I. Introduction
Automatic speech recognition (ASR) is at the heart of the ever-increasing variety of voice assistants, auto-captioning, and voice-search applications [1]. Speech Classification is a subset of speech recognition which refers to a group of duties or issues that program must resolve to automatically categorise a section of input audio into categories, such as voice activity detection (binary or multi-class), speech command recognition (multi-class), and audio sentiment classification, among others. Speech command recognition is the process of classifying an input speech pattern into a certain set of classifications. It's a branch of automatic speech recognition known as keyword spotting, where a model continuously examines speech patterns to find particular “command” classes. The system can conduct a specified action in response to these commands being detected.