I. Introduction
The vocal folds (VFs) are a pair of muscles in the larynx (voice box), the tubular structure that connects the throat to the windpipe (trachea). The VFs function like a valve in the upper airway, opening and closing as needed for breathing, swallowing, and speaking [24]. Therefore, VF dysfunction can result in breathing difficulty (dyspnea), swallowing dysfunction (dysphagia), and/or voice impairment (dysphonia), all of which can significantly reduce the patient’s quality of life [31]. However, dyspnea and dysphagia can become life-threatening as a result of VF "valving" impairment – inappropriate VF closure can obstruct breathing, while inappropriate VF opening can allow food and liquid into the airway, leading to choking and/or lung infection (aspiration pneumonia) [21]. While numerous medical conditions can result in life-threatening VF dysfunction, the most prevalent are neurological disorders (e.g., stroke, Parkinson’s disease, and amyotrophic lateral sclerosis) and head and neck cancer [27]. Aspiration pneumonia, is a leading cause of morbidity and mortality in these conditions/diseases [9], thus highlighting the clinical need for improved medical management of VF dysfunction. Our proposed solution entails incorporating deep learning-based automated video analysis methods into the current clinical test for VF dysfunction.