Loading web-font TeX/Main/Regular
Hybrid CNN-LSTM Network for Real-Time Apnea-Hypopnea Event Detection Based on IR-UWB Radar | IEEE Journals & Magazine | IEEE Xplore

Hybrid CNN-LSTM Network for Real-Time Apnea-Hypopnea Event Detection Based on IR-UWB Radar


Overall structure of the proposed CNN-LSTM model.

Abstract:

Polysomnography (PSG) is the gold-standard for sleep apnea and hypopnea syndrome (SAHS) diagnosis. Because the PSG system is not suitable for long-term continuous use owi...Show More

Abstract:

Polysomnography (PSG) is the gold-standard for sleep apnea and hypopnea syndrome (SAHS) diagnosis. Because the PSG system is not suitable for long-term continuous use owing to the high cost and discomfort caused by attached multi-channel sensors, alternative methods using a non-contact sensor have been investigated. However, the existing methods have limitations in that the radar-person distance is fixed, and the detected apnea hypopnea (AH) event cannot be provided in real-time. In this paper, therefore, we propose a novel approach for real-time AH event detection with impulse-radio ultra-wideband (IR-UWB) radar using a deep learning model. 36 PSG recordings and simultaneously measured IR-UWB radar data were used in the experiments. After the clutter was removed, IR-UWB radar images were segmented by sliding a 20-s window at 1-s shift, and categorized into two classes: AH and N. A hybrid model combining the convolutional neural networks and long short-term memory networks was trained with the data, which consisted of class-balanced segments. Time sequenced classified outputs were then fed to an event detector to identify valid AH events. Therefore, the proposed method showed a Cohen’s kappa coefficient of 0.728, sensitivity of 0.781, specificity of 0.956, and an accuracy of 0.930. According to the apnea-hypopnea index (AHI) estimation analysis, the Pearson’s correlation coefficient between the estimated AHI and reference AHI was 0.97. In addition, the average accuracy and kappa of SAHS diagnosis was 0.98 and 0.96, respectively, for AHI cutoffs of ≥ 5, 15, and 30 events/h. The proposed method achieved the state-of-the-art performance for classifying SAHS severity without any hand-engineered feature regardless of the user’s location. Our approach can be utilized for a cost-effective and reliable SAHS monitoring system in a home environment.
Overall structure of the proposed CNN-LSTM model.
Published in: IEEE Access ( Volume: 10)
Page(s): 17556 - 17564
Date of Publication: 19 May 2021
Electronic ISSN: 2169-3536

Funding Agency:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

Sleep apnea and hypopnea syndrome (SAHS) is the most common sleep-related breathing disorder in the general population and is caused by partial or complete obstruction of the upper airway [1]. This disorder is characterized by repetitive events in which breathing is shallow or paused for more than 10s during sleep [2]. These events are typically accompanied by blood oxygen desaturation and arousals during sleep, leading to daytime sleepiness, decreased cognitive function and negative mood [3], [4]. Moreover, SAHS is known as a risk factor for several complications including hypertension, type 2 diabetes, cardiovascular disease, stroke, and heart failure in untreated patients [5], [6]. Hence, monitoring for unrecognized SAHS and appropriate treatment allows preventative measurement to reduce these potential health problems and death.

Although polysomnography (PSG) is the gold standard method for diagnosing SAHS, its usefulness as a long-term monitoring system is reduced by several drawbacks [7]: it requires trained sleep technicians in specially equipped laboratories; and the large number of electrodes attached to the patient’s body may cause discomfort and affect their sleeping behavior. To overcome the shortcomings of PSG, numerous studies have proposed alternative methods to monitor SAHS using various sensors such as accelerometers, depth and thermal cameras, and piezoelectric pressure sensors [8], [9]. Nevertheless, these methods still require the user to physically contact the sensor or pose other privacy issues. In the past decade, non-contact methods using radio frequency (RF) sensors for SAHS monitoring have been proposed to address the drawbacks of PSG [10], [11]. Impulse-radio ultra-wideband (IR-UWB) is a short-range wireless communication technique employing radio waves occupying a frequency band of 3.1–10.6 GHz. Owing to the wide bandwidth, IR-UWB radar could have many advantages, such as strong permeability, multipath resistance, and high spatial resolution [12]. Thus, IR-UWB radar has been utilized for various clinical applications as well as obstacle detection by remotely detecting vital signs and body movement [13], [14].

Recently, a few studies have developed algorithms for estimation of apnea hypopnea index (AHI) and diagnosis of SAHS using only IR-UWB radar [15]–​[17]. Javaid et al. [15] implemented an under-the-mattress ultra-wideband (UWB) radar sensor to detect apnea-hypopnea (AH) events using linear discriminant classifier for 4 patients with sleep apnea. Zhou et al. [16] investigated a wireless radar sleep screening device (ZG-S01A) that estimates AHI and total sleep time (TST) based on the IR-UWB radar, and Kang et al. [17] proposed an algorithm for the detection of AH events with the constant false alarm rate (CFAR) algorithm and weight function base on the IR-UWB radar. Both studies showed high correlation coefficient between estimated AHI and PSG AHI and high sensitivity and specificity for diagnostic efficacy in three different AHI cutoffs. Most of the existing radio frequency (RF) sensor-based methods mainly used the adaptive threshold method based on the AASM manual [18]. The adaptive threshold utilizes a procedure of extracting the respiratory signal from the IR-UWB radar and detecting the decrease in amplitude relative to the baseline respiratory signal. However, this procedure requires information about the location of the human chest to extract the breathing signal from the radar data. Moreover, AH events are usually accompanied by body movements, which act as a factor that prevents radar from not only finding accurate body position but also determining the exact baseline amplitude. In addition, there have been numerous studies on real-time monitoring of SAHS using alternative sensors because it is critical to provide real-time instantaneous feedback for any associated medical treatment, such as continuous positive airway pressure (CPAP) pressure adjustments, when an AH event appears [19]–​[21]. Nonetheless, none of the existing studies provide real-time AH event detection techniques.

To effectively address the above challenges, we propose a deep learning approach based on a convolutional neural network (CNN) in combination with a long short-term memory (LSTM) network. CNNs automatically filter out noise and extract the valuable feature from a signal or image without any domain knowledge [22], [23]. Although CNNs are useful in extracting patterns that appear as a local trend or appear the same in different regions of the time sequence, they are not suitable for capturing temporal dependencies [24]. LSTM network, which is a variant of recurrent neural network (RNN), contains cyclic feedbacks that are designed to handle the temporal sequence [25]. Thus, LSTM layers can encode relevant information of class-specific characteristics across time [26]. Owing to these characteristics, the models combined with CNN and LSTM, have been successfully applied in detecting SAHS using bio-signal sequences, in recent studies [27]–​[29]. On this premise, we take the temporal characteristics of radar signals into consideration and propose a hybrid model architecture that combines CNNs and LSTM network. To best of our knowledge, none of the previous studies applied a deep learning algorithm to monitor sleep-related breath disorder using RF sensors.

In this study, we assessed the ability of the CNN-LSTM network to accurately detect AH events based on a single IR-UWB radar. The main contributions of this study are as follows:

  • A deep learning framework for SAHS monitoring using a single RF sensor is presented.

  • We develop a hybrid CNN-LSTM architecture for real-time AH event detection without any hand-engineering, which cannot be implemented by existing methods.

  • Performance of the AH event detection is evaluated using segment analysis and AHI estimation analysis.

  • The SAHS diagnostic efficacy of the proposed approaches is compared with that of existing methods, and it shows a state-of-the-art performance.

SECTION II.

Materials and Methods

A. Subjects and POLYSOMNOGRAPHY

Our study was performed in accordance with the ethical standards in the Declaration of Helsinki, and the Institutional Review Board of Seoul National University Hospital (IRB-SNUH No. 1807-190-964) approved this prospective cohort. All participants were briefed about the objective and procedure of the experiment, and they signed the consent forms. Subjects with suspected SAHS were recruited from clinic populations and the online clinical trials center of SNUH.

Subjects were initially screened by study coordinators to ensure that they met inclusion criteria and exclusion criteria. The inclusion criteria were adults whose ages were ≥ 18 yrs. and who were judged as a part of the high-risk group in both the Berlin questionnaire [30] and STOP-Bang questionnaire [31]. Exclusion criteria were people who had any history of sleep disorders other than SAHS, psychiatric, neurological, or cardiovascular disorders. Qualifying 40 subjects underwent overnight PSG at the Center for Sleep and Chronobiology of SNUH. As a result of the PSG, 4 PSG recordings for which incomplete data were collected, due to a defective cable connecting the IR-UWB radar and PC, were excluded, leaving 36 PSG recordings for analysis. SAHS was diagnosed with an AHI > 5 events/h, and they were classified into three groups: mild SAHS (5 ≤ AHI < 15 events/h), moderate SAHS (15 ≤ AHI < 30 events/h), and severe SAHS (AHI ≥ 30 events/h).

All PSG data were recorded with a NEUVO system (Compumedics Ltd., Victoria, Australia) scored by a sleep technologist, and verified by two sleep clinicians according to the 2018 AASM manual [18]. The following physiological data were collected: electroencephalogram (EEG) at O2-M1, C4-M1, and F4-M1; submental and tibialis anterior electromyogram (EMG); bilateral electrooculogram (EOG); electrocardiography (ECG); oronasal airflow; thoracic and abdominal respiratory effort; nasal pressure using a thermistor, piezoelectric-type belts, and nasal cannula/pressure transducer; body posture from a 3-axis accelerometer; and blood oxygen saturation using a pulse oximeter. Of all these signals, only the blood oxygen saturation was sampled at 200 Hz, and all other signals were measured at 500 Hz. The anthropometric and sleep parameters of the subjects are summarized in Table 1.

TABLE 1 Subjects Demographics and Sleep-Related Variables
Table 1- 
Subjects Demographics and Sleep-Related Variables

B. IR-UWB Radar

A commercially available IR-UWB radar system on chip (SoC) X4 (Novelda, Oslo, Norway) was adopted. The IR-UWB radar was fixed on a tripod within a range of approximately 0.5 to 2 m from the human chest and measured simultaneously with the PSG as depicted in Fig. 1. The transmitter of the radar has a center frequency of 7.29 GHz, and a bandwidth of 1.5 GHz. The receiver sampled the reflected signal at 23.328 GS/s and the radar signals were digitized at a speed of 20 fps. Detection range was set to 2 m. The PSG system and IR-UWB radar device were time-synchronized by using a respiratory effort signal from PSG and a sample chest movement signal from the IR-UWB data at a fixed distance, where the human chest was presumed to be located. Before starting PSG, we asked the subjects to hold their breath for 15–20 seconds during the calibration time. In this manner, two devices were synchronized in time by finding the section where the two signals appear flat simultaneously using the cross-correlation method. For acquiring, processing, and storing data from the IR-UWB radar, we used MATLAB 2019a (MathWorks, New York, MA, USA).

FIGURE 1. - Measurement of IR-UWB radar with PSG in our experiment.
FIGURE 1.

Measurement of IR-UWB radar with PSG in our experiment.

C. Signal Preprocessing and Segmentation

Because the multiple baseband signals obtained by the IR-UWB radar sensor can be expressed as a 2-D virtual image, the raw signals were preprocessed and segmented to train and test the CNN-LSTM model. The window size was set to 20 s with a shift of 1 s to sufficiently reflect the normal breathing section. The image segment did not specify the region of interest and included all ranges from 0 to 2 m. In the sleep environment, the background is considered to be stationary, while the human body changes its position in terms of chest movements caused by breathing. Therefore, we obtained the target signals by subtracting the DC component from the raw radar signals with a moving-average method [32]. The clutter was calculated as the average of the 20-s epoch amplitudes along the fast time range bin. Then, the image was downsized from 300\times 400 pixels to 80\times 300 pixels using the area interpolation method to increase the learning speed.

Fig. 2 summarizes the examples of synchronized IR-UWB radar image, nasal airflow, and thoracic respiratory effort during no apnea event, central sleep apnea (CSA), obstructive sleep apnea (OSA), and hypopnea for 1 min. During no apnea event (Fig. 2 (A)), respiration activity that occurs at a distance of approximately 1 m in the radar image is clearly visible. In contrast, in central apnea (Fig. 2 (B)), it can be seen that the airflow and thoracic waveform disappear, and at the same time, the breathing pattern in the radar image also disappears. In the obstructive apnea (Fig. 2 (C)) and hypopnea (Fig. 2 (D)), a significant decrease apprears in the thoracic waveform, and the contrast due to breathing in the radar image is weakened.

FIGURE 2. - Example of IR-UWB raw data and different normalized PSG signals with normal breath and three types of respiratory events (A) no apnea event (B) central sleep apnea (C) obstructive sleep apnea, and (D) hypopnea. The PSG signals were recorded at 500 Hz, and the IR-UWB signals were sampled with 20 fps.
FIGURE 2.

Example of IR-UWB raw data and different normalized PSG signals with normal breath and three types of respiratory events (A) no apnea event (B) central sleep apnea (C) obstructive sleep apnea, and (D) hypopnea. The PSG signals were recorded at 500 Hz, and the IR-UWB signals were sampled with 20 fps.

The preprocessed images were categorized into two classes: AH and N. If at least 10s of a segment occurred within an apnea and hypopnea event period, it was labeled as class AH. Other cases were labeled as class N. As a result, 138,067 segments were labeled as class AH, and 778,471 segments were labeled as class N. Next, to prevent the model from overfitting to the majority number of the class, we made the training set consist of the same number of samples for each class. Because class N had more segments than class AH, for each subject, class N segments were randomly subsampled by the imbalance ratio. The original imbalance ratio of our training set, which is the number of class N segments divided by the number of class AH, was 5.64.

D. CNN-LSTM Architecture

As shown in Fig. 3, our proposed deep learning network CNN-LSTM architecture comprises three convolutional layers, two max-pooling layers, one bidirectional LSTM layer, and one fully connected layer. Note that the input image is treated as a 1-D signal based on time axis, and 1-D CNNs were applied. This is to better preserve the temporal characteristics of the breathing patten and feed the time-dependent feature vectors into the LSTM units. Moreover, 1-D CNN has less computational complexity than 2-D CNN [33].

FIGURE 3. - Overall structure of the proposed CNN-LSTM model.
FIGURE 3.

Overall structure of the proposed CNN-LSTM model.

To find the optimal hyperparameters and evaluate the model performance, we used a nested 6-fold cross-validation. To do this, subjects were randomly divided into 6 equal-size subsets. In the outer 6-fold cross-validation loop, 5 subsets were used as the training dataset, and 1 subset was used as the test dataset. At each inner fold, the training dataset was further divided into 6 equal-size subsets. Then, 5 subsets were used as the training dataset, and 1 subset was used for the validation dataset. From the results of the 6\times 6 inner folds, the best hyperparameters were selected by maximizing Cohen’s kappa coefficient for the validation dataset. Table 2 shows the detailed configuration of various layers of the proposed model. In each CNN layer, the layer input and kernel were convolved with a stride of 2 and the same padding. We trained the model using the Adam optimizer [34] and He normal initializer [35]. The learning rate was set to 0.001, and the binary cross-entropy was used as a loss function. The model was trained for a maximum of 100 epochs with an early stopping patience of 10 and a batch size of 128.

TABLE 2 Configuration of Various Layers of the Proposed Model
Table 2- 
Configuration of Various Layers of the Proposed Model

Models were implemented in Python 3.7 and the Keras framework [36] with TensorFlow backend [37]. The training and test process were done using a GTX 1080 8GB GPU and a 3.4 GHz Intel i7-6700 CPU.

E. Performance Evaluation

After finding the optimal hyperparameters, we evaluated the performance of the sleep apnea event detection for the test dataset. Test segments were applied to the CNN-LSTM model, and classification outputs, representing class AH or class N, were received. Finally, these time sequenced classified labels were then fed to the event detector to identify valid AH events. The event detector judges valid AH events if at least six consecutive segments are classified as class AH.

To evaluate the performance of the sleep apnea event detection, we performed three analyses. First, the conventional metric of accuracy (ACC), sensitivity (SENS), specificity (SPEC), and Cohen’s kappa coefficient (Kappa) were calculated by the generating a confusion matrix between the estimated results and reference PSG results, according to a segment-by-segment analysis. Second, we estimated AHI based on the number of valid AH events. Then, the Pearson’s correlation analysis and Bland-Altman analysis between the estimated AHI and reference PSG AHI were conducted. Lastly, the SAHS diagnostic performance for AHI cutoff 5, 15, and 30 events/h was validated with ACC, SENS, SPEC, positive predictive value (PPV), and Kappa.

SECTION III.

Results

A. Segment-By-Segment Analysis

We calculated the time taken to classify all the test segments, in order to verify the real-time application of the trained model. Because it took 211.9 s to classify the 916,538 test segments, it took approximately 0.00023 s to classify one segment on an average. Thus, this model is sufficiently capable of real-time event detection.

Table 3 shows the performance of the proposed model for the test dataset based on segment-by-segment analysis. AH events classified by the CNN-LSTM model were compared with the scored AH events from the reference PSG. When computing the performance for apnea event detection in each of the severity groups, SENS and Kappa value tended to gradually increase from the non-SAHS group (AHI < 5 events/h) to the severe SAHS group (AHI ≥ 30 events/h). For the overall test segments, we obtained a ACC of 0.930, SENS of 0.781, SPEC of 0.956, and Kappa of 0.728.

TABLE 3 Event Detection Performance in Each SAHS Severity Group
Table 3- 
Event Detection Performance in Each SAHS Severity Group

B. AHI Estimation Analysis

Intra-class correlation coefficients and Bland-Altman plots were used to assess the agreement between the estimated AHI (AHIEST) from IR-UWB radar and PSG-derived AHI (AHIPSG). Fig. 4 (A) and (B) show the scatter plots of the AHIEST versus AHIPSG and Bland-Altman plots for the entire night sleep. The Pearson correlation coefficient (N = 36) was 0.970 with p < 0.001 . Bland-Altman shows low mean biases (-1.983), and the limits of agreement were -14.655 to 10.689. To verify the performance in detail, the AHI of each hour was additionally calculated from both the IR-UWB radar and PSG for each subject. Fig. 4 (C) shows that the Pearson correlation coefficient for overall samples (N = 258) was 0.955 with p < 0.001 . Moreover, Bland-Altman also shows low mean biases (−1.567), and the limit of agreement was −17.710 to 15.375 (Fig 4 (D)).

FIGURE 4. - Scatter plots of estimated AHI using the proposed method (AHIEST) versus reference AHI obtained from the polysomnography (AHIPSG) for (A) total sleep time and (C) each hour from all subjects. Bland-Altman plots for visualization of the agreement between AHIEST and AHIPSG for (B) total sleep time and (D) each hour from all subjects. Gray line indicates an identity line in (A, C). Gray bold line and blue lines in (B, D) indicate the average difference (Bias) and the average ± 1.96*standard deviation, respectively.
FIGURE 4.

Scatter plots of estimated AHI using the proposed method (AHIEST) versus reference AHI obtained from the polysomnography (AHIPSG) for (A) total sleep time and (C) each hour from all subjects. Bland-Altman plots for visualization of the agreement between AHIEST and AHIPSG for (B) total sleep time and (D) each hour from all subjects. Gray line indicates an identity line in (A, C). Gray bold line and blue lines in (B, D) indicate the average difference (Bias) and the average ± 1.96*standard deviation, respectively.

Table 4 summarizes the SAHS severity classification and diagnostic performance for all test subjects. The diagnostic performance was calculated for AHI cutoffs of 5, 15, and 30 events/h. Therefore, the average values for ACC, SENS, SPEC, PPV, and Kappa were 0.98, 0.97, 1.00, 1.00, and 0.96.

TABLE 4 SAHS Severity Classification and Diagnostic Performance
Table 4- 
SAHS Severity Classification and Diagnostic Performance

SECTION IV.

Discussion

The purpose of this study was to develop a new diagnostic algorithm for real-time SAHS monitoring by using a deep learning method based on a non-contact sensor. A single IR-UWB radar was used as a breathing monitoring device, and a hybrid model combining CNNs and LSTM network was used as a classifier for the AH event in this study. To demonstrate the ability to detect an AH event in real-time, the radar image with overlapping window was input into the CNN-LSTM model.

An important feature of our method is that it detects individual AH events in real-time and reports the results as per-segment classification performance. Most previous studies do not report event detection performance and only present AHI for the entire sleep. Javaid et al. [15], which detected AH events using a machine learning technique with an under-the-mattress IR-UWB radar sensor, only showed an accuracy, sensitivity, and specificity of 0.73, 0.71, and 0.71, respectively. However, the number of participants in this study was very small, and the classification performance also could be improved.

In the segment-by-segment analysis, the Kappa value of our proposed model for all dataset segments was 0.72, which is a substantial agreement. Table 3 reports a confusion matrix for overall segment-by-segment comparison between the proposed CNN-LSTM model and reference PSG. The total number of false positives was 34309. Notably, of the 34309 false positives, about 68% contain events in which the nasal airflow signal had an amplitude reduction of more than 30% compared to the baseline. But the segments were recorded as “no event” because of the absence of the desaturation or arousal of SpO2. Because additional information for classifying hypopneas, such as EEG or SpO2, was not considered in this study, we speculate that these segments would have led the model to misclassify it into class AH and cause overestimated AH events. In Table 3, the Kappa value for all SAHS severity groups exceeded 0.5, which reveals our proposed model detects the AH events with substantial agreement. Note that the performance indices tend to gradually increase from the non-SAHS group to severe SAHS group. When the ratio of hypopnea event and apnea event was calculated for each severity group, it was found to be 9.85, 7.22, 3.87, and 1.15 in each group. In other words, the aforementioned overestimation problem for hypopnea event deteriorated the model performance in the non-SAHS and mild SAHS groups, where the ratio of hypopnea is relatively high. In addition, according to Fig. 4 (B), the average bias value is -1.983, which means that the model overestimated AH events as well.

However, despite the problem of overestimation, the sensitivity in the non- and mild SAHS groups is lower than that of the moderate and severe SAHS groups (Table 3). Considering that the proportion of hypopnea events is relatively large in the non- and mild SAHS groups, it is reasonable to think that the factor that induced misclassification into class N is a hypopnea event rather than an apnea event. In fact, approximately 92% of false negatives in the non- and mild SAHS groups contained hypopnea events. These misclassifications can be understood from Fig. 2, in which the hypopnea event shows that the breathing weakens and amplitude decreases, while its waveforms resemble those of “no event.” In addition, the average duration of hypopnea events in the normal and mild groups was 26.3 s, whereas the average duration of hypopnea corresponding to false negatives was only 13.5 s. In other words, it can be assumed that the characteristic of the hypopnea event is not well reflected in the radar image, and hence, it is misclassified as class N when a hypopnea event has a short duration in the non- and mild SAHS groups.

As summarized in Table 4, despite the bias of IR-UWB radar compared to the AHI of the reference PSG, the SAHS diagnostic performance showed a high value. The outstanding diagnostic performance and the statistically high AHI correlation of our model supports the fact that it can be used to clinically screen or continuously monitor SAHS severity. When the results are compared with those of previous studies based on IR-UWB-radar technologies showing the highest performances [16], [17], our method achieved better or equal level of performances among the used performance metrics (average diagnostic performance in [16]: ACC = 0.99, SENS = 0.99, SPEC = 0.99, PPV = 0.99, and Kappa = 0.98; Pearson’s correlation coefficient of AHI = 0.97; average diagnostic performance in [17]: ACC = 0.95, SENS = 0.88, SPEC = 0.98, PPV = 0.98, and Kappa = 0.88; Pearson’s correlation coefficient of AHI = 0.93). This study is the first to investigate the effectiveness of deep learning for sleep-related breathing disorders based on RF sensor. Fig. 2 shows that when an AH event occurs, the amplitude of the sinusoidal form of thoracic movement due to breathing decreases, which in turn weakens the contrast by oscillation in the radar image. CNN modules can learn these prominent characteristics and extract robust features [38]. Therefore, contrary to the existing rule-based algorithms based on hand-engineered features that might miss important sleep apnea markers, our method does not include the process of extracting and selecting hand-engineered features [39]. Moreover, LSTM layer plays a key role in identifying the sequence pattern information and short-term and long-term dependencies [40]. Thus, the proposed model was able to determine whether an AH event occurred in the segment with high performance in real-time with only 20-s of radar image. In addition, the high-performance detection ability was maintained in various positions where the breath occurs.

To confirm that our experiment showed high AH event detection performance evenly, regardless of the location of the human chest, we additionally calculated the distribution of the distances from the radar device to the subject’s chest, and analyzed the correlation with the detection performance. The distance of the chest with the highest power could be chosen by calculating the spectral power of the radar signal in the respiration frequency range (0.1–0.7 Hz) in the fast-time domain. The calculation was performed every 30-s epoch, but the epochs recorded as the stage “Wake” or containing AH events were excluded from the analysis. Therefore, the mean distance during the whole night’s sleep for each subject was 0.98 ± 0.31 m with a range of 0.84 – 1.32 m, and it did not show any significant correlation with all AH event detection performance indices.

Our study method has a few limitations. First, although the information regarding the actual sleep time is necessary for an accurate AH event classification and AHI calculation, our method could not take the sleep stage into account because the EEG was not included. Moreover, the occurrence of the AH events is greatly influenced by the sleep posture as well as the sleep stage [41]; however, an accelerometer was not considered in this study. Nevertheless, recently, studies have been conducted to classify the sleep stages and sleep posture based on IR-UWB radar signals [42]–​[44]. Combining these approaches with our method will allow us to classify the AH events and calculate the AHI accurately from a single IR-UWB radar. Second, this study could not compare the agreement between respiratory disturbance index (RDI) between IR-UWB radar and reference PSG. RDI is also a criterion for classifying the severity of SAHS and is very similar to AHI, but it includes the number of respiratory effort-related arousals (RERAs) as well as apnea and hypopnea [18]. However, as mentioned above, since EEG information could not be included in this study, arousal-associated respiratory events could not be classified. Future studies including additional EEG information will enhance the usefulness of our method. Third, this study did not differentiate detailed types of AH events. As show in Fig. 2, the respiration signals from PSG show different patterns in which the waveform disappears or the amplitude decreases depending on the type of AH event. However, there is a limitation in recognizing the difference in the pattern only with the radar signal reflected from the chest movement, and it is difficult to train the model owing to an imbalance in the sample number for each event type. Finally, the AH event detection was tested in a controlled laboratory environment. To confirm usability, our model ought to be validated at home environments in future studies.

SECTION V.

Conclusion

In this paper, we developed a deep learning model combining CNN and LSTM network, and detected AH events based on overlapping images of IR-UWB radar, allowing us to identify actual events in real-time. Despite not using any hand-engineered feature as input, our proposed method achieved the state-of-the-art performance for classifying SAHS severity regardless of the user’s location. Moreover, in this model, the hybrid architecture that exploits the benefits of both deep learning techniques did not require a feature extraction and selection process. Because of the advantages of that users do not need to attach any sensor to their body, the IR-UWB radar is drawing attention as a sleep monitoring device that has the potential as an alternative to PSG [45]. Our proposed method based on the IR-UWB radar and deep learning can be utilized for a cost-effective and reliable SAHS monitoring in both hospital and home environments.

References

References is not available for this document.