Introduction
Brain–computer interface (BCI) systems that capture sensory-motor rhythms and event-related potentials from the central nervous system and convert them to artificial outputs have shown great value in medical rehabilitation, entertainment, learning, and military applications [1], [2], [3], [4]. Motor imagery (MI) can evoke SMR, which shares common neurophysiological dynamics and sensorimotor areas with the corresponding explicit motor execution (ME), but does not produce real motor actions [5], [6]. As a functionally equivalent counterpart to ME, MI is more convenient for BCI users with some degree of motor impairment who cannot perform overt ME tasks, making it important to study BCI. However, MI still faces two major challenges. First, improving the performance of MI-based classification poses a huge challenge for BCI design and development. Second, existing algorithms usually require a large number of channels to achieve good classification performance, which limits the practicality of BCI systems and their ability to be translated into the clinic.
Because of the nonstationary, time-varying, and multichannels of EEG signals, traditional machine learning methods such as Bayesian classifier [7] and support vector machine (SVM) have limitations in achieving high classification performance. Recently, deep artificial neural networks, loosely inspired by biological neural networks, have shown a remarkable performance in EEG signal classification. An et al. [8] proposed to use multiple deep belief nets as weak classifiers and then combined them into a stronger classifier based on the Ada-boost algorithm, achieving a 4–6% performance improvement compared to the SVM algorithm. A framework combining conventional neural network (CNN) and autoencoder was proposed by Tabar et al. [9] to classify feature which was transformed by short time distance Fourier transform (STFT) with more significant results. The lately proposed EEGNet [10] employed a novel scheme that combined classification and feature extraction in one network, and achieved relatively good results in several BCI paradigms. Sun et al. [11], [12] added an attention mechanism to a CNN designed to give different attention to different channels of EEG data, achieving state-of-the-art results in current BCI applications. Although CNN models have achieved good results for MI classification, it is worth noting that traditional CNN are better at processing local features of signals such as speech, video, and images, where the signals are constantly changing [13]. CNN approaches may be less suitable for EEG signals, as EEG signals are discrete and noncontinuous in the spatial domain.
Recent work has shown that graph neural network (GNN) can serve as valuable models for EEG signal classification. GNN is a novel network that use the graph theory to process data in the graph domain, and has shown great potential for non-Euclidean spatial domains such as image classification [14], channel classification [15], and traffic prediction [16]. ChebNet [14] was proposed to speed up the graph convolution operation while ensuring the performance by parameterizing the graph convolution using the Chebyshev polynomials. Based on ChebNet, Kipf et al. [17] proposed the graph convolutional network (GCN) by combining CNN with spectral theory. GCN is not only better than ChebNet in terms of performance, but also highly scalable [15]. Compared with CNN models, GCN has the advantage in handling discriminative feature extraction of signals [18], and more importantly, GCN offers a way to explore the intrinsic relationships between different channels of EEG signals. GCN has been widely used in brain signal processing and its effectiveness has been proved. Some current methods based on GCN made some innovations in the adjacency matrix. Zhang et al. [19] used prior knowledge to transform the 2-D or 3-D spatial positions of electrodes into adjacency matrix. Li et al. [20] used mutual information to construct the adjacency matrix. Du et al. [21] used spatial distance matrix and relational communication matrix to initialize the adjacency matrix. However, most of the existing work has focused on the design of adjacency matrices to improve the decoding accuracy, which often requires manual design or requires a priori knowledge.
The use of dense electrodes for EEG recordings increases the burden on the subjects, it is becoming increasingly evident that novel channel selection approaches need to be explored [22]. The purpose of channel selection is to select the channels that are most critical to classification, thereby reducing the computational complexity of the BCI system, speeding up data processing, and reducing the adverse effects of irrelevant EEG channels on classification performance. The activity of brain areas still varies from subject to subject in the same MI task despite the maturity of brain region delineation. Therefore, the selection of EEG channels that are appropriate for a particular subject on an individual basis is essential for the practical application of MI-BCI. There have been some studies on channel selection, including filters, wrappers, and embedded methods [23], [24], [25]. Among these methods, the common spatial pattern (CSP) algorithm and its variants [26], [27], [28] have received much attention for their simplicity and efficiency. Meng et al. [29] measured channel weight coefficients to select channels via CSP, whose computational efficiency and accuracy cannot be satisfied at the same time. In order to solve the channel selection problem, Yong et al. [30] used
To address the above issues, this article proposes a EEG channel active inference neural network (EEG-ARNN), which not only outperforms the state-of-the-art (SOTA) methods in terms of accuracy and robustness, but also enables channel selection for specific subjects. The main contributions are as follows:
An end-to-end EEG-ARNN method for MI classification, which consists of temporal feature extraction module (TFEM) and channel active reasoning module (CARM), is proposed. The TFEM is used to extract temporal features of EEG signals. The CARM, which is based on GCN, eliminates the need to construct an artificial adjacency matrix and can continuously modify the connectivity between different channels in the subject-specifical situation.
Two channel selection methods, termed as edge-selection (ES) and aggregation-selection (AS), are proposed to choose optimal subset of channels for particular subjects. In addition, when using selected channels to train EEG-ARNN, classification performance close to that of full channel data can be obtained by using only 1/6 to 1/2 of the original data volume. This will help to simplify the BCI setup and facilitate practical applications.
We explore the connection between the EEG channels selected by ES and AS during MI and the brain regions in which they are located, offering the possibility to further explore the activity levels in different brain regions during MI and paving the way for the development of practical brain–computer interface systems.
The rest of this article is organized as follows: Section II introduces the EEG-ARNN model, ES and AS methods. In Section III, experimental results are presented and the relationship between the brain regions is explored. Finally, Section IV concludes this article.
Methods
By simulation of human brain activation with GCN and extracting the EEG features of temporal domain with CNN, a novel MI-EEG classification framework is built in this work. As shown in Fig. 1, EEG-ARNN mainly consists of two parts: the CARM based on CNN and the TFEM based on GCN. In this section, CARM, TFEM, and the whole framework detail are described. After that, the CARM-based ES and AS methods are described in detail.
A. Channel Active Reasoning Module
GCN performs convolution operations on graph data in non-Euclidean space. The graph is defined as
The Laplacian matrix of the graph
\begin{equation*}
\mathbf {L} = \mathbf {D} - \mathbf {W} \in R^{N \times N} \tag{1}
\end{equation*}
\begin{equation*}
\widehat{\mathbf{x}} = \mathbf {U}^{T}\mathbf {x} \tag{2}
\end{equation*}
\begin{equation*}
\mathbf {L} = \mathbf {U}\boldsymbol{\Lambda } \mathbf {U}^{T} \tag{3}
\end{equation*}
\begin{equation*}
\mathbf {x} = \mathbf {U}\widehat{\mathbf{x}} = \mathbf {U}\mathbf {U}^{T}\mathbf {x}. \tag{4}
\end{equation*}
\begin{align*}
\mathbf {x_{1}}*_{\mathcal {G}} \mathbf {x_{2}} &= \mathbf {U}\left(\left(\mathbf {U}^{T} \mathbf {x_{1}}\right) \odot \left(\mathbf {U}^{T} \mathbf {x_{2}}\right)\right)\\
&= \mathbf {U}\left(\widehat{\mathbf{x}}_{1} \odot \left(\mathbf {U}^{T}\mathbf {x_{2}}\right)\right)\\
&= \mathbf {U}(\text{diag}\left(\widehat{\mathbf{x}}_{1}\right)\left(\mathbf {U}^{T}\mathbf {x_{2}}\right))\\
&= \mathbf {U}\text{diag}(\widehat{\mathbf{x}}_{1})\mathbf {U}^{T}\mathbf {x_{2}} \tag{5}
\end{align*}
Let filter function
\begin{equation*}
g_{\theta }*_{\mathcal {G}} \mathbf {x} = \mathbf {U}\text{diag}(\theta)\mathbf {U}^{T}\mathbf {x}. \tag{6}
\end{equation*}
\begin{equation*}
g(\boldsymbol{\Lambda }) = \sum _{k=0}^{K-1}\theta _{k}T_{k}(\bar{\mathbf {\Lambda }}) \tag{7}
\end{equation*}
\begin{equation*}
{\begin{cases}T_{0}(\bar{\mathbf {\Lambda }}) = 1 \\
T_{1}(\bar{\mathbf {\Lambda }}) = \bar{\mathbf {\Lambda }} \\
T_{k}(\bar{\mathbf {\Lambda }}) = 2\bar{\mathbf {\Lambda }}T_{k-1}(\bar{\mathbf {\Lambda }}) - T_{k-2}(\bar{\mathbf {\Lambda }}). k \geq 2. \\
\end{cases}} \tag{8}
\end{equation*}
According to (6) and (7), we have
\begin{equation*}
g_{\theta }*_\mathcal {G} \mathbf {x} = \sum _{k=0}^{K}\theta _{k}T_{k}(\bar{\mathbf {\Lambda }}) \mathbf {x} \tag{9}
\end{equation*}
\begin{align*}
g_{\theta }*_{\mathcal {G}} \mathbf {x} &= \theta _{0}\mathbf {x} + \theta _{1}(\boldsymbol{\Lambda } - \mathbf {I}_{N}) \mathbf {x}\\
&= \theta _{0}\mathbf {x} + \theta _{1}\mathbf {D}^{-\frac{1}{2}}\mathbf {W}\mathbf {D}^{-\frac{1}{2}}\mathbf {x}. \tag{10}
\end{align*}
The above (10) has two trainable parameters, using
\begin{equation*}
g_{\theta }*_{\mathcal {G}} \mathbf {x} = \theta \left(\mathbf {I}_{N} + \bar{\mathbf {\Lambda }}^{-\frac{1}{2}} \mathbf {W}\bar{\mathbf {\Lambda }}^{-\frac{1}{2}}\right)\mathbf {x}. \tag{11}
\end{equation*}
Using the normalized
\begin{equation*}
g_{\theta }*_{\mathcal {G}} \mathbf {x} = \theta \left(\widetilde{\mathbf{D}}^{-\frac{1}{2}} \widetilde{\mathbf{W}} \widetilde{\mathbf{D}}^{-\frac{1}{2}}\right) \mathbf {x}. \tag{12}
\end{equation*}
Input from the spatial domain will be extended to the spatiotemporal domain to obtain the signal
\begin{equation*}
\mathbf {H}_{t} = \widetilde{\mathbf{D}}^{-\frac{1}{2}} \widetilde{\mathbf{W}} \widetilde{\mathbf{D}}^{-\frac{1}{2}} \mathbf {X}_{t} \Theta _{t} \tag{13}
\end{equation*}
\begin{equation*}
\mathbf {H}_{t} = \hat{\mathbf {W}}\mathbf {X}_{t}\Theta _{t}. \tag{14}
\end{equation*}
It has been shown that the brain does not activate only one area during MI, but the several areas work together. In some previous studies, Sun et al. [11] proposed to construct the adjacency matrix of graph by connecting on channel to the surrounding neighboring channels in the standard 10/20 system arrangement, Zhang et al. [19] proposed to construct the adjacency matrix using the 3-D spatial information of the natural EEG channel connections. Although the abovementioned methods provide some rough descriptions of the connectivity of the brain regions, where the EEG channels are located, they require the input of artificial prior knowledge. These static adjacency matrices do not reflect the connectivity of brain regions during MI in real-world situations on a subject-specific basis, for which the CARM initially connects one channel to all remaining channels as
\begin{equation*}
\mathbf {W}^{*}_{ij}=\left\lbrace \begin{array}{l} 1, \quad i\ne j \\
0, \quad i =j \end{array} \right. \tag{15}
\end{equation*}
\begin{equation*}
\frac{\partial \text{Loss}}{\hat{\mathbf {W}}^{*}} = \left(\begin{array}{ccc}\frac{\partial \text{Loss}}{\partial \hat{\mathbf {W}}^{*}_{11}} & \cdots & \frac{\partial \text{Loss}}{\partial \hat{\mathbf {W}}^{*}_{\text{1}\,N}}\\
\vdots & \vdots & \vdots \\
\frac{\partial \text{Loss}}{\partial \hat{\mathbf {W}}^{*}_{N1}} & \cdots & \frac{\partial \text{Loss}}{\partial \hat{\mathbf {W}}^{*}_{NN}}, \\
\end{array} \right) \tag{16}
\end{equation*}
\begin{equation*}
\hat{\mathbf {W}}^{*} = (1 - \rho)\hat{\mathbf {W}}^{*} - \rho \frac{\partial \text{Loss}}{\partial \hat{\mathbf {W}}^{*}} \tag{17}
\end{equation*}
\begin{equation*}
\mathbf {H}_{t} = \hat{\mathbf {W}}^{*}\mathbf {X}_{t}\Theta _{t}. \tag{18}
\end{equation*}
CARM does not require the prior knowledge of the adjacency matrix, and can also correct the connection relations between different EEG channels in the subject-specifical situation, improving the ability of graph convolution to extract EEG channel relationships.
Algorithm 1: Training Procedure of EEG-ARNN.
Input: EEG trial
Output: Model prediction
Initialization of model parameters
repeat
repeat
Calculating the results of the
Calculating the results of the
until
Calculating the results of the final TFEM
Flattening the feature obtain in step 10 and calculating the predictions of the full connect layer
Calculating
Updating the model parameters include the learnable matrix
\begin{equation*}
\hat{\mathbf {W}}^{*} = (1 - \rho)\hat{\mathbf {W}}^{*} - \rho \frac{\partial \text{Loss}}{\hat{\mathbf {\partial W}}}^{*}
\end{equation*}
until
B. Temporal Feature Extraction Module
In previous work, the amplitude–frequency features due to their high discriminability are widely used for EEG signal classification. However, the extraction of amplitude–frequency features increases the computation time of the model and may lose the information of important frequency bands. So, we design the CNN-based TFEM, which directly performs feature extraction in the time domain. There are four TFEM in our framework. The first TFEM consists of convolution, batch normalization (BN), exponential linear unit (ELU), and a dropout. The kernel size and the stride of the first TFEM are (1, 16) and (1, 1), respectively. The input data dimension is specified as
C. Network Architecture Details
The EEG-ARNN consists of three main modules: CARM, TFEM, and a full connected layer. Except the forth TFEM, each TFEM, which extracts the EEG temporal features is connected to a CARM called TFEM-CARM block. The forth TFEM is used to compress the channel features and feed them into the full connected layer. Since Softmax activation function is applied to the output of the EEG-ARNN, the cross-entropy loss CE
D. EEG Channels Selection
How to select the EEG channels which are beneficial for the MI-EEG tasks is important to BCI systems. CARM solves the problem of the lack of a priori knowledge of the graph structure constituted at the EEG channels. In addition, the dynamic adjustable adjacency matrix
Schematic representation of the results of selecting 4 channels from 64-channel EEG data using (a) ES and (b) AS methods. The corresponding adjacency matrices are illustrated as well.
1) Edge-Selection
In the dynamic adjustable adjacency matrix
\begin{equation*}
\delta _{i,j} = |f_{i,j}| + |f_{j,i}|, i\ne j. \tag{19}
\end{equation*}
2) Aggregation-Selection
The above ES roughly describes the strength of the connection relationship between two nodes but does not take into account the aggregating cooperation between the node and the all neighboring nodes. To circumvent this issue, AS method is brought up. For node
\begin{equation*}
\tau _{i} = \sum _{j=1}^{j=N}|f_{i,j}| + |d_{i}| \tag{20}
\end{equation*}
Experiments and Results
A. Experimental Protocol and Data Preprocessing
TJU dataset: Experiments were conducted with 25 right-handed students (12 men and 13 women) at Tianjin University, their average age is 25.3 years (range, 19–32). None of them have personal or family history of neurological illness. Besides, participants were asked not to take psychotropic drugs two days before the experiment and to get at least 7 h of sleep the night before the experiment to avoid interference with the experiment. All procedures for recording experiments were approved by the China Rehabilitation Research Center Ethics Committee (No. CRRC-IEC-RF-SC-005-01). The EEG signals were acquired using the Neuroscan system, which consists of 64 Ag/AgCl scalp electrodes arranged according to the 10/20 system. The sampling frequency is set at 1000 Hz and can be downsampled during the preprocessing phase. Before the experiment, the electrode impedance would be tuned to below 5 k
BCICIV 2a dataset [33]: The BCICIV 2a dataset collects EEG signals of 22 nodes recorded from nine healthy subjects. For each subject, two session of data are collected on two different days. Each session is comprised of 288 MI trials per subject. The signals were sampled with 250 Hz and bandpass-filtered between 0.5 and 100 Hz by the dataset provider before release. In our experiment, considering the fairness of comparison, left-hand movement, and right-hand movement are included in the dataset to validate the performance of the model, which results in 288 trials (144 trials × 2 sessions) per subject. The sampling rate was reduced to 128 Hz with 4 s resulting in 512 time points.
PhysioNet dataset [34]: The PhysioNet dataset contains EEG data collected from 109 healthy subjects who are asked to imagine the open and close of the left/right fist with 64 channels and a sampling rate of 160 Hz. However, due to the damaged recordings with multiple consecutive “rest” sections, the data of subject #88, #89, #92, #100 are removed. Thus, in this experiment, we have EEG data from 105 subjects, each providing approximately 43 trials, with a roughly balanced ratio of binary task. Each trial consist of 3.2 s, resulting in 512 time points. We do not perform any additional preprocessing on the EEG data.
B. Baselines and Comparison Criteria
The computer hardware resources used in this article include NVIDIA Titan Xp GPU and Intel Core I7 CPU. The proposed model is built and evaluated in PyTorch [35] and python 3.5 environments. For TJU and BCICIV 2a datasets, the data of each subject are used to train and evaluate the performance of the model separately. 10-fold cross-validation is applied to the tests of each model, and the trials are randomly divided into 10 equal-size parts. A total of nine parts are used as the training set and the remaining one part is used as the test set. The average of the classification accuracy of the 10 model test set is used as the final accuracy. For PhysioNet dataset, the data partitioning is consistent with [19], ten of the 105 subjects are randomly chosen as the test set and the rest as the training set. We run the experiments 10 times and report the averaged results.
A total of five baselines are chosen to evaluate the performance metrics of classification accuracy with the proposed EEG-ARNN, including FBCSP [36], CNN-SAE [9], EEGNet [10], ACS-SE-CNN [11], and graph-based G-CRAM [19]. To ensure the reliability of our experiments, we set the batch size to 20 for 500 epochs in the following methods with deep learning. We use Adam optimizer with a learning rate of 0.001. The drop out rate is set to 0.25.
C. Classification Performance Comparisons
To evaluate the proposed EEG-ARNN, we first perform FBCSP, CNN-SAE, EEGNet, ACS-SE-CNN, G-CRAM, EEG-ARNN on TJU datasets of 25 subjects in sequence. The experimental results are shown in Table I. The average results of the six methods above are 67.5%, 74.7%, 84.9%, 87.2%, 71.5%, 92.3%. It is observed that the EEG-ARNN provides a 24.8% improvement concerning FBCSP, a 17.4% improvement to CNN-ASE in terms of average accuracy. Compared with these two methods, the improvement effect is significant. As for EEGNet and ACE-SE-CNN, the average accuracy improvement in EEG-ARNN is 7.4%, 5.1%. Compared with the graph-based G-CRAM method, our average accuracy improves by 17.2%. G-CRAM is designed to handle the cross-subject datasets, so the dataset size of a single subject limits the performance of G-CRAM. It is also proved that our method can deal with small datasets. Moreover, the average standard deviation (std) of 10-fold cross-validation accuracies for EEG-ARNN is 3.0%, which is less than that of FBCSP (std = 7.9%), EEGNet (std = 5.0%), CNN-SAE (std = 5.7%), ACE-SE-CNN (std = 5.0%), G-CRAM (std = 3.9%), thus proves that EEG-ARNN is quite robust in EEG recordings. Table I also illustrates the F1-score result, which indicates that the proposed model outperforms other methods. In addition, EEG-ARNN outperforms FBCSP, EEGNet, and CNN-SAE in all 25 subjects. It also performs better in 24 out of 25 subjects compared with ACS-SE-CNN and G-CRAM. Moreover, statistical significance is assessed by Wilcoxon signed-rank test for each algorithm with EEG-ARNN as shown in Fig. 3. The results show that EEG-ARNN dominates among all algorithms in terms of average accuracy. The differences are significant except for EEG-ARNN versus ACE-SE-CNN, the EEG-ARNN performs slightly better than ACS-SE-CNN.
Mean classification performance (%) of each algorithm averaged across all 25 subjects from the TJU dataset. *** and * above certain lines denote that the performance of EEG-ARNN was significantly better than that of the corresponding algorithm at the 0.005 and 0.1 level.
We also validate the performance of our proposed method on two widely used public datasets. Tables II and III illustrate the classification accuracy, standard deviation, and F1-score results of proposed and baseline methods on BCICIV 2a and PhysioNet dataset, respectively. It can be observed that the overall performance of our EEG-ARNN is also competitive on public datasets. For BCICIV 2a dataset, the average classification accuracy outperforms all other baseline methods, including traditional method, CNN-based methods, and graph-based method as well as the classification accuracy and F1-score of more than two-thirds of subjects on EEG-ARNN are higher than other baselines. For PhysioNet dataset, as shown in Table III, the proposed method achieves the highest average accuracy and F1-score among all baseline methods. Furthermore, the average standard deviation of EEG-ARNN is lower than 4 of 5 baseline models in nine replicates. These indicate that our proposed method is also competitive on cross-subjects datasets.
D. Ablation Experiments
In this section, ablation experiments were conducted to identify the contribution of key components of the proposed method (the part inside the black dashed line in Fig. 1), the training method and parameter settings for the ablation experiments remained the same as those in Section III-B.
We considered three cases on TJU dataset, i.e., retaining TFEM or CARM only, using different number of TFEM-CARM blocks, switching the sequence of TFEM and CARM. The average classification accuracies, standard deviation, and F1-score in three cases for all subjects are illustrated in Table IV. The accuracies of the EEG-ARNN without CARM or TFEM decrease a lot compared to the proposed method. When the CARM is removed, the model loses the update mechanism on
Therefore, singular temporal or spatial feature is insufficient to describe complex physiological activities, and fewer TFEM-CARM blocks are not enough to extract effective spatiotemporal feature. Furthermore, the advantage of using TFEM and CARM alternately is to guarantee that corresponding spatiotemporal features can be extracted from the feature map at various scales, due to the fact that the neural activities of different subjects often exhibit diversified spatiotemporal interactions. The result of ablation experiments demonstrates that our EEG-ARNN is a preferable model to comprehensively leverage spatiotemporal feature for MI classification task.
E. Results of ES and AS
In order to further generalize the model, we use the trained
The results of AS method are shown in Table V. We observe that when the number of channels is reduced to 10, the average accuracy of the results is
For the ES method, the average accuracy is
AS method is a node-based selection method, which is a direct channel selection method. AS selects the number of channels equal to the specified value
F. Relation Between ES/AS and Neurology
To reveal which channel plays a major role in the EEG acquisition process and to explore the relationship between the brain region where the channel is located and the MI experiment, two ways to select the channels is designed in Section II-D, we further investigate what the EEG channels selected using ES and AS can indicate and whether the structures shown in Figs. 4 and 5 can match neurology concept. We first extracted the channels obtained from the top20 experiments of 25 subjects in the TJU dataset, and listed the channels selected more frequently by ES and AS methods in Fig. 4(a) and (c). Fig. 4(b) and (d) exhibit the distribution of these channels in scalp electrodes. It can be seen that “C1,” “C3,” “CZ,” “CP1,” “CP3” and other electrodes related to motor imagination are selected several times by the two methods, and some electrodes are chosen in more than two-thirds of the subjects, which indicates that the channel selection methods proposed have neurophysiological significance. Then, the edges/nodes structures of two subjects are selected and plotted using brainnetviewer [37]. According to Table I, it can be obtained that the data of the No.17 subject achieves excellent results on the six different classifiers. However, the No.23 subject has poor data quality. Based on this premise, we selected the 20 edges and 20 nodes with the highest weights following the method of Section II-D.
Frequency and distribution of channels selected by ES/AS method among 25 subjects in TJU dataset. (a) Most frequency channel selected by ES. (b) Most frequency channel selected by. (c) Distribution of channels selected by ES. (d) Distribution of channels selected by AS.
Top20 edges/nodes drawn by the ES and AS method for subjects No.17 and No.23, respectively. (a) Edge-selection (Num17). (b) Edge-selection (Num23). (c) Aggregation-selection (Num17). (d) Aggregation-selection (Num23).
As shown in Fig. 5(a), the selected edges in No.17 subject are mainly in the left hemisphere, and the most frequent channels are “CP3,” followed by “CP1,” “CZ,” and other channels. Human high-level senses (e.g., somatosensory, spatial sensation) are mainly performed by the parietal lobe, and electrode “CP3” is located in the parietal lobe. In the MI experiment, subjects did not produce actual movements, but only imagined movements based on cues on the screen, which required the sensation of movement. The electrode “CP3” is located in the parietal lobe, which is responsible for this sensation. For the AS selected channels shown in Fig. 5(c), the channel locations are similar to that of the channels selected using ES, with the channels mainly distributed in the left hemisphere. It is worth noting that ES selects the edge with the largest weight and then selects the EEG channels located on both sides of the edge. Therefore, the number of EEG channels selected by ES is usually less than the number of EEG channels selected by AS. For the No.17 subject, 20 nodes were selected using AS, while 11 nodes were selected using ES, but the corresponding accuracy decreased by only 0.3%, as shown in Table VI.
The channel connections of No.23 subject are shown in Fig. 5(b), with more channels located in the right hemisphere, except for the “CP3” channel, which still plays a important role. In contrast, No.17 subject selects 11 channels, which means that the distribution of channels of No.23 is more dispersed. The EEG channels selected using AS shows the same properties in Fig. 5(d), with channels mostly distributed in the right hemisphere, while a few related to the sensation of movement channels such as “FC5” and “PO3” are also selected. “C”-series channels (CZ, C1, C2,...) are mainly located in the precentral gyrus, and the neurons in this part are primarily responsible for human movements. It is obvious that most of the channels with high weights are “C”-series for No.17 subject. However, the distribution of the channels with a higher weight of No.23 subject is disorderly. The mean accuracies of No.17 and No.23 subjects are shown in Table VI. This further reveals the relationship between the selected channels of the ES/AS obtained through EEG-ARNN and the subjects performing the MI experiment. During the MI experiment, No.17 subject was energetically focused during the experiment, while No.23 subject had problems such as lack of concentration during the imagery. It can be confirmed that the vital feature of the MI is captured through the EEG-ARNN. It also demonstrates the importance of the EEG-ARNN proposed in revealing the working state of different brain regions of the subjects.
Conclusion
This article proposed a novel hybrid deep framework called EEG-ARNN based on CNN and GCN for MI-EEG classification, which integrates the channel information dynamically and extracts the EEG signals in the time domain. Experimental results on three datasets showed that the proposed EEG-ARNN outperformed SOTA methods in terms of accuracy and robustness. In addition, two channel selection methods ES and AS were proposed to select the best channels. Finally, we compared the ES/AS-selected channels with active brain regions, which will help us further understand why subjects differed significantly in their performance in MI tasks.
The proposed model can be further improved by integrating convolution and graph convolution to reduce the computational complexity rather than simply stacking these two operations. In addition, the proposed method was only validated on the MI task. The future direction was to extend the EEG-ARNN to other paradigms, such as P300 and SSVEP, and continue to explore the connection relationship of channels in EEG data. Finally, it would be a meaningful work to incorporate our proposed model into a real-world BCI and evaluate its performance online.