Introduction
Currently, the cognitive workload of the operator has been studied widely in the fields of human-robot interaction environment and passive brain-computer interface [1], [2]. It is a special case of cognitive states, described as the ratio of the operator’s available cognitive resources (e.g., the attention resources and working memory capacity) over the task demanded resources [3]. Due to the limited cognitive resources of the brain, the heavy cognitive work in real-world environments will lead to cognitive overload, further affect task execution and harm the operator’s state [4]. As such, it is important to accurately recognize human cognitive workload to prevent accidents and maintain health.
To date, we can categorize the measurements for cognitive workload into subjective and objective measures [5]. Subjective measurements are mainly based on the operator’s perceived feeling and rating scales, e.g., the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) [6]. Though these measurements implement easily, they are post-evaluation and cannot yield objective or real-time results [7]. Meanwhile, objective measurements are mainly relied on the recorded physiological signals during the task process, hence having less interference on the task [5]. Among various physiological signals, the electroencephalogram (EEG) signal has been widely used and studied, due to its high-temporal-resolution, security, and convenience. Besides, its effectiveness has been validated in detecting cognitive workload during the execution of workload-related cognitive tasks [8]. Hence, we concentrate on EEG-based cognitive workload recognition.
The EEG signals may have variabilities among different subjects and/or tasks, mainly including intra- and inter-subject, and inter-task variabilities. These variabilities are related to constructing subject-dependent, cross-subject (or subject-independent), and cross-task (or task-independent) models, respectively [7]. The subject-dependent model is trained and tested on data specific to each subject. Here, we can view the subject-dependent study as the standard cognitive workload recognition design [9] considering the large variability between subjects. Hitherto, many subject-dependent methods have been constructed and achieved acceptable recognition performance [10], [11]. Typically, the inter-subject and inter-task variabilities are more challenging and complex and would have inferior performance to the former. But if successful, they would enable and improve cognitive state monitoring and probing in real-world environments [7], [9]. So it is crucial to alleviate the cross-subject and cross-task issues. The cross-subject model is trained on data from one or a group of subjects and tested with data from a new subject. On the cross-subject research, based on the development of EEG optimal features extraction and classification models, acceptable results are obtained [12], [13].
The cross-task model is trained on one task and tested on another similar but different task. Ideally, although different cognitive tasks may elicit different cognitive resources, the cognitive workload pays more attention to the occupied amount of cognitive resources rather than the specific cognitive resources. Therefore, for a practical design, it is possible to construct a generalizable or cross-task cognitive workload recognition model that is capable to recognize workload across various tasks [14]. However, we find the cross-task cognitive workload recognition has degraded performance than the subject-dependent model [14]. During different task-elicited cognitive activities, the mismatched workload between the training and testing data [15], the highly dissimilar EEG patterns [16], the non-stationary characteristics of EEG data [15], [17], further the distribution variabilities between the EEG data, may cause considerable difficulties for cross-task cognitive workload recognition. Existing studies assumed a set of invariant features exists across tasks and proposed to find the common features and then constructed the task-independent model [8], [14]–[16], which might ignore the discrepancy between different tasks and thus have limited performance. To deal with the above problems, we propose to apply unsupervised domain adaptation for establishing the cross-task cognitive workload recognition model, aiming to reduce the distribution discrepancy as well as to improve the generalized classification accuracy across various tasks. Given efficient labeled source samples and unlabeled target samples, unsupervised domain adaptation transfers knowledge from the source domain to the target domain and tries to train a classifier that works well on a target domain. It aims to reduce the distribution discrepancy between the source and target data, thus making them similar [18], [19]. To our knowledge, domain adaptation has not been or rarely been applied for cross-task cognitive workload recognition systems.
In this paper, we propose a new framework for EEG-based cross-task cognitive workload recognition using domain adaptation. The proposed framework is implemented under three transfer schemes, which are the same/various one-to-one cross-task transfers, and the many-to-one cross-task transfer. We mainly explore four domain adaptation methods as a preliminary study for the new framework. These domain adaptation methods both assume the shared feature representations are existing between the source domain and the target domain simultaneously reducing the distribution gap. The difference in them lies in the ideas on the marginal distribution and conditional distribution between domains. Then, we compare the performance of these methods on a private EEG dataset with two different tasks to construct the workload recognition model in a binary classification way. Assuming the two tasks involved in the cross-task study should share have some brain mechanisms in common but also present distinctive activations, as suggested and applied in [8], [20], [21], we thus use the Sternberg Working Memory task (denoted as WM task) and Mathematics Addition task (denoted as MA task) to elicit cognitive workload states.
In Figure 1, we display the general framework of cross-task cognitive workload recognition, including the cross-task design, EEG data acquisition, data preprocessing, feature extraction, domain adaptation, and classification. Here, we take task A (e.g., the WM task) as source data to train the models, and task B (e.g., the MA task) as the target data to test the models.
The proposed framework of cross-task cognitive workload recognition using domain adaptation.
The major contributions of this work are three-fold. First, we design a workload paradigm including working memory and mathematic addition tasks with fine-grained partition. Second, we propose to use domain adaptation to reduce the distribution discrepancy as well as to improve the classification accuracy. Third, we evaluate the proposed method on a real EEG dataset, with results demonstrating the superiority of our method over non-transfer methods.
The rest of the work is organized as follows. Section II briefly introduces the concepts of domain adaptation and related methods. Section III introduces EEG data recordings during two different cognitive workload tasks. The results are compared and presented in Section IV to evaluate the performance of the proposed methods. Section V discusses the major findings. Finally, Section VI concludes the whole paper.
Methods
We adopt unsupervised domain adaptation without using the labeled samples from the target subjects, to cope with the task-to-task variability for building the EEG-based cross-task cognitive workload recognition models.
The EEG signal collected in one task of each subject is viewed as a domain, which is defined as
The source domain is the labeled EEG samples of one task and the target domain is the unlabeled EEG samples of the other task. Given a source domain
In the following, we will briefly introduce four domain adaptation methods used in the paper. These methods mainly focus on the shared feature representation and minimization of the distribution discrepancy.
A. Transfer Component Analysis
Transfer component analysis (TCA) tries to reduce the distribution discrepancy via embedding the source and target domains into a shared low-dimensional feature space and learning a set of transfer components [23]. TCA can be seen as a dimensionality reduction method. To achieve this goal, Pan et al. proposed to find a transformation function \begin{align*}&\min \limits _{W}~{tr(W^{T}KLKW)}+\mu tr(W^{T}W), \\&{\it s.t.} ~W^{T}KHKW=I_{m},\tag{1}\end{align*}
B. Joint Distribution Adaptation
Joint distribution adaptation (JDA) is proposed to adopt both the marginal and conditional distributions in a principled dimensionality reduction procedure [25], \begin{align*}&\hspace {-0.5pc}{D}\left ({{\mathcal {D}_{s},\mathcal {D}}_{T} }\right)\approx \mathrm {D}\left ({P\left ({X_{S} }\right),P\left ({X_{T} }\right) }\right) \\&\qquad\qquad\qquad\qquad\quad+\,\mathrm {D(}P\left ({y_{S}\thinspace \vert \thinspace X_{S}}\right),P\left ({y_{T}\thinspace \vert \thinspace X_{T}}\right)),\tag{2}\end{align*}
Similar to TCA, JDA tries to find a transformation matrix \begin{align*}&\min \limits _{\text {W}}~tr\left ({W^{T}KLK^{T}W }\right)+\sum \nolimits _{c=1}^{C} tr(W^{T}KM_{c}K^{T}W) \\&\hphantom {\min \limits _{\text {W}}~}+\,\lambda \left \Vert{ W }\right \Vert _{F}^{2}, \\&{\it s.t.} ~W^{T}KHK^{T}W=I,\tag{3}\end{align*}
C. Balanced Domain Adaptation
As mentioned above, the TCA method only considers the marginal distribution of source and target domains, and JDA considers both the marginal and conditional distributions. Though JDA may have more information used, it assumes the marginal and conditional distributions are dedicating identically to the domain divergence, which is not practical in real-world applications. To address this problem, Wang et.al [26] proposed a balanced domain adaptation (BDA) by using a balance factor \begin{align*}&\hspace {-0.5pc}{D}\left ({{\mathcal {D}_{s},\mathcal {D}}_{T} }\right)\approx \left ({1-\mu }\right){D}\left ({P\left ({X_{S} }\right),P\left ({X_{T} }\right) }\right) \\&\qquad\qquad\qquad\qquad+\,\mu {D}(P\left ({y_{S}\thinspace \vert \thinspace X_{S}}\right),P\left ({y_{T}\thinspace \vert \thinspace X_{T}}\right)),\tag{4}\end{align*}
\begin{align*}&\min \limits _{W}~tr\left ({W^{T}K\left ({\left ({1-\mu }\right) }\right)L+\mu \sum \nolimits _{c=1}^{C} M_{c}\big)K^{T}W }\right) \\&\hphantom {\min \limits _{\text {W}}~}+\,\lambda \left \Vert{ W }\right \Vert _{F}^{2}, \\&{s.t.} ~W^{T}KHKW=I,\quad \mu \in \left [{ 0,1 }\right].\tag{5}\end{align*}
The
D. Transfer Joint Matching
When the source and target domains are different in both feature distribution and samples relevance, Long et al. [27] proposed Transfer Joint Matching (TJM) to cope with this setting. TJM aims to reduce the domain discrepancy by simultaneously matching the marginal feature distributions and reweighting the source samples across domains in a principled dimensionality reduction procedure, and construct a new feature representation that is stationary to both the marginal distribution discrepancy and the irrelevant samples, \begin{align*}&\min \limits _{W}~{tr\left ({W^{T}KLK^{T}W }\right)+\lambda (\left \Vert{ W_{s} }\right \Vert _{2,1}+\left \Vert{ W_{t} }\right \Vert _{F}^{2})}, \\&{\it s.t.} ~W^{T}KHK^{T}W=I,\tag{6}\end{align*}
Data and Experiment
A. Subjects and Experiment Setup
In this study, we have invited 45 college students (28 males and 17 females, aged 20 to 30, mean age of 24.6 ± 6.6) to participate in EEG experiments at the iBRAIN laboratory of Nanjing University of Aeronautics and Astronautics. Except for one subject, all others were right-handed. All subjects have fulfilled the inclusion criterion that having a normal or corrected-to-normal vision. They did not suffer from any condition that may cause anxiety or fatigue. Participants were required to keep away from caffeine, medication, and alcohol, and had a normal amount of eight sleep hours before the experiment. After the explanation of the experimental protocol, we have obtained the signed written consent from all subjects.
Each subject undertook three tasks, including resting state with eye closed and eye open, WM task, and MA task. The general experimental design is displayed in Figure 2(A).
The experimental design in this paper, including (A) the general task design, (B) the design of WM task and MA task, and (C) a detailed trial of both tasks at a difficulty level of L3.
In the WM task, subjects needed to remember different numbers of stimuli groups (here we choose English letter sequences as stimuli), maintaining for 2 seconds, and determine whether the shown English letter had existed in the memorization sequences before [28]. The WM task consists of seven groups, and each group has 20 trials, 30% of them are target stimuli. We set the difficulty levels of seven, from the very low (L1), low (L2), medium (L3), medium-high (L4), high (L5), very high (L6) to extreme high (L7), with the length of sequences as 1, 2, 4, 6, 8, 10, 12, respectively. In each group, the order of levels was randomly presented. After a blank of 1s, the stimulus was presented for 2s followed by a fixation cross during the interval of 3s, then a judgment time of 2s.
In the MA task, the subjects needed to retain the results of the currently shown addition formula (e.g., 5+7) and identify whether the given number (e.g., 11) matched the result they calculated before. This task involves temporal storage of intermediate results and information retrieval held in the cognitive workspace [11]. Similarly, the MA task consists of seven groups, and each group has 20 trials, 30% of them are target stimuli. We set the difficulty levels of seven, corresponding with the various digits and carries of addition. In each group, the order of levels is randomly presented. After a blank of 1s, each addition was presented for 2s followed by a fixation cross for 3s, while the answer is displayed for 2s.
In Figure 2(B), we display the detailed setup for each cognitive task, and Figure 2(C) shows a detailed trial of both tasks at a difficulty level of L3. In Table I, we list the level design and the examples of various difficulty levels.
All task stimuli were shown on a computer screen in white font on a black background. Participants were instructed to focus on both accuracy and speed and practice five trials before the EEG recordings were implemented for both tasks. Notably, mental fatigue may be induced during the experiment and it may contaminate the EEG quality. To avoid this, we first randomized the group order in each task. Then, we randomized the order of difficulty levels in each group. Also, we asked the subjects to rest before each task to ease the fatigue. The task paradigms were carried out in E-Prime 2.0 software [29].
B. Data Acquisition and Preprocessing
A portable wireless EEG amplifier (NeuSen. W64, Neuracle, China) was used for EEG data recording at a sampling rate of 1000 Hz. Fifty-nine electrodes were arranged according to the international 10–20 system, with reference at CPz and a forehead ground at AFz. Electrode impedances were kept below 5
A widely-used EEG preprocessing pipeline was adopted in the current study, using EEGLAB software [30]. Specifically, the raw EEG data were firstly re-referenced to the average of the electrodes, then, band-pass filtered to 0.1-70 Hz to eliminate noise, and additionally a 50 Hz notch filtered to reduce the power line noise. The filtered data were down-sampled to 256 Hz. Signals were then baseline adjusted and segmented into 2s epochs after stimulus onset. Eye-blink and muscle-related artifacts were removed via Independent Components Analysis by rejecting the components that were highly correlated with those artifacts. Here we used ADJUST [31] and ICLabel [32] tools to mark the components. Due to high noise contamination, seven subjects were excluded, thus leaving 38 subjects (25 males and 13 females, mean age of 24.4 ± 5.9 years) for the subsequent analysis. For these subjects, the bad epochs with noise were further removed by visual inspection, thus leaving 4554 and 4681 trials (each is 2s length) for the WM task and MA task, respectively. In Table II, we list the corresponding numbers of the workload levels with two tasks.
C. Feature Extraction
For feature extraction, we have extracted power spectral density (PSD) features from the spectral dimension [33] and coherence features from the brain connectivity network [34], which are both widely used in EEG analysis [35], and combined them as the final feature set.
To be specific, the PSD and coherence features are extracted from 5 clinical frequency bands (
Besides the PSD feature, functional connectivity network can be used for measuring the relationship between different EEG electrodes [35] or functional brain regions [37]–[39]. EEG-related functional connectivity networks can be constructed by coherence [40]. The formula for coherence is
D. Workload Classification
We aim to build EEG-based cross-task cognitive workload recognition models using domain adaptation techniques. Here, during training for cross-task transfer, the samples of the labeled source data from task A are used to predict the unlabeled target data in task B. Considering the different application scenarios, we define three kinds of transfer schemes with different focuses.
One-to-one cross-task transfer (denoted as
): is the transfer considering the intra-subject task variability, where we use the data from each subject to complete the cross-task cognitive workload recognition. Given one subject, the source domain is the data of task A, and the target domain is the data of task B of the same person.\boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}} Many-to-one cross-task transfer (denoted as
): is the transfer focusing on the source combination, where the source domain is the data of task A of all the subjects, and the target domain is the data of task B of each subject. It makes sense when multiple existing subjects are available and it might have more information than\boldsymbol {\mathcal {M}}\to \boldsymbol {\mathcal {O}} .\boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}} Various one-to-one cross-task transfer (denoted as
): which considers the subject variabilities, where the target domain is the task B of one subject, the source domain is the enumerated task A from the other subjects.\boldsymbol {V} \boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}}
We thus focus on evaluating the domain adaptation methods for a common binary classification task. We take levels L1, L2, L3 as low workload, and levels L5, L6, L7 as high workload.
Here we adopt widely used classifiers for baseline comparison, including supporting vector machine (SVM) with radial basis function (RBF) kernel, K-nearest neighbor (KNN), linear discriminative analysis (LDA), and single hidden-layer artificial neural network (ANN). These methods also are non-transfer methods. SVM is constructed based on the RBF kernel with a soft margin parameter C. The parameter of KNN is the number of neighbors. Note that this paper only presents the best SVM and KNN results. The LDA classifier is used with the default setting provided by MATLAB. ANN is used with 10 hidden units. Based on the preliminary results, we then use SVM with RBF kernel as the base classifier for domain adaptation methods. For the TCA, JDA, and TJM methods, the hyper-parameters are the dimension of subspace
Experimental Results
The results for recognizing cognitive workload consist of two parts. First, behavioral data including response time and answer accuracy, and event-related spectral perturbation (ERSP) are analyzed to validate the different cognitive workload levels induced by WM and MA tasks. Second, the binary classification results of cross-task workload models are presented, using accuracy and F1 score as performance evaluation metrics.
A. Behavioral Results
We use the WM task and MA task to explore cross-task cognitive workload recognition. In the experiment, we have recorded the response time and answer accuracy as behavior data. Due to the non-block experimental design and random order of experimental stimulus representation [41], we did not collect the subjective measurements, such as the NASA-TLX questionnaire. Figure 3 presents the behavior data to validate the effectiveness of these two types of cognitive tasks and the corresponding one-way analysis of variance (ANOVA) results. This experiment confirms that easy and hard tasks are distinguishable.
Results of the one-way ANOVA of the behavioral data. Bars represent mean ± standard error. WM task: working memory task; MA task: mathematic addition; RT: Response Time;
For both the MA and WM tasks, when the difficulty levels increase, the subjects perform worse as in Figure 3(A) and (B), and take longer to provide the answers as in Figure 3(C) and (D). For the MA task, the accuracy difference between levels 1, 2, 3 and levels 4, 5, 6, 7 (p < 0.01), and the difference between levels 4, 5, 6, and level 7 (p < 0.01) are significant; the difference between levels 1, 2, 3 and the difference between levels 4, 5, 6 are not significant. For response time, the difference between levels 1, 2, 3 and levels 5, 6, 7 (p < 0.01), and the difference between level 4 and level 7 (p < 0.01) are significant; the difference between levels 1, 2, 3, 4, the difference between levels 4, 5, 6 and the difference between levels 5, 6, 7 are not significant. For the WM task, the accuracy difference between levels 1, 2, 3, 4, and levels 5, 6, 7 (p < 0.01) are significant; the difference between levels 1, 2, 3, 4, and the difference between levels 5, 6, 7 are not significant. For response time, the difference between levels 1, 2 and levels 5, 6, 7 (p < 0.01), and the difference between level 1 and levels 3, 4 (p < 0.01) are significant; the difference between levels 1, 2, the difference between levels 3, 4, and the difference between levels 4, 5, 6, 7 are not significant.
We analyze EEG using the event-related spectral perturbation (ERSP) [20], [42] and perform the one-way ANOVA with 2000 permutations for statistical testing. ERSP provides detailed information on event-related desynchronization/ synchronization and can visualize the mean power changes [43]. In Figure 4, we analyze the ERSP maps for each cognitive task, for each of the 7 stimulus conditions in all 59 electrodes, and in the four EEG frequency-bands (
Results of the one-way ANOVA for ERSP analysis. WM task: working memory task; MA task: mathematic addition. For different levels, blue indicates event-related desynchronization, red indicates event-related synchronization. The last column represents p values, with redder color indicating the stronger significance; FDR correction was used for multiple comparisons.
For the WM task, as shown in Figure 4(A), in
For the MA task, as shown in Figure 4(B), in
B. Classification Results
Table V displays the classification accuracy of different methods and Supplementary Table S1 validates the differences between different methods. Here, SVM, LDA, KNN, and ANN are non-transfer methods, whereas TCA, JDA, BDA, and TJM are domain adaptation methods with SVM as the classifier.
In
Supplementary Table S1 validates the differences between different methods using Dunn’s test with multiple comparison corrections. In both
Accuracy distributions for each algorithm and differences between the TJM and the other methods. Here, (A) and (B) are in
Discussion
The discussion of cross-task cognitive workload recognition consists of five parts. First, the classification tasks with combinations of different source numbers and the dimensions of subspace for the TJM are conducted in Figure 6. Second, accuracy comparisons between PSD and coherence are employed in Figure 7. Third, PSD and coherence are employed separately to evaluate their impacts in Figures 8 and 9. Fourth, a nonlinear dimensionality reduction is used to visualize the feature representation learned after TJM in Figure 10. Finally, the limitations and future work are presented.
Accuracy for TJM algorithm with different source combinations, where (A) is trained on MA task and tested on WM task, (B) is reverse, and d is the dimension number of TJM.
Accuracy comparison between PSD and coherence (COH) features in a
The PSD distribution in low and high workload, where the upper is for WM task and the lower is for MA task. The color bar below indicates the mean PSD for each frequency band.
The coherence variance distribution of low and high workload. (A) The upper is for the WM task and the lower is for the MA task. The color bar indicates the coherence variation, where the redder color represents an increasing trend and the bluer represents a decreasing trend. (B) We then show features distribution for the five frequency bands in the pie charts and the channel location in the radar charts for each task.
Feature visualization by t-SNE. Here, (A) and (B) are the features of the original distribution, (C) and (D) are the features after TJM. The first column is trained on the MA task and tested on the WM task, the second column is in reverse order. Light colors denote features from the source domain and deep colors represent features from the target domain. For better visualization, we also highlight features from different categories with different colors (i.e., red and blue). S is Source, and T is Target domain.
A. Parameter Sensitivity
To figure out the best source combination number in
B. Features Comparisons
To evaluate the different features on the results of the TJM method, we repeat the classification task using PSD and coherence features separately. We set the kernel function as linear, the dimension of subspace as 80, and the other experimental settings stay the same. Accuracy comparisons between PSD and coherence are displayed in Figure 7.
Coherence has an advantage over PSD for about 4.3% and 7.1% increments in the MA
C. Feature Importance
We also investigate the importance of the original features for both two tasks. In Figure 8, we show the PSD distributions in low and high workload conditions with the mean PSD value of each channel across 5 frequency bands. Here, the whole EEG brain is divided into the frontal, central, occipital, parietal, and temporal brain regions. We find the PSD distributions in the same task have a similar pattern but vary between the two tasks. For the WM task, with the workload increment, the
In Figure 9, we show the coherence variance distribution of low and high workloads. To obtain the coherence variance distribution, we first use the mean coherence features of high workload to minus the low workload in corresponding frequency bands. In Figure 9(A), the upper is for the WM task and the lower is for the MA task. As we can see, the coherence variance distributions have a similar pattern for both tasks, where
D. Feature Visualization
To visualize the distributions of feature representation learned by TJM, we try to project the latent feature representations on a two-dimensional (2D) plane using
E. Limitations and Future Work
In this paper, the experimental results show that domain adaptation methods in a
Conclusion
In this paper, we have presented a preliminary study on EEG-based cross-task cognitive workload recognition using domain adaptation techniques, and a comparative study on a private EEG dataset. The cross-task workload recognition models have been constructed in