Journals & Magazines >IEEE Transactions on Neural S... >Volume: 30

Cross-Task Cognitive Workload Recognition Based on EEG and Domain Adaptation

Abstract:

Cognitive workload recognition is pivotal to maintain the operator’s health and prevent accidents in the human-robot interaction condition. So far, the focus of workload ...Show More

Metadata

Abstract:

Cognitive workload recognition is pivotal to maintain the operator’s health and prevent accidents in the human-robot interaction condition. So far, the focus of workload research is mostly restricted to a single task, yet cross-task cognitive workload recognition has remained a challenge. Furthermore, when extending to a new workload condition, the discrepancy of electroencephalogram (EEG) signals across various cognitive tasks limits the generalization of the existed model. To tackle this problem, we propose to construct the EEG-based cross-task cognitive workload recognition models using domain adaptation methods in a leave-one-task-out cross-validation setting, where we view any task of each subject as a domain. Specifically, we first design a fine-grained workload paradigm including working memory and mathematic addition tasks. Then, we explore four domain adaptation methods to bridge the discrepancy between the two different tasks. Finally, based on the supporting vector machine classifier, we conduct experiments to classify the low and high workload levels on a private EEG dataset. Experimental results demonstrate that our proposed task transfer framework outperforms the non-transfer classifier with improvements of 3% to 8% in terms of mean accuracy, and the transfer joint matching (TJM) consistently achieves the best performance.

Published in: IEEE Transactions on Neural Systems and Rehabilitation Engineering ( Volume: 30)

Page(s): 50 - 60

Date of Publication: 05 January 2022

ISSN Information:

PubMed ID: 34986098

DOI: 10.1109/TNSRE.2022.3140456

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Currently, the cognitive workload of the operator has been studied widely in the fields of human-robot interaction environment and passive brain-computer interface [1], [2]. It is a special case of cognitive states, described as the ratio of the operator’s available cognitive resources (e.g., the attention resources and working memory capacity) over the task demanded resources [3]. Due to the limited cognitive resources of the brain, the heavy cognitive work in real-world environments will lead to cognitive overload, further affect task execution and harm the operator’s state [4]. As such, it is important to accurately recognize human cognitive workload to prevent accidents and maintain health.

To date, we can categorize the measurements for cognitive workload into subjective and objective measures [5]. Subjective measurements are mainly based on the operator’s perceived feeling and rating scales, e.g., the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) [6]. Though these measurements implement easily, they are post-evaluation and cannot yield objective or real-time results [7]. Meanwhile, objective measurements are mainly relied on the recorded physiological signals during the task process, hence having less interference on the task [5]. Among various physiological signals, the electroencephalogram (EEG) signal has been widely used and studied, due to its high-temporal-resolution, security, and convenience. Besides, its effectiveness has been validated in detecting cognitive workload during the execution of workload-related cognitive tasks [8]. Hence, we concentrate on EEG-based cognitive workload recognition.

The EEG signals may have variabilities among different subjects and/or tasks, mainly including intra- and inter-subject, and inter-task variabilities. These variabilities are related to constructing subject-dependent, cross-subject (or subject-independent), and cross-task (or task-independent) models, respectively [7]. The subject-dependent model is trained and tested on data specific to each subject. Here, we can view the subject-dependent study as the standard cognitive workload recognition design [9] considering the large variability between subjects. Hitherto, many subject-dependent methods have been constructed and achieved acceptable recognition performance [10], [11]. Typically, the inter-subject and inter-task variabilities are more challenging and complex and would have inferior performance to the former. But if successful, they would enable and improve cognitive state monitoring and probing in real-world environments [7], [9]. So it is crucial to alleviate the cross-subject and cross-task issues. The cross-subject model is trained on data from one or a group of subjects and tested with data from a new subject. On the cross-subject research, based on the development of EEG optimal features extraction and classification models, acceptable results are obtained [12], [13].

The cross-task model is trained on one task and tested on another similar but different task. Ideally, although different cognitive tasks may elicit different cognitive resources, the cognitive workload pays more attention to the occupied amount of cognitive resources rather than the specific cognitive resources. Therefore, for a practical design, it is possible to construct a generalizable or cross-task cognitive workload recognition model that is capable to recognize workload across various tasks [14]. However, we find the cross-task cognitive workload recognition has degraded performance than the subject-dependent model [14]. During different task-elicited cognitive activities, the mismatched workload between the training and testing data [15], the highly dissimilar EEG patterns [16], the non-stationary characteristics of EEG data [15], [17], further the distribution variabilities between the EEG data, may cause considerable difficulties for cross-task cognitive workload recognition. Existing studies assumed a set of invariant features exists across tasks and proposed to find the common features and then constructed the task-independent model [8], [14]–[16], which might ignore the discrepancy between different tasks and thus have limited performance. To deal with the above problems, we propose to apply unsupervised domain adaptation for establishing the cross-task cognitive workload recognition model, aiming to reduce the distribution discrepancy as well as to improve the generalized classification accuracy across various tasks. Given efficient labeled source samples and unlabeled target samples, unsupervised domain adaptation transfers knowledge from the source domain to the target domain and tries to train a classifier that works well on a target domain. It aims to reduce the distribution discrepancy between the source and target data, thus making them similar [18], [19]. To our knowledge, domain adaptation has not been or rarely been applied for cross-task cognitive workload recognition systems.

In this paper, we propose a new framework for EEG-based cross-task cognitive workload recognition using domain adaptation. The proposed framework is implemented under three transfer schemes, which are the same/various one-to-one cross-task transfers, and the many-to-one cross-task transfer. We mainly explore four domain adaptation methods as a preliminary study for the new framework. These domain adaptation methods both assume the shared feature representations are existing between the source domain and the target domain simultaneously reducing the distribution gap. The difference in them lies in the ideas on the marginal distribution and conditional distribution between domains. Then, we compare the performance of these methods on a private EEG dataset with two different tasks to construct the workload recognition model in a binary classification way. Assuming the two tasks involved in the cross-task study should share have some brain mechanisms in common but also present distinctive activations, as suggested and applied in [8], [20], [21], we thus use the Sternberg Working Memory task (denoted as WM task) and Mathematics Addition task (denoted as MA task) to elicit cognitive workload states.

In Figure 1, we display the general framework of cross-task cognitive workload recognition, including the cross-task design, EEG data acquisition, data preprocessing, feature extraction, domain adaptation, and classification. Here, we take task A (e.g., the WM task) as source data to train the models, and task B (e.g., the MA task) as the target data to test the models.

Fig. 1.

The proposed framework of cross-task cognitive workload recognition using domain adaptation.

Show All

The major contributions of this work are three-fold. First, we design a workload paradigm including working memory and mathematic addition tasks with fine-grained partition. Second, we propose to use domain adaptation to reduce the distribution discrepancy as well as to improve the classification accuracy. Third, we evaluate the proposed method on a real EEG dataset, with results demonstrating the superiority of our method over non-transfer methods.

The rest of the work is organized as follows. Section II briefly introduces the concepts of domain adaptation and related methods. Section III introduces EEG data recordings during two different cognitive workload tasks. The results are compared and presented in Section IV to evaluate the performance of the proposed methods. Section V discusses the major findings. Finally, Section VI concludes the whole paper.

SECTION II.

Methods

We adopt unsupervised domain adaptation without using the labeled samples from the target subjects, to cope with the task-to-task variability for building the EEG-based cross-task cognitive workload recognition models.

The EEG signal collected in one task of each subject is viewed as a domain, which is defined as $\left \{{(X^{i},y^{i}) }\right \}_{i=1}^{N}$ , where $N$ is the number of trials used. $X^{i}\in \mathbb {R}^{(E\times d)}$ is an EEG trial with $E$ electrodes and $d$ dimensional features, and $y^{i}\in \mathbb {R}^{C}$ is the corresponding label for $X^{i}$ of $C$ classes.

The source domain is the labeled EEG samples of one task and the target domain is the unlabeled EEG samples of the other task. Given a source domain $\mathcal {D}_{S}=\left \{{\left ({X_{s}^{i},y_{s}^{i} }\right),\cdots,\left ({X_{s}^{N_{s}},y_{s}^{N_{s}} }\right) }\right \}$ with $N_{s}$ labeled samples, and a target domain $\mathcal {D}_{T}=\left \{{ X_{T}^{1},{\cdots,X}_{T}^{N_{t}} }\right \}$ with $N_{t}$ unlabeled samples, domain adaptation aims to predict the labels ${y_{t}^{1},\cdots,y}_{t}^{N_{t}}$ corresponding to the inputs $X_{T}^{1},{\cdots,X}_{T}^{N_{t}}$ in the target domain, using the learned knowledge from the source domain. Generally, domain adaptation assumes the source and target domains have the same feature spaces and label spaces, i.e., $X_{S}{,X}_{T}\in \mathbb {R}^{(E\times d)}$ and $y_{S},y_{T}\in \mathbb {R}^{C}$ , but the marginal probability distribution and/or conditional probability distribution are different, i.e., $P(X_{S})\ne P(X_{T})$ and/or $P\left ({y_{S}\thinspace \vert \thinspace X_{S}}\right) \ne P\left ({y_{T}\thinspace \vert \thinspace X_{T}}\right)$ [22]. Specifically, in the paper, we follow the key assumption in most domain adaptation methods, that the source and target domains have different marginal probability distributions and different conditional probability distributions.

In the following, we will briefly introduce four domain adaptation methods used in the paper. These methods mainly focus on the shared feature representation and minimization of the distribution discrepancy.

A. Transfer Component Analysis

Transfer component analysis (TCA) tries to reduce the distribution discrepancy via embedding the source and target domains into a shared low-dimensional feature space and learning a set of transfer components [23]. TCA can be seen as a dimensionality reduction method. To achieve this goal, Pan et al. proposed to find a transformation function $\phi (\cdot)$ such that the ${P(\phi (X}_{s}))\approx {P(\phi (X}_{t}))$ and $P\left ({y_{s}\thinspace \vert \thinspace {\phi (X_{s})}}\right) \approx P\left ({y_{t}\thinspace \vert \thinspace {\phi (X_{t})}}\right)$ . Such $\phi (\cdot)$ should be satisfied that, it can reduce the distance between the domain distribution (e.g., the empirical means) and preserve the variance of the original data at the same time. By adopting the empirical Maximum Mean Discrepancy (MMD) [24] as the distance measure, the final mathematic representation of TCA is $\begin{align*}&\min \limits _{W}~{tr(W^{T}KLKW)}+\mu tr(W^{T}W), \\&{\it s.t.} ~W^{T}KHKW=I_{m},\tag{1}\end{align*}$ View Source where the first term is the MMD distance, i.e., $\left \Vert{ \! \frac {1}{N_{s}}\sum \nolimits _{i=1}^{N_{s}} {{W^{T}\phi \big(X}_{i}\big)\!-\!\frac {1}{N_{t}}\sum \nolimits _{j\!=\!1}^{N_{t}} {W^{T}{\phi \big(X}_{j}\big)}} \!}\right \Vert ^{2}{\!=\! tr\big(W}^{T}\!KLKW\big)$ , $K=[{\phi (X}_{i})^{T}{\phi (X}_{j})]$ is the kernel matrix defined on all the data. $L_{ij}=1/N_{s}^{2}$ , if $X_{i},X_{j}\in X_{s}$ , else $L_{ij}=1/N_{t}^{2}$ , if $X_{i},X_{j}\in X_{t}$ , otherwise $L_{ij}=-1/N_{s}N_{t}$ . $W\in \mathbb {R}^{(N_{s}+N_{t})\times m}$ is the transformation matrix that transforms the kernel matrix $K$ to an m-dimension space ( $m\ll N_{s}+N_{t}$ ). A regularization term $tr(W^{T}W)$ is added to control the complexity of the $W$ , $\mu >0$ is the trade-off parameter. $W^{T}KHK^{T}W=I$ is the constraint to keep the structure of the original data, where $H=I_{N_{s}+N_{t}}-(1/(N_{s}+N_{t}))\mathbf {1}\mathbf {1}^{T}$ is the centering matrix, $I_{N_{s}+N_{t}}\in \mathbb {R}^{(N_{s}+N_{t})\times (N_{s}+N_{t})}$ is the identity matrix, and $\mathbf {1}\in \mathbb {R}^{N_{s}+N_{t}}$ is the column vector with all 1’s. Finally, $W$ is composed of the $m$ leading components of the matrix $(KLK+\mu I)^{-1}KHK$ .

B. Joint Distribution Adaptation

Joint distribution adaptation (JDA) is proposed to adopt both the marginal and conditional distributions in a principled dimensionality reduction procedure [25], $\begin{align*}&\hspace {-0.5pc}{D}\left ({{\mathcal {D}_{s},\mathcal {D}}_{T} }\right)\approx \mathrm {D}\left ({P\left ({X_{S} }\right),P\left ({X_{T} }\right) }\right) \\&\qquad\qquad\qquad\qquad\quad+\,\mathrm {D(}P\left ({y_{S}\thinspace \vert \thinspace X_{S}}\right),P\left ({y_{T}\thinspace \vert \thinspace X_{T}}\right)),\tag{2}\end{align*}$ View Source

Similar to TCA, JDA tries to find a transformation matrix $W$ , such that ${P(W^{T}X}_{s})\approx {P(W^{T}X}_{t})$ and $P\left ({y_{s}\thinspace \vert \thinspace {W^{T}X}_{s}}\right) \approx P\left ({y_{t}\thinspace \vert \thinspace {W^{T}X}_{t}}\right)$ . Marginal distributions can be constructed by the TCA method, however, it is nontrivial to adopt the conditional distributions since there are no labeled data in the target domain. Long et al. proposed to use the pseudo labels of the target data and class-conditional distributions ( $P\left ({X_{s}\thinspace \vert \thinspace {y_{s}=c}}\right)$ and $P\left ({X_{t}\thinspace \vert \thinspace {y_{t}=c}}\right)$ , $c\mathrm {\epsilon }\left \{{1,\cdots,C }\right \}$ is in the label space) to approximate conditional distributions. Since the pseudo labels may be unreliable, we can iteratively update them. The JDA optimization problem is $\begin{align*}&\min \limits _{\text {W}}~tr\left ({W^{T}KLK^{T}W }\right)+\sum \nolimits _{c=1}^{C} tr(W^{T}KM_{c}K^{T}W) \\&\hphantom {\min \limits _{\text {W}}~}+\,\lambda \left \Vert{ W }\right \Vert _{F}^{2}, \\&{\it s.t.} ~W^{T}KHK^{T}W=I,\tag{3}\end{align*}$ View Source where the first term is the MMD distance between marginal distribution, the second term $\left \Vert{ \! 1 \mathord {\left /{ {\vphantom {1 n_{s}^{\left ({c }\right)}}} }\right. } n_{s}^{\left ({c }\right)}\sum \nolimits _{X_{i}\in \mathcal {D}_{s}^{\left ({c }\right)}} {{W^{T}\phi (X}_{i})\!-\!1 \mathord {\left /{\!\! {\vphantom {1 n_{t}^{\left ({c }\right)}}} }\right. } n_{t}^{\left ({c }\right)}\sum \nolimits _{X_{j}\in \mathcal {D}_{t}^{\left ({c }\right)}} {W^{T}{\phi (X}_{j})}} }\right \Vert ^{2}\!=\!tr(W^{T}KM_{c}KW)$ is the MMD distance between the class-conditional distribution, where $\mathcal {D}_{s}^{(c)}=\left \{{X_{i}:X_{i}\epsilon \mathcal {D}_{s}\wedge y\left ({X_{i} }\right)=c }\right \}$ is the set of examples belonging to class c in the source data, $y\left ({X_{i} }\right)$ is the true label of $X_{i}$ , and $n_{s}^{(c)}=\left \vert{ \mathcal {D}_{s}^{(c)} }\right \vert$ ; $\mathcal {D}_{t}^{(c)}=\left \{{X_{j}:X_{j}\epsilon \mathcal {D}_{t}\wedge \hat {y}\left ({X_{j} }\right)=c }\right \}$ is the set of examples belonging to class c in the target data, $\hat {y}\left ({X_{j} }\right)$ is the pseudo labels of $X_{j}$ , and $n_{t}^{(c)}=\left \vert{ \mathcal {D}_{t}^{(c)} }\right \vert$ ; ${(M_{c})}_{ij}=1/(n_{s}^{\left ({c }\right)}n_{s}^{\left ({c }\right)})$ , if $X_{i},X_{j}\in \mathcal {D}_{s}^{(c)}$ ; else ${(M_{c})}_{ij}=1/(n_{t}^{\left ({c }\right)}n_{t}^{\left ({c }\right)})$ , if $X_{i},X_{j}\in \mathcal {D}_{s}^{(c)}$ ; else ${(M_{c})}_{ij}=-1/(n_{s}^{\left ({c }\right)}n_{t}^{\left ({c }\right)})$ , if $X_{i}\in \mathcal {D}_{s}^{\left ({c }\right)},\mathrm { }X_{j}\in \mathcal {D}_{t}^{\left ({c }\right)}$ or $X_{j}\in \mathcal {D}_{s}^{\left ({c }\right)},X_{i}\in \mathcal {D}_{t}^{\left ({c }\right)}$ ; otherwise ${(M_{c})}_{ij}=0$ . The third term is the regularization term. The $W$ can be obtained by the following generalized eigen-decomposition problem and composed of its d smallest eigenvectors, $\left ({K\left({L+\sum \nolimits _{c=1}^{C} M_{c} }\right)K^{T}+\lambda I }\right)W=KHK^{T}W\Phi$ , where $\Phi =(\mathrm {\phi }_{1},\cdots,\phi _{d})$ are the Lagrange multipliers.

C. Balanced Domain Adaptation

As mentioned above, the TCA method only considers the marginal distribution of source and target domains, and JDA considers both the marginal and conditional distributions. Though JDA may have more information used, it assumes the marginal and conditional distributions are dedicating identically to the domain divergence, which is not practical in real-world applications. To address this problem, Wang et.al [26] proposed a balanced domain adaptation (BDA) by using a balance factor $\mu$ to exploit the different importance of distributions, $\begin{align*}&\hspace {-0.5pc}{D}\left ({{\mathcal {D}_{s},\mathcal {D}}_{T} }\right)\approx \left ({1-\mu }\right){D}\left ({P\left ({X_{S} }\right),P\left ({X_{T} }\right) }\right) \\&\qquad\qquad\qquad\qquad+\,\mu {D}(P\left ({y_{S}\thinspace \vert \thinspace X_{S}}\right),P\left ({y_{T}\thinspace \vert \thinspace X_{T}}\right)),\tag{4}\end{align*}$ View Source where $\mu \in \left [{ 0,1 }\right]$ , when $\mu \to 0$ , the marginal distribution is more important to adapt, when $\mu \to 1$ , the conditional distribution is more important. The optimization problem of BDA is $\begin{align*}&\min \limits _{W}~tr\left ({W^{T}K\left ({\left ({1-\mu }\right) }\right)L+\mu \sum \nolimits _{c=1}^{C} M_{c}\big)K^{T}W }\right) \\&\hphantom {\min \limits _{\text {W}}~}+\,\lambda \left \Vert{ W }\right \Vert _{F}^{2}, \\&{s.t.} ~W^{T}KHKW=I,\quad \mu \in \left [{ 0,1 }\right].\tag{5}\end{align*}$ View Source

The $W$ is obtained by solving the following problem, $\left ({K\left ({\left ({1-\mu }\right) }\right)L+\mu \sum \nolimits _{c=1}^{C} M_{c}\big)K^{T}+\lambda I }\right)W=KHK^{T}W\Phi,$ and composed of its $d$ smallest eigenvectors, where $\Phi =(\mathrm {\phi }_{1}\mathrm {,\cdots,}\mathrm {\phi }_{\mathrm {d}})$ are the Lagrange multipliers.

D. Transfer Joint Matching

When the source and target domains are different in both feature distribution and samples relevance, Long et al. [27] proposed Transfer Joint Matching (TJM) to cope with this setting. TJM aims to reduce the domain discrepancy by simultaneously matching the marginal feature distributions and reweighting the source samples across domains in a principled dimensionality reduction procedure, and construct a new feature representation that is stationary to both the marginal distribution discrepancy and the irrelevant samples, $\begin{align*}&\min \limits _{W}~{tr\left ({W^{T}KLK^{T}W }\right)+\lambda (\left \Vert{ W_{s} }\right \Vert _{2,1}+\left \Vert{ W_{t} }\right \Vert _{F}^{2})}, \\&{\it s.t.} ~W^{T}KHK^{T}W=I,\tag{6}\end{align*}$ View Source here, the first term is the MMD distance between marginal distribution, the second is the regularizer of sample reweighting, including the row-sparsity for source sample in the sample space. Followed with TCA, the constraint is retaining the structure of the original data. So, the new feature representation is $Z=W^{T}K$ . Finding the optimal adaptation matrix $W$ is reduced to solving following generalized eigen-decomposition for the $d$ smallest eigenvectors, $\left ({KLK^{T}+\lambda G }\right)W=KHK^{T}W$ , $\phi =(\phi _{1},\cdots,\phi _{d})$ are the Lagrange multipliers, and $G$ is a diagonal sub-gradient matrix. We recommend the readers refer to the detailed descriptions of TJM [27].

SECTION III.

Data and Experiment

A. Subjects and Experiment Setup

In this study, we have invited 45 college students (28 males and 17 females, aged 20 to 30, mean age of 24.6 ± 6.6) to participate in EEG experiments at the iBRAIN laboratory of Nanjing University of Aeronautics and Astronautics. Except for one subject, all others were right-handed. All subjects have fulfilled the inclusion criterion that having a normal or corrected-to-normal vision. They did not suffer from any condition that may cause anxiety or fatigue. Participants were required to keep away from caffeine, medication, and alcohol, and had a normal amount of eight sleep hours before the experiment. After the explanation of the experimental protocol, we have obtained the signed written consent from all subjects.

Each subject undertook three tasks, including resting state with eye closed and eye open, WM task, and MA task. The general experimental design is displayed in Figure 2(A).

Fig. 2.

The experimental design in this paper, including (A) the general task design, (B) the design of WM task and MA task, and (C) a detailed trial of both tasks at a difficulty level of L3.

Show All

In the WM task, subjects needed to remember different numbers of stimuli groups (here we choose English letter sequences as stimuli), maintaining for 2 seconds, and determine whether the shown English letter had existed in the memorization sequences before [28]. The WM task consists of seven groups, and each group has 20 trials, 30% of them are target stimuli. We set the difficulty levels of seven, from the very low (L1), low (L2), medium (L3), medium-high (L4), high (L5), very high (L6) to extreme high (L7), with the length of sequences as 1, 2, 4, 6, 8, 10, 12, respectively. In each group, the order of levels was randomly presented. After a blank of 1s, the stimulus was presented for 2s followed by a fixation cross during the interval of 3s, then a judgment time of 2s.

In the MA task, the subjects needed to retain the results of the currently shown addition formula (e.g., 5+7) and identify whether the given number (e.g., 11) matched the result they calculated before. This task involves temporal storage of intermediate results and information retrieval held in the cognitive workspace [11]. Similarly, the MA task consists of seven groups, and each group has 20 trials, 30% of them are target stimuli. We set the difficulty levels of seven, corresponding with the various digits and carries of addition. In each group, the order of levels is randomly presented. After a blank of 1s, each addition was presented for 2s followed by a fixation cross for 3s, while the answer is displayed for 2s.

In Figure 2(B), we display the detailed setup for each cognitive task, and Figure 2(C) shows a detailed trial of both tasks at a difficulty level of L3. In Table I, we list the level design and the examples of various difficulty levels.

TABLE I Details of the Task Design for All Seven Levels in the WM Task and MA Task

All task stimuli were shown on a computer screen in white font on a black background. Participants were instructed to focus on both accuracy and speed and practice five trials before the EEG recordings were implemented for both tasks. Notably, mental fatigue may be induced during the experiment and it may contaminate the EEG quality. To avoid this, we first randomized the group order in each task. Then, we randomized the order of difficulty levels in each group. Also, we asked the subjects to rest before each task to ease the fatigue. The task paradigms were carried out in E-Prime 2.0 software [29].

B. Data Acquisition and Preprocessing

A portable wireless EEG amplifier (NeuSen. W64, Neuracle, China) was used for EEG data recording at a sampling rate of 1000 Hz. Fifty-nine electrodes were arranged according to the international 10–20 system, with reference at CPz and a forehead ground at AFz. Electrode impedances were kept below 5 $\text{k}\Omega$ for all electrodes throughout the experiment. In addition to the EEG signals, the reaction time and the answer accuracy of the subjects were also recorded as objective behavior data to evaluate the cognitive workload levels.

A widely-used EEG preprocessing pipeline was adopted in the current study, using EEGLAB software [30]. Specifically, the raw EEG data were firstly re-referenced to the average of the electrodes, then, band-pass filtered to 0.1-70 Hz to eliminate noise, and additionally a 50 Hz notch filtered to reduce the power line noise. The filtered data were down-sampled to 256 Hz. Signals were then baseline adjusted and segmented into 2s epochs after stimulus onset. Eye-blink and muscle-related artifacts were removed via Independent Components Analysis by rejecting the components that were highly correlated with those artifacts. Here we used ADJUST [31] and ICLabel [32] tools to mark the components. Due to high noise contamination, seven subjects were excluded, thus leaving 38 subjects (25 males and 13 females, mean age of 24.4 ± 5.9 years) for the subsequent analysis. For these subjects, the bad epochs with noise were further removed by visual inspection, thus leaving 4554 and 4681 trials (each is 2s length) for the WM task and MA task, respectively. In Table II, we list the corresponding numbers of the workload levels with two tasks.

TABLE II The Number of Samples of Different Tasks

C. Feature Extraction

For feature extraction, we have extracted power spectral density (PSD) features from the spectral dimension [33] and coherence features from the brain connectivity network [34], which are both widely used in EEG analysis [35], and combined them as the final feature set.

To be specific, the PSD and coherence features are extracted from 5 clinical frequency bands ( $\delta$ [1~3 Hz], $\theta$ [4~7 Hz], $\alpha$ [8~13 Hz], $\beta$ [14~30 Hz], and $\gamma$ [31~50 Hz]), and PSD is extracted based on short-time Fourier transform. Short-time Fourier transform is a time-frequency domain decomposed method, and it decomposes signals into small sequential data frames with shifting windows, and a fast Fourier transform is then used to each frame [36]. We use Hanning windows with 0.5s window length (total is 2s) and no overlap. The calculation formula of PSD is $PSD=\frac {1}{NF_{s}}\left \vert{ \sum \nolimits _{n=1}^{N} {x\left [{ n }\right]w\left [{ n }\right]e^{-j2\pi fn/F_{s}}} }\right \vert ^{2}$ , where $x\left [{ n }\right]$ is the EEG trials, $n=1,2,\cdots,N,F_{s}$ is the sampling rate, $w\left [{ n }\right]$ is the window function. The total dimension of the PSD feature of a 59-channel EEG segment is 59 channels $\times$ 4 windows $\times$ 5 bands = 1180.

Besides the PSD feature, functional connectivity network can be used for measuring the relationship between different EEG electrodes [35] or functional brain regions [37]–[39]. EEG-related functional connectivity networks can be constructed by coherence [40]. The formula for coherence is $Coh_{xy}(\omega)=\frac {\left \vert{ P_{xy}(\omega) }\right \vert }{P_{xx}(\omega)P_{yy}(\omega)}$ , where $P_{xy}(\omega)$ is the cross-spectrum of channel signals $x$ and $y$ , and $P_{xx}(\omega)$ , $P_{yy}(\omega)$ are the power spectrum of $x$ and $y$ , respectively. Since the connection matrix is symmetric, i.e., the upper triangle is the same as the lower triangle, we thus remove the lower triangle entries. Since the main diagonal entries are both self-connections, we also remove them. The total dimension of coherence feature of a 59-channel EEG segment is $59\times$ ( $59-1$ )/ $2\times 5$ bands $=8555$ . Finally, we combine the PSD and coherence as the final features (total 9735).

D. Workload Classification

We aim to build EEG-based cross-task cognitive workload recognition models using domain adaptation techniques. Here, during training for cross-task transfer, the samples of the labeled source data from task A are used to predict the unlabeled target data in task B. Considering the different application scenarios, we define three kinds of transfer schemes with different focuses.

One-to-one cross-task transfer (denoted as $\boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}}$ ): is the transfer considering the intra-subject task variability, where we use the data from each subject to complete the cross-task cognitive workload recognition. Given one subject, the source domain is the data of task A, and the target domain is the data of task B of the same person.
Many-to-one cross-task transfer (denoted as $\boldsymbol {\mathcal {M}}\to \boldsymbol {\mathcal {O}}$ ): is the transfer focusing on the source combination, where the source domain is the data of task A of all the subjects, and the target domain is the data of task B of each subject. It makes sense when multiple existing subjects are available and it might have more information than $\boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}}$ .
Various one-to-one cross-task transfer (denoted as $\boldsymbol {V} \boldsymbol {\mathcal {O}}\to \boldsymbol {\mathcal {O}}$ ): which considers the subject variabilities, where the target domain is the task B of one subject, the source domain is the enumerated task A from the other subjects.

We thus focus on evaluating the domain adaptation methods for a common binary classification task. We take levels L1, L2, L3 as low workload, and levels L5, L6, L7 as high workload.

Here we adopt widely used classifiers for baseline comparison, including supporting vector machine (SVM) with radial basis function (RBF) kernel, K-nearest neighbor (KNN), linear discriminative analysis (LDA), and single hidden-layer artificial neural network (ANN). These methods also are non-transfer methods. SVM is constructed based on the RBF kernel with a soft margin parameter C. The parameter of KNN is the number of neighbors. Note that this paper only presents the best SVM and KNN results. The LDA classifier is used with the default setting provided by MATLAB. ANN is used with 10 hidden units. Based on the preliminary results, we then use SVM with RBF kernel as the base classifier for domain adaptation methods. For the TCA, JDA, and TJM methods, the hyper-parameters are the dimension of subspace $d$ , the kernel function, RBF kernel-based $\gamma$ , trade-off parameter $\lambda$ ; for BDA, it also includes the balance factor $\mu$ . The related hyper-parameters are summarized in Table IV. We first use the MA task and WM task as source domain (training data) and target domain (testing data), respectively, as MA $\to$ WM. Then, we select data in reverse order, as WM $\to$ MA. So, for each method, we repeat the classification process 228 times (38 subjects $\times$ 2 orders $\times$ 3 conditions).

TABLE III The Design of Transfer Schemes

TABLE IV Details of Hyper-Parameter for the Used Methods

SECTION IV.

Experimental Results

The results for recognizing cognitive workload consist of two parts. First, behavioral data including response time and answer accuracy, and event-related spectral perturbation (ERSP) are analyzed to validate the different cognitive workload levels induced by WM and MA tasks. Second, the binary classification results of cross-task workload models are presented, using accuracy and F1 score as performance evaluation metrics.

A. Behavioral Results

We use the WM task and MA task to explore cross-task cognitive workload recognition. In the experiment, we have recorded the response time and answer accuracy as behavior data. Due to the non-block experimental design and random order of experimental stimulus representation [41], we did not collect the subjective measurements, such as the NASA-TLX questionnaire. Figure 3 presents the behavior data to validate the effectiveness of these two types of cognitive tasks and the corresponding one-way analysis of variance (ANOVA) results. This experiment confirms that easy and hard tasks are distinguishable.

$Fig. 3. - Results of the one-way ANOVA of the behavioral data. Bars represent mean ± standard error. WM task: working memory task; MA task: mathematic addition; RT: Response Time; $^{\ast \ast }$ indicates p < 0.01; Games-Howell’s correction was used for multiple comparisons.$

Fig. 3.

Results of the one-way ANOVA of the behavioral data. Bars represent mean ± standard error. WM task: working memory task; MA task: mathematic addition; RT: Response Time; $^{\ast \ast }$ indicates p < 0.01; Games-Howell’s correction was used for multiple comparisons.

Show All

For both the MA and WM tasks, when the difficulty levels increase, the subjects perform worse as in Figure 3(A) and (B), and take longer to provide the answers as in Figure 3(C) and (D). For the MA task, the accuracy difference between levels 1, 2, 3 and levels 4, 5, 6, 7 (p < 0.01), and the difference between levels 4, 5, 6, and level 7 (p < 0.01) are significant; the difference between levels 1, 2, 3 and the difference between levels 4, 5, 6 are not significant. For response time, the difference between levels 1, 2, 3 and levels 5, 6, 7 (p < 0.01), and the difference between level 4 and level 7 (p < 0.01) are significant; the difference between levels 1, 2, 3, 4, the difference between levels 4, 5, 6 and the difference between levels 5, 6, 7 are not significant. For the WM task, the accuracy difference between levels 1, 2, 3, 4, and levels 5, 6, 7 (p < 0.01) are significant; the difference between levels 1, 2, 3, 4, and the difference between levels 5, 6, 7 are not significant. For response time, the difference between levels 1, 2 and levels 5, 6, 7 (p < 0.01), and the difference between level 1 and levels 3, 4 (p < 0.01) are significant; the difference between levels 1, 2, the difference between levels 3, 4, and the difference between levels 4, 5, 6, 7 are not significant.

We analyze EEG using the event-related spectral perturbation (ERSP) [20], [42] and perform the one-way ANOVA with 2000 permutations for statistical testing. ERSP provides detailed information on event-related desynchronization/ synchronization and can visualize the mean power changes [43]. In Figure 4, we analyze the ERSP maps for each cognitive task, for each of the 7 stimulus conditions in all 59 electrodes, and in the four EEG frequency-bands ( $\theta$ , $\alpha$ , $\beta$ , and $\gamma$ ) during the 0-1000ms after task stimulus, using EEGLAB [30].

Fig. 4.

Results of the one-way ANOVA for ERSP analysis. WM task: working memory task; MA task: mathematic addition. For different levels, blue indicates event-related desynchronization, red indicates event-related synchronization. The last column represents p values, with redder color indicating the stronger significance; FDR correction was used for multiple comparisons.

Show All

For the WM task, as shown in Figure 4(A), in $\theta$ band, levels 1 and 2, levels 4,5,6, and 7 have similar ERSP patterns, respectively. In $\alpha$ and $\beta$ band, level 1 to level 7 have similar ERSP. Significant differences between level 1 to level 7 are confined in $\theta$ band with frontal, right temporal, and occipital brain regions (p < 0.05), whereas the difference in the $\alpha$ , $\beta$ , and $\gamma$ bands are not significant.

For the MA task, as shown in Figure 4(B), in $\theta$ band, levels 1 and 2, levels 4,5, and 6 have similar ERSP patterns, respectively. In $\alpha$ and $\beta$ band, level 1 to level 7 have similar ERSP patterns. Significant differences between level 1 to level 7 are confined in $\theta$ band with frontal and left occipital brain regions (p < 0.05), and $\gamma$ band with occipital, and central brain regions (p < 0.05), whereas the difference in the $\alpha$ and $\beta$ bands are not significant. Furthermore, the ERSP distributions between the WM task and MA task have a similar pattern in the $\theta$ , $\alpha$ , and $\beta$ bands.

B. Classification Results

Table V displays the classification accuracy of different methods and Supplementary Table S1 validates the differences between different methods. Here, SVM, LDA, KNN, and ANN are non-transfer methods, whereas TCA, JDA, BDA, and TJM are domain adaptation methods with SVM as the classifier.

TABLE V The Classification Accuracy (%) With Mean and Standard Deviation

In ${\mathcal {O}}\to {\mathcal {O}}$ , ${\mathcal {M}}\to {\mathcal {O}}$ , and $V{\mathcal {O}}\to {\mathcal {O}}$ transfers, we find the proposed TJM method has the best average accuracy both for the MA $\to$ WM and WM $\to$ MA. For non-transfer methods, SVM performs better than other methods. Specifically, in ${\mathcal {O}}\to {\mathcal {O}}$ task transfer, compared TJM with SVM, the average accuracy is increased by 8.36% in MA $\to$ WM, 5.76% in WM $\to$ MA, 7.06% on average. In $\mathcal {M}\to \mathcal {O}$ task transfer, compared with SVM, the average accuracy of TJM is increased by 7% in MA $\to$ WM, 4.81% in WM $\to$ MA, 5.91% on average. When comparing ${\mathcal {O}}\to {\mathcal {O}}$ to $\mathcal {M}\to \mathcal {O}$ , we find SVM, ANN, and all domain adaptation methods have an increasing trend as having more source data samples. In $V{\mathcal {O}}\to {\mathcal {O}}$ task transfer, we have averaged the classification results of various source domains for each target domain, and have reported the average classification results of all the target domains as the final results. Compared with SVM, the average accuracy of TJM is increased by 3.59% in MA $\to$ WM, 3.17% in WM $\to$ MA, 3.38% on average. When comparing the ${\mathcal {O}}\to {\mathcal {O}}$ to ${\mathcal {M}}\to {\mathcal {O}}$ , the improvement of ${\mathcal {M}}\to {\mathcal {O}}$ design might come from the more training data; for ${\mathcal {M}}\to {\mathcal {O}}$ design, the proposed transfer methods outperform the non-transfer methods, we argue the improvement of this result might come from the proposed domain adaptation frameworks. Comparing the ${\mathcal {O}}\to {\mathcal {O}}$ , $V{\mathcal {O}}\to {\mathcal {O}}$ , and ${\mathcal {M}}\to {\mathcal {O}}$ classification results, we have found that the $V{\mathcal {O}}\to {\mathcal {O}}$ setting has lower classification accuracy than the other two, mainly due to the huge subject variabilities existing in this way. It indicates the importance to reduce the effect of subject variabilities.

Supplementary Table S1 validates the differences between different methods using Dunn’s test with multiple comparison corrections. In both ${\mathcal {O}}\to {\mathcal {O}}$ and $\mathcal {M}\to \mathcal {O}$ designs, the difference between TJM and non-transfer methods (SVM, LDA, KNN, and ANN) are significant (p < 0.05); the difference between all transfer methods to KNN and LDA are significant (p < 0.05). For domain adaptation, TCA versus BDA, JDA versus TJM have no significant difference (p > 0.05). In $V{\mathcal {O}}\to {\mathcal {O}}$ setting, the difference between non-transfer methods (SVM, LDA, KNN, and ANN) versus JDA and TJM are significant (p < 0.05). For domain adaptation, the difference between TCA versus TJM, BDA versus TJM are significant (p < 0.05), whereas JDA versus TJM, JDA versus BDA have no significant difference (p >0.05). To further display the performance, in Figure 5, we show the accuracy distributions across different algorithms, using box-plot figures. Besides, Supplementary Table S2 shows the F1 score results, which have similar results in Table V. We also provide additional case 1 and case 2 settings for binary cross-task cognitive workload recognition, here, in case 1, we take level L1 as low workload and level L7 as high; in case 2, we take levels L1, L2 as low workload, levels L6, L7 as high. Supplementary Tables S3 and S4 show the classification accuracy and F1 score results for both case 1 and case 2. In sum, the proposed transfer framework has better performance and significant improvements than non-transfer methods, especially for TJM. As such, both cognitive tasks generate positive transfers in the proposed cross-task framework.

$Fig. 5. - Accuracy distributions for each algorithm and differences between the TJM and the other methods. Here, (A) and (B) are in $\mathcal {O}\to \mathcal {O}$ design, (C) and (D) are in $\mathcal {M}\to \mathcal {O}$ design, (E) and (F) are in $V\mathcal {O}\to \mathcal {O}$ design. The first column is trained on the MA task and tested on the WM task, the second column is in reverse order. $^{\ast \ast }$ indicates p < 0.01; $^\ast $ indicates p < 0.05; Dunn’s test with multiple comparison correction was used.$

Fig. 5.

Accuracy distributions for each algorithm and differences between the TJM and the other methods. Here, (A) and (B) are in $\mathcal {O}\to \mathcal {O}$ design, (C) and (D) are in $\mathcal {M}\to \mathcal {O}$ design, (E) and (F) are in $V\mathcal {O}\to \mathcal {O}$ design. The first column is trained on the MA task and tested on the WM task, the second column is in reverse order. $^{\ast \ast }$ indicates p < 0.01; $^\ast$ indicates p < 0.05; Dunn’s test with multiple comparison correction was used.

Show All

SECTION V.

Discussion

The discussion of cross-task cognitive workload recognition consists of five parts. First, the classification tasks with combinations of different source numbers and the dimensions of subspace for the TJM are conducted in Figure 6. Second, accuracy comparisons between PSD and coherence are employed in Figure 7. Third, PSD and coherence are employed separately to evaluate their impacts in Figures 8 and 9. Fourth, a nonlinear dimensionality reduction is used to visualize the feature representation learned after TJM in Figure 10. Finally, the limitations and future work are presented.

Fig. 6.

Accuracy for TJM algorithm with different source combinations, where (A) is trained on MA task and tested on WM task, (B) is reverse, and d is the dimension number of TJM.

Show All

$Fig. 7. - Accuracy comparison between PSD and coherence (COH) features in a $\mathcal {M}\to \mathcal {O}$ design. Bars represent mean ± standard error. Here, the mean is the average accuracy of MA $\to $ WM and WM $\to $ MA transfers. Paired t-test is used for feature comparisons, and $^{\ast \ast }$ indicates p < 0.01.$

Fig. 7.

Accuracy comparison between PSD and coherence (COH) features in a $\mathcal {M}\to \mathcal {O}$ design. Bars represent mean ± standard error. Here, the mean is the average accuracy of MA $\to$ WM and WM $\to$ MA transfers. Paired t-test is used for feature comparisons, and $^{\ast \ast }$ indicates p < 0.01.

Show All

Fig. 8.

The PSD distribution in low and high workload, where the upper is for WM task and the lower is for MA task. The color bar below indicates the mean PSD for each frequency band.

Show All

Fig. 9.

The coherence variance distribution of low and high workload. (A) The upper is for the WM task and the lower is for the MA task. The color bar indicates the coherence variation, where the redder color represents an increasing trend and the bluer represents a decreasing trend. (B) We then show features distribution for the five frequency bands in the pie charts and the channel location in the radar charts for each task.

Show All

Fig. 10.

Feature visualization by t-SNE. Here, (A) and (B) are the features of the original distribution, (C) and (D) are the features after TJM. The first column is trained on the MA task and tested on the WM task, the second column is in reverse order. Light colors denote features from the source domain and deep colors represent features from the target domain. For better visualization, we also highlight features from different categories with different colors (i.e., red and blue). S is Source, and T is Target domain.

Show All

A. Parameter Sensitivity

To figure out the best source combination number in ${\mathcal {M}}\to {\mathcal {O}}$ design, we also conduct classification tasks using the TJM method with different source combination numbers from [4, 6, 8, $\cdots$ , 38] and the dimension of subspace from [40, 60, 80, 100], as shown in Figure 6. Here, (A) is trained on the MA task and tested on the WM task, (B) is in reverse, and d is the dimension number of TJM. For (A), the highest accuracy is obtained at source number N_S = 28, and d = 80. For the MA $\to$ WM task, when the N_S < 20, the overall trend is increasing, and the low-dimensional d can obtain acceptable performance, such as d = 40 in (A); when N_S > 20, the bigger d shows more stable power. For (B), the highest accuracy is obtained at source number N_S = 36, and d = 80. For the WM $\to$ MA task, the accuracy trend is first decreasing at N_S = 16, then increasing. Similarly, when N_S > 16, the higher d shows a stable increase. Since we direct combine data in all the source domains as a new source domain, the training time of ${\mathcal {M}}\to {\mathcal {O}}$ design is longer than ${\mathcal {O}}\to {\mathcal {O}}$ . The above results show that the multi-source selection might be useful for fast cognitive workload recognition.

B. Features Comparisons

To evaluate the different features on the results of the TJM method, we repeat the classification task using PSD and coherence features separately. We set the kernel function as linear, the dimension of subspace as 80, and the other experimental settings stay the same. Accuracy comparisons between PSD and coherence are displayed in Figure 7.

Coherence has an advantage over PSD for about 4.3% and 7.1% increments in the MA $\to$ WM and WM $\to$ MA transfers, respectively (p < 0.01, paired t-test). When compared with the average accuracy of the two transfers, the coherence has an advantage over PSD with about a 5.7% increment. In sum, in terms of the mean accuracy, the coherence performs better than PSD, although these two features have different feature numbers. Previous studies have shown that brain activity is usually completed by multiple brain regions [34], [35]. We argue that the functional connectivity feature (i.e., coherence used in the study) can better reflect the correlation between EEG electrodes, so coherence achieves better classification results than PSD.

C. Feature Importance

We also investigate the importance of the original features for both two tasks. In Figure 8, we show the PSD distributions in low and high workload conditions with the mean PSD value of each channel across 5 frequency bands. Here, the whole EEG brain is divided into the frontal, central, occipital, parietal, and temporal brain regions. We find the PSD distributions in the same task have a similar pattern but vary between the two tasks. For the WM task, with the workload increment, the $\theta$ frequency band in the frontal and central brain regions has an increased PSD; the $\alpha$ and $\beta$ bands in the parietal and occipital brain regions both have a decreased PSD; the $\gamma$ band in the frontal and occipital brain regions has an increased PSD. For the MA task, the $\theta$ and $\alpha$ bands in the temporal regions have similar PSD distribution; the $\beta$ band in the occipital region has a decreased PSD with the workload increment.

In Figure 9, we show the coherence variance distribution of low and high workloads. To obtain the coherence variance distribution, we first use the mean coherence features of high workload to minus the low workload in corresponding frequency bands. In Figure 9(A), the upper is for the WM task and the lower is for the MA task. As we can see, the coherence variance distributions have a similar pattern for both tasks, where $\delta$ band in frontal regions has great importance and the coherence values have an increasing trend, whereas the coherence values in $\beta$ and $\alpha$ bands from central and occipital regions have a decreasing trend. To further find the workload-related frequency bands and channels, we then use a t-test to select the most significate top 200 features (p < 0.01). In Figure 9(B), we show features distribution for the five frequency bands in the pie charts and the channel locations in the radar charts for each task. The majority of coherence features in the $\delta$ band included frontal, central, and occipital regions, following with theta and alpha bands. The PSD features and coherence features describe the EEG characteristics from the spectral and spatial connectivity perspectives, respectively, thus combining them may offer meaningful information.

D. Feature Visualization

To visualize the distributions of feature representation learned by TJM, we try to project the latent feature representations on a two-dimensional (2D) plane using $t$ -distributed stochastic neighbor embedding (t-SNE) in Figure 10. T-SNE is a nonlinear dimensionality-reduction method, aiming to sustain data structure in a low-dimensional space [44]. For briefness, we randomly select one target from the dataset in ${\mathcal {O}}\to {\mathcal {O}}$ design and draw the t-SNE maps on 2D space. Here, Figures 10 (A) and (B) are the features of the original distribution, (C) and (D) are the features after TJM. The first column is trained on the MA task and tested on the WM task, the second column is in reverse order. Light colors denote features from the source domain and deep colors represent features from the target domain. Red denotes low and blue denotes high workload class. Since we use a random order to display the experimental stimulus rather than the commonly-used block design, the classification task in our paper is harder to discriminate than the block design. As we can see from Figures 10 (A) and (B), originally, the samples belonging to different classes overlap substantially across domains, making it difficult to discriminate. In Figures 10 (C) and (D), the discrepancies between different tasks have been reduced, as the target samples are closer to the corresponding classes, which also makes it easy to classify. Although it is easier to classify the low workload and high workload samples in (C) and (D) than (A) and (B), we also find the classification boundaries after the TJM still have some overlaps. We argue that some hard-to-distinguish samples might be misclassified and mislead the classification boundaries in source domains, indicating the need to construct the effective and strong classifier in source domains.

E. Limitations and Future Work

In this paper, the experimental results show that domain adaptation methods in a ${\mathcal {M}}\to {\mathcal {O}}$ design have better performance than $\mathcal {O}\to \mathcal {O}$ design, indicating the multi-source domain adaptation would enhance the classification performance than a single source. We argue the ${\mathcal {M}}\to {\mathcal {O}}$ design may provide more information and more data for constructing a powerful and robust model than ${\mathcal {O}}\to {\mathcal {O}}$ design. However, we use traditional domain adaptation methods in the experiment. Besides, we focus on the binary classification task in the paper, whereas the experimental design can be used to classify seven workload levels. These findings provide the requisite deep transfer models necessary for recognizing the multi-class workload levels for real-world use, due to their effectiveness and strong power [45], especially for multi-source deep transfer learning [46].

SECTION VI.

Conclusion

In this paper, we have presented a preliminary study on EEG-based cross-task cognitive workload recognition using domain adaptation techniques, and a comparative study on a private EEG dataset. The cross-task workload recognition models have been constructed in ${\mathcal {O}}\to {\mathcal {O}}$ , ${\mathcal {M}}\to {\mathcal {O}}$ , and $V{\mathcal {O}}\to {\mathcal {O}}$ transfers. We have compared the performance of domain adaptation methods and conventional non-transfer classifiers. The experimental results have demonstrated the proposed task transfer framework outperforms the non-transfer classifier with an accuracy increase of 3% to 8%. Taking advantage of jointly considering the adaptation of data distributions and the weights of source samples, the TJM consistently achieves the best performance and significant improvements. The results provide the requisite deep and/or multi-source domain adaptation methods that might be necessary for recognizing the multi-class workload levels for real-world use.

References is not available for this document.

MIT Libraries

MIT Libraries

Cross-Task Cognitive Workload Recognition Based on EEG and Domain Adaptation

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction