Introduction
To perform actions/tasks, the brain relies on the simultaneous activation of many Functional Brain Networks (FBN), which are engaged in appropriate interaction to effectively execute the tasks. Such networks, potentially distributed over the whole brain, are defined as segregated regions exhibiting high functional connectivity. Connectivity is quantified via the underlying correlations among the associated activation/deactivation time patterns, referred to as time courses [1]. Functional Magnetic Resonance Imaging (fMRI) is the dominant data acquisition technique for the detection and study of FBNs [2]. fMRI measures the Blood Oxygenation Level-Dependent (BOLD) contrast [3], which tracks the evoked hemodynamic response of the brain to the corresponding neuronal activity. This process can be modeled as a convolution between the actual neuronal activation and a person-dependent impulse response function, called the Hemodynamic Response Function (HRF). The fMRI captures 3D images with a typical resolution of
In the case of block- or event-related experimental designs, i.e., when the subject is presented with a fixed set of conditions, the time courses associated with these experimental conditions are usually estimated as the convolution of the pre-defined stimuli of each condition with the canonical Hemodynamic Response Function (cHRF) [4]. Hereafter, such time courses are referred to as task-related time courses.
A prominent approach for fMRI data analysis is via Blind Source Separation (BSS), which is usually performed with the aid of appropriate matrix factorization schemes [5]. In general, BSS methods aim to discover the different sources from the fMRI data without the necessity of any prior information regarding the experimental task. This feature makes BSS-based methods the dominant tool for the analysis of resting-state fMRI data, which lack any prior external task-related information. Independent Component Analysis (ICA) [6]–[8], and Dictionary Learning (DL) are the most popular paths in this direction. A drawback of ICA is the underlying independence assumption, which can be violated in fMRI, especially in the presence of high spatial overlapping, [9]–[11]. Unlike ICA, DL relies on sparsity that is a reasonable assumption for the neuronal brain activity [12]–[15].
However, DL approaches are not without shortcomings. The tuning of the associated regularization parameters is not easy in practice, and it is performed via cross-validation techniques; however, in real experiments, this is not possible due to the lack of ground truth data. Therefore, the only way to fine-tune parameters is via visual inspection of the results, a process that requires the subjective judgment of the user, which can be inconsistent and susceptible to errors. This may hamper the adoption of DL approaches in practice.
Alternatively, another candidate family of BSS for fMRI data analysis is the Non-negative Matrix Factorization (NMF) approach [16]–[18]. Unlike the aforementioned approaches, NMF methods impose a non-negativity constraint over the matrix factorization. However, the non-negativity constraint may not be valid in practice [19]. In the fMRI context, such a constraint would aim at eliminating the negative contribution of the BOLD signal response, leaving only the positive activations [16]; this is an undesired effect, taking in consideration the true nature of the hemodynamic response [3], [20]. Furthermore, NMF algorithms often require tunning of several regularization parameters, sharing the same limitations with standard DL techniques.
Conventional analysis of fMRI data relies on the General Linear Model (GLM), which assumes the prior availability of the task-related time courses [21]. This approach suffers from a critical limitation: It assumes that the HRF is known and fixed, whereas in reality the HRF may vary across subjects [22], as well as among brain locations [20]. In contrast, BSS methods make no assumption regarding the HRF and can reveal other brain-induced sources beyond the task-related ones. For example, they inherently model interfering artifacts, such as scanner-induced artifacts, uncorrected head-motion residuals, or other unmodeled physiological signals that may obscure the brain activity of interest.
Despite their advantages, BSS methods share a major drawback compared to GLM: when two or more task-related sources manifest themselves in highly overlapping brain regions, ICA (to a larger extent) and DL (to a smaller extent) can fail to discriminate them [23]. From a neuroscience perspective, the presence of overlaps between FBNs is frequent in most of the typical experimental designs of interest. More specifically, several research groups have reported that conventional task-related FBNs such as motor, language, emotion, or auditory, exhibit considerable overlap with each other [24]–[26].
In an attempt to overcome the aforementioned fundamental drawbacks of the BSS methods against GLM, alternative approaches have been proposed [27], [28]. In the ICA case, the most relevant is to impose task-related information. Collectively, such methods are referred to as constrained ICA [29]–[34]. Although these often lead to enhanced performance, compared with their fully blind counterparts [33], they suffer from a critical limitation: the embedded constraint, e.g., the imposed task-related time courses, must not violate the independence assumption. This requirement poses stringent constraints either on the total number of allowable time sequences, e.g., [35], or on the nature of the imposed time courses, which need to be independent of each other [30]–[37]. Both restrictions heavily limit the applicability of constrained ICA in fMRI, since the most common case is to have experimental designs that comprise more than two BOLD sequences.
Furthermore, in contrast to many unconstrained ICA algorithms, which require a reduced number of relatively easy to tune parameters, all constrained ICA algorithms require extensive regularization parameter fine-tuning [34], based on cross-validation. Even the most recent constrained ICA technique, referred to as CSTICA [34], involves three regularization parameters and, as it is pointed out by the authors, the algorithm needs further improvement to “enable these parameters not to be determined by the experiments”. Besides constrained ICA, there are also NMF algorithms that allow incorporation of external information [17], [18]; yet, they suffer from similar drawbacks that limit their applicability in practice.
Recently, a DL method called Supervised Dictionary Learning (SDL) [38] was introduced, which allows the incorporation of external information from the task-related time courses with a rationale similar to GLM. As a result, SDL is greatly aided in the case of highly overlapping spatial maps and attains performance similar to that of GLM. However, SDL inherits from GLM two primary drawbacks: a) it builds upon the cHRF, which is fixed and, inevitably, different from the true one, and b) it adopts a regularized formulation of the DL, which inherits the difficulties associated with the tuning of the corresponding regularization parameter. In Section IV, Table 2, we provide a thorough comparison among all competitive approaches and their characteristics of interest in the fMRI case.
In this paper, a novel DL formulation of the fMRI BSS problem, referred to as Information Assisted Dictionary Learning (IADL) is proposed, which, among other merits, alleviates the two aforementioned critical disadvantages of SDL as well as those of the constrained ICA approaches. More specifically:
A new sparsity constraint is adopted, which bears a physical interpretation that naturally complies with the segregated nature of FBNs. Unlike standard approaches, the proposed sparsity constraint establishes a bridge between the optimization parameters and the expected number of activated voxels of each source.
The proposed sparsity constraint also offers the flexibility of simultaneously dealing with sparse and dense sources. Indeed, in real fMRI, in addition to sparse sources, dense sources may appear, usually related to physiological or machine-induced artifacts.
A new semi-blind DL approach is proposed that incorporates task-related information. In contrast to the standard approaches, where any task-related information is fully governed by the canonical HRF, our novel formulation incorporates this information in a relaxed way, allowing the imposed time course to adjust to the subject (or subjects) at hand. Thanks to this relaxation, we implicitly accommodate discrepancies between the HRF and the cHRF, and we cope with distortions and inaccuracies regarding the convolutional model, e.g., due to nonlinear effects. In case no prior task-related information is available (e.g., resting-state fMRI data), the proposed method still benefits from the newly adopted sparsity constraint.
A new, highly realistic synthetic dataset, is constructed, that allows conducting a thorough performance evaluation of the new method against state-of-the-art ICA- and DL-based techniques.
Notation: A lower case letter,
Novel DL Constraints Tailored to Task-Related fMRI
A. Preliminaries on DL-Based fMRI Analysis
The data collected during an fMRI experiment form a two-dimensional data matrix as follows: Each of the acquired 3D images is unfolded and stored into a vector,
From a mathematical point of view, the source separation problem can be described as a matrix factorization task of the data matrix, i.e., \begin{equation*} \mathbf {X}\approx \mathbf {D}\mathbf {S},\tag{1}\end{equation*}
\begin{equation*} (\hat {\mathbf {D}},\hat {\mathbf {S}})=\underset {\mathbf {D},\mathbf {S}}{\text {argmin }}\left \Vert{ \mathbf {X}-\mathbf {D}\mathbf {S}}\right \Vert _{F}^{2}\quad \text {s.t. }~\begin{array}{c} \mathbf {D}\in \mathfrak {D}\\ \mathbf {S}\in \mathfrak {L} \end{array}, \tag{2}\end{equation*}
The concept of signal sparsity refers to discrete signals that involve a sufficiently large number of zero values. The typical way to quantify sparsity is via the
The proposed method introduces new constraints on the spatial maps (i.e., on each row of the coefficient matrix,
B. Information-Bearing Sparsity Constraints on the Spatial Maps
In the fMRI framework, sparsity appears to be a natural assumption for the segregated nature of the spatial maps of the FBNs. In other words, each row, say
In this paper, to the best of our knowledge, it is the first time that the DL framework is extended to allow sparsity promotion along rows of the coefficient matrix. Looking at (2), sparsity in the rows of the coefficient matrix can be imposed using the following admissible set of constraints:\begin{equation*} \mathfrak {L}_{0}=\left \{{ \mathbf {S}\in \mathbb {R}^{K\times N}\;|\;\left \Vert{ \mathbf {s}^{i}}\right \Vert _{0}\leqslant \phi _{i},\;i=1,2,\ldots,K}\right \},\quad \tag{3}\end{equation*}
It is well known that the \begin{equation*} \mathfrak {L}_{1}=\left \{{ \mathbf {S}\in \mathbb {R}^{K\times N}\,|\,\left \Vert{ \mathbf {s}^{i}}\right \Vert _{1}\leqslant \lambda _{i},\;i=1,2,\ldots,K}\right \}, \tag{4}\end{equation*}
An additional novelty of IADL that allows us to overcome this obstacle is the application across rows of a weighted version of the \begin{equation*} \left \Vert{ \mathbf {x}}\right \Vert _{1,\mathbf {w}}=\sum _{i=1}^{N}w_{i}\left |{x_{i}}\right |, \tag{5}\end{equation*}
\begin{equation*} w_{i}=\frac {1}{\left |{x_{i}}\right |+\varepsilon }\quad i=1,2,\ldots,N, \tag{6}\end{equation*}
\begin{equation*} \mathfrak {L}_{w}=\left \{{ \mathbf {S}\in \mathbb {R}^{K\times N}|\left \Vert{ \mathbf {s}^{i}}\right \Vert _{1,\mathbf {w}^{i}}\leqslant \phi _{i}\quad i=1,2,\ldots,K}\right \},\quad ~ \tag{7}\end{equation*}
Hereafter, an equivalent but conceptually easier to handle sparsity-related measure that is independent of the length of the vector, known as sparsity percentage, will be used interchangeably with sparsity level. Sparsity percentage expresses the proportion of zeros within a vector, \begin{equation*} \theta =\left ({1-\frac {\phi }{N}}\right)\times 100,\tag{8}\end{equation*}
C. Task-Related Dictionary Soft Constraints
The enhanced discriminative power of the GLM over BSS methods comes from the fact that, in the GLM, the task-related time courses are explicitly provided to GLM modeling via the estimated BOLD sequences [2]. Such information is left unexploited in the BSS framework. Hence, it seems reasonable to also incorporate this information into the BSS methods, leading naturally to a semi-blind formulation.
As stated in the introduction, in contrast to ICA techniques, DL-based methods can easily incorporate, in principle, any constraint in the time courses, since sparsity is not affected. This fact has been exploited in SDL through splitting the dictionary into two parts:\begin{equation*} \mathbf {D}=\left [{\boldsymbol{\Delta },\mathbf {D}_{F}}\right]\in \mathbb {R}^{T\times K},\tag{9}\end{equation*}
In this paper, we relax the strong equality requirement of SDL to a looser similarity-based distance-measuring norm constraint. Then, if part of the a priori information is inaccurate, e.g., the assumed HRF differs from the true one, the method can efficiently adjust the constrained atoms since they are not forced to remain fixed and equal to the preselected time courses. Moreover, the proposed modeling also accounts for multiple factors that potentially alter the functional shape of the task-related time courses across subjects and brain regions, such as vascular differences, partial volume imaging, brain activations, [22], hematocrit concentrations [44], lipid ingestion [45], and even nonlinear effects due to short interstimulus intervals [46].
Mathematically, the starting point is to split the dictionary into two parts:\begin{equation*} \mathbf {D}=\left [{\mathbf {D}_{C},\mathbf {D}_{F}}\right]\in \mathbb {R}^{T\times K},\tag{10}\end{equation*}
\begin{equation*} \mathfrak {D}_{\delta }=\left \{{ \mathbf {D}\in \mathbb {R}^{T\times K}\;\left |{\begin{array}{ll} \left \Vert{ \mathbf {d}_{i}- \boldsymbol {\delta }_{i}}\right \Vert _{2}^{2}\leqslant c_{\delta } & \scriptstyle {i=1,\ldots,M} \\ \left \Vert{ \mathbf {d}_{i}}\right \Vert _{2}^{2}\leqslant c_{d} & \scriptstyle {i=M+1,\ldots,K}\end{array}}\right.}\right \},\quad ~ \tag{11}\end{equation*}
The IADL Algorithm
In this section, we present an implementation of IADL that solves (2), incorporating the two proposed sets of constraints, namely,
The simultaneous minimization for
Put succinctly, the proposed DL algorithm follows the standard scheme of classical DL methods, which iteratively alternate between a sparse coding step and a dictionary update step. Concerning the sparse coding step, the corresponding recovery mechanism is a soft thresholding operator similar to the one that corresponds to the standard
In Algorithm 1, we present the pseudo-code for solving (2), given the number of sources,
Performance Comparison
A. Performance Results Based on Synthetic Data
In Section III of the supplementary material, a novel synthetic data set is presented. This highly realistic dataset emulates demanding experimental tasks, where some of the spatial maps substantially overlap each other. Therefore, this synthetic dataset allows us to effectively evaluate the performance of the proposed DL method, in comparison with the state-of-the-art of blind and semi-blind approaches, under more realistic settings.
The adopted performance measure,
The aim of this performance study is twofold: First, to study the effectiveness of the proposed approach in dealing with HRF mis-modeling; and second, to evaluate the decomposition performance of the algorithm with respect to the set of sources of interest. As benchmarks, the following competitive algorithms are considered: a) McICA,2 which is a constrained ICA algorithm [35] that allows assisting a source using task-related time courses, b) SDL, c) an Online DL algorithm (ODL) [51], which is included in SPAMS3 toolbox, and d) three ICA algorithms, namely, Infomax4 [52], a widely used ICA algorithm within the fMRI community, JADE5 [53], which we used as an initialization point for all DL algorithms, and CanICA6 [54], a state-of-the-art ICA-based algorithm for fMRI data analysis.
To emulate HRF variability, we generated six different “subjects”, through six different, yet realistic, synthetic HRFs. The selected HRFs are depicted in Fig. 2. Six different datasets were built, one for each subject, with the only difference among them being the HRF used to generate the brain-induced time courses. Sources 1, 11 and 14 (see Fig. 1) were chosen to be the task-related time courses, since they correspond to realistic scenarios that are often encountered in practice: Source 1 is easy to identify, since it barely spatially overlaps with other sources and corresponds to a block-event experimental design. Sources 11 and 14 are more challenging and exhibit notable overlap, emulating an event-related task (intervals shorter than 5 seconds, see Fig. 1). Consequently, we generated the imposed task-related time courses
Visual representation of the synthetic spatial maps and their corresponding time courses generated with the canonical HRF. The intensity of the sources is normalized to facilitate visual inspection.
Graphic representation of 100 HRFs (gray) randomly generated from the two gamma distributions model. The red curve represents the canonical HRF (cHRF) and the rest of the colored HFRs stands for the five selected alternatives.
Concerning the parametrization of the algorithms,
For (a), the imposed sparsity percentages for the task-related source 1, 11 and 14 are 95%, 90% and 94% respectively, which correspond to a slight overestimation compared to the true sparsity values. Sparsity set-up information related to this experiment is also listed in Table 1. In particular, the true sparsities of the task-related time courses are shown in Table 1.b and the corresponding imposed sparsities are shown in the corresponding rows of the first column of Table 1.a (sparsity set up scheme
For (b), the exact used sparsity values are shown in the rest of the rows of
Finally, for (c), we set
SDL and ODL tune the sparsity constraint via a single regularization parameter,
On the other hand, the optimal value for the fully blind ODL algorithm was found to be
McICA requires fine-tuning a set of 4 regularization parameters. We observed that optimal selection of these parameters heavily depends on the particular synthetic subject, similarly to the SDL algorithm. Accordingly, we manually optimized these parameters via cross-validation aiming to achieve the best average performance over all the subjects.
Fig. 3.A and Fig. 3.B show the performance results with respect to the full source and the time course, respectively. The horizontal axis indicates the six synthetic subjects that correspond to different HRFs. Both figures comprise three inset graphs, each of which depicts performance with respect to three different sets of sources: (a) the task-related sources 1, 11 and 14, (b) the brain-like sources (
Performance comparison of different approaches with respect to the full source (A) and with respect the time courses only (B). The inset figures correspond to (a) the sources of interest [1, 11, 14], (b) the brain-like sources only, and (c) all sources (including artifacts).
Let us first focus on the two information-assisted DL algorithms, SDL and IADL, whose performance is indicated with green and dark blue curves, respectively. They are both assisted with the task-related time courses that correspond to the cHRF. In the SDL case, the solid and the dashed curves correspond to regularization parameter tuning equal to
In the cases where only brain-like and all the sources are considered (mid and rightmost subfigures), the proposed approach still outperforms SDL. Note that the time courses estimated by IADL are overall better than those of SDL even in the case of the canonical subject, in which SDL is fully optimized exploiting ground truth knowledge. In comparison to the fully blind methods, i.e., ODL, JADE, and Infomax, task-related assisted methods perform better. Concerning the performance of the blind methods, we observed that ODL works better than ICA-based approaches for the optimal selected
The yellow curves in Fig. 3 depict the performance of McICA. This particular constrained ICA algorithm provides estimates only of the assisted sources [35]; hence, results correspond only to the assisted brain-like sources. First, we observe that the McICA algorithm performs better than JADE, CanICA, and Infomax. On the other hand, our proposed IADL algorithm outperforms McICA for all the studied synthetic subjects.
Observe that CanICA (orange curves) exhibits performance that is similar to that of the other two ICA algorithms (see Fig. 3.B). This was expected because CanICA performs best in the multi-subject level [54], rather than in single-subject setups as the one we used so far, where it does not offer any particular advance. In section V, where we deal with multi-subject analysis, we confirm the superiority of CanICA over the other ICA methods examined here.
B. IADL Robustness Against Sparsity Parameter Mistuning
In this section, the tolerance of the proposed approach to the choice of maximum sparsity parameters,
Fig. 4 shows the performance for these three sparsity setups. The first figure illustrates performance results with respect to the full sources, whereas the second one shows performance results concerning the time courses only. Besides the sparsity levels, the setup of the experiment is the same as the previous one. Performance curves of JADE and SDL are repeated here for reference. Observe that the proposed approach is remarkably robust to sparsity specifications. Indeed, in the analysis of the performance of the time courses (Fig. 4.B), there are no detrimental effects, whereas in the full-source case,
Performance evaluation of the IADL algorithm for different choices of sparsity. Fig. (A) shows performance with respect to the full sources and Fig. (B) with respect to the time courses only. The inset figures correspond to (a) the sources of interest [1], [11], [14], (b) the brain-like sources only, and (c) all sources (including artifacts) IADL performance using three different choices of sparsity. The figures also include the results from SDL and JADE from the Fig. 3 as a reference.
C. Comparison Between IADL and GLM
For completeness, we have performed a comparison between the proposed IADL and the standard GLM approach using the SPM127 toolbox, where the design matrix comprises the three task-related time courses. For this study, we observed that SPM and IADL recover the assisted sources 1 and 3 correctly. However, for assisted source 2 the result of SPM is significantly inferior to that of IADL. Full details of this experiment can be found in Section V-B of the supplementary material.
D. Comparison Between IADL and Several Alternatives for the Analysis of fMRI Data
To position IADL among all the alternative matrix factorization-based methods for fMRI, concerning all their features and analysis capabilities, we present a comprehensive comparison in Table 2. Among these different alternatives, observe that IADL complies with all the studied criteria apart from criterion F, namely to be able to explicitly specify the place/voxels within the brain that a FBN will appear through user-defined masks. However, note that IADL can also comply with this criterion with relatively mild modifications in the spatial map constraint,
Task-fMRI Data Analysis
The following study with real fMRI data aims to illustrate the advantages of the proposed approach in a realistic scenario, and to compare its performance with other standard techniques. In particular, for this study, in addition to the IADL algorithm we also employed SDL, the standard GLM implemented in SPM, and two ICA algorithms: ERBM from the GIFT8 toolbox and CanICA6 from the Nilearn toolbox.
A. fMRI Data
For this study, we considered 900 subjects from the motor-task fMRI dataset of the WU-Minn Human Connectome Project [57], which is available at the HCP repository,9 and the acquisition parameters are summarized in the imaging protocols.10 This experiment follows a standard block paradigm, where a visual cue asks the participants to either tap their left/right fingers, squeeze their left/right toes or move their tongue. Each movement block lasted 12 seconds and is preceded by a 3 second of visual cue. In addition, there are 3 extra fixation blocks of 15 second each, as detailed in the Human Connectome Project protocols.11
There are two main reasons for selecting this specific dataset: First, the FBNs related to this experimental design are well studied [38], [58]–[62], which facilitates the evaluation of the results. Second, this dataset is particularly challenging: The FBNs of interest exhibit significant asymmetries in their intensity [60] and some spatial maps exhibit high overlap, particularly within the cerebellar cortex [61]. Besides, the cerebral areas from the motor cortex are larger and exhibit a lower inter-subject variability than those from the cerebellum.
Finally, on top of the standard preprocessing pipeline already applied to the obtained datasets (see [59], [63]), we further smoothed each volume with a 4-mm FWHM Gaussian kernel.
B. Methods & Parameter Setup
For the GLM analysis, we used SPM126, and we followed the same standard procedure as described in [59]. Put succinctly, we defined six task-related time courses, i.e., one per experimental condition: visual, right-hand, left-hand, right-foot, left-foot, and tongue. We estimated each task-related time course as the convolution between the cHRF and each experimental condition, where each experimental condition consisted of a succession of blocks with duration equal to its presentation time. Apart from the six task-related time courses, the design matrix also includes the temporal and spatial derivatives. Then, following the same approach as in [59], at the single-subject level, we computed a linear contrast to assess significant activity.
For the group study, we randomly split the dataset to generate a total number of twenty groups of 15, 30, and 60 subjects each. Then, we performed a group analysis for each combination of subjects to assess significant activity. We studied these three group sizes to evaluate the impact of the number of subjects on performance.
For the matrix factorization methods, in all cases the total number of sources was set equal to 20, which is a reasonable estimate of the expected number of sources (see further discussion in Section I.A of the supplementary material). For the ICA analysis, first, we used the software toolbox GIFT, which implements multiple ICA algorithms in the context of fMRI data analysis. In this study, the algorithms Fast-ICA, Infomax, Erica and ERBM were tested. To save space, we only reported the results of the ERBM algorithm [64], since it appeared to perform somewhat better compared to the rest. Finally, we performed a group fMRI analysis using CanICA6, a state-of-the art ICA-based algorithm.
Concerning IADL and SDL, they both used the same six task-related time courses used in the SPM analysis. Table 1.c shows the sparsity percentage set in IADL. The first six values correspond to the task-related time courses, i.e., visual, right/left-hand, right/left-foot, and tongue, in this order. The sparsity percentage of the rest of the sources (7th to 20th in Table 1.c) are gradually diminishing in a fashion similar to the one used in the analysis of synthetic data above (see Table 1.a). As discussed in the previous section, IADL is robust to sparsity overdetermination. Therefore, little difference in performance is expected if the sparsity percentage values are increased (e.g., all six first values in Table 1.c could be set to 90%. Parameter
In SDL, the
The spatial maps used to evaluate the performance of the matrix factorization methods were computed via the pseudo-inverse approach, namely
C. Performance Evaluation
To quantitatively evaluate the performance of the different methods, we considered two different criteria: a) reproducibility and b) reliability.
1) Reproducibility
Reproducibilty refers to the similarity among different realizations with the same number of subjects, since the particular selection of subject may affect performance. In this study, we measured reproducibility among pairs of the twenty randomly generated groups using three complementary metrics:
- A metric that measures the one-to-one overlap among components [54].t - A metric that quantifies the match between the subspaces spanned by the maps from each component [54].e - The Jaccard overlap, a standard metric that quantifies the similarity between images, which has already been used to quantify group reproducibilty in fMRI [62].J
It is essential to understand that reproducibility alone does not measure if the method works correctly or not, in the sense of correctly separating the different brain sources. Instead, it only provides information regarding how consistent the obtained results are among different realizations of the group analysis. Note that a method can systematically fail to separate a specific brain source, and it can still exhibit good reproducibility (i.e., consistently producing similar wrong results), for example, keeping two sources merged as a single one.
2) Reliability
Reliability refers to the ability of the methods to consistently detect significant activity within the expected regions of interest (ROIs). Note that the ROIs related to the motor tasks have been well documented and studied [24], [38], [58]–[62]. Therefore, we can define a mask that approximately delineates the corresponding ROIs for each motor task.
To measure reliability, we first construct a conjunction map for each motor task. The conjunction map is a particular kind of spatial map, where each voxel indicates the number of subjects/groups that exhibited significant activity within that specific voxel. In this study, for each method, we determine the conjunction map among the twenty groups for each studied group size. Then, we normalize the lesion maps dividing by the total number of realizations. Thus, the normalized lesion maps have values from 0 to 1, where 0 means that no activity was detected in that voxel, and 1 means that all studied groups showed significant activity within that particular voxel.
Using the lesion maps and the defined ROIs for each motor task, we quantify the reliability of the method through two complementary metrics:
OC - Overlap Consistency
FPM - False Positive Rate
The OC measures the mean value of the detected significant active voxels within the ROI, that is, the mean value of the active voxels of the conjunction map within the ROI. This value serves to evaluate the success of the method to find consistent activity within the expected ROIs. On the other hand, the FPR (also known as fall-out or false alarm rate) is defined as the ratio of the number of negative events wrongly categorized as positive (false positive) over the total number of actual negative events. We count as a false positive any activated voxel outside the ROI.
Ideally, the perfect method should exhibit a mean overlap consistency of 100% and an FPR of 0%, meaning that all the analyses obtained the same spatial maps within the expected ROI. Note that both the OC and the FPR are needed to provide a complete view of the reliability of the results. For example, one method may exhibit a good OC, but also a large FPR, in which case, the method is not reliable. Similarly, a lower FPR is not useful if the method does not have a good OC.
D. Results
We analyzed the brain areas associated with each motor task from the cerebrum and the cerebellum separately. We divided the analysis of performance over these two main areas to provide a complete view of the behavior of the studied methods, since the motor areas within the cerebrum present different behavior than those from the cerebellum, as we further discuss in the next section.
Table 3 shows the obtained reproducibility measurement for each studied method. The table depicts the mean value of the three implemented metrics (
For completeness, Fig. 5 shows the spatial maps from two randomly selected groups with 60 participants. We focused on the spatial maps from the group analysis with 60 participants since, according to Table 3, all methods performed best with this group size. Both figures show the significant active voxels at
Significant active voxels (
E. Discussion
The quantitative study verifies that the proposed IADL algorithm exhibited the best reproducibility and reliability at the same time, followed by CanICA, SDL, SPM, and ERBM.
A closer inspection reveals some interesting facts: First, regarding the performance of the ICA algorithms, CanICA achieved considerably better performance compared to ERBM. In particular, CanICA exhibits excellent reproducibility among groups. These results agree with the expected behavior of CanICA, since, before the separation of the independent components, CanICA identifies the common subspace to all the subjects that contains the activation patterns, as detailed in [54]. However, CanICA exhibited a relatively poor OC with considerable variance among tasks, as Table 4 shows. The reason for this is that CanICA (as well as ERBM) had difficulties separating some of the motor areas. For example, the spatial maps in Fig. 5 show that CanICA failed to separate the motor areas from the feet correctly. Furthermore, the excellent reproducibility of CanICA indicates that the algorithm was systematically failing to separate these motor areas.
Second, SPM presents an excellent OC, especially over the cerebral areas. However, SPM exhibited a low reproducibility that is driven by the large presence of false-positive activations, as the large FPR in Table 4 indicates.
SDL appears to outperform SPM in all metrics. This is consistent with the quantitative analyses at the single-subject level performed in [38], but, to our knowledge, it has never been previously demonstrated using quantitative analyses at the group level. Compared with IADL, we can say that SDL’s performances lies roughly in the middle between IADL and SPM. Moreover, we observed that SDL fails to recover some motor areas in some realizations. For example, in Fig. 5.A, SDL missed the motor area corresponding to the right foot, whereas IADL and SPM both show significant activity within the expected ROI. Note that missing an area does not occur often, and it is driven by the particular set of subjects because, unlike CanICA, the relatively lower reproducibility of SDL does not allow us to generalize this observation. Besides, we also noticed that SDL presented more spurious activity across the brain compared to IADL and lower than SPM (see the FPR in Table 4).
With respect to parameter selection for the semi-blind methods, we should emphasize that the differences between IADL and SDL are substantial. First of all, parameter
1) Differences Between the Main Brain Areas
In this study, we analyzed the performance of the studied methods over the cerebrum and the cerebellum separately. The main reason why we divided the analysis over these two main areas is because the FBNs of interest exhibit significant asymmetries, and some of the areas present high overlap. Furthermore, the areas of the motor cortex are large and present a relatively high intensity compared to those from the cerebellum.
The detailed quantitative analysis of the performance of the studied methods over these two main brain parts revealed that the cerebellar areas are considerably more challenging compared to the cerebral ones, as expected. In particular, we observed that the performance of all studied method drops over the cerebellar areas compared to the results from the motor cortex. Interestingly, IADL attains the best performance over the cerebellum, exhibiting a relatively high OC, low FPR, as well as good reproducibility, even for the group analysis with just 15 subjects.
2) Effect of Group Size
Our quantitative analysis revealed that the number of subjects has a tangible effect on the performance of various methods. In general, larger groups exhibit better performance compared to smaller groups, which complies with the expected general behavior. This effect is particularly evident for the SPM method, which had the highest performance gain compared to the rest of the tested methods. Furthermore, we observe that the obtained results for SPM closely resemble the Jaccard overlap results regarding the effects of the number of subjects reported in [62] that also studied the same motor task fMRI dataset.
3) Algorithmic Complexity
Regarding the computational cost of the proposed algorithm, we implemented an efficient approach based on [40], which allows avoiding computationally expensive matrix calculations, as we detail in Section V of the supplementary material. Thus, the most computationally expensive step of the proposed algorithm is the sparse projection (see line 8 in Algorithm 1). The sparse projection depends on the number of voxels,
Conclusions
In this paper, we present a new Dictionary Learning method that naturally incorporates external information via two novel convex constraints: a) a sparsity constraint based on the weighted
The proposed sparsity constraint constitutes a natural alternative to the standard
The advantages and the enhanced performance obtained by the proposed method have been verified through detailed quantitative analyses with both realistic synthetic and task-related real fMRI datasets.
ACKNOWLEDGMENT
The authors would like to thank Prof. E. Kofidis, Dept. of Statistics and Insurance Science, University of Piraeus (Greece), and C. Chatzichristos, Dept. of Electrical Engineering (ESAT), Leuven (Belgium), whose comments contributed to improve the quality of the final manuscript.