Introduction
Magnetic resonance spectroscopic imaging (MRSI) is a potentially powerful noninvasive, label-free molecular imaging modality. It allows for simultaneous detection and quantification of the spatiotemporal variations of many endogeneous molecules in the human body [1]–[3]. However, due to the high dimensionality of the underlying imaging problem to produce spatially-resolved spectra and the inherently low SNR as the molecules of interest typically have very low concentrations, in vivo applications of MRSI have been limited to single-voxel spectroscopy or low-resolution, time-consuming acquisitions [2], [3].
In the past several years, a number of constrained MRSI methods have been proposed to improve the speed, resolution and SNR tradeoffs. The key methodologies in these works typically rely on constructing a reduced-complexity signal model of the high-dimensional MRSI data, by exploiting either sparsity in the spatial and/or spectral domains [4]–[10] or low-rankness of the desired spatiospectral function [11]–[17]. Additional spatial constraints have also been introduced to further take advantages of other anatomical prior information readily available in an MR experiment [18]–[21]. Recently, a subspace MRSI method called SPICE has been developed to successfully achieve fast, high-resolution MRSI by jointly designing both data acquisition and processing in a subspace imaging framework [22]–[24]. The key feature of this approach is modeling individual voxel spectra as a linear combination of a small number of basis functions, which can be predetermined using specially designed navigator data, thus significantly reducing the degrees-of-freedom and allowing better tradeoffs for speed, resolution and SNR. However, to capture more complicated spatiospectral variations, the dimension of the linear subspace can increase substantially, reducing its efficiency, thus motivating the need for a more general nonlinear low-dimensional model.
Learning a general nonlinear model for high-dimensional functions is a challenging problem. In the context of MRSI, locally linear embedding (LLE) [25] and Laplacian eigenmaps (LE) [26] have been applied to classify spectra from normal and diseased tissues by estimating the low-dimensional manifolds where each class of spectra was assumed to reside. But incorporating such classification models into the imaging process is difficult for which more accurate representation is required. Meanwhile, the recent success of deep neural network based methods in learning complex functional mapping and extracting nonlinear features from high-dimensional data presents new opportunities to address the model learning problem [27]–[29]. A number of works have been proposed in MRSI, mainly focusing on spectral quantification [30]–[32] or spectral artifact removal [33]. A common approach among these methods is to directly learn the entire inverse function that maps the noisy and artifact-containing signals to the desired artifact-free ones or the spectral parameters (e.g., molecule concentrations or lineshape parameters), by training a deep neural network (DNN) [30], [33], [34]. This approach requires the learned function to simultaneously capture all the nuances in the noise, artifacts, as well as underlying true signals at all the possible SNRs, thus dramatically increasing the complexity of this function and the learning problem. Moreover, it is also sensitive to SNR levels and data acquisition designs.
We proposed in this work a different approach to learn a nonlinear low-dimensional model that captures the inherent degrees-of-freedom in spectroscopic signals and to incorporate such a learned representation for constrained MRSI reconstruction. Specifically, we recognized that a general NMR spectrum can be characterized by a small number of parameters based on the well-defined spectral quantification model, thus should reside in a low-dimensional manifold embedded in the original high-dimensional space. Accordingly, we designed a deep autoencoder network (DAE) to capture this manifold. The effectiveness of DAE in learning nonlinear low-dimensional representation for high-dimensional data has been well-documented [27]–[29] but requires a large amount of training data. To this end, we acquired a small amount of training data to estimate the distributions of molecule-dependent parameters and then used the spectral fitting model to generate the data needed to train the DAE. A regularization formulation was devised to integrate the learned representation and the spatiospectral encoding model for constrained MRSI reconstruction. An efficient algorithm was developed to solve the associated optimization problem. The proposed model was evaluated against linear dimensionality reduction and demonstrated a more efficient representation of MRSI data. Simulation and experimental results were obtained to demonstrate the capability of the proposed method in producing improved spatiospectral reconstructions over existing methods.
The rest of the paper is organized as follows. Section II provides background on MR spectroscopy modeling and the need for nonlinear dimension reduction, as well as a brief review of autoencoder based neural networks. Section III presents details on the proposed DAE-based learning method, reconstruction formulation, and optimization algorithm. Section IV summarizes the experimental results followed by discussion and conclusion in Sections V and VI.
Background
A. Spectroscopy Signal Modeling
The imaging problem in MRSI can be defined as recovering a high-dimensional spatiotemporal function \begin{equation*} d_{p}( {\mathbf k},t)\!=\!\int _{V}s_{p}( {\mathbf r})\rho ( {\mathbf r},t)e^{i2\pi \delta f( {\mathbf r})t}e^{-i2\pi {\mathbf k} {\mathbf r}}d {\mathbf r}+ n_{p}( {\mathbf k},t), \quad \tag{1}\end{equation*}
\begin{equation*} \rho ( {\mathbf r},t)=\sum _{m=1}^{M}c_{m}( {\mathbf r})\phi _{m}(t)e_{m}(t;\boldsymbol {\theta }_{m}( {\mathbf r})), \tag{2}\end{equation*}
\begin{equation*} \rho ( {\mathbf r},t) = \sum _{m=1}^{M}c_{m}( {\mathbf r})\phi _{m}(t)e^{-t/T_{2,m}^{*}( {\mathbf r})+i2\pi \delta f_{m}( {\mathbf r})t}e^{-\beta ( {\mathbf r})t^{2}},\quad \tag{3}\end{equation*}
B. Deep Autoencoders
A number of methods have been developed to learn nonlinear features from MRSI data for classification purpose, e.g., [25], [26], [39]. The incorporation of learned features into constrained reconstruction requires representations with much higher accuracy and flexibility. Motivated by the recent success of deep neural network in representation learning [27]–[29], we propose in this work to use a deep autoencoder to learn an accurate and efficient low-dimensional model for MR spectroscopy signals. To provide a context for the proposed method, we present in this section a brief introduction to autoencoders and deep autoencoders.
An autoencoder (AE) is a special type of artificial neural network that is typically composed of fully connected layers and has been developed to learn the underlying representation of data of various types [27]. Figure 1a illustrates a commonly used basic autoencoder with the first layer being linear units to represent an input vector \begin{align*} \quad {\mathbf h}=&f( {\mathbf x}) = {\mathbf a}\left ( {\mathbf W}_{1} {\mathbf x}; \mathbf {b}\right ), \tag{4}\\ \quad \hat {\mathbf x}=&g( {\mathbf h}) = {\mathbf W}_{2} {\mathbf h} \tag{5}\end{align*}
Illustration of AEs: (a) A basic three-layer AE with fully connected layers and a hidden layer to extract nonlinear features of the data; (b) A DAE constructed by stacking the basic AEs shown in (a) with multiple nonlinear hidden layers, which can be trained to extract hierarchical features from high-dimensional data and for nonlinear dimensionality reduction.
Deep autoencoders (DAEs) are deep neural networks that are constructed by stacking multiple nonlinear encoding and decoding layers from the standard AEs (Fig. 1b). The multiple nonlinear layers introduce more complexity, thus offering stronger representation and feature extraction powers than the standard AEs. Mathematically, the features extracted from the multiple nonlinear layers can be denoted as \begin{equation*} \{\hat {\boldsymbol {\theta }}_{f},\hat {\boldsymbol {\theta }}_{g}\} = \arg \underset {\boldsymbol {\theta }_{f},\boldsymbol {\theta }_{g}}{\text {min}}\cfrac {1}{N}\sum _{s=1}^{N}\varepsilon \left ({ {\mathbf x}_{s}, g(f({\mathbf x}_{s};\boldsymbol {\theta }_{f});\boldsymbol {\theta }_{g})}\right). \tag{6}\end{equation*}
Proposed Method
A. Low-Dimensional Spectral Model Learning Using DAE
Recognizing that an inherent low-dimensional representation should exist for the MR spectra specified by Eq. (3), we propose here a strategy to learn this general nonlinear model for MR spectroscopic signals using a DAE. Two major issues arise for this learning task. First, deep neural networks require a large number of high-quality training data which is a generally luxury for many MRSI applications. Secondly, proper designs of architecture and training procedure are needed to learn a representation of the signals that can be useful in the imaging process (e.g., reconstruction from noisy measurements).
We address the first issue by combining the physical model in Eq. (3), quantum mechanical (QM) simulations, and experimentally acquired training data. More specifically, the molecule basis functions,
The proposed strategy to learn a nonlinear low-dimensional representation of spectroscopic signals: (Most left column) Metabolite resonance structures are obtained by quantum mechanical simulations and spectral parameters are generated from distributions estimated from empirical data. This information is fed into the commonly used spectral fitting model (blue box) to generate a large collection of FID data for training, denoted as
We will use 31P spectroscopy data in this work to demonstrate the capability of the proposed approach considering their clearly defined spectral features, absence of nuisance signals and macromolecule baseline (see Discussion for more details on extensions to MRSI of other nuclei). To this end, the QM simulated resonance structures include the commonly observed 31P-containing metabolites, as described in [46]. The CSI data to estimate the empirical parameter distributions were acquired on a 7T scanner (more details in the Results section).
B. Constrained MRSI Using the Learned Model
With the trained DAE \begin{align*}&\hspace {-2.6pc}\hat {\mathbf X} = \arg \underset {\mathbf {X}}{\text {min}}\left \Vert{ {\mathbf d}- \Omega \{ {\mathbf F} {\mathbf B} \odot {\mathbf X} \}}\right \Vert _{2}^{2} + \lambda _{2}\left \Vert{ {\mathbf D}_{w} {\mathbf X}}\right \Vert _{F}^{2} \\&\hspace {8pc} + \lambda _{1}\sum _{n=1}^{N}\left \Vert{ \mathcal {C}({\mathbf X}_{n}) - {\mathbf X}_{n} }\right \Vert _{2}^{2} \tag{7}\end{align*}
C. Optimization Algorithm
We describe here an efficient algorithm to solve Eq. (7). Specifically, to decouple the linear least-squares and the nonlinear regularization functional, we introduce an auxiliary variable \begin{align*} \hat {\mathbf X}=&\arg \underset {\mathbf {X}}{\text {min}}\left \Vert{ {\mathbf d}- \Omega \{ {\mathbf F} {\mathbf S}\}}\right \Vert _{2}^{2} + \lambda _{2}\left \Vert{ {\mathbf D}_{w}\bar {\mathbf B}\odot {\mathbf S}}\right \Vert _{F}^{2} \\&+ \lambda _{1}\sum _{n=1}^{N}\left \Vert{ \mathcal {C}({\mathbf X}_{n}) - {\mathbf X}_{n} }\right \Vert _{2}^{2} \\&s.t. ~ {\mathbf B}\odot {\mathbf X}= {\mathbf S}. \tag{8}\end{align*}
Update
with fixed{\mathbf X} and{\mathbf S}^{(i)} with{\mathbf Y}^{(i)} denoting the iteration number andi the Lagrangian multiplier{\mathbf Y} \begin{align*}&\hspace {-2.6pc} {\mathbf X}^{(i+1)} = \arg \underset {\mathbf X}{\text {min}}~\lambda _{1}\sum _{n=1}^{N}\left \Vert{ \mathcal {C}({\mathbf X}_{n}) - {\mathbf X}_{n} }\right \Vert _{2}^{2} \\&\hspace {6pc} + \cfrac {\mu }{2}\left \Vert{ {\mathbf B}\odot {\mathbf X}- {\mathbf S}^{(i)} + \cfrac {\mathbf Y^{(i)}}{\mu } }\right \Vert _{F}^{2}. \tag{9}\end{align*} View Source\begin{align*}&\hspace {-2.6pc} {\mathbf X}^{(i+1)} = \arg \underset {\mathbf X}{\text {min}}~\lambda _{1}\sum _{n=1}^{N}\left \Vert{ \mathcal {C}({\mathbf X}_{n}) - {\mathbf X}_{n} }\right \Vert _{2}^{2} \\&\hspace {6pc} + \cfrac {\mu }{2}\left \Vert{ {\mathbf B}\odot {\mathbf X}- {\mathbf S}^{(i)} + \cfrac {\mathbf Y^{(i)}}{\mu } }\right \Vert _{F}^{2}. \tag{9}\end{align*}
Update
with fixed{\mathbf S} and{\mathbf X}^{(i+1)} {\mathbf Y}^{(i)} \begin{align*}&\hspace {-2.6pc} {\mathbf S}^{(i+1)} = \arg \underset {\mathbf S}{\text {min}} \left \Vert{ {\mathbf d}- \Omega \{ {\mathbf F} {\mathbf S}\}}\right \Vert _{2}^{2} + \lambda _{2}\left \Vert{ {\mathbf D}_{w}\bar {\mathbf B}\odot {\mathbf S}}\right \Vert _{F}^{2} \\&\hspace {5pc} + \cfrac {\mu }{2}\left \Vert{ {\mathbf B}\odot {\mathbf X}^{(i+1)} - {\mathbf S} + \cfrac {\mathbf Y^{(i)}}{\mu } }\right \Vert _{F}^{2}. \tag{10}\end{align*} View Source\begin{align*}&\hspace {-2.6pc} {\mathbf S}^{(i+1)} = \arg \underset {\mathbf S}{\text {min}} \left \Vert{ {\mathbf d}- \Omega \{ {\mathbf F} {\mathbf S}\}}\right \Vert _{2}^{2} + \lambda _{2}\left \Vert{ {\mathbf D}_{w}\bar {\mathbf B}\odot {\mathbf S}}\right \Vert _{F}^{2} \\&\hspace {5pc} + \cfrac {\mu }{2}\left \Vert{ {\mathbf B}\odot {\mathbf X}^{(i+1)} - {\mathbf S} + \cfrac {\mathbf Y^{(i)}}{\mu } }\right \Vert _{F}^{2}. \tag{10}\end{align*}
Update
{\mathbf Y} \begin{equation*} {\mathbf Y}^{(i+1)} = {\mathbf Y}^{(i)} + \mu \left ({ {\mathbf B}\odot {\mathbf X}^{(i+1)} - {\mathbf S}^{(i+1)}}\right).\tag{11}\end{equation*} View Source\begin{equation*} {\mathbf Y}^{(i+1)} = {\mathbf Y}^{(i)} + \mu \left ({ {\mathbf B}\odot {\mathbf X}^{(i+1)} - {\mathbf S}^{(i+1)}}\right).\tag{11}\end{equation*}
The first subproblem requires general unconstrained optimization solvers due to the highly nonlinear function \begin{align*} f_{n}({\mathbf X}_{n})=\lambda _{1}\left \Vert{ \mathcal {C}({\mathbf X}_{n})\! -\! {\mathbf X}_{n}}\right \Vert _{2}^{2} \!+\! \cfrac {\mu }{2}\left \Vert{ \left [{ {\mathbf B}\odot {\mathbf X} \!-\! {\mathbf S}^{(i)} \!+\! \cfrac {\mathbf Y^{(i)}}{\mu }}\right ]_{n}}\right \Vert _{2}^{2} \\ {}\tag{12}\end{align*}
\begin{align*}&\hspace {-3pc} \nabla f_{n}({\mathbf X}_{n}) = 2\lambda _{1}\left ({ {\mathbf J}_{\mathcal {C}} - {\mathbf I}}\right)^{T}\left ({\mathcal {C}({\mathbf X}_{n}) - {\mathbf X}_{n}}\right) \\&\hspace {4pc} + \mu {\mathbf B} _{(n)}^{H}\left ({ {\mathbf B}_{(n)} {\mathbf X}_{n} - {\mathbf S}_{n}^{(i)} + \cfrac {\mathbf Y_{n}^{(i)}}{\mu }}\right). \tag{13}\end{align*}
\begin{equation*} {\mathbf J}_{\mathcal {C}} = {\mathbf W}_{L}^{T}\times {\displaystyle \prod _{l=1}^{L-1} {\mathbf U}_{l} {\mathbf W}_{l}^{T}}, \tag{14}\end{equation*}
With the updated \begin{align*}&\hspace {-2.9pc} {\mathbf F}^{H}\Omega ^{H}\Omega \{ {\mathbf F} {\mathbf S}\} + \lambda _{2} {\mathbf B}\odot {\mathbf D}_{w}^{H} {\mathbf D}_{w}\bar {\mathbf B}\odot {\mathbf S} + \cfrac {\mu }{2} {\mathbf S} \\&\hspace {1.6pc}= {\mathbf F}^{H}\Omega ^{H}\{ {\mathbf d}\} + \cfrac {\mu }{2}\left ({ {\mathbf B}\odot {\mathbf X}^{(i+1)} + \cfrac {\mathbf Y^{(i)}}{\mu }}\right), \tag{15}\end{align*}
D. Single-Voxel Spectroscopy Denoising
The proposed method can also be applied to spectroscopy data acquired without spatial encoding. In this case, the formulation in Eq. (7) can be simplified into \begin{equation*} \hat {\mathbf x} = \arg \underset {\mathbf {x}}{\text {min}}\left \Vert{ {\mathbf d}- {\mathbf x}}\right \Vert _{2}^{2} + \lambda _{1}\left \Vert{ \mathcal {C}({\mathbf x}) - {\mathbf x} }\right \Vert _{2}^{2} \tag{16}\end{equation*}
\begin{equation*} \nabla f({\mathbf x}) = 2({\mathbf x}- {\mathbf d}) + 2\lambda _{1}\left ({ {\mathbf J}_{\mathcal {C}} - {\mathbf I}}\right)^{T}\left ({\mathcal {C}({\mathbf x}) - {\mathbf x}}\right). \tag{17}\end{equation*}
E. Training and Other Implementation Details
We generated 300,000 31P MR spectra as described in Section III.A for model learning. The molecules we considered in this work are 31P-containing compounds commonly observed during in vivo 31P MRS/MRSI experiments, i.e., PCr,
Results
A. Numerical Simulations
Simulation studies have been conducted to evaluate the proposed low-dimensional model learning and reconstruction method. We first investigated the approximation accuracy of our learned low-dimensional representation by comparing the dimensionality reduction errors at different model orders (denoted as
Approximation accuracy of the learned model: (a) Relative
A numerical phantom was constructed to validate the proposed MRSI reconstruction method using the learned nonlinear model. Specifically, segmented brain tissue compartments, i.e., gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF), were obtained from an experimentally acquired \begin{equation*} \text {SNR} = \cfrac {\max |\rho ({\mathbf r},f_{\text {PCr}})|}{\sigma }. \tag{18}\end{equation*}
The computational phantom for evaluating different reconstruction methods. The top row shows maps of PCr,
A set of denoising results obtained by the proposed method from simulation data with an SNR of 20 is shown in Fig. 5, with comparison to the noisy data (obtained by the standard Fourier reconstruction) as well as results produced by a spatially regularized reconstruction [20] and a subspace constrained reconstruction (SPICE) also with the spatial regularization term [12], [22]. The spatially regularized reconstruction is equivalent to solving Eq. (7) with
Simulation results showing spatiospectral reconstructions from the ground truth (Gold Standard), noisy data (Noisy Data), anatomically constrained reconstruction (Spatial), SPICE reconstruction (Subspace) and the proposed method (Proposed), respectively. The normalized MSEs are shown for each case under the method labels. The first three columns compare the reconstructed maps of PCr,
Reconstruction errors for different methods (identified by different colored curves) under various SNR levels (SNR defined in texts). As can be seen, the proposed method consistently yields the lowest errors.
Effects of
B. Experimental Studies
The performance of the proposed method under practical experimental conditions has been evaluated using brain 31P-MRSI data acquired from healthy volunteers on a 7T system (Siemens Magnetom) using a double-tuned 31P-1H surface coil. Data were acquired using a CSI sequence [55] with the following parameters: TR/TE = 170/2.3 ms, field-of-view (FOV)
Reconstructions from the in vivo data acquired using a surface coil. The first row contains the anatomical images within the imaging volume (
Another data set was acquired using a double-tuned volume 31P-1H coil on the same 7T system with matched TR, TE and spectral BW. The FOV was modified to
Results from the data acquired using a volume coil. The maps of PCr and
C. Special Case: Single-Voxel Spectroscopy Denoising
As discussed in Section III.D, the proposed method can also be applied to denoise single-voxel spectroscopy (SVS) data by solving the optimization problem in Eq. (16), which allows the incorporation of data-driven priors to improve the SNR of general single-voxel acquisitions. Figure 10 shows an example of such results to demonstrate this capability. Specifically, a noisy brain 31P spectrum was acquired in vivo and denoised by the proposed method. In addition, a “reference” spectrum was generated by spectral fitting of this data (the residual was inspected to ensure high-quality fitting) and compared to the denoising result. As can be seen, the proposed method achieved effective noise reduction with excellent preservation of spectral features as compared to the fitted spectrum. The peaks of PCr,
Application of the proposed method to denoise a single-voxel 31P spectrum. The noisy data and denoised spectra are shown in the first and second rows, respectively, with comparison to a noiseless spectrum (last row) obtained by performing a spectral fitting of the noisy data.
Discussion
The proposed method has several key differences compared to other learning-based denoising/reconstruction strategies. First of all, by not directly learning the inverse transform that maps the noisy, artifact containing data to the desired signals or parameters, we simplified the learning problem and allowed the constructed DAE to focus on learning to extract the physiologically meaningful/molecule-specific low-dimensional features of spectroscopic data (for a particular field strength) instead of the nuisance noise and artifacts. Second, the proposed formulation represents a different way to incorporate deep learning into the imaging process that effectively combines the physics-based data acquisition model and learned signal model. It offers higher flexibility and can work with different noise levels, acquisition parameters and other spatiospectral constraints without having to retrain the network. Third, our algorithm directly solves the resulting optimization problem as opposed to previous works which mapped an iterative process to a cascade of networks that approximately solve a similar regularized reconstruction formulation (e.g., [56], [57]). These “unrolling” based methods can be efficient but still fall into the category of directly learning the entire inverse mapping and can be sensitive to SNR levels and acquisition designs.
An important issue with the proposed method is the selection of regularization parameters. As shown by our comparison of reconstructions obtained with different combinations of
Several other issues are worth investigating in future research. In particular, the network for learning the nonlinear low-dimensional representation can be further optimized. For example, structures for better handling complex-valued data, different layer designs (convolutional versus fully-connected), asymmetric encoder and decoder networks, and choices of activation functions may be studied. Characterizations of the connections between network complexity, dimensionality of the nonlinear features, and the complexity of the spectral functions will be topics of both theoretical and practical interest. High-quality in vivo data can be acquired to improve the estimation of spectral parameter distributions or directly augment the neural network training data, which should make the learned model more adaptive to experimental variations not captured by synthesized data alone. One unique advantage of the proposed nonlinear model is that the dimensionality is not affected by the range of the spectral parameters, thus more generalizable to patient populations with larger parameter changes. However, if there exist novel molecular features in certain patient groups that are not present in the training data, patient-specific data need to be collected to adapt the learned model to capture such features. A more general lineshape distortion function
While we demonstrated the capability of the proposed approach using 31P-MRSI data, application to other nuclei is possible. Extending the proposed method to 1H-MRSI will require addressing a number of important issues such as residual nuisance water/lipid signals, presence of macromolecule baseline in short-TE data, more spectral features, and potentially stronger lineshape distortion due to larger intravoxel
It is also worth mentioning that as the proposed method integrates the data acquisition model and the learned nonlinear model through a regularization formalism, it can be readily extended to incorporate other spatiospectral constraints (e.g., spatiospectral sparsity using non-quadratic regularizations) and even potentially learned spatial priors. The improved SNR can allow faster speeds (e.g., with less number of averages and/or shorter TRs) and higher resolution acquisitions. Although we only demonstrated denoising reconstruction, the proposed formulation can be extended to other scenarios such as sparse sampling in the (k,t)-space with modifications of the sampling operator. These directions are currently being pursued and could lead to new opportunities in synergizing physics-based modeling and machine learning to push the resolution and SNR limits of in vivo MRSI.
Conclusion
We have presented a new method to model MRSI data by learning a low-dimensional nonlinear representation and use the learned model for spatiospectral reconstruction. The model was learned using a deep autoencoder based neural network which can accurately capture the inherent low-dimensional features of high-dimensional spectral variations and enable effective dimensionality reduction. The proposed constrained reconstruction method incorporates the learned model through a regularization formalism which was solved by an efficient ADMM-based algorithm. Significantly improved spatiospectral reconstruction over conventional methods achieved by the proposed method has been demonstrated using both simulation and experimental data. The proposed method represents a new means to incorporate deep learning into the imaging process and may be extended in various ways including improved network designs, training data generation, choices of error metrics and combinations with other spatiospectral constraints.
ACKNOWLEDGMENT
The authors would like to acknowledge Drs. Wei Chen and Xiao-Hong Zhu for providing the 31P-MRSI data, and Dr. Dinesh Deelchand for sharing the simulated metabolite basis.