Introduction
Brain tumours (BTs) are considered to be a serious and potentially life-threatening form of cancer [1]. Frequent headaches, memory loss, difficulty in concentration, seizures, coordination, and speech problems are some of the major symptoms of BT. Based on the rate of growth, origin, and progression state, BTs are classified into different grades [2]. The detection of BTs at earlier stages and categorizing them into a certain grade is of high importance for the best treatment. Amongst the presented imaging technologies, BT is considerably identified using Magnetic Resonance Imaging (MRI) without medicinal brain surgery [3]. The MRI system is considered a non-invasive medical imaging technique and is pain-free which attains high-resolution images of tumours. In addition, MRI is considered the best medical imaging approach for BTs detection, due to its high-resolution property [4]. At present, many automated techniques are established by research workers for the detection of BTs. Mostly, the present system is carried out based on machine learning (ML) algorithms that involve supervised and unsupervised learning approaches [5].
In recent times, deep learning (DL), which is a subgroup of ML demonstrated high effectiveness as a conventional method, particularly in classification and segmentation drawbacks [6]. The convolutional neural network (CNN) model is now quickly expanded due to its performance limits. CNN is a kind of DL that is used in evaluating visual imagery and generally needs minimal preprocessing [7]. The CNN model offers improved accuracy and feature learning used to categorize different grades and types of BTs compared with traditional ML [8]. Medical image classification represents the concept where the images are classified into various types based on the lesion type observed in images using a supervised learning technique. When the training process is implemented by a set of images, the classifier is used in succeeding machine-based healthcare diagnoses. Lately, BT classification was performed using ML and imaging techniques [9]. CNNs use a convolution operator in multiple layers of the network rather than matrix multiplication and subsequently contributed towards the priority of the convolution network in solving disadvantages with greater computation value [10].
This study designs an Ensemble Learning Driven Computer-Aided Diagnosis Model for Brain Tumor Classification (ELCAD-BTC) technique on MRIs. The presented system contains a Gabor filtering (GF) approach to remove the noise and increase the quality of MRI images. Moreover, ensemble learning of three DL models namely EfficientNet, DenseNet, and MobileNet is utilized as feature extractors. Furthermore, the denoising autoencoder (DAE) approach can be exploited to detect the presence of BTs. Finally, a social spider optimization algorithm (SSOA) was executed for the hyperparameter tuning of DL models. For simulating the improved BT classification outcome, a brief set of simulations occur on BRATS 2015 database.
Related Works
In [11], a deep (DCNN) EfficientNet-B0 base architecture is finetuned with the presented layer to effectively detect and classify BT images. The image enhancement technique was applied by the different filters to optimize the image quality. The data augmentation method is exploited for increasing the data sample for further training. Toğaçar et al. [12] introduced a new CNN model termed BrainMRNet. This model is based on the hypercolumn technique and attention module; it has a ResNet. Initially, a set of images are pre-processed in BrainMRNet. In the second phase, they are transported to the attention module through the image augmentation technique for all the images. The attention module chooses a significant area of the image and the images are transported to the convolution layer. In [13], a DL-based technique that applied various modalities of MRI is proposed for the BT classification. The presented hybrid CNN architecture uses a patch-based technique and considered contextual and local while forecasting output labels. The presented method handles overfitting problems by using batch normalization and dropout regularize, while data imbalance problems are addressed by using a two-stage training model.
In [14], the authors proposed a three-phase pre-processing technique for enhancing the quality of MRI images, alongside a new DCNN framework for robust diagnoses of glioma, meningioma, and pituitary. The framework makes utilize Batch Normalization (BN) for faster training with the highest rate of learning and alleviates initialized layer weight. The presented model is a computationally light-weighted method with a smaller amount of convolutional layers, a max-pooling layer and trained iterations. In [15], the authors presented an incorporation of ANN and Fuzzy K-means algorithm for classifying the tumour locale. It encompasses four stages, (1) Feature selection and extraction (2) Noise evacuation (3) Segmentation and (4) Classification. At first, the procured images are denoised using a wiener filter, and later the considerable grey level co-occurrence matrix (GLCM) attribute was extracted from the image. Next, DL-based classification was implemented to categorize the abnormal image from the normal image. At last, it can be managed by employing the Fuzzy K-Means approach to classifying the tumour area.
Gull et al. [16] present a novel architecture for the diagnosis of BT using MRI scans. The architecture depends on the transfer learning and Fully CNN (FCNN) approaches. The presented architecture contains five stages that are transfer learning, skull stripping, preprocessing, CNN-based tumour classification, and post-processing-based BT binary classification. For the classification of BT images, the presented framework is applied, and for post-processing, the global threshold approach was exploited for eliminating tiny non-tumour areas that improved segmentation performance. Gupta and Gupta [17] suggested a method for the fully automatic classification of BT. In the presented work, a unique ensemble of CNN (ConvNet) for glioma segmentation in MRI. Two fully connected ConvNets established the ensemble method (2D and 3D ConvNets).
The existing models do not focus on the hyperparameter selection process which mainly influences the performance of the classification model. Particularly, hyperparameters such as epoch count, batch size, and learning rate selection are essential to attain effectual outcomes. Since the trial and error method for hyperparameter tuning is a tedious and erroneous process, metaheuristic algorithms can be applied. Therefore, in this work, we employ the SSOA algorithm for the parameter selection of the DAE model.
The Proposed Model
In this manuscript, we have introduced a novel ELCAD-BTC system for accurate BT classification using MRIs. The proposed ELCAD-BTC technique exploits the ensemble learning concept to detect and classify various phases of BTs. The presented system encompasses GF-based noise elimination, ensemble feature extraction, DAE-based classification, and SSOA-based hyperparameter tuning. Fig. 1 represents the working process of the ELCAD-BTC algorithm.
A. GF-Based Noise Removal
At the preliminary level, the GF approach is used to eradicate the noise in the MRI. Fourier transform is an effective mechanism in processing signals that might be useful to transform images from spatial to frequency domains [18] and extract features that are not easier for extracting in the spatial domain. But after FT, frequency features of an image at dissimilar positions are frequently combined, however, GF is capable of extracting spatial local frequency features which is a robust texture detection technique. The GF is calculated by multiplying a Gaussian with the cosine function as follows:\begin{align*} g\left ({x,y,\lambda,\theta,\phi,\sigma,\gamma }\right)&=exp\left ({\frac {-x^{\mathrm {'2}}+\gamma ^{2}y^{2}}{2\sigma ^{2}} }\right) \\ &\quad \times exp\left ({i\left ({2\pi \frac {x^{\prime} }{\lambda }+\phi }\right) }\right) \tag{1}\\ g_{real}\left ({x,y,\lambda,\theta,\phi,\sigma,\gamma }\right)&=exp\left ({\frac {-x^{\mathrm {'2}}+\gamma ^{2}y^{2}}{2\sigma ^{2}} }\right) \\ &\quad \times \mathrm {cos}\left ({i\left ({2\pi \frac {x^{\prime} }{\lambda }+\phi }\right) }\right) \tag{2}\\ g_{imag}\left ({x,y,\lambda,\theta,\phi,\sigma,\gamma }\right)&=exp\left ({\frac {-x^{\mathrm {'2}}+\gamma ^{2}y^{2}}{2\sigma ^{2}} }\right) \\ &\quad \times sir\iota \left ({i\left ({2\pi \frac {x^{\prime} }{\lambda }+\phi }\right) }\right) \tag{3}\end{align*}
B. Ensemble Learning-Based Feature Extraction
In this work, ensemble learning of three DL models namely EfficientNet, DenseNet, and MobileNet is utilized as feature extractors. Assumed the number of classes as \begin{equation*} c_{k}=\mathop {\text {arg max}}\limits _{j}\sum \nolimits _{\mathrm {i=1}}^{D} \left ({\Delta _{\mathrm {ji}}\times w_{\mathrm {i}} }\right) \tag{4}\end{equation*}
Now, \begin{equation*} Acc=\frac {\sum \nolimits _{k} \left \{{\mathrm {1\vert }c_{k}~ \mathrm {is~the~true~class~of~instance} ~k }\right \}}{\mathrm {Size~of~test~instances}}\times 100\%. \tag{5}\end{equation*}
1) Efficientnet Model
EfficientNet is a class of DL techniques which are scaled for balancing width, depth, and input data of the network resolve for achieving a better efficiency-trained time trade-off [19]. Previously the EfficientNet came along, the most frequent manner for scaling up CNN was each one of 3 dimensions:
Depth (count of hidden layers): while a deeper network provides optimum image classifier accuracy, it is also more complex for training because of the famous vanishing gradient problems. Accuracy obtains rapidly reduce above a particular depth.
Width (count of channels or filters): although simpler for training and capable of capturing fine-grained features, it encounters problems from the capture of superior-level image content.
Image resolution (image size): the improved resolution of input imageries from the rule offers CNN further data.
EfficientNet instead of executing Compound Scaling, scales simultaneously all 3 dimensions, image resolution, depth, and width, whereas preserving a balance betwixt every dimension of networks.
2) DenseNet Model
The DenseNet201 model makes use of a condensed network to optimize the efficiency and construct simple-to-train, extremely parametrical, and robust models [20]. The denseNet201 model has been well-performed on datasets such as CIFAR-100 and ImageNet. The denseNet201 model provides a direct connection that spans from one layer to the other, which enhances connectivity. The feature mapping layer 0 via 1 is fused with the single tensor to make the application easier. A transition layer is a network structure module. After these layers derive a 1–1 convolutional layer and then the -2 BN pooling layers. The “H” hyperparameter set the rate of growth of the DenseNet201 model and illustrates that dense structure optimizes efficiency. Notwithstanding its moderate growth rate, DenseNet201 performs well since its structure applies feature mapping. Fig. 2 demonstrates the infrastructure of the DenseNet model.
Accordingly, the existing layer has access to every mapping function in the preceding layer. The amount of input feature mapping at every layer, represented as “fm”, might be evaluated by the following: for all the layers, (fm) I = H0 + H1, the “H” feature map relates to the global state. The input layer channel comes from H0. Every
3) MobileNet Model
The MobileNet structure was utilized for feature extraction in this study. Mainly, CNN was collected from convolution, fully connected (FC) input, pooling, and output layers [21]. In comparison to the traditional neural network, it features weighted sharing, local connection, and downsampling. It might effectually minimize the network parameter, avoid overfitting, and optimize the effectiveness of eliminating local features. The convolution layer was a major component of the CNNs, and the local extracting feature was identified by interconnecting the input of all the neurons to the local sensing area of the previous layer. The convolution function is categorized into activation and convolution layers, and it is computed by the following expression:\begin{equation*} T=f_{k}\left ({\sum \nolimits _{x,y,z=1}^{r} C_{x,y,z} w_{x,y,z}^{s}+b^{s} }\right) \tag{6}\end{equation*}
In Eq. (6),
C. BT Classification
To classify the existence of BT, the DAE model is used. DAE model is the extended edition of AE which aims at recovering the original dataset in noise corrupted dataset [22]. DAE model was depending on the fact that the data preserves its fundamental features, although it is destroyed partially. Hence, the DAE is capable of recovering the original dataset from the noise-added input. DAE is well-established for noise filtering, recovery of voice or image, and typo correction amongst other applications. It comprises two parts. The former is an autoencoder that is only decomposed as encoded and the latter is decoded part. The encoded part is entitled Mani folding learning, which successively decreases the size of the input dataset. Consequently, the core of the original dataset named the hidden values has sufficient data about the original dataset, is attained. For the provided dataset \begin{equation*} f_{\theta }\left ({x }\right)=h=s\left ({Wx+b }\right) \tag{7}\end{equation*}
In Eq. (7),
The decoded is the reverse procedure of encoded that is named generative learning system. The decoded part makes use of successively enhancing layers and recovers the original dataset in the resultant encoded. The formula for the decoded, \begin{equation*} g_{\theta ^{\prime} }\left ({h }\right)=\hat {x}=s\left ({Wh+b^{\prime} }\right) \tag{8}\end{equation*}
In Eq. (8), \begin{align*} L\left ({\theta,\theta }\right)&=\frac {1}{N}\sum \nolimits _{k=1}^{N} {\|x^{k}-\hat {x}\|^{2}} \\ &=\frac {1}{N}\sum \nolimits _{k=1}^{N} \left \|{ x^{k}-g_{\theta ^{\prime} }\left ({f_{\theta }\left ({x^{k} }\right) }\right) }\right \|^{2} \tag{9}\end{align*}
Another part of DAE is the addition of noise to the raw dataset. After this step, the noise dataset is selected as
D. Hyperparameter Tuning
Finally, the SSOA is utilized for the optimum hyperparameter adjustment of the DAE algorithm. SSOA is assumed a bio-simulated meta-heuristic system which simulates the procedure of spider colonies [23]. All the members in the colony are both male and female. All the spiders define the possible solution to problems. The count of female spiders (FSP) (arbitrarily chosen from the range of 65% to 90% of every spider) is measured in Eq. (10).\begin{equation*} N_{f}=floor\left [{ \left ({\mathrm {0.9-}rand\left ({0,1 }\right)\mathrm {.0.25} }\right)\mathrm {.}N }\right] \tag{10}\end{equation*}
\begin{equation*} N_{m}=N-N_{f} \tag{11}\end{equation*}
The position of FSP and MSP are evaluated as in Eqs. (12) and (13), correspondingly.\begin{align*} f_{i,j}^{0}&=P_{j}^{low}+rand\left ({0,1 }\right)\cdot \left ({P_{j}^{high}-P_{j}^{low} }\right) \tag{12}\\ m_{i,j}^{0}&=P_{j}^{low}+rand\left ({0,1 }\right)\cdot \left ({P_{j}^{high}-P_{j}^{low} }\right) \tag{13}\end{align*}
\begin{equation*} w_{i}=\frac {J(s_{i}\mathrm {)-}worst_{s}}{best_{s}-worst_{s}} \tag{14}\end{equation*}
In which \begin{equation*} V_{i}b_{i,j}=w_{j}\mathrm {.}e^{-d_{i,j}^{2}} \tag{15}\end{equation*}
\begin{align*} f_{i}^{t+1}=\begin{cases} \displaystyle f_{i}^{t}+\alpha \cdot V_{i}b_{ci}\cdot (S_{c}-f_{i}^{t})+\beta \mathrm {\cdot }V_{i}b_{bi}\cdot (S_{b}-f_{i}^{t})\\ \displaystyle +gamma. (rand-\mathrm {0.5)} ~with~ probability~PF\\ \displaystyle f_{i}^{t}-\alpha \cdot V_{i}b_{ci}\cdot (S_{c}-f_{i}^{t})+\beta \mathrm {\cdot }V_{i}b_{bi}\cdot (S_{b}-f_{i}^{t})\\ \displaystyle +gamma. (rand-0.5)~ with~ probability~ 1-PF \end{cases} \\{}\tag{16}\end{align*}
\begin{align*} m_{i}^{t+1}=\begin{cases} \displaystyle m_{i}^{t}+\alpha \cdot V_{i}b_{\mathrm {fi}}\cdot (S_{f}-m_{i}^{t})+\delta \mathrm {\cdot} (rand-0.5)\\ \displaystyle if W_{N_{f+i}}>W_{N_{f+m}}\\ \displaystyle m_{i}^{t}-\alpha \cdot \left(\frac {\sum \nolimits _{n=1}^{N_{m}} m_{h}^{t} \mathrm {\cdot }w_{N_{f+h}}}{\sum \nolimits _{h=1}^{N_{m}} w_{N_{f+h}}}-m_{i}^{t}\right) \end{cases} \\{}\tag{17}\end{align*}
In which, \begin{equation*} r=\frac {\sum \nolimits _{j=1}^{n} (P_{j}^{hi\mathrm {.}gh}-P_{j}^{1ow})}{\mathrm {2\cdot }n} \tag{18}\end{equation*}
The fitness choice is a key aspect of the SSOA algorithm. An encoding result was applied for evaluating the fitness of candidate solutions. At present, the accuracy value is the central condition employed to design a fitness function.\begin{align*} Fitness&=\mathrm { max}\left ({P }\right) \tag{19}\\ P&=\frac {TP}{TP+FP} \tag{20}\end{align*}
Results and Discussion
The proposed model is simulated using Python 3.6.5 tool with different Python Packages such as tensorflow(GPU-CUDA Enabled), keras, numpy, pickle, matplotlib, sklearn, pillow, and opencv-python. The proposed model is experimented on PC i5-8600k, GeForce 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD.
In this section, the BT classification outcome of the ELCAD-BTC system is investigated utilizing the BraTS 2015 database. To enhance the size of the dataset, the data augmentation process is involved in different ways: Rotation, Cropping, Flipping, Translation, and Color Space. The dataset holds 1320 images with two classes as represented in Table 1.
Fig. 3 demonstrates the confusion matrix of the ELCAD-BTC system on the BT classification process. On 80% of TRS, the ELCAD-BTC technique has acknowledged 55.87% of samples under benign and 42.42% of samples under malignant class. At the same time, on 20% of TSS, the ELCAD-BTC method has acknowledged 50.38% of samples under benign and 48.86% of samples under malignant class. Next, on 70% of TRS, the ELCAD-BTC algorithm has acknowledged 54.11% of samples under benign and 43.43% of samples under malignant class.
Confusion matrices of ELCAD-BTC method (a-b) TRS/TSS of 80:20 and (c-d) TRS/TSS of 70:30.
In Table 2, an overall BT classifier result of the ELCAD-BTC approach with 80:20 of TRS and TSS is given. On 80% of TRS. The ELCAD-BTC technique has proficiently recognized benign and malignant samples. For instance, on benign class, the ELCAD-BTC technique has obtained
In Table 3, an overall BT classifier outcome of the ELCAD-BTC method with 70:30 of TRS and TSS is given. On 70% of TRS, the ELCAD-BTC method has proficiently recognized benign and malignant samples. For example, on benign class, the ELCAD-BTC method has attained
The training accuracy (TAY) and validation accuracy (VAY) of the ELCAD-BTC technique are inspected on BT efficiency in Fig. 4. The result implied that the ELCAD-BTC method has depicted higher performance with the highest values of TAY and VAY. It can be noticeable that the ELCAD-BTC system has obtained maximal TAY outcomes.
The training loss (TLSS) and validation loss (VLSS) of the ELCAD-BTC technique are tested on BT efficiency in Fig. 5. The result indicated that the ELCAD-BTC approach has demonstrated superior efficiency with minimum values of TLSS and VLSS. It can be observable that the ELCAD-BTC algorithm has resulted in the least VLSS outcomes.
An obvious precision-recall (PR) examination of the ELCAD-BTC system under the test database is portrayed in Fig. 6. The figure indicated that the ELCAD-BTC algorithm has led to greater values of PR values under all class labels.
A comprehensive ROC outcome of the ELCAD-BTC approach on the test database is represented in Fig. 7. The outcomes indicated the ELCAD-BTC system has demonstrated its capability in categorizing two classes.
A brief comparative outcome of the ELCAD-BTC algorithm with other DL systems is made in Table 4 and Fig. 8 [24]. The experimental outcome stated that the novel 3D-CNN and VGG-19 models attain reduced
On the contrary, the Inception-v3 and fine-tuned VGG-19 models have resulted in closer
Conclusion
In this manuscript, we have introduced a novel ELCAD-BTC system for accurate BT classification using MRIs. The proposed ELCAD-BTC technique exploits the ensemble learning concept to detect and classify various steps of BTs. The presented system contains a GF approach to remove the noise and increase the quality of MRI images. Moreover, ensemble learning of three DL models namely EfficientNet, DenseNet, and MobileNet was exploited as feature extraction. Furthermore, the SSOA with the DAE model is exploited to detect the presence of BTs. For simulating the greater BT classification result, a brief set of simulations occur on BRATS 2015 database. The enhanced results of the ELCAD-BTC approach show its promising performance on BT classification. In the future, the presented ELCAD-BTC technique can be extended to three-dimensional MRI for accurate BT classification performance.