Deep Learning-Based Smart IoT Health System for Blindness Detection Using Retina Images

FIGURE 2.

Typical augmentation-resize and crop.

B. Preprocessing

This section details the data processing step which we performed before actually modeling this regression task so that training can be made faster on such unbalanced dataset. First and foremost, we transformed each retinal images in RGB channels followed by circular crop on each images based on image center to separate black regions from real-squared images by keeping the original aspect ratio so the images appearance natural. Also, we employ Ben Graham’s method³ to improve the illumination condition of images so that we can enrich the insights from eye images. Then, we resize the original images to $224\times 224$ and $256\times 256$ (different sizes to check the correctness / feasibility of our modeling approach), and we trained these collection of reduced image sizes with a large batch size of 32.

After preprocessing the dataset, we found that images when zoomed to center are exhibiting close relation to the light spots on left. To solve this zoom issue as reported in Figure 3, we performed augmentation on these processed images to make sure it generalises better and does not overfits.

FIGURE 3.

Preprocessed sample image: Left image is zoomed on the right.

C. Data Augmentation

For better generalisation of the processed dataset, we look into this problem by executing some augmentation steps for the images. We employ the Albumentations library [37] to perform augmentation, where we flipped images horizontally and vertically, image rotation to 360°, zoom images to $1.3{x}$ and contrasting for lightning. We report these augmentation steps in Figure 4.

FIGURE 4.

Sample images after augmentation.

FIGURE 5.

Learning rate policy scheduler.

FIGURE 6.

Training loss.

FIGURE 7.

Accuracy.

We adapted a new augmentation step i.e. Polar unrolling which allows to better leverage pixel space, separate “rotation” from an augmentations list and acquire uniformly scaled eye images (for the instances of no or partial cropping of the fundus image with preservation of radius).

Originally we have images with noticeable black regions. In order to whittle away such regions we at the beginning apply an autocrop. After autocrop, we have the circle’s radius (depicted by the broadest side of an image). Then, extraction of circle using polar unrolling. By unrolling, we changed coordinate space which do not require rotation augmentation. As rotation becomes just plain shift by x-axis (which does not matter for convolutional neural networks). It is more than absence of this type of augmentation. This made all possible rotations for our model considered (except some borders, which can be tackled by the single 50% shift by x-axis).

Our augmentation technique improves the existing contour to transform method on retinal images [6] in terms of processing the imbalanced data and training technique on the state-of-the-art EfficientNet models [17].

Apart from the above augmentation approaches, we also look into a specific oversampling technique [38] to compare the methodological flaw if there any exists in Polar unrolling as key part of the augmentation step. The oversampling of the data can be an alternative step to augmentation due to given imbalanced data. We used image size of $224\times 224$ , where the oversampling is performed after partitioning the training dataset as opposed to the steps considered in [38]. We found the oversampling approach [38] containing large number of duplicates images i.e., 22.45% of training set are not accounting for duplicates. In our oversampling strategy, we found no duplicate images and so a proper oversampling of the retina images dataset. Also, we used Mixup [39], a data augmentation technique along with oversampling for better generalisation of the detection result. The result of oversampling on the complete dataset is shown in Figure 8, where in the first row - the confusion matrix represents the oversampling before data partitioning and after partitioning (in the first row and second column). In the second row of Figure 8, the first confusion matrix uses Mixup augmentation with oversampling, whereas the second confusion matrix is a result of Mixup augmentation [39].

FIGURE 8.

Oversampling result.

D. Training and Inference

We trained our proposed EfficientNet-B5 based model with a larger batch size of 32 for 5 epochs as in the warm-up stage where we freeze all layers except the last two given the learning rate of 4e-3, and we use Adam optimizer and cosine with learning rate (LR) scheduler. In the complete model training, we trained over 30 epoch under the fine-tuning stage provided all layers are unfreeze. Also, we use Early stopping monitoring validation loss for 5 epoch. The LR scheduler at different steps of training is schematically shown in Figure 5 followed by the training and validation loss in Figure 6.

As we found that the training loss becomes smoother around ~0.4 and steeply improved to less than 0.2 at 30th epoch. This shows that TTA keeps the retinal feature during validation while training and could be improved if trained for long on higher batch size.

We cross-validated our EfficientNet-B5 model 5 folds and then inference these trained models by doing test-time augmentation (TTA) [40] 10 times and taken average of 5 models.

We employ 1xP100 and 1xT4 NVIDIA Tesla GPU for training models.

E. Evaluation Measures

To evaluate the detection model, we use quadratic weighted kappa (or Cohen’ s Kappa), which measures the consensus between expert ratings and submitted ratings. The quadratic weights triggers an optimization factor in the rounding operation. This metric ranges from 0 (random agreement among raters) to 1 (complete agreement among raters). A perfect score of 1.0 is allowed when both the actuals and predictions are same, otherwise the least possible score is -1 which is provided when the predictions are furthest away from actuals. In this work, we treat all actuals as 0’s and all predictions as 4’s. This will give rise a quadratic weighted kappa score of −1.⁴ The weighted kappa is given below:

$\begin{equation*} \kappa = 1 - \frac {\sum _{i=1}^{k} \sum _{j=1}^{k} w_{ij} x_{ij}}{\sum _{i=1}^{k} \sum _{j=1}^{k} w_{ij} m_{ij}}\tag{3}\end{equation*}$ View Source

where we are going to optimize mean squared error (MSE) and by optimizing MSE, we will also optimize quadratic weighted kappa as the problem is expressed as regression.

SECTION V.

Experimental Results

This section reports the evaluation result which is the predicted labels of DR for blind spot. For evaluation we predict values from the generator and round it to the nearest integer to get valid predictions. After that we compute the quadratic weighted kappa score on the training set and the validation set which is reported in Table 1.

TABLE 1 Cohen Kappa Score-Fine-Tuned Efficientnet-B5 Regression

We perform Grid search in order to optimize the validation score over a range of thresholds and the quadratic weighted kappa score is 0.92 with threshold range of (0.5, 1.5, 2.5, 3.5). Our fine-tuned EfficientNet-B5 based model scores 92.32% on validation set with the input image size of $224\times 224$ .

We perform test time augmentation (same augmentation steps as mentioned in above section) 10 times and averaged all 5 models with their TTA predictions. The prediction result is reported in Figure 8 and Figure 9.

FIGURE 9.

Confusion matrix-inference of 5 folds.

We adopted several convolutional neural network based models reported in Table 2 and found that DenseNet169 perform significantly better than ResNet 50 which is since former model improves on Adam optimizer.

TABLE 2 Comparison of CNN Models With Efficientnet Weights

The prediction scores of convolutional neural network models reported in Table 2 treats the identification problem as regression which we also considered for our proposed model on top of EfficientNet-B5 model. The models are trained in general without any fine-tuning steps and EfficientNet-B5 performs better than any other CNN models with a significant improvement of 0.10%.

SECTION VI.

Case Study of Smart IoT Based Healthcare System

IoT based healthcare systems are more powerful by integration of deep learning approaches. Recent advancements in IoT technology have revolutionized the electronic healthcare research and industry applications. The huge increase in the use of portable smart health devices has increased the quality of health monitoring, diagnosis and collecting data for the clinicians with potential to perform early diagnosis and provide necessary treatment on time. However, the use of personal medical data and records is a concern of data security and data sharing policy [47]. Blockchain [41] provides a solution to deal with privacy and transparency of data. The typical IoT blockchain platform for smart healthcare is illustrated in Fig 10.

FIGURE 10.

Label distribution-prediction result.

FIGURE 11.

IoT based framework for smart healthcare.

The framework consists of vital sign monitoring system, an IoT server [42], a blockchain network, and a communication interface to collect the patients information from the healthcare sensors. All the information is stored securely and communicated to the medical staff for further diagnosis and treatment. This information further can be used to develop decision making models using deep learning models to provide accurate diagnosis in time. The approach once development can be optimized to develop an IoT based smart device and can be used by the medical staff efficiently.

SECTION VII.

Conclusion and Future Work

In this work, we introduce the state-of-the-art DL based smart health system for identification of blindness in eye disease (diabetic retinopathy) evaluated on a retinal image dataset in the IoT. We have shown that the convergence of IoT with AI can lead to provide effective smart health system.

Our fine-tuned Efficient-B5 based model outperforms CNN and ResNet50 models with 92.32% validation accuracy which predicts diagnosis of diabetic retinopathy severity (eye blindness) on the five-point scale from retinal images. Our baseline model Efficient-B5 (with fine-tuning) model training on the average doctor opinion, a tactics that output state-of-the-art results on identifying blindness by 90.20% of validation accuracy. The freezing and unfreezing for fine-tuned EfficientNet-B5 significantly improved the prediction with 92.32% validation accuracy. The proposed approach has been developed and tested only for the early detection of diabetic retinopathy in diabetic patients. We also performed oversampling strategies for the interpretation on our detected results. For other medical image diagnosis, the approach needs to be tested before making any concluding remarks. We found that there are more number of 0’s and 2’s i.e., no diabetic retinopathy symptom and moderate retinopathy shown in around 89% of the images.

We intend to adopt other CNN architectures [43]–[45] such as UNet with ResNet models and EfficientNet weights [46] with UNet for such imbalanced image collection, and pseudo labeling the imbalanced dataset may potentially improve the prediction for given classes including consideration of this identification task as binary classification by individually labeling the data could be an added advantage. The limitation in doing so is of processing power which greatly increases if using EfficientNet-B6 or B7 weights. The proposed approach has been developed and tested only for the early detection of diabetic retinopathy in diabetic patients.

Declaration of Competing Interest

We wish to confirm that there are no known conflicts of interest associated with this publication.

Cites in Papers - |

Cites in Papers - IEEE (8)

Select All

Divya K. S, Manesh T, Abhinand K Prasad, Akhila Venu, Akshara Devaraj, Albert Mathew Paul, "Artificial Intelligence for Retina Disease Detection: A Comprehensive Survey", 2024 International Conference on IoT Based Control Networks and Intelligent Systems (ICICNIS), pp.1310-1316, 2024.

Rama Krishna Eluri, Y. Gnaneswar Reddy, Karunakumar. Valicharla, K. Divya Prakash, B. Sudheer, "Improving Early Detection of Diabetic Retinopathy: A Hybrid Deep Learning Model Focused on Lesion Identification", 2024 First International Conference on Innovations in Communications, Electrical and Computer Engineering (ICICEC), pp.1-7, 2024.

P Vinayagam, S Santhiya, S Naveen Kumar, V Dhanush Kumar, "Visionary AI Enhancing Diabetic Retinopathy Detection through Image Processing", 2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI), pp.928-933, 2024.

D. Nagasudha, B. Vanessa Thanmayi, B. Raghavi, A. Akash, CH. Vaishnavi, R. Pitchai, "An Approach for Diabetic Retinopathy Using GNN", 2023 6th International Conference on Contemporary Computing and Informatics (IC3I), vol.6, pp.2408-2414, 2023.

Pankaj Kumar, Sudhir Bhandari, Vishal Dutt, "Pre-Trained Deep Learning-Based Approaches for Eye Disease Detection", 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT), pp.1286-1290, 2023.

Hyun-Cheol Park, In-Pyo Hong, Sahadev Poudel, Chang Choi, "Data Augmentation Based on Generative Adversarial Networks for Endoscopic Image Classification", IEEE Access, vol.11, pp.49216-49225, 2023.

Kalpana Murugan, Tarun Srinivasulu Gondrala, Arani Hariprasad Vigneesh, Ushasree Nandimandalam, "IoT Based Diabetic Retinopathy Monitoring System", 2022 IEEE International Conference of Electron Devices Society Kolkata Chapter (EDKCON), pp.47-50, 2022.

Aashat Gehlot, Neeti Misra, "Retracted: An IoT Based Smart Healthcare Medical System using Deep Learning Algorithm", 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), pp.1-6, 2022.

Cites in Papers - Other Publishers (10)

Ravi Kant Kumar, Sobin C. C., "Study and Analysis of IoT-Based Telemedicine and Remote Patient Monitoring", Scalable Modeling and Efficient Management of IoT Applications, pp.99, 2024.

Antara Malakar, Ankur Ganguly, Swarnendu Kumar Chakraborty, "Development of CNN-Based Feature Extraction and Multi-layer Perceptron for Eye Disease Detection", Advanced Computing and Intelligent Technologies, vol.958, pp.1, 2024.

Md. Aiyub Ali, Md. Shakhawat Hossain, Md.Kawar Hossain, Subhadra Soumi Sikder, Sharun Akter Khushbu, Mirajul Islam, "AMDNet23: Hybrid CNN-LSTM Deep Learning Approach with Enhanced Preprocessing for Age-Related Macular Degeneration (AMD) Detection", Intelligent Systems with Applications, pp.200334, 2024.

Brij B. Gupta, Akshat Gaurav, Varsha Arya, "Deep CNN based brain tumor detection in intelligent systems", International Journal of Intelligent Networks, 2024.

Alejandro Mora-Rubio, Mario Alejandro Bravo-Ortíz, Sebastián Quiñones Arredondo, Jose Manuel Saborit Torres, Gonzalo A. Ruz, Reinel Tabares-Soto, "Classification of Alzheimer’s disease stages from magnetic resonance images using deep learning", PeerJ Computer Science, vol.9, pp.e1490, 2023.

S. Kayalvizhi, S. Nagarajan, J. Deepa, K. Hemapriya, "Multi-modal IoT-based medical data processing for disease diagnosis using Heuristic-derived deep learning", Biomedical Signal Processing and Control, vol.85, pp.104889, 2023.

Muhab Hariri, Ercan Avsar, "COVID-19 and pneumonia diagnosis from chest X-ray images using convolutional neural networks", Network Modeling Analysis in Health Informatics and Bioinformatics, vol.12, no.1, 2023.

Kurubaran Ganasegeran, Mohd Kamarulariffin Kamarudin, "Epidemiological surveillance of blindness using deep learning approaches", Computational Methods and Deep Learning for Ophthalmology, pp.53, 2023.

Abderahman Rejeb, Karim Rejeb, Horst Treiblmaier, Andrea Appolloni, Salem Alghamdi, Yaser Alhasawi, Mohammad Iranmanesh, "The Internet of Things (IoT) in Healthcare: Taking Stock and Moving Forward", Internet of Things, pp.100721, 2023.

10.

Eduardo L. L. Nascimento, Angel Freddy Godoy Viera, "Contributions and Limitations About the Use of Deep Learning for Skin Diagnosis: A Review", Data and Information in Online Environments, vol.452, pp.133, 2022.