Expresso-AI: An Explainable Video-Based Deep Learning Models for Depression Diagnosis | IEEE Conference Publication | IEEE Xplore

Expresso-AI: An Explainable Video-Based Deep Learning Models for Depression Diagnosis


Abstract:

Given the widespread prevalence of depression and its consequential impact on individuals and society, it is crucial to obtain objective measures for early diagnosis and ...Show More

Abstract:

Given the widespread prevalence of depression and its consequential impact on individuals and society, it is crucial to obtain objective measures for early diagnosis and intervention. As a multidisciplinary topic, these objective measures should be interpretable and accessible to health care professionals, ensuring effective collaboration and treatment planning in the realm of mental health care. Even though current automated depression diagnosis approaches improved over the last decade, a critical gap exists as they often lack affect-specificity and interpretability, limiting their practical application and potential impact on mental health care. In particular, interpretability from temporal activities from videos when deep models are used is not fully explored. In this study, we present a novel framework for analyzing Deep Neural Networks’ decisions when trained on facial videos, specifically focusing on automatic depression severity diagnosis. By fine-tuning Deep Convolutional Neural Networks (DCNN) pre-trained on Action Recognition datasets on depression severity facial videos from AVEC depression dataset, our framework is able to interpret the model’s saliency maps by examining face regions and temporal expression semantics. Our approach generates both visual and quantitative explanations for the model’s decisions, providing greater insight into its reasoning. In addition to this interpretability, our video-based modeling has improved upon previous single-face benchmarks for visual depression diagnosis, resulting in enhanced predictive performance. Overall, our work demonstrates the successful development of a framework capable of generating hypotheses from a facial model’s decisions while simultaneously improving depression’s predictive capabilities.
Date of Conference: 10-13 September 2023
Date Added to IEEE Xplore: 15 January 2024
ISBN Information:

ISSN Information:

Conference Location: Cambridge, MA, USA
References is not available for this document.

I. Introduction

Depression is a pervasive mental health issue affecting millions of people globally. According to the World Health Organization (WHO), over 264 million people of all ages suffer from depression, making it one of the leading causes of disability worldwide [1]. Current diagnostic methods for depression primarily rely on subjective assessments, such as self-reports and clinical interviews, which may be influenced by factors like personal bias, cultural differences, and subjective interpretations. Given the widespread prevalence of depression and its consequential impact on individuals and society, there is a pressing need for developing objective measures to accurately identify and assess depressive symptoms. Such objective measures should be interpretable and accessible to health care professionals, ensuring effective collaboration and treatment planning in the realm of mental health care. By addressing this need, we can pave the way for improved early diagnosis, intervention, and ultimately, better mental health outcomes for those affected by depression.

Select All
1.
"Depression (who fact sheet) world health organization", 2020, [online] Available: https://www.who.int/news-room/factsheets/.
2.
A. Pampouchidou, P. Simos, K. Marias, F. Meriaudeau, F. Yang, M. Pediaditis, et al., "Automatic assessment of depression based on visual cues: A systematic review", IEEE Transactions on Affective Computing, 2017.
3.
T. Miller, "Explanation in artificial intelligence: Insights from the social sciences", Artificial Intelligence, vol. 267, pp. 1-38, 2019.
4.
O. Biran and C. Cotton, "Explanation and justification in machine learning: A survey", IJCAI-17 workshop on explainable AI (XAI), vol. 8, no. 1, 2017.
5.
F. Hohman, M. Kahng, R. Pienta and D. H. Chau, "Visual analytics in deep learning: An interrogative survey for the next frontiers", IEEE transactions on visualization and computer graphics, vol. 25, no. 8, pp. 2674-2693, 2018.
6.
A. Nguyen, J. Yosinski and J. Clune, "Understanding neural networks via feature visualization: A survey", Explainable AI: Interpreting Explaining and Visualizing Deep Learning, pp. 55-76, 2019.
7.
R. Fong and A. Vedaldi, "Explanations for attributing deep neural network predictions", Explainable AI: Interpreting Explaining and Visualizing Deep Learning, pp. 149-167, 2019.
8.
Z. Li, W. Wang, Z. Li, Y. Huang and Y. Sato, "A comprehensive study on visual explanations for spatio-temporal networks", arXiv preprint arXiv:2005.00375, 2020.
9.
X. Li, W. Guo and H. Yang, "Depression severity prediction from facial expression based on the drr depressionnet network", 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, pp. 2757-2764, 2020.
10.
Z. Bylinskii, T. Judd, A. Oliva, A. Torralba and F. Durand, "What do different evaluation metrics tell us about saliency models?", IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 3, pp. 740-757, 2018.
11.
S. Al-gawwam and M. Benaissa, "Depression detection from eye blink features", 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, pp. 388-392, 2018.
12.
Y. Suhara, Y. Xu and A. S. Pentland, "Deepmood: Forecasting depressed mood based on self-reported histories via recurrent neural networks", Proceedings of the 26th International Conference on World Wide Web ser. WWW ’17, pp. 715-724, 2017.
13.
T. Alhanai, M. Ghassemi and J. Glass, "Detecting depression with audio/text sequence modeling of interviews", Proceedings of the Annual Conference of the International Speech Communication Association INTERSPEECH, vol. 2018-September, pp. 1716-1720, 2018.
14.
A. Haque, M. Guo, A. S. Miner and L. Fei-Fei, "Measuring depression symptom severity from spoken language and 3d facial expressions", arXiv preprint arXiv:1811.08592, 2018.
15.
S. Yin, C. Liang, H. Ding and S. Wang, "A multi-modal hierarchical recurrent neural network for depression detection", Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop, pp. 65-71, 2019.
16.
Y. Zhu, Y. Shang, Z. Shao and G. Guo, "Automated depression diagnosis based on deep networks to encode facial appearance and dynamics", IEEE Transactions on Affective Computing, vol. 9, no. 4, pp. 578-584, Oct 2018.
17.
L. Yang, D. Jiang and H. Sahli, "Integrating deep and shallow models for multi-modal depression analysis—hybrid architectures", IEEE Transactions on Affective Computing, 2018.
18.
Z. Du, W. Li, D. Huang and Y. Wang, "Encoding visual behaviors with attentive temporal convolution for depression prediction", 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019). IEEE, pp. 1-7, 2019.
19.
W. C. de Meto, E. Granger and M. B. Lopez, "Encoding temporal information for automatic depression recognition from facial analysis", ICASSP 2020-2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 1080-1084, 2020.
20.
J. F. Cohn, N. Cummins, J. Epps, R. Goecke, J. Joshi and S. Scherer, "Multimodal assessment of depression from behavioral signals", The Handbook of Multimodal-Multisensor Interfaces: Signal Processing Architectures and Detection of Emotion and Cognition-Volume 2, pp. 375-417, 2018.
21.
E. A. Stepanov, S. Lathuiliere, S. A. Chowdhury, A. Ghosh, R.-L. Vieriu, N. Sebe, et al., "Depression severity estimation from multiple modalities", 2018 IEEE 20th International Conference on e-Health Networking Applications and Services (Healthcom), pp. 1-6, 2018.
22.
K. Anis, H. Zakia, D. Mohamed and C. Jeffrey, "Detecting depression severity by interpretable representations of motion dynamics", 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, pp. 739-745, 2018.
23.
X. Zhou, K. Jin, Y. Shang and G. Guo, "Visually interpretable representation learning for depression recognition from facial images", IEEE Transactions on Affective Computing, 2018.
24.
W. C. De Melo, E. Granger and A. Hadid, "Depression detection based on deep distribution learning", 2019 IEEE International Conference on Image Processing (ICIP). IEEE, pp. 4544-4548, 2019.
25.
S. Song, S. Jaiswal, L. Shen and M. Valstar, "Spectral representation of behaviour primitives for depression analysis", IEEE Transactions on Affective Computing, 2020.
26.
M. Valstar, B. Schuller, K. Smith, T. Almaev, F. Eyben, J. Krajewski, et al., "Avec 2014: 3d dimensional affect and depression recognition challenge", Proceedings of the 4th international workshop on audio/visual emotion challenge, pp. 3-10, 2014.
27.
F. Ringeval, B. Schuller, M. Valstar, N. Cummins, R. Cowie, L. Tavabi, M. Schmitt, S. Alisamir, S. Amiriparian, E.-M. Messner et al., "Avec 2019 workshop and challenge: state-of-mind detecting depression with ai and cross-cultural affect recognition", Proceedings of the 9th International on Audio/visual Emotion Challenge and Workshop, pp. 3-12, 2019.
28.
S. Zhang, X. Zhu, Z. Lei, H. Shi, X. Wang and S. Z. Li, "S3fd: Single shot scale-invariant face detector", Proceedings of the IEEE international conference on computer vision, pp. 192-201, 2017.
29.
A. Bulat and G. Tzimiropoulos, "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230000 3d facial landmarks)", Proceedings of the IEEE International Conference on Computer Vision, pp. 1021-1030, 2017.
30.
K. Turkowski, "Turkowski filters for common resampling tasks 10 april 1990 filters for common resampling tasks".

Contact IEEE to Subscribe

References

References is not available for this document.