Mental health assessment is typically carried out via a series of conversation sessions with medical professionals, where the overall aim is the diagnosis of mental illnesses and well-being evaluation. Despite its arguable socioeconomic significance, national health systems fail to meet the increased demand for such services that has been observed in recent years. To assist and accelerate the diagnosis process, this work proposes an AI-based tool able to provide interpretable predictions by automatically processing the recorded speech signals. An explainability-by-design approach is followed, where audio descriptors related to the problem at hand form the feature vector (Mel-scaled spectrum summarization, Teager operator and periodicity description), while modeling is based on Hidden Markov Models adapted from an ergodic universal one following a suitably designed data selection scheme. After extensive and thorough experiments adopting a standardized protocol on a publicly available dataset, we report significantly higher results with respect to the state of the art. In addition, an ablation study was carried out, providing a comprehensive analysis of the relevance of each system component. Last but not least, the proposed solution not only provides excellent performance, but its operation and predictions are transparent and interpretable, laying out the path to close the usability gap existing between such systems and medical personnel.

Interpretable probabilistic identification of depression in speech / S. Ntalampiras. - In: SENSORS. - ISSN 1424-8220. - 25:4(2025 Feb), pp. 1270.1-1270.24. [10.3390/s25041270]

Interpretable probabilistic identification of depression in speech

S. Ntalampiras
2025

Abstract

Mental health assessment is typically carried out via a series of conversation sessions with medical professionals, where the overall aim is the diagnosis of mental illnesses and well-being evaluation. Despite its arguable socioeconomic significance, national health systems fail to meet the increased demand for such services that has been observed in recent years. To assist and accelerate the diagnosis process, this work proposes an AI-based tool able to provide interpretable predictions by automatically processing the recorded speech signals. An explainability-by-design approach is followed, where audio descriptors related to the problem at hand form the feature vector (Mel-scaled spectrum summarization, Teager operator and periodicity description), while modeling is based on Hidden Markov Models adapted from an ergodic universal one following a suitably designed data selection scheme. After extensive and thorough experiments adopting a standardized protocol on a publicly available dataset, we report significantly higher results with respect to the state of the art. In addition, an ablation study was carried out, providing a comprehensive analysis of the relevance of each system component. Last but not least, the proposed solution not only provides excellent performance, but its operation and predictions are transparent and interpretable, laying out the path to close the usability gap existing between such systems and medical personnel.
medical acoustics; audio pattern recognition; universal background modeling; hidden Markov models; mental health; depression; explainable AI; interpretable AI
Settore INFO-01/A - Informatica
feb-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
sensors-25-01270.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 1.46 MB
Formato Adobe PDF
1.46 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1148695
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex 2
social impact