Explainable Artificial Intelligence Techniques for Speech Emotion Recognition: A Focus on XAI Models

Michael Norval; Zenghui Wang

doi:10.4114/intartif.vol28iss76pp85-123

Authors

Michael Norval University of South Africa, Johannesburg
Zenghui Wang University of South Africa, Johannesburg

DOI:

https://doi.org/10.4114/intartif.vol28iss76pp85-123

Keywords:

Artificial Intelligence, Speech Emotion Recognition, Shapley additive explanations, Local Interpretable Model-agnostic

Abstract

This study employs Explainable Artificial Intelligence (XAI) techniques, including SHAP, LIME, and XGBoost, to interpret speech-emotion recognition (SER) models. Unlike previous work focusing on generic datasets, this research integrates these tools to explore the unique emotional nuances within an Afrikaans speech corpus. The complexity of architectures poses significant challenges regarding model interpretability. This paper explicitly aims to bridge the gaps in existing Speech Emotion Recognition (SER) systems by integrating advanced Explainable Artificial Intelligence (XAI) techniques. The objective is to develop an Ensemble stacking model that combines CNN, CLSTM, and XGBoost, augmented by SHAP and LIME, to enhance the interpretability, accuracy, and adaptability of SER systems, particularly for underrepresented languages like Afrikaans. Our research methodology involves utilising XAI methods to explain the decision-making processes of CNN and CLSTM models in speech emotion recognition (SER) to enhance trust, diagnostic insight, and theoretical understanding. We train the models for SER using a comprehensive dataset of emotional speech samples. Post-training, we apply SHAP and LIME to these models to generate explanations for their predictions, focusing on the importance of features
and the models’ decision logic. By comparing the explanations generated by SHAP and LIME, we assess the efficacy of each method in providing meaningful insights into the models’ operations. The comparative study of various models in SER demonstrates their capability to discern complex emotional states through diverse analytical approaches, from spatial feature extraction to temporal dynamics. Our research reveals that XAI techniques improve the interpretability of complex SER models. This enhanced transparency builds end-user trust and provides valuable insights. This study contributes to the importance of explainability in deploying AI technologies in emotionally sensitive applications, paving the way for more accountable and user-centric SER systems.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Explainable Artificial Intelligence Techniques for Speech Emotion Recognition: A Focus on XAI Models

Authors

DOI:

Keywords:

Abstract

Downloads

Metrics

Downloads

Published

How to Cite

Issue

Section

License

open

Inteligencia Artificial
_{An international open access journal.
Edited by Iberamia. e-ISSN: 1988-3064}

Make a Donation

J. Impact Factor 2024: 3.7 (Q2)

ONGOING ISSUE

ALL ISSUES

Information

Current Issue

Explainable Artificial Intelligence Techniques for Speech Emotion Recognition: A Focus on XAI Models

Authors

DOI:

Keywords:

Abstract

Downloads

Metrics

Downloads

Published

How to Cite

Issue

Section

License

open

Inteligencia ArtificialAn international open access journal.Edited by Iberamia. e-ISSN: 1988-3064

Make a Donation

J. Impact Factor 2024: 3.7 (Q2)

ONGOING ISSUE

ALL ISSUES

Information

Current Issue

Inteligencia Artificial
_{An international open access journal.
Edited by Iberamia. e-ISSN: 1988-3064}