Explainable Artificial Intelligence Techniques for Speech Emotion Recognition: A Focus on XAI Models
DOI:
https://doi.org/10.4114/intartif.vol28iss76pp85-123Keywords:
Artificial Intelligence, Speech Emotion Recognition, Shapley additive explanations, Local Interpretable Model-agnosticAbstract
This study employs Explainable Artificial Intelligence (XAI) techniques, including SHAP, LIME, and XGBoost, to interpret speech-emotion recognition (SER) models. Unlike previous work focusing on generic datasets, this research integrates these tools to explore the unique emotional nuances within an Afrikaans speech corpus. The complexity of architectures poses significant challenges regarding model interpretability. This paper explicitly aims to bridge the gaps in existing Speech Emotion Recognition (SER) systems by integrating advanced Explainable Artificial Intelligence (XAI) techniques. The objective is to develop an Ensemble stacking model that combines CNN, CLSTM, and XGBoost, augmented by SHAP and LIME, to enhance the interpretability, accuracy, and adaptability of SER systems, particularly for underrepresented languages like Afrikaans. Our research methodology involves utilising XAI methods to explain the decision-making processes of CNN and CLSTM models in speech emotion recognition (SER) to enhance trust, diagnostic insight, and theoretical understanding. We train the models for SER using a comprehensive dataset of emotional speech samples. Post-training, we apply SHAP and LIME to these models to generate explanations for their predictions, focusing on the importance of features
and the models’ decision logic. By comparing the explanations generated by SHAP and LIME, we assess the efficacy of each method in providing meaningful insights into the models’ operations. The comparative study of various models in SER demonstrates their capability to discern complex emotional states through diverse analytical approaches, from spatial feature extraction to temporal dynamics. Our research reveals that XAI techniques improve the interpretability of complex SER models. This enhanced transparency builds end-user trust and provides valuable insights. This study contributes to the importance of explainability in deploying AI technologies in emotionally sensitive applications, paving the way for more accountable and user-centric SER systems.
Downloads
Metrics
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Iberamia & The Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Open Access publishing.
Lic. under Creative Commons CC-BY-NC
Inteligencia Artificial (Ed. IBERAMIA)
ISSN: 1988-3064 (on line).
(C) IBERAMIA & The Authors