Developing a negative speech emotion recognition model for safety systems using deep learning

Growing threats in public spaces have forced people to question personal security, making technology more relevant, especially in speech recognition. This paper proposes a security safety system by considering keyword and negative emotion detection to solve this problem. It detects the wake-up word "ON" whenever it is spoken with negative emotion. Our essential contribution is two-fold: first detecting the presence of the wake-up keyword 'ON' in the speech using a Convolutional Neural Network (CNN) model, and second, detecting negative emotion in the speech through a Long Short-Term Memory (LSTM) Model. In this paper, we proposed combining the models above, catering to the same problem statement. From the suggested methodology, the CNN-based keyword detection model achieves 97.23% accuracy for the safety-related ‘ON’ keyword, placing it only slightly above comparable works, while the LSTM-based negative emotion recognition registers 88.94% accuracy, trailing advanced architectures from recent developments. The dataset curation, different methodologies implemented, and system pipeline are some of the building blocks discussed further. The paper also compares feature extraction techniques such as MEL Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC), CHROMA, and MEL. Moreover, as speech recognition applications with more than one model are becoming increasingly popular, this analysis would help develop applications that require a similar end-to-end construct. © 2025 Elsevier B.V., All rights reserved.

Авторы
Jena Shreya 1 , Basak Sneha 1 , Agrawal Himanshi 1 , Saini Bunny 1 , Gite Shilpa Shailesh 2, 3 , Kotecha Ketan V. 2, 5 , Alfarhood Sultan 4
Издательство
Springer Nature
Номер выпуска
1
Язык
Английский
Статус
Опубликовано
Номер
54
Том
12
Год
2025
Организации
  • 1 Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune, India
  • 2 Symbiosis Centre for Applied Artificial Intelligence, Pune, Pune, India
  • 3 Department of Artificial Intelligence and Machine Learning, Symbiosis Institute of Technology, Pune, India
  • 4 College of Sciences, Riyadh, Saudi Arabia
  • 5 RUDN University, Moscow, Russian Federation
Ключевые слова
Automatic speech recognition (ASR); Convolutional neural network (CNN); Long short-term memory (LSTM) model; MEL-frequency cepstral coefficients (MFCC); Safety systems
Цитировать
Поделиться

Другие записи

Аватков В.А., Апанович М.Ю., Борзова А.Ю., Бордачев Т.В., Винокуров В.И., Волохов В.И., Воробьев С.В., Гуменский А.В., Иванченко В.С., Каширина Т.В., Матвеев О.В., Окунев И.Ю., Поплетеева Г.А., Сапронова М.А., Свешникова Ю.В., Фененко А.В., Феофанов К.А., Цветов П.Ю., Школярская Т.И., Штоль В.В. ...
Общество с ограниченной ответственностью Издательско-торговая корпорация "Дашков и К". 2018. 411 с.