Главная

Developing a negative speech emotion recognition model for safety systems using deep learning

Growing threats in public spaces have forced people to question personal security, making technology more relevant, especially in speech recognition. This paper proposes a security safety system by considering keyword and negative emotion detection to solve this problem. It detects the wake-up word "ON" whenever it is spoken with negative emotion. Our essential contribution is two-fold: first detecting the presence of the wake-up keyword 'ON' in the speech using a Convolutional Neural Network (CNN) model, and second, detecting negative emotion in the speech through a Long Short-Term Memory (LSTM) Model. In this paper, we proposed combining the models above, catering to the same problem statement. From the suggested methodology, the CNN-based keyword detection model achieves 97.23% accuracy for the safety-related ‘ON’ keyword, placing it only slightly above comparable works, while the LSTM-based negative emotion recognition registers 88.94% accuracy, trailing advanced architectures from recent developments. The dataset curation, different methodologies implemented, and system pipeline are some of the building blocks discussed further. The paper also compares feature extraction techniques such as MEL Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC), CHROMA, and MEL. Moreover, as speech recognition applications with more than one model are becoming increasingly popular, this analysis would help develop applications that require a similar end-to-end construct. © 2025 Elsevier B.V., All rights reserved.

Авторы

Jena Shreya ¹ , Basak Sneha ¹ , Agrawal Himanshi ¹ , Saini Bunny ¹ , Gite Shilpa Shailesh ^2, ³ , Kotecha Ketan V. ^2, ⁵ , Alfarhood Sultan ⁴

Журнал

Journal of Big Data

Издательство

Springer Nature

Номер выпуска

Язык

Английский

Статус

Опубликовано

Ссылка

Внешняя ссылка

DOI

10.1186/S40537-025-01090-0

Номер

Том

Год

2025

Организации

¹ Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune, India
² Symbiosis Centre for Applied Artificial Intelligence, Pune, Pune, India
³ Department of Artificial Intelligence and Machine Learning, Symbiosis Institute of Technology, Pune, India
⁴ College of Sciences, Riyadh, Saudi Arabia
⁵ RUDN University, Moscow, Russian Federation

Ключевые слова

Automatic speech recognition (ASR); Convolutional neural network (CNN); Long short-term memory (LSTM) model; MEL-frequency cepstral coefficients (MFCC); Safety systems

Цитировать

ГОСТ MLA RIS BibTex

Другие записи

АКТУАЛЬНЫЕ ПРОБЛЕМЫ МЕЖДУНАРОДНЫХ ОТНОШЕНИЙ И ВНЕШНЕЙ ПОЛИТИКИ В ХХI ВЕКЕ

Монография

Аватков В.А., Апанович М.Ю., Борзова А.Ю., Бордачев Т.В., Винокуров В.И., Волохов В.И., Воробьев С.В., Гуменский А.В., Иванченко В.С., Каширина Т.В., Матвеев О.В., Окунев И.Ю., Поплетеева Г.А., Сапронова М.А., Свешникова Ю.В., Фененко А.В., Феофанов К.А., Цветов П.Ю., Школярская Т.И., Штоль В.В. ...

Общество с ограниченной ответственностью Издательско-торговая корпорация "Дашков и К". 2018. 411 с.

THE EFFECTS OF FUSARIUM GRAMINEARUM CELL EXTRACTS AND CULTURE FILTRATES ON THE PRODUCTION OF PACLITAXEL AND 10-DEACETYLBACCATIN III IN SUSPENSION CELL CULTURES OF TAXUS BACCATA L.

Статья

Dehghan Arman Kamali, Zargar Meisam, Bamneshin Mahsa, Mahmoudieh Mohtaram, Safaie Naser, Yang Junli, Naghavi Mohammad Reza

BMC Plant Biology. Том 25. 2025.