Optimizing deep learning for webshell detection based on flexible dataset reduction

Webshells, malicious scripts, or code snippets have seen a dramatic rise in incidents, posing significant threats to organizations across various sectors. Traditional security measures often fail to detect these threats, necessitating the use of advanced detection mechanisms. This article proposes a deep learning-based technique for webshell detection, which addresses the challenges of high computational costs and sensitivity to input length variations. The proposed method uses a flexible dataset reduction approach in conjunction with two feature extraction techniques, TF-IDF and Word2Vec, to mitigate computational complexity and standardize model input. To address input variability and high-dimensionality, we introduce two dataset reduction strategies: Flat-based and Depth-based reduction, both of which rely on a standard deviation-based representation to preserve essential statistical characteristics while reducing dataset size. This combination enhances the performance and scalability of deep learning models, making them more feasible for practical applications in webshell detection. The study systematically reviews existing techniques, highlights limitations, and presents an innovative solution to improve detection accuracy and efficiency. Experimental results demonstrate that our approach achieves high accuracy (up to 98.50% using CNN) while significantly reducing training time. The findings validate that flexible dataset reduction combined with dual feature extraction offers a scalable and effective solution for real-time webshell detection. © 2025 Elsevier B.V., All rights reserved.

Авторы
Medileh Saci 1 , Hammoudeh Mohammad Ali A. 6 , Bounceur Ahcène 2 , Brahim Ferik 3 , Abdelkader Laouid Azzeddine 1 , Mostefa Kara 4 , Muthanna Ammar 5
Издательство
Elsevier B.V.
Язык
Английский
Статус
Опубликовано
Номер
100770
Том
31
Год
2025
Организации
  • 1 LIAP Laboratory, Université d’Echahid Hamma Lakhdar – El-oued, El Oued Province, Algeria
  • 2 University of Sharjah, Sharjah, United Arab Emirates
  • 3 Université Larbi Tébessi - Tébessa, Tebessa, Algeria
  • 4 King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
  • 5 RUDN University, Moscow, Russian Federation
  • 6 Department of Computer and Information Science, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
Ключевые слова
Cybersecurity; Dataset reduction; Deep learning; Depth-based reduction; Flat-based reduction; Machine learning; Standard deviation representation; TF-IDF; Webshell detection; Word2Vec
Цитировать
Поделиться

Другие записи

Аватков В.А., Апанович М.Ю., Борзова А.Ю., Бордачев Т.В., Винокуров В.И., Волохов В.И., Воробьев С.В., Гуменский А.В., Иванченко В.С., Каширина Т.В., Матвеев О.В., Окунев И.Ю., Поплетеева Г.А., Сапронова М.А., Свешникова Ю.В., Фененко А.В., Феофанов К.А., Цветов П.Ю., Школярская Т.И., Штоль В.В. ...
Общество с ограниченной ответственностью Издательско-торговая корпорация "Дашков и К". 2018. 411 с.