A Malware Family Classification Approach Based on Deep Sequence Modeling

Authors

Corressponding author's email:

samnx@hcmute.edu.vn

DOI:

https://doi.org/10.54644/jte.2025.1527

Keywords:

One Dimension Convolutional Neuron Network (1D-CNN), Bidirectional Long Short-Term Memory (BiLSTM), Malware Family Classification (MFC), Sequential Data (SD), Opcode Pattern Extraction Mechanism (OPEM)

Abstract

In cyber networks, malware can come in various forms and different families. Classifying malware into families helps respond to specific threats more effectively. Because the executable files contain instructions and opcodes to identify types of malwares, presenting in sequential data. Sequence learning models are necessary for improving performance of malware family classification. In this work, We proposed hybrid models based on a one-dimensional convolutional neural network and bidirectional long short-term memory where one dimensional convolutional neural network  works as a preprocessing mechanism in the extracting malware features from raw data and bidirectional long short-term memory networks process the sequential data in both forward and backward directions. Simulating results shown that our proposal was able to classify 21 malware families with training and testing accuracy 95%, significantly better than one directional convolutional neuron network, training accuracy with 98% and testing accuracy 91%. Similarity, loss of our model in the training and the testing is decreased smoothly, compared to one dimension convolutional neuron network.

Downloads: 0

Download data is not yet available.

Author Biographies

Xuan Sam Nguyen, Ho Chi Minh City University of Technology and Education, Vietnam

Xuan Sam Nguyen received the Bachelor of engineering in Electronic and Communication Engineering from PTIT, Hanoi, Vietnam in 2002, the Master of science in Information and Communications Engineering from the Andong National University, and the Doctor of Philosophy in Computer Engineering from Korea University (Seoul campus), Republic of Korea in 2009 and 2016, respectively. He is currently a faculty member of FIT, HCMUTE. His research interests include distributed computing, real-time embedded systems, artificial intelligence for Internet of things, and cyber security.

Email: samnx@hcmute.edu.vn. ORCID:  https://orcid.org/0009-0005-7225-9104

Han Nguyen, University of South Florida, United States of America

Han Nguyen is currently an undergraduate student majoring in Computer Science at the University of South Florida, United States of America. Her research interests include artificial intelligence, machine learning and cyber security.

Email: han316@usf.edu. ORCID:  https://orcid.org/0009-0007-3684-6226

References

M. Jain, W. Andreopoulos, and M. Stamp, "Convolutional neural networks and extreme learning machines for malware classification," Journal of Computer Virology and Hacking Techniques, vol. 16, pp. 229-244, 2020. DOI: https://doi.org/10.1007/s11416-020-00354-y

H. Sandeep, "Static analysis of android malware detection using deep learning," in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019: IEEE, pp. 841-845.

M. Stamp, M. Alazab, and A. Shalaginov, Malware analysis using artificial intelligence and deep learning. Springer, 2021. DOI: https://doi.org/10.1007/978-3-030-62582-5

D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei, and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture," Computer Networks, vol. 171, p. 107138, 2020. DOI: https://doi.org/10.1016/j.comnet.2020.107138

S. Kiranyaz, O. Avci, O. Abdeljaber, T. Ince, M. Gabbouj, and D. J. Inman, "1D convolutional neural networks and applications: A survey," Mechanical systems and signal processing, vol. 151, p. 107398, 2021. DOI: https://doi.org/10.1016/j.ymssp.2020.107398

M. Azizjon, A. Jumabek, and W. Kim, "1D CNN based network intrusion detection with normalization on imbalanced data," in 2020 international conference on artificial intelligence in information and communication (ICAIIC), 2020: IEEE, pp. 218-224. DOI: https://doi.org/10.1109/ICAIIC48513.2020.9064976

X. Chong, Y. Gao, R. Zhang, J. Liu, X. Huang, and J. Zhao, "Classification of malware families based on efficient-net and 1D-CNN fusion," Electronics, vol. 11, no. 19, p. 3064, 2022. DOI: https://doi.org/10.3390/electronics11193064

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," The journal of machine learning research, vol. 15, no. 1, pp. 1929-1958, 2014.

S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997. DOI: https://doi.org/10.1162/neco.1997.9.8.1735

Z. Cui, R. Ke, Z. Pu, and Y. Wang, "Deep bidirectional and unidirectional LSTM recurrent neural network for network-wide traffic speed prediction," arXiv preprint arXiv:1801.02143, 2018.

Downloads

Published

28-08-2025

How to Cite

[1]
X. S. Nguyen and H. Nguyen, “A Malware Family Classification Approach Based on Deep Sequence Modeling”, JTE, vol. 20, no. 03, pp. 8–15, Aug. 2025.

Issue

Section

Research Article

Categories