A Novel Feature Vector for AI-Assisted Windows Malware Detection

Page view(s)
18
Checked on Feb 07, 2025
A Novel Feature Vector for AI-Assisted Windows Malware Detection
Title:
A Novel Feature Vector for AI-Assisted Windows Malware Detection
Journal Title:
2023 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)
Publication Date:
25 December 2023
Citation:
Yau, L. Q., Lam, Y. T., Lokesh, A., Gupta, P., Lim, J., Singh, I. S., Loo, J.-Y., Ngo, M. V., Teo, S. G., & Truong-Huu, T. (2023, November 14). A Novel Feature Vector for AI-Assisted Windows Malware Detection. 2023 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). https://doi.org/10.1109/dasc/picom/cbdcom/cy59711.2023.10361451
Abstract:
Dynamic malware analysis, which has been a major field in malware analysis and detection, involves executing mal- ware in a controlled environment and observing its behavior. Dynamic analysis reports include Windows API calls, which are extracted as a data source for statistical features, and have allowed for effective malware detection. However, existing works neglect certain critical information about the API calls when constructing feature vectors. In this work, we develop a novel feature vector, taking into account not only the API name and its arguments but also other statistical features such as the return values and the number of times it is called in a sample. Due to the diversity of API calls in terms of the number of arguments, names, and return values, we adopt hash functions to construct a fixed-size feature vector, thus facilitating the design and development of artificial intelligence (AI)-assisted algorithms for malware detection. We experiment with various deep learning and machine learning models and perform extensive hyperpa- rameter tuning to come up with an optimal model for our feature vector. The experimental dataset was recently collected from an anti-virus company, including 14860 samples with 7398 malign samples and 7462 benign samples. Extensive experiments show that our solution outperforms many baseline state-of-the- art malware detectors in various performance metrics, including accuracy, and false positive or false negative rate, thus proving the effectiveness of our feature vector and detection models.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© 2023 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISBN:
979-8-3503-0461-9
Files uploaded:

File Size Format Action
dacs2023-v3.pdf 483.58 KB PDF Request a copy