Enhancing Robustness of Malware Detection using Synthetically-adversarial Samples

Enhancing Robustness of Malware Detection using Synthetically-adversarial Samples
Title:
Enhancing Robustness of Malware Detection using Synthetically-adversarial Samples
Other Titles:
2020 IEEE Global Communications Conference (GLOBECOM 2020)
DOI:
10.1109/GLOBECOM42002.2020.9322377
Publication Date:
25 January 2021
Citation:
W. L. Tan and T. Truong-Huu, "Enhancing Robustness of Malware Detection using Synthetically-Adversarial Samples," GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1-6, doi: 10.1109/GLOBECOM42002.2020.9322377.
Abstract:
Malware detection is a critical task in cybersecurity to protect computers and networks from malicious activities arising from malicious software. With the emergence of machine learning and especially deep learning, many malware detection models (malware classifiers) have been developed to learn features of malware samples collected from static or dynamic analysis. However, these classifiers experience a deterioration in performance (e.g., detection accuracy) over time due to the changes in the distribution of malware samples. Leveraging the positive aspects of adversarial samples, we aim at enhancing the robustness of malware classifiers using synthetically-adversarial samples. We develop Generative Adversarial Networks (GANs) that learn to generate not only malicious samples but also benign samples to enrich the training set of a baseline malware classifier. We improve the performance of the developed GANs by incorporating a relativistic discriminator and the cosine margin loss function such that quasi-realistic samples can be generated. We carry out extensive experiments with publicly available malware samples to evaluate the performance of the proposed approach. The experimental results show that without synthetic samples in the training set, the baseline classifier experiences a drop in its detection accuracy by up to 18.20% when evaluated against a test set that includes synthetic samples. By introducing synthetic samples into the training set and retraining the classifier, the improvement in detection accuracy not only compensates for the drop but also increases further by up to 4.15%.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the Agency for Science, Technology and Research - RIE2020 AME Core Funds (SERC Grant)
Grant Reference no. : A1916g2047
Description:
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
2576-6813
1930-529X
ISBN:
978-1-7281-8298-8
978-1-7281-8299-5
Files uploaded:

File Size Format Action
09322377.pdf 9.15 MB PDF Request a copy