Ngo, M. V., Truong-Huu, T., Rabadi, D., Loo, J. Y., & Teo, S. G. (2023). Fast and Efficient Malware Detection with Joint Static and Dynamic Features Through Transfer Learning. Lecture Notes in Computer Science, 503–531. https://doi.org/10.1007/978-3-031-33488-7_19
Abstract:
In malware detection, dynamic analysis extracts the runtime be-
havior of malware samples in a controlled environment and static
analysis extracts features using reverse engineering tools. While
the former faces the challenges of anti-virtualization and evasive
behavior of malware samples, the latter faces the challenges of code
obfuscation. To tackle these drawbacks, prior works proposed to
develop detection models by aggregating dynamic and static fea-
tures, thus leveraging the advantages of both approaches. However,
simply concatenating dynamic and static features raises an issue
of imbalanced contribution due to heterogeneous dimensions of
feature vectors, resulting in not much performance improvement.
Yet, dynamic analysis is a time-consuming task and requires a se-
cure environment, leading to detection delays and high costs for
maintaining the analysis infrastructure. In this paper, we first in-
troduce a novel method of constructing aggregated features via
concatenating latent features learned through deep learning with
equally-contributed dimensions. We then develop a knowledge dis-
tillation technique to transfer knowledge learned from aggregated
features by a teacher model to a student model trained only on
static features. We carry out extensive experiments with a dataset
of 86 709 samples including both benign and malware samples. The
experimental results show that the teacher model trained on aggre-
gated features constructed by our method outperforms the state-
of-the-art models with an improvement of up to 2.38% in detection
accuracy. The distilled student model not only achieves high per-
formance (97.81% in terms of accuracy) as that of the teacher model
but also significantly reduces the detection time (from 70 046.6ms
to 194.9ms) without requiring dynamic analysis.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
This version of the article has been accepted for publication, after peer reviewand is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-031-33488-7_19