An intelligent malware classification framework

Thumbnail Image



Journal Title

Journal ISSN

Volume Title


University of New Brunswick


Malicious software or malware has risen to become a primary source of most of the attacks taking place across the Internet over the last decades. This prevalence of new malware, for which signatures are not available, along with the challenge of anti-malware software to keep up with the continuous stream of new malware, has made the adoption of classification/-clustering approaches necessary. Machine-learning methods have been excessively applied to classify or cluster malware into families, based on different features derived from static or dynamic review of the malware. While these approaches demonstrate promise, they are themselves subject to a growing array of countermeasures. In this work, we propose a framework to enhance the traditional machine learning-based classification by utilizing high-level domain knowledge. We outline major behaviours of Windows malware from an analyst's point of view and provide possible methods (rules) to extract them from the output of static and dynamic analysis tools. We also take advantage of memory forensics to extract other stealthy aspects of an executable, which otherwise remain undetected. Our comparative experimental results with the state-of-the-art malware classification approaches, confirm the effectiveness of our framework by an average classification accuracy of 81%, while leaving only 0.5% of samples unlabeled.