A dynamic graph-based malware classifier

dc.contributor.advisorGhorbani, Ali
dc.contributor.authorJazi, Hossein Hadian
dc.description.abstractThe anti-virus industry receives a sheer amount of new malware samples on a daily basis. The prevalence of new sophisticated instances, for most of which no signature is available, coupled with the significant growth of potentially harmful programs have made the adoption of an effective automated classifier almost inevitable. Due to the vast majority of obfuscation techniques employed by the malware authors, extraction of a high-level representation of malware structure is an efficient way in this regard. High-level graph representations such as Function Call Graphs or Control Flow Graphs are able to represent the main functionality of a given sample in more abstract way. The graph-based approaches have mostly revolved around static analysis of the binary and share the common drawbacks of any static-based approaches. For example, generating a graph from a packed executable does not reflect the real structure of the code at all. In addition to the type of analysis, the scalability of these approaches is also affected by the employed graph comparison algorithm. Full graph comparison is by itself an NP-hard problem. Approximated graph comparison algorithms such as Graph Edit Distance have been commonly studied in the field of graph classification. To address the two major weaknesses involved with the current graph-based approaches, we propose a dynamic and scalable graph-based malware classifier. At the time of this proposal, this is the first attempt to generate and classify dynamic graphs. In spite of providing more accurate graphs, dynamic analysis leads to the generating larger graphs, and aggravating the problem of comparison measurement. To address this problem we modify an existing algorithm called Simulated Annealing to reduce computational complexity. To have a reasonable estimation of the effectiveness, our proposed system is compared against Classy, which is the state-of-the-art graph-based system. Our results show that proposed classifier is able to outperform Classy by an average classification accuracy of 94%, 4% false positive rate, and leaving only 2% of samples unlabeled.
dc.description.copyright© Hossein Hadian Jazi, 2016
dc.format.extentxii, 105 pages
dc.publisherUniversity of New Brunswick
dc.subject.disciplineComputer Science
dc.titleA dynamic graph-based malware classifier
dc.typemaster thesis
thesis.degree.disciplineComputer Science
thesis.degree.fullnameMaster of Computer Science
thesis.degree.grantorUniversity of New Brunswick


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.52 MB
Adobe Portable Document Format