首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于图卷积网络的恶意代码聚类
引用本文:刘凯,方勇,张磊,左政,刘亮.基于图卷积网络的恶意代码聚类[J].四川大学学报(自然科学版),2019,56(4):654-660.
作者姓名:刘凯  方勇  张磊  左政  刘亮
作者单位:四川大学电子信息学院,四川大学网络空间安全学院,四川大学网络空间安全学院,四川大学电子信息学院,四川大学网络空间安全学院
基金项目:国家重点研发计划基金资助项目(2017YFB0802904)
摘    要:许多新型恶意代码往往是攻击者在已有的恶意代码基础上修改而来,因此对恶意代码的家族同源性分析有助于研究恶意代码的演化趋势和溯源.本文从恶意代码的API调用图入手,结合图卷积网络(GCN),设计了恶意代码的相似度计算和家族聚类模型.首先,利用反汇编工具提取了恶意代码的API调用,并对API函数进行属性标注.然后,根据API对恶意代码家族的贡献度,选取关键API函数并构建恶意代码API调用图.使用GCN和卷积神经网络(CNN)作为恶意代码的相似度计算模型,以API调用图作为模型输入计算恶意代码之间的相似度.最后,使用DBSCAN聚类算法对恶意代码进行家族聚类.实验结果表明,本文提出的方法可以达到87.3%的聚类准确率,能够有效地对恶意代码进行家族聚类.

关 键 词:恶意代码    图卷积网络  聚类    API  调用图    卷积神经网络
收稿时间:2019/1/9 0:00:00
修稿时间:2019/3/5 0:00:00

Malware Clustering Based on Graph Convolutional Networks
liukai,fangyong,zhanglei,zuozheng and liuliang.Malware Clustering Based on Graph Convolutional Networks[J].Journal of Sichuan University (Natural Science Edition),2019,56(4):654-660.
Authors:liukai  fangyong  zhanglei  zuozheng and liuliang
Institution:College of Cybersecurity, Sichuan University,College of Cybersecurity, Sichuan University,College of Electronics and Information Engineering, Sichuan University,College of Cybersecurity, Sichuan University
Abstract:Many new types of malwares are often modified by attackers based on the existing malwares. Therefore, family homology analysis of malwares can help to study of evolutionary trend and traceability of malwares. In this paper, starting from API call graphs of malwares and combined with Graph Convolutional Networks (GCN), we proposed a similarity calculation and family clustering model for malwares. Firstly, we extract API call graphs of malwares with disassembly tools and the attribution of the API functions in the graphs are labeled. Then, we select key API functions by its contribution to the malware families and the API call graphs of malwares are constructed. We use GCN and Convolutional Neural Networks (CNN) as the model of the malware similarity calculation which the inputs are the API call graphs. Finally, we use DBSCAN algorithm to cluster malwares. The experimental results show that the proposed method can achieve 87.3% accuracy and can effectively cluster malware families.
Keywords:malware  GCN  clustering  API call graph  CNN
本文献已被 CNKI 等数据库收录!
点击此处可从《四川大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号