首页 | 本学科首页   官方微博 | 高级检索  
     

深度学习源代码缺陷检测方法
引用本文:王晓萌,张涛,辛伟,侯长玉. 深度学习源代码缺陷检测方法[J]. 北京理工大学学报, 2019, 39(11): 1155-1159. DOI: 10.15918/j.tbit1001-0645.2018.396
作者姓名:王晓萌  张涛  辛伟  侯长玉
作者单位:中国信息安全测评中心,北京,100085
基金项目:国家自然科学基金资助项目(U1636115,U1736209)
摘    要:针对由于传统的源代码缺陷分析技术依赖于分析人员的对安全问题的认识以及长期经验积累造成的缺陷检测误报率、漏报率较高的问题,提出了一种深度学习算法源代码缺陷检测方法.该方法根据深度学习算法,利用程序源代码的抽象语法树、数据流特征,通过训练源代码缺陷分类器完成源代码缺陷检测工作.其依据的关键理论是应用深度学习算法及自然语言处理中的词嵌套算法学习源代码抽象语法树和数据流中蕴含的深层次语义特征和语法特征,提出了应用于源代码缺陷检测的深度学习一般框架.使用公开数据集SARD对提出的方法进行验证,研究结果表明该方法在代码缺陷检测的准确率、召回率、误报率和漏报率方面均优于现有的检测方法. 

关 键 词:缺陷检测  深度学习  静态分析  语义特征  语法特征
收稿时间:2018-10-16

Source Code Defect Detection Based on Deep Learning
WANG Xiao-meng,ZHANG Tao,XIN Wei and HOU Chang-yu. Source Code Defect Detection Based on Deep Learning[J]. Journal of Beijing Institute of Technology(Natural Science Edition), 2019, 39(11): 1155-1159. DOI: 10.15918/j.tbit1001-0645.2018.396
Authors:WANG Xiao-meng  ZHANG Tao  XIN Wei  HOU Chang-yu
Affiliation:China Information Technology Security Evaluation Center, Beijing 100085, China
Abstract:The development and progress of traditional source code defect analysis techniques rely mainly on analysts'' understanding of safety issues and long-term experience. To improve the quality of source code defect detection and report, a source code defect detection method was proposed based on deep learning algorithm. Firstly, introducing an abstract syntactic tree of program source code and the data stream features, and training source code defect sorter, the method was arranged to achieve source code defect detection according to the deep learning algorithm. And then,analyzing the abstract syntactic tree of source code and the semantic and syntactic feature contained in the data stream, a general framework was proposed for deep learning based source code defect detection according to the key theories, deep learning algorithm and word nesting algorithm in nature language processing. Finally, an open data set SARD was used to validate the proposed method. The experimental results show that, the proposed method can learn semantic and syntactic features hidden in the source code and outperform the existing methods in terms of accuracy, recall rate, false positive rate, and false negative rate.
Keywords:defect detection  deep learning  static analysis  semantic feature  syntactic feature
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号