首页 | 本学科首页   官方微博 | 高级检索  
     检索      

稳健边界强化GMM-SMOTE软件缺陷检测方法
引用本文:罗森林,苏霞,潘丽敏.稳健边界强化GMM-SMOTE软件缺陷检测方法[J].北京理工大学学报,2021,41(3):303-310.
作者姓名:罗森林  苏霞  潘丽敏
作者单位:北京理工大学信息与电子学院,北京 100081
基金项目:国家“十三五”科技支撑计划项目(SQ2018YFC200004)
摘    要:基于软件大数据的自动化缺陷检测模型已成为缺陷发现的重要工具.针对软件大数据中,被准确标定的缺陷样本稀少,且漏标、误标率高,导致现有机器学习数据平衡优化方法易使噪声加剧、分类边界模糊等问题,提出一种稳健边界强化GMM-SMOTE软件缺陷检测方法.该方法利用高斯混合聚类将软件集合划分为多簇,基于簇内类别比进行可靠样本筛选并且通过后验概率实现边界识别,用以指导完成加权数据平衡,最后利用平衡优化数据构建软件缺陷检测模型.在NASA多个公开数据集上的实验结果表明,GMM-SMOTE可实现噪声抑制、边界强化的数据平衡,有效提高了软件缺陷识别效果,实际应用价值大. 

关 键 词:软件缺陷检测  数据不平衡  过采样  高斯混合模型
收稿时间:2019/12/17 0:00:00

Robust Boundary-Enhanced GMM-SMOTE Software Defect Detection Method
LUO Senlin,SU Xia,PAN Limin.Robust Boundary-Enhanced GMM-SMOTE Software Defect Detection Method[J].Journal of Beijing Institute of Technology(Natural Science Edition),2021,41(3):303-310.
Authors:LUO Senlin  SU Xia  PAN Limin
Institution:School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
Abstract:Software defects are bugs that can disrupt the normal operation of the system or software, the cost of detection and positioning for software defects is high. Automatic defect detection model based on software data have become an important tool for defect discovery. Defective samples that are accurately labeled is rare, and the rate of missing labels and mislabeling is high, which leads the existing data balance optimization methods to exacerbate noise and blur boundaries of classification. To solve this problem, a robust boundary-enhanced GMM-SMOTE software defect detection method was proposed. This method was arranged to use Gaussian mixture clustering to divide the software data set into multiple clusters, to make reliable sample selection based on intra-cluster category ratio, and to implement boundary recognition based on posterior probability, to guide the completion of the weighted data balance, and finally to build a software defect detection model using balanced optimization data. Experimental results on multiple NASA public data sets show that GMM-SMOTE can achieve data balance of noise suppression and boundary enhancement, effectively improve the effect of software defect detection, possessing great practical value.
Keywords:software defect detection  data imbalance  oversampling  Gaussian mixture model
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号