首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于特征子空间与流形正则化的高效增量半监督特征选择方法
引用本文:古楠楠,李路云,史春红,陈庆美.基于特征子空间与流形正则化的高效增量半监督特征选择方法[J].系统工程理论与实践,1981,40(11):2968-2980.
作者姓名:古楠楠  李路云  史春红  陈庆美
作者单位:首都经济贸易大学 统计学院, 北京 100070
基金项目:国家自然科学基金(61503263)
摘    要:基于Fisher Score的前向序列选择法是目前性能良好并广泛使用的一种有监督特征选择方法.然而,该方法只能对有标签样本进行分析,无法利用大量"廉价"的无标签样本信息;并且随着已选特征的个数的增加,对候选特征进行评分的计算复杂度呈三次方形式增加.针对这两个问题,提出基于特征子空间与流形正则化的高效增量半监督特征选择方法.一方面,该方法通过提取有标签与无标签数据的局部线性表示来进行半监督特征选择,使得所选特征能够保持数据的局部流形结构信息;另一方面,该方法基于特征子空间理论进行特征评分,时间复杂度取决于特征空间的维数而非已选特征的个数,如果特征空间的维数是固定的,该方法将花费几乎恒定的时间来评价每一个候选特征.相比于基于Fisher Score的前向序列选择法选择特征的三次方复杂度,所提方法在时间效率方面得到很大提升.在五个标准数据集上进行了实验,所得结果验证了该方法的有效性.

关 键 词:半监督学习  特征选择  流形正则化  特征子空间  
收稿时间:2020-02-17

Efficient semi-supervised feature selection based on eigenspace model and manifold regularization
GU Nannan,LI Luyun,SHI Chunhong,CHEN Qingmei.Efficient semi-supervised feature selection based on eigenspace model and manifold regularization[J].Systems Engineering —Theory & Practice,1981,40(11):2968-2980.
Authors:GU Nannan  LI Luyun  SHI Chunhong  CHEN Qingmei
Institution:School of Statistics, Capital University of Economics and Business, Beijing 100070, China
Abstract:The forward search method based on Fisher Score is a widely-used supervised feature selection method that has good performance. However, this method can only analyze labeled samples and cannot use a large amount of "cheap" unlabeled sample information; and as the number of selected features increases, the computational complexity of scoring candidate features is in the form of a cube increase. Aiming at these two problems, we propose an efficient semi-supervised feature selection method based on eigenspace model and manifold regularization. On the one hand, the method performs semi-supervised feature selection by extracting locally linear representations of labeled and unlabeled data, so that the selected features can maintain the local manifold structure information of the data. On the other hand, this method uses eigenspace model theory for feature scoring. The time complexity depends on the dimension of the feature space rather than the number of selected features. If the dimension of the eigenspace model is fixed, the method will take almost constant time to evaluate each candidate feature. Compared with the cubic complexity of feature selection based on the Fisher Score forward search method, the proposed method is greatly improved in terms of time efficiency. Experiments are performed on five standard data sets, and the results verify the effectiveness of the method.
Keywords:semi-supervised learning  feature selection  manifold regularization  eigenspace model  
点击此处可从《系统工程理论与实践》浏览原始摘要信息
点击此处可从《系统工程理论与实践》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号