首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于论域划分的无监督文本特征选择方法
引用本文:朱颢东,吴怀广.基于论域划分的无监督文本特征选择方法[J].科学技术与工程,2013,13(7):1836-1839.
作者姓名:朱颢东  吴怀广
作者单位:郑州轻工业学院,郑州轻工业学院
基金项目:国家自然科学基金项目(61201447);
摘    要:由于缺乏类信息,使得无监督文本特征选择问题一直未较好地加以解决。为此,对该问题进行了研究并提出了一个基于论域划分的无监督文本特征选择。该方法主要是把论域划分的思想引入到无监督文本特征选择之中,其首先使用一种新型无监督文档进行文本特征初选以过滤低频的噪声词,然后再使用所给的基于论域划分的属性约简进行文本特征优选。实验结果表明这个方法能够克服文本聚类时缺乏类的先验知识的不足,可以较好地解决无监督文本特征选择问题。

关 键 词:文本聚类  特征选择  文档频  论域划分
收稿时间:2012/10/21 0:00:00
修稿时间:2012/10/21 0:00:00

Unsupervised Text Feature Selection Method Based on Domain Division
Zhu Hao-Dong and WU Huai-Guang.Unsupervised Text Feature Selection Method Based on Domain Division[J].Science Technology and Engineering,2013,13(7):1836-1839.
Authors:Zhu Hao-Dong and WU Huai-Guang
Institution:Zhengzhou University of Light Industry
Abstract:Due to the lack of class labels, unsupervised text feature selection problem hasn't been resolved effectively. Therefore, this problem was studied in this thesis and an unsupervised text feature selection method based on domain division was proposed. This method mainly makes use of supervised text feature selection doing unsupervised text feature selection. The proposed firstly uses a new unsupervised Document Frequency filter out those low-frequency noise words, and then employs a presented attribute reduction based on domain division for text feature optimization. The experimental results show that this method can overcome the clustering flaw which lacks of transcendent knowledge and solve unsupervised text feature selection problem well.
Keywords:Text Clustering  Feature Selection  Document Frequency  Domain Division  
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号