首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于频次统计的兼类噪声消除方法
引用本文:尹中航,王永成,宋聚平,蔡巍.一种基于频次统计的兼类噪声消除方法[J].上海交通大学学报,2003,37(3):408-410.
作者姓名:尹中航  王永成  宋聚平  蔡巍
作者单位:上海交通大学电子信息学院,上海,200030
基金项目:国家自然科学基金资助项目 ( 60 0 82 0 0 3 )
摘    要:分析了自动分类知识库中的文本兼类噪声,提出借助于频次统计特性来减少兼类噪声的新算法,在进行理论分析的基础上,讨论了具体的实现步骤,并通过对新闻语料的分类实验,检验了降噪效果。结果表明,该方法可以减少兼类概念在知识库中的冗余次数,能提高自动分类系统的性能指标。

关 键 词:知识库  降噪  自然语言处理
文章编号:1006-2467(2003)03-0408-03
修稿时间:2002年1月16日

An Algorithm to Reduce Multi-Category Noise Based on Frequency Statistics
YIN Zhong hang,WANG Yong cheng,SONG Ju ping,CAI Wei.An Algorithm to Reduce Multi-Category Noise Based on Frequency Statistics[J].Journal of Shanghai Jiaotong University,2003,37(3):408-410.
Authors:YIN Zhong hang  WANG Yong cheng  SONG Ju ping  CAI Wei
Abstract:This paper analyzed the multi category noise in knowledge base for automatic text classification, and presented a new algorithm by means of the statistical characteristic of knowledge base. Based on the theoretical analysis, it discussed the concrete steps of the algorithm, and tested the effect by classifying news samples. The experimental result indicates that the algorithm can obviously decrease the redundant appearing times of concepts of multi-category samples in knowledge base. The performance of automatic classification was improved after revising.
Keywords:knowledge base  noise reduction  natural language processing
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号