首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多重假设检验市长公开电话文本的自动分类
引用本文:郝立柱,赵世舜,郝立丽.基于多重假设检验市长公开电话文本的自动分类[J].吉林大学学报(理学版),2008,46(6):1101-1104.
作者姓名:郝立柱  赵世舜  郝立丽
作者单位:吉林大学 数学研究所, 长春 130012
摘    要:提出一种基于多重假设检验的特征加权朴素贝叶斯分类算法, 该算法通过特征选择方法得到多个特征词集合, 再按多重假设检验错误率为每个特征词集合配以不同的权重系数并参与到分类器的构建中. 该方法已经应用到市长公开电话的文本分类中, 通过构建的3个特征加权朴素贝叶斯分类器实现了投诉文本的计算机自动分类, 且相对传统方法提高了分类器的效率和精度.

关 键 词:多重假设检验  文本分类  特征加权  市长公开电话  
收稿时间:2008-01-23

Text Automatic Classification Based on Multiple Hypothesis Testing in the Mayor's Public Access Line Project
HAO Li-zhu,ZHAO Shi-shun,HAO Li-li.Text Automatic Classification Based on Multiple Hypothesis Testing in the Mayor''s Public Access Line Project[J].Journal of Jilin University: Sci Ed,2008,46(6):1101-1104.
Authors:HAO Li-zhu  ZHAO Shi-shun  HAO Li-li
Institution:Institute of Mathematics, Jilin University, Changchun 130012, China
Abstract:On the basis of multiple hypothesis testing, we proposed a feature weighted naive Bayesian algorithm, which outputs many sets of feature words by means of feature selection, and assigns a coefficient to each set of feature words which is used to construct the classifier in terms of the error rate of multiple hypothesis testing. This algorithm was used in the text classification of the mayor’s public access line project, where we realized the automatic classification of complaint texts by constructing three feature weighted naive Bayesian classifiers. Compared with those of the traditional methods, the efficiency and accuracy of our classifier are higher.
Keywords:multiple hypothesis testing  text classification  fea  ture weighted  the mayor’s public access line project
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《吉林大学学报(理学版)》浏览原始摘要信息
点击此处可从《吉林大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号