首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于特征相关的改进加权朴素贝叶斯分类算法
引用本文:饶丽丽,刘雄辉,张东站.基于特征相关的改进加权朴素贝叶斯分类算法[J].厦门大学学报(自然科学版),2012,51(4):682-685.
作者姓名:饶丽丽  刘雄辉  张东站
作者单位:1. 厦门大学信息科学与技术学院,福建厦门,361005
2. 龙岩烟草工业有限责任公司信息技术部,福建龙岩,364021
摘    要:朴素贝叶斯分类算法的特征项间强独立性的假设在现实中是很难满足的.为了在一定程度上放松这一假设,提出了基于特征相关的改进加权朴素贝叶斯分类算法,该算法采用一种新的权重计算方法,这种权重计算方法是在传统词频反文档频率(TF-IDF)权重计算基础上,考虑到特征项在类内和类间的分布情况,另外还结合特征项间的相关度,调整权重计算值,加大最能代表所属类的特征项的权重,将它称之为TF-IDF-FC权重计算.与基于传统TF-IDF权重的加权朴素贝叶斯分类算法和其他常用加权朴素贝叶斯分类算法比较,如基于属性加权的朴素贝叶斯分类算法,这种算法的分类效果均有一定的提高.

关 键 词:朴素贝叶斯文本分类器  加权朴素贝叶斯文本分类算法  TF-IDF权重  特征项间的相关度

An Improved Weighted Naive Bayes Classification Algorithm Using Feature Correlation
RAO Li-li , LIU Xiong-hui , ZHANG Dong-zhan.An Improved Weighted Naive Bayes Classification Algorithm Using Feature Correlation[J].Journal of Xiamen University(Natural Science),2012,51(4):682-685.
Authors:RAO Li-li  LIU Xiong-hui  ZHANG Dong-zhan
Institution:1(1.School of Information Science and Technology,Xiamen University,Xiamen 361005,China; 2.Department of Information Technology,Longyan Tobacco Industrial Co.Ltd,Longyan 364021,China)
Abstract:The strong independence condition between the feature required by naive Bayes classification algorithm is very difficult to realize in reality.This paper puts forward an improved weighted naive naive Bayes classification algorithm using feature correlation to loose this condition to some extent,this algorithm adopts a new weighting method called TF-IDF-FC weight calculation,it takes into account the feature distribution within and between class based on the traditional TF-IDF weight calculation method and adjusts feature weight in combination with feature correlation in order to make the weight of the feature which can represent its class mostly.Compared with weighted naive Bayes classification based on the traditional TF-IDF weight and other commonly used weighted naive Bayes classification algorithms,such as attribute weighted naive Bayes classification,this algorithm improve the performance of classification to a certain extent.
Keywords:naive Bayes text classification  weighted naive Bayes text classification  TF-IDF weight  feature correlation
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号