首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种多分类的微博垃圾用户检测方法
引用本文:杨云,徐光侠,雷娟.一种多分类的微博垃圾用户检测方法[J].重庆大学学报(自然科学版),2018,41(8):44-55.
作者姓名:杨云  徐光侠  雷娟
作者单位:国网重庆市电力公司信息通信分公司,重庆,400014 重庆大学博士后流动站,重庆,400044 国网重庆市电力公司电力科学研究院,重庆,401123
基金项目:国家自然科学基金项目(61772099),中国博士后基金(2014M562282),重庆市博士后项目(XM2014039),重庆市人工智能技术创新重大主题专项(cstc2017rgzn-zdyf0140),重庆市高校优秀成果转化资助项目(KJZH17116)
摘    要:针对微博多类垃圾用户的检测问题,设计了一种基于模糊多类支持向量机的垃圾用户检测方法。首先,采用一对多SVM(support vector machines)的构造思想来构造多分类器,并针对每类用户的分类器重新选择训练集;然后,利用构造好的训练集来训练多分类器,经过反复调整参数,得到5个用户分类器;最后,针对多分类器的不可分样本,采用模糊聚类来进行模糊处理,即在垂直于SVM的最优分类面上定义一个改进的隶属度函数,选择最大隶属度对样本进行再分类。实验结果表明,该方法在保证垃圾用户检测效果的前提下,可以解决多分类中存在的混分和漏分问题。

关 键 词:微博垃圾用户检测  多分类  模糊处理  隶属度函数
收稿时间:2018/4/2 0:00:00

A multi-classification method for detecting microblog spam users
YANG Yun,XU Guangxia and LEI Juan.A multi-classification method for detecting microblog spam users[J].Journal of Chongqing University(Natural Science Edition),2018,41(8):44-55.
Authors:YANG Yun  XU Guangxia and LEI Juan
Institution:State Grid Chongqing Information & Telecommunication Company, Chongqing 400014, P. R. China,Postdoctoral Research Station of Chongqing University, Chongqing 400044, P. R. China and State Grid Chongqing Electric Power Co. Electric Power Research Institute, Chongqing 401123, P. R. China
Abstract:Based on fuzzy multi-class support vector machine, a method for detecting microblog spammers is designed. Firstly, a multi-class SVM(support vector machines) is used to construct multi-classifiers, and a training set is re-selected for each type of user''s classifier. Then, the constructed training set is used to train the multi-classifier, and five user classifiers are obtained after repeated remediation. Finally, for the non-separable samples of multiple classifiers, fuzzy clustering is used to perform the fuzzy processing. An improved membership function is defined on the optimal classification plane perpendicular to the SVM, and the maximum membership degree is used to reclassify the samples. Experimental results show that this method can solve the problems of mixing and missing points in multi-classification under the premise of ensuring the detection effect of spammers.
Keywords:microblog spammer detection  multi-classification  fuzzy processing  degree of membership function
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《重庆大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号