基于用户反馈和增量学习的垃圾邮件识别方法 Incremental learning based on interactive spam filter期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于用户反馈和增量学习的垃圾邮件识别方法

引用本文：	王鑫,陈光英,段海新,李学农.基于用户反馈和增量学习的垃圾邮件识别方法[J].清华大学学报(自然科学版),2006,46(1):70-73.

作者姓名：	王鑫陈光英段海新李学农

作者单位：	清华大学,信息网络工程研究中心,北京,100084

基金项目：	中国科学院资助项目;国家重点基础研究发展计划(973计划)

摘要：	为了提高垃圾邮件识别的准确度,减少识别中的错判,提出了一种交互式垃圾邮件识别方法。该方法用一组具有特定权重的规则识别垃圾邮件,规则权重分布用改进遗传算法训练得到。增加用户与服务器间的交互,收集用户反馈的错判信息,根据反馈信息用增量学习动态调整规则权重。通过对SpamA ssass in扩展实现了该方法,并应用在邮件服务器上进行了测试。实验中在不影响垃圾邮件识别率的前提下,降低误判率约10%。实验结果表明:该方法不但能有效减少识别中的误判,而且避免了繁琐的重新训练,加快了规则权重的更新速度。
关键词：	模式识别电子邮件垃圾邮件识别改进遗传算法用户反馈增量学习
文章编号：	1000-0054(2006)01-0070-04
修稿时间：	2004年12月28
Incremental learning based on interactive spam filter

WANG Xin,TRAN Quang Anh,DUAN Haixin,LI Xuenong.Incremental learning based on interactive spam filter[J].Journal of Tsinghua University(Science and Technology),2006,46(1):70-73.

Authors:	WANG Xin TRAN Quang Anh DUAN Haixin LI Xuenong

Abstract:	An interactive spam filter was developed to reduce misclassification rates when filtering spam.A set of weighted rules is used to filter spam with the weights selected using an improved genetic algorithm.The false positive and false negative rates are improved using user feedback on the misclassified information with incremental learning to dynamically adjust the rule weights.The filtering method was implemented by expanding SpamAssassin with tests on an email server at CCERT(Cernet Computer Emergency Response Term) in Tsinghua University.Test results show that the method effectively reduces misclassifications without affecting spam filtering quality.

Keywords:	pattern classification electronic mail spam filtering improved genetic arithmetic user feedback incremental learning
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏