J4 ›› 2010, Vol. 28 ›› Issue (03): 298-.

• 论文 • 上一篇    下一篇

实用高效的垃圾邮件过滤算法

梁 好|徐长庚|林和平   

  1. 东北师范大学 计算机学院|长春 130117
  • 出版日期:2010-05-30 发布日期:2010-06-12
  • 通讯作者: 梁好(1984— ),男,河北秦皇岛人,东北师范大学硕士研究生,主要从事数据挖掘研究,(Tel)86-15543675247 E-mail:liangh656@nenu.edu.cn
  • 作者简介:梁好(1984— )|男|河北秦皇岛人|东北师范大学硕士研究生|主要从事数据挖掘研究|(Tel)86-15543675247(E-mail)liangh656@nenu.edu.cn;林和平 (1956— )|男(满族)|长春人|东北师范大学教授|硕士生导师|主要从事人工智能、计算机图形学和计算工程研究|(Tel)86-13500810142(E-mail)linhp@nenu.edu.cn。

Simple and Efficient Algorithm for Spam Filter

LIANG Hao|XU Chang-geng|LIN He-ping   

  1. School of Computer|Northeast Normal University|Changchun 130117| China
  • Online:2010-05-30 Published:2010-06-12

摘要:

为了提高电子邮件中垃圾邮件的过滤准确率和效率,以朴素贝叶斯算法和K最近邻(KNN:K-Nearest Neighbors)算法为基础,对传统垃圾邮件过滤算法进行改进,给出邮件的合法属性和非法属性的概念,并提出一种新的分类算法——基于邮件合法属性和非法属性的分类算法(SEASF:Simple and Efficient Algorithm to Spam Filter based on legitimate attribute and nonlicet attribute)。SEASF计算复杂度较低,可适用于大规模场合及邮件的在线过滤。将SEASF算法应用于垃圾邮件过滤的结果表明,该算法可大幅度提高分类精度,分类速度也令人满意。

关键词: 垃圾邮件过滤, K最近邻算法, 朴素贝叶斯算法

Abstract:

In order to improve the precision and efficiency of spam filter.Two new concepts, legitimate attribute and nonlicet attribute,and an improved spam filter algorithm SEASF(Simple and Efficient Algorithm to Spam Filter based on legitimate attribute and nonlicet attribute) based on Naive Bayes algorithm and KNN(K-Nearest Neighbors) algorithm, two traditional spam filter algorithms are proposed. SEASF can be used to filter a large number of specimens and to filter email online, and it is efficient. SEASF is applied to spam filter, the recall and precision are highly improved, and the rate is satisfactory. 

Key words: spam filter, K-nearest neighbors(KNN), naive bayes algrithm

中图分类号: 

  • TP391