首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于EVS相似度的邮件社区聚类方法
引用本文:王芳,郭华平,牛常勇,范明.一种基于EVS相似度的邮件社区聚类方法[J].山东大学学报(理学版),2010,45(3):34-40.
作者姓名:王芳  郭华平  牛常勇  范明
作者单位:郑州大学信息工程学院,河南郑州,450052
摘    要:聚类方法的核心是如何度量事物间的邻近性。介绍了邮件特征的向量表示形式、构建了邮件特征矩阵,并使用变形后的极值分布函数模型拟合了邮件间通信特征信息;在此基础上提出了一个新的邻近性度量方法(ex-treme value distribution similarity,EVS),用以指导邮件社区划分;使用微聚类-宏聚类邮件社区划分算法验证了该方法的有效性。实验表明,在测试数据集上,相比余弦、PCC等经典的邻近性度量方法,以EVS作为划分依据的邮件社区划分算法能够更加有效地发现高质量的邮件社区。

关 键 词:社会网络  邮件社区划分  极值分布  EVS相似度
收稿时间:2009-12-30

New email community clustering method based on EVS similarity  
WANG Fang,GUO Hua-ping,NIU Chang-yong,FAN Ming.New email community clustering method based on EVS similarity  [J].Journal of Shandong University,2010,45(3):34-40.
Authors:WANG Fang  GUO Hua-ping  NIU Chang-yong  FAN Ming
Institution:School of Information and Engineering, Zhengzhou University, Zhengzhou 450052, Henan, China
Abstract:Proximity measurement between objects is a key problem of the clustering method. The email feature vector was introduced, and the email feature matrix was constructed. The information of email features was fitted by the model of the transformed extremal value distribution function. Based on this, EVS (extreme value distribution similarity) was proposed fur email community clustering. The effectiveness of the new measurement was verified by the micro-macro clustering algorithm. Experiments show that compared to cosine-based similarity and Pearson correlation coefficient, the algorithm using the new proposed similarity measurement can identify higher quality communities.
Keywords:social network  email community partition  extreme value distribution  EVS similarity
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《山东大学学报(理学版)》浏览原始摘要信息
点击此处可从《山东大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号