首页 | 本学科首页   官方微博 | 高级检索  
     

基于三元闭包和会员闭包的社区发现算法研究
引用本文:许云峰,赵宁,郝雪君,李兵,刘慧娟. 基于三元闭包和会员闭包的社区发现算法研究[J]. 河北科技大学学报, 2014, 35(1): 103-108
作者姓名:许云峰  赵宁  郝雪君  李兵  刘慧娟
作者单位:[1]河北科技大学信息科学与工程学院,河北石家庄050018 [2]河北科技大学人事处,河北石家庄050018
基金项目:基金项目:国家自然科学基金(71271076)
摘    要:随着网络的发展和人们沟通方式的扩展,社交网络影响了人们的生活,改变了人们传播与分享消息的方式,吸引了越来越多的人关注和研究社交网络。社交网络即社交网络服务,源自英文SNS(social network service)的翻译,社交网络有多种表现平台,比如QQ、微博、Facebook和微信。本文主要研究微博这一新兴的社交平台,研究微博的主要目的是搞清用户之间的种种关系。当代人一般认为,微博中存在5种关系即关注关系、提及关系、转发关系、评论关系以及好友关系。由于社交网络中人数众多,关系错综复杂,因而产生的社交数据和传统的数据相比具有数据量大、结构复杂、语义丰富等特点,针对这种情况,依据用户之间的关系,提出了一种基于三元闭包的社区划分算法。该算法首先设初始社区为空,在所有的顶点中,选择度最大的顶点作为初始顶点;然后求初始顶点与其邻接顶点的三元闭包数和顶点属于该社区的概率PS,取它们最大的邻接顶点加入初始顶点所在社区,形成新的社区,继续迭代,当剩余的顶点很少时,可以使用会员闭包和三元闭包这种归集算法把剩余的顶点划分到不同的社区,直到把整个社区划分完毕;最后以图形这种直观、形象的方式把每一个社区表示出来。在该算法中,三元闭包数、顶点属于某社区的概率、扩张度的差是评估复杂网络中顶点划分的关键。该方法综合了顶点全局重要性的特点,即在复杂网络中,三元闭包数越大,它们处在一个社区的可能性就越大;顶点的会员闭包越大,该顶点就会越优先被划分;扩张度的差是确定第i个社区是否被划分完毕的关键。社交网络的研究不仅可以帮助人们了解网络结构、分析网络结构特性、探测分析网络的社团结构,而且还可以把虚拟世界中这种关系链接到现实世界中,即把虚拟关系转化成利润,为企业提供有价值的关系网络,从而挖掘出潜藏在社交网络背后的巨大的经济价值,具体体现在:1)帮助企业找到潜在的商机,比如分析某个用户的评论和发表内容,可知他的消费能力、喜好和最近的购买习惯,从而知道他购买自己产品的概率;2)危机预警,根据用户的消息内容可以知道他对自己产品的满意度;3)带动了消息的传播速度和广度。企业可以利用这一点,为自己的产品更好地做宣传。通过与宽吻海豚网和Zachary空手道俱乐部的社区网络作比较,证明了该算法的有效性和可行性。

关 键 词:社交网络  三元闭包  社区划分
收稿时间:2013-11-01
修稿时间:2013-12-02

A community detection algorithm based on triadic closure and membership closure
XU Yunfeng,ZHAO Ning,HAO Xuejun,LI Bing and LIU Huijuan. A community detection algorithm based on triadic closure and membership closure[J]. Journal of Hebei University of Science and Technology, 2014, 35(1): 103-108
Authors:XU Yunfeng  ZHAO Ning  HAO Xuejun  LI Bing  LIU Huijuan
Affiliation:1. School of Information Science and Engineering, H ebei University of Science and Technology, Shijiazhuang H ebei 050018, China;2. Personnel Department, Hebei University of Science and Technology, Shijiazhuang Hebei 050018, China)
Abstract:With the development of network and the expansion of people's communication ways, the social network penetrates into almost every corner of the entire society, and changes the ways of information communicating and news sharing, attracting more and more people's attention and research on it. Social network, also named social networking service, originated from the translation of British SNS (social network service), was literally translated as social network services or social networking serv- ice in Chinese. There're many manifestations of social networking platforms, such as:QQ, WeChat, Facebook and Micro-biog. In this paper, we mainly focus on the micro- blog, the emerging social platform. The main purpose of research on micro-blog is to find out the various relationships between users. People generally believe there're mainly 5 relations existing in miro-blog among users:the relationship of concerning, mentioning, forwarding, commenting and being friends. Due to the large number of social network users and the complicated relations among them, the generated social data, compared with the traditional da- ta, has the characteristics of large amount of data, complex structure and semantically rich features. So according to the rela- tionship among users, this paper proposes divided community algorithm based on triadic closure. In the first instance, this algorithm took the initial community as being empty, in which the vertex degree maximum among all vertices was chosen as the initial vertex, then requesting for the number of Triadic closures between the initial vertex and the adjacent vertex, and requesting for the probability of vertices belonged to the community. The vertex with the maximum Ps joined the initial vertex community, forming a new initial community. With continued iterating, and by using collection algo- rithm of triadic and membership closure, the remained few vertices could be divided into different communities until the entire community was completely divided. Finally, every community was intuitively and visually presented by Graphics. When using this algorithm, the number of Triadic closure, the probability of vertex belonged to a community and the difference in expansion degree are the keys to value vertices in complex network. This method combines the characteristics of the global importance of vertices. Namely, in complex networks, the greater the number of Triadic closure is, the greater the likelihood of them in a community will be. The greater the vertex membership closures are, the priorer the vertex will be divided. The difference in expansion degree is to determine whether the i community is divided completely or not. The research of social networking can not only help us understand and analyze the network structure, and detect to analyze network, but also can help link the relationship in virtual world to the real world. So the virtual relationship could be trans- ferred into profits, providing valuable network for enterprises, and digging out the great economic value behind the social net- work. It can be embodied like this:firstly, to help companies find potential business opportunities by analyzing users comments and published content to learn their consumption power, preferences and recent buying habits, thus to know the probability that he could purchase products. Second, it can give crisis warning message. According to user's information, products satis- faction degree of users can be learnt. Third, it can drive the information propagation speed and message breadth, of which the enterprises take advantage, achieving better product publicity. Compared with the community network of Bottlenose Dolphin Internet and Zachary, the algorithm mentioned in this paper was proved to be effective and feasible.
Keywords:social network  number of triadic closure  community divide
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《河北科技大学学报》浏览原始摘要信息
点击此处可从《河北科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号