首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进的MapReduce模型的Web挖掘
引用本文:应毅,任凯,曹阳. 基于改进的MapReduce模型的Web挖掘[J]. 科学技术与工程, 2013, 13(5): 1205-1209
作者姓名:应毅  任凯  曹阳
作者单位:三江学院,南京大学金陵学院,三江学院计算机科学与工程学院
摘    要:基于单一服务器的Web挖掘系统在处理海量数据集时计算能力不足,针对该问题,提出了一种基于云计算的挖掘方法。将大数据集和挖掘任务分解到多台计算机上并行处理。实现了一个基于Hadoop开源框架的并行Web挖掘平台,同时提出了一种改进的MapReduce模型——MapReduce-LP。并通过对电子商务系统中Web日志的挖掘工作验证了系统的有效性和新模型的高效性。实验表明,在集群中使用云计算技术处理大数据集,可以明显提高挖掘效率。

关 键 词:Web挖掘  云计算技术  Hadoop  MapReduce-LP模型  Web日志挖掘
收稿时间:2012-08-09
修稿时间:2012-08-28

Web Mining Based on Improved MapReduce Model
YingYi,REN Kai and CAO Yao. Web Mining Based on Improved MapReduce Model[J]. Science Technology and Engineering, 2013, 13(5): 1205-1209
Authors:YingYi  REN Kai  CAO Yao
Affiliation:Jinling College, Nanjing University,College of Computer Science and Technology, Sanjiang University
Abstract:When process the massive data, there exists a calculation bottleneck in current Web mining system based on single server. To solve these problems, proposed a cloud-computing technology-based Web mining method. That is, the large data and mining tasks will be decomposed on multiple computers and be processed by parallel. We use open source project - Hadoop to establish a parallel Web mining platform. Moreover, we put forward a kind of improved MapReduce model - MapReduce-LP. It has been verified the effectiveness of system and efficiency of new model by Web log mining job in Electronic Commerce Systems. Experimental results show that, using cloud-computing technology to process large data in the cluster can significantly improve the efficiency of Web mining.
Keywords:Web Mining Cloud-Computing Technology Hadoop MapReduce-LP Model Web Log Mining
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号