首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进的MapReduce模型的Web挖掘
引用本文:应毅,任凯,曹阳.基于改进的MapReduce模型的Web挖掘[J].科学技术与工程,2013,13(5):1205-1209.
作者姓名:应毅  任凯  曹阳
作者单位:三江学院,南京大学金陵学院,三江学院计算机科学与工程学院
摘    要:基于单一服务器的Web挖掘系统在处理海量数据集时计算能力不足,针对该问题,提出了一种基于云计算的挖掘方法。将大数据集和挖掘任务分解到多台计算机上并行处理。实现了一个基于Hadoop开源框架的并行Web挖掘平台,同时提出了一种改进的MapReduce模型——MapReduce-LP。并通过对电子商务系统中Web日志的挖掘工作验证了系统的有效性和新模型的高效性。实验表明,在集群中使用云计算技术处理大数据集,可以明显提高挖掘效率。

关 键 词:Web挖掘  云计算技术  Hadoop  MapReduce-LP模型  Web日志挖掘
收稿时间:8/9/2012 10:15:22 PM
修稿时间:2012/8/28 0:00:00

Web Mining Based on Improved MapReduce Model
YingYi,REN Kai and CAO Yao.Web Mining Based on Improved MapReduce Model[J].Science Technology and Engineering,2013,13(5):1205-1209.
Authors:YingYi  REN Kai and CAO Yao
Institution:Jinling College, Nanjing University,College of Computer Science and Technology, Sanjiang University
Abstract:When process the massive data, there exists a calculation bottleneck in current Web mining system based on single server. To solve these problems, proposed a cloud-computing technology-based Web mining method. That is, the large data and mining tasks will be decomposed on multiple computers and be processed by parallel. We use open source project - Hadoop to establish a parallel Web mining platform. Moreover, we put forward a kind of improved MapReduce model - MapReduce-LP. It has been verified the effectiveness of system and efficiency of new model by Web log mining job in Electronic Commerce Systems. Experimental results show that, using cloud-computing technology to process large data in the cluster can significantly improve the efficiency of Web mining.
Keywords:Web Mining    Cloud-Computing Technology    Hadoop    MapReduce-LP Model    Web Log Mining
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号