首页 | 本学科首页   官方微博 | 高级检索  
     检索      

可伸缩的重复流数据检测方法
引用本文:胡立辉.可伸缩的重复流数据检测方法[J].系统工程与电子技术,2008,30(2):351-353.
作者姓名:胡立辉
作者单位:长沙理工大学计算机与通信工程学院,湖南,长沙,410076
基金项目:湖南省交通厅科研项目资助课题(200610)
摘    要:流数据具有实时、连续、有序及无限等特点,一般使用近似方法检测重复,从而存在漏检等缺点。针对一类连续分时段的流数据序列,介绍了一个应用时序区间确定数据存在性的方法,设计了一个时序区间链表结构,给出了一个精确检测重复数据与动态更新时序区间链表的算法,分析了算法复杂度及影响复杂度的几个因素。该方法具有自适应性、可伸缩性及精确性等特点,方法简单且与时间无关,还可应用于遗漏流数据判断及查询过程优化,弥补了近似算法的不足。

关 键 词:流数据  重复  遗漏  可伸缩的  时序区间
文章编号:1001-506X(2008)02-0351-03
修稿时间:2007年1月10日

Scalable method for detecting duplicate stream data
Hu Li-hui.Scalable method for detecting duplicate stream data[J].System Engineering and Electronics,2008,30(2):351-353.
Authors:Hu Li-hui
Abstract:The stream data have the features of realtime,continuousness,orderliness and infinity,whose duplicates are detected generally by approximate methods which omission often exists.With respect to a kind of continuous and time-segmented stream da sequences,an approach to verifying the data existence with time ordered interval(TOI) is introduced.The TOI list structure and an algorithm for detecting accurately deplicates and updating dynamically TOI list are presented,the complexity of the algorithm and the factors influencing them are also analysed.The method described has the characteristics of adaptability,independent on time,scalability,accuracy,simplicity,etc.,can be applied to judging the omitted stream data and the optimization of query procedure too,and covers the deficiency of approximate algorithms.
Keywords:stream data  duplicate  omitted  scalable  time ordered interval
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号