首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A Survey of Bitmap Index Compression Algorithms for Big Data
Authors:Zhen Chen;Yuhao Wen;Junwei Cao;Wenxun Zheng;Jiahui Chang;Yinjun Wu;Ge Ma;Mourad Hakmaoui;Guodong Peng;the Research Institute of Information Technology  Tsinghua University;Tsinghua National Laboratory for Information Science and Technology TNList  Tsinghua University;
Institution:Department of Electronic Engineering,Tsinghua University;Department of Physics,Tsinghua University;Department of Aerospace Engineering,Tsinghua University;Department of Automation,Tsinghua University;Department of Computer Science and Technology, Tsinghua University;Department of Mechanical Engineering,Tsinghua University;
Abstract:With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems(ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. Among the three key technologies in ITAS, we focus on bitmap index compression algorithm and give a detailed survey in this paper. The current state-of-the-art bitmap index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, COMPAX, VLC, DF-WAH, and VAL-WAH. Based on differences in segmentation, chunking, merge compress, and Near Identical(NI) features, we provide a thorough categorization of the state-of-the-art bitmap index compression algorithms. We also propose some new bitmap index encoding algorithms, such as SECOMPAX, ICX, MASC, and PLWAH+, and present the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap index compression algorithms. Beyond the application in network security and network forensic, bitmap index compression with faster bitwise-logical operations and reduced search space is widely used in analysis in genome data, geographical information system, graph databases, image retrieval, Internet of things, etc. It is expected that bitmap index compression will thrive and be prosperous again in Big Data era since 1980s.
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号