首页 | 本学科首页   官方微博 | 高级检索  
     

面向周期性工业时序数据的流式清洗系统
引用本文:王耀,赵炯,周奇才,熊肖磊,陈传林,张恒. 面向周期性工业时序数据的流式清洗系统[J]. 同济大学学报(自然科学版), 2024, 52(3): 462-471
作者姓名:王耀  赵炯  周奇才  熊肖磊  陈传林  张恒
作者单位:1.同济大学 机械与能源工程学院,上海 201804;2.同济大学 浙江学院,浙江 嘉兴 314051;3.上海地铁盾构设备工程有限公司,上海 200233
基金项目:上海申通地铁集团有限公司科研计划(JS-KY21R003-3)
摘    要:为了高效清洗具有时序性、周期性等特点的工业数据,首先利用分布式组件设计了一套流式清洗系统,系统以Mosquitto作为采集数据的汇集中心,以Flume为连接组件,以Kafka为缓冲组件,对接数据清洗组件,使系统具有高吞吐、大缓冲等优势。然后基于速度约束模型,设计了一种周期性数据清洗算法,综合工业数据的时序性、周期性、物理意义等特性,在原有速度约束算法基础上增加周期性检测和数据切片机制,以解决速度约束算法处理周期性数据的失真问题,提高可用度。最后文中以盾构掘进数据集为样本,验证了系统和算法的有效性,以及改进算法的适用性。

关 键 词:数据清洗  工业大数据  时序数据  速度约束  周期性
收稿时间:2022-05-12

Streaming Cleaning System for Periodic Industrial Time Series Data
WANG Yao,ZHAO Jiong,ZHOU Qicai,XIONG Xiaolei,CHEN Chuanlin,ZHANG Heng. Streaming Cleaning System for Periodic Industrial Time Series Data[J]. Journal of Tongji University(Natural Science), 2024, 52(3): 462-471
Authors:WANG Yao  ZHAO Jiong  ZHOU Qicai  XIONG Xiaolei  CHEN Chuanlin  ZHANG Heng
Affiliation:1.College of Mechanical Engineering, Tongji University, Shanghai 201804, China;2.Tongji Zhejiang College, Zhejiang Jiaxing 314051, China;3.Shanghai Metro Shield Machine Equipment & Engineering Co.Ltd, Shanghai 200233, China
Abstract:To efficiently clean industrial time series with the characteristics of periodicity, a streaming data cleaning system was first designed using distributed components. The system employs Mosquitto for data gathering, Flume for connection, and Kafka for the buffer, which provides benefits of high throughput and a large buffer. The data cleaning component serves as the core of the system. Then, a periodic time series cleaning algorithm was proposed based on a constraint model. Integrating the characteristics of temporality, periodicity, and physical meaning, the methods of periodic detection and data slicing were added to the original speed constraint algorithm, so as to solve the distortion problem of the original algorithm and improve the availability to deal with periodic data. Finally, the effectiveness of the system and the improved algorithm was verified using a tunnel boring machine data set as a case study.
Keywords:data cleaning  industrial big data  time series data  speed constraint  periodic
点击此处可从《同济大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《同济大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号