首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于密度峰值聚类的电力大数据异常值检测算法
引用本文:陆春光,叶方彬,赵羚,姜驰,董伟.基于密度峰值聚类的电力大数据异常值检测算法[J].科学技术与工程,2020,20(2):654-658.
作者姓名:陆春光  叶方彬  赵羚  姜驰  董伟
作者单位:国网浙江省电力有限公司电力科学研究院,杭州310007;浙江华云信息科技有限公司,杭州310002
基金项目:国网科技项目(5211DS18000A)
摘    要:为了解决传统算法检测准确性低,复杂性高不适于电力大数据异常值检测的问题,通过密度峰值聚类算法研究了电力大数据异常值检测问题。分析了密度峰值聚类算法的聚类过程。按照聚类中心选择原则,通过相邻距离和密度的归一化乘积对聚类点的差异度进行衡量,按照差异度的统计特性与改变趋势选择最大的一组点当成聚类中心。按照z空间填充曲线与高维数据点z携带位置信息特性提出基于z的分布式密度峰值聚类算法,降低异常检测复杂性,以达到电力大数据异常值检测要求。采用优化后的密度峰值聚类算法对电力大数据异常值进行检测,在局部密度超过阈值,同时距离超过阈值的情况下,认为相应电力数据点为异常值。将基于距离的检测算法和基于密度的检测算法作为对比进行测试,结果表明:所提算法得到的异常电力数据点,和实际情况相符,和其他两种算法相比没有出现错检测和漏检测的情况。可见所提算法适于电力大数据异常值检测,且检测结果准确性高。

关 键 词:密度峰值聚类  电力大数据  异常值  检测
收稿时间:2018/12/27 0:00:00
修稿时间:2019/2/16 0:00:00

Abnormal Value Detection of Large Power Data Based on Density Peak Clustering
Lu Chunguang,Ye Fangbin,Zhao Ling,Jiang Chi,Dong Wei.Abnormal Value Detection of Large Power Data Based on Density Peak Clustering[J].Science Technology and Engineering,2020,20(2):654-658.
Authors:Lu Chunguang  Ye Fangbin  Zhao Ling  Jiang Chi  Dong Wei
Institution:State Grid Zhejiang Electric Power Co., Ltd. Institute of Electric Power Science,State Grid Zhejiang Electric Power Co., Ltd. Institute of Electric Power Science,State Grid Zhejiang Electric Power Co., Ltd. Institute of Electric Power Science,State Grid Zhejiang Electric Power Co., Ltd. Institute of Electric Power Science,Zhejiang Huayun Information Technology Co., Ltd.
Abstract:In order to solve the problem of low accuracy and high complexity of traditional algorithm which is not suitable for abnormal value detection of large power data, the abnormal value detection of large power data is studied by density peak clustering algorithm. The clustering process of density peak clustering algorithm is analyzed. According to the principle of cluster center selection, the difference degree of clustering points is measured by the normalized product of adjacent distance and density. According to the statistical characteristics of difference degree and changing trend, the largest group of points are selected as cluster centers. A distributed density peak clustering algorithm based on Z-value is proposed according to the z-space filling curve and the location information characteristics of Z-value carried by high-dimensional data points, which reduces the complexity of anomaly detection and achieves the requirement of anomaly detection for large power data. The optimized density peak clustering algorithm is used to detect the abnormal values of large power data. When the local density exceeds the threshold and the distance exceeds the threshold, the corresponding power data points are considered as abnormal values. The distance-based detection algorithm and density-based detection algorithm are compared and tested. The results show that the abnormal power data points obtained by the proposed algorithm are consistent with the actual situation. Compared with the other two algorithms, there is no case of error detection and leakage detection. It can be seen that the proposed algorithm is suitable for anomaly detection of large power data, and the accuracy of detection results is high.
Keywords:Peak Density Clustering    Large Power Data    Abnormal Value    Detection
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号