首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于离群因子的不确定数据生成算法
引用本文:刘钢,唐东凯,王红梅,胡明.基于离群因子的不确定数据生成算法[J].吉林大学学报(理学版),2018,56(4):925-932.
作者姓名:刘钢  唐东凯  王红梅  胡明
作者单位:1. 长春工业大学 计算机科学与工程学院, 长春 130012; 2. 长春工程学院 计算机技术与工程学院, 长春 130012
摘    要:基于不确定数据的表示模型,针对属性级不确定数据,提出一种不确定数据生成算法AC-UDGen(attribute level continuous uncertain data set generation algorithm).该算法通过引入离群点检测-LOF(local outlier factor)算法,用每个数据对象的离群因子作为参数来控制不确定数据对象的扰动范围,可很好地满足原始数据的分布特征,解决了目前工作中缺乏原始数据分布特征的问题.实验结果表明,该算法生成的不确定数据集具有更好的聚类效果,并降低了离群点对聚类结果的影响,使每个数据对象MBR(minimum bounding rectangle)的大小可根据自身的分布特征自适应地变化.

关 键 词:表示模型    AC  UDGen算法  不确定数据    离群因子  
收稿时间:2017-06-20

Uncertain Data Generation Algorithm Based on Outlier Factor
LIU Gang,TANG Dongkai,WANG Hongmei,HU Ming.Uncertain Data Generation Algorithm Based on Outlier Factor[J].Journal of Jilin University: Sci Ed,2018,56(4):925-932.
Authors:LIU Gang  TANG Dongkai  WANG Hongmei  HU Ming
Institution:1. School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China;2. School of Computer Technology and Engineering, Changchun Institute of Technology, Changchun 130012, China
Abstract:Based on the uncertain data representation model, we proposed an uncertain data generation algorithm AC UDGen (attribute level continuous uncertain data set generation algorithm) for attribute level uncertain data. By introducing the outlier detection algorithm LOF (local outlier factor) algorithm, the algorithm used the outlier factor of each data object as the parameter to control the perturbation range of uncertain data objects, which could well satisfy the distribution characteristics of the original data and solve the problem of lack of the distribution characteristics ofthe original data in the present work. The experimental results show that the uncertain data set generated by the proposed algorithm has a better clustering effect, and reduces the influence of outier on the clustering results, so that the size of each data object MBR (minimum bounding rectangle)can be adaptively changed according to its own distribution characteristics.
Keywords:outlier factor  AC UDGen algorithm  uncertain data  representation model
本文献已被 CNKI 等数据库收录!
点击此处可从《吉林大学学报(理学版)》浏览原始摘要信息
点击此处可从《吉林大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号