首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于凝聚函数的混合属性数据聚类算法
引用本文:王宇,杨莉.基于凝聚函数的混合属性数据聚类算法[J].大连理工大学学报,2006,46(3):446-448.
作者姓名:王宇  杨莉
作者单位:1. 大连理工大学,管理学院,辽宁,大连,116024
2. 大连理工大学,外语学院,辽宁,大连,116024
摘    要:借助于近似极大值函数的凝聚函数,将传统数据聚类问题转化为无约束优化问题求解.首先利用一阶必要条件,推导出数值属性下数据聚共中心的计算格式;其次采用类属性分解方法,提出计算类属性数据对象之间距离的新方法,井在此基础上给出混合属性下数据聚类中心的计算格式和一个能处理数值型和分类型混合数据集的凝聚聚类算法;最后选取不同初始聚类中心,使用凝聚聚类算法对英语借词进行了聚类实验和分析.结果表明,凝聚聚类算法在计算效率和计算效果方面均优于模糊k-prototypes聚类算法.

关 键 词:聚类  凝聚函数  混合属性  优化  英语借词
文章编号:1000-8608(2006)03-0446-03
收稿时间:2005-02-11
修稿时间:2005-02-112006-03-04

A clustering algorithm for mixed valued data based on aggregate function
WANG Yu,YANG Li.A clustering algorithm for mixed valued data based on aggregate function[J].Journal of Dalian University of Technology,2006,46(3):446-448.
Authors:WANG Yu  YANG Li
Institution:1. School of Manage., Dalian Univ. of Technol., Dallan 116024, China; 2. School of Foreign Lang., Dalian Univ. of Technoh, Dalian 116024, China
Abstract:Aggregate function which approximates the maximum function, is introduced, and data clustering problem is reformulated as the unconstrained optimization. Firstly, a computing scheme for clustering center is inferred for numeric valued data, applying the first order necessary condition. Secondly, a new distance concept and computing scheme for categorical valued data are presented using decomposition method of categorical valued attributes, and furthermore, a new clustering approach for mixed numeric and categorical valued data is presented. Finally, computing experiment and analysis for Chinese loanwords in English are given by using different centers of clustering. The results show that the aggregate clustering algorithm is superior to the fuzzy k-prototypes algorithm in both computing efficiency and effects.
Keywords:clustering  mixed valued data  aggregate function  optimization  Chinese Loanwords in English
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号