首页 | 本学科首页   官方微博 | 高级检索  
     检索      

代价敏感的客户流失预测半监督集成模型研究
引用本文:肖进,李思涵,贺小舟,腾格尔,贾品荣,谢玲.代价敏感的客户流失预测半监督集成模型研究[J].系统工程理论与实践,2021,41(1):188-199.
作者姓名:肖进  李思涵  贺小舟  腾格尔  贾品荣  谢玲
作者单位:1. 四川大学 商学院, 成都 610064;2. 四川大学 管理科学/运筹学研究所, 成都 610064;3. 北京科学学研究中心, 北京 100089;4. 遵义医科大学 医学信息工程学院, 遵义 563006
基金项目:国家社会科学基金重大项目(18VZL006);四川大学文科杰出青年基金(sksyl201709);北京市科学技术研究院“北科学者”计划(PXM2020-178216-000008);北京市财政课题(PXM2020-178216-000001)
摘    要:客户流失预测是企业客户关系管理的重要内容.在现实的很多客户流失预测建模过程中,由于数据类别的高度不平衡现象的存在,使得模型的分类性能低下,不能很好地进行分类预测.同时由于现实情况中只有少量有类别标签的样本,更多的是无类别标签数据的存在,造成了大量有用信息的浪费.为了解决以上两个问题,本研究将元代价敏感学习,半监督学习和Bagging集成等技术结合,提出了代价敏感的客户流失预测半监督集成模型(semi-supervised ensemble based on metacost,SSEM).该模型主要包括三个阶段:1)用Metacost方法修改初始有标签训练集L的类别标签,得到新的训练集Lm,并将其随机的分为模型训练集Ltr和模型验证集Va;2)使用Va挑选分类精度最高的三个基分类器,并用其选择性标记无类别标签U中的样本,并将它们添加到Ltr中;3)用新的模型训练集Ltr训练N个基本分类模型,并对测试集样本进行分类,进一步将分类结果进行集成.在两个客户流失预测数据集上进行实证分析,将SSEM模型与常用的监督式集成模型以及半监督式集成模型相比,结果表明,SSEM具有更好的客户流失预测性能.

关 键 词:客户流失预测  类别分布不平衡  半监督  协同训练  代价敏感
收稿时间:2019-12-19

Semi-supervised ensemble based on metacost model for customer churn prediction
XIAO Jin,LI Sihan,HE Xiaozhou,TENG Geer,JIA Pinrong,XIE Ling.Semi-supervised ensemble based on metacost model for customer churn prediction[J].Systems Engineering —Theory & Practice,2021,41(1):188-199.
Authors:XIAO Jin  LI Sihan  HE Xiaozhou  TENG Geer  JIA Pinrong  XIE Ling
Institution:1. Business School, Sichuan University, Chengdu 610064, China;2. Management Science and Operations Research Institute, Sichuan University, Chengdu 610064, China;3. Beijing Research Center Science of Science, Beijing 100089, China;4. School of Medical Information Engineering, Zunyi Medical University, Zunyi 563006, China
Abstract:Customer churn prediction is an important content of customer relationship management (CRM). In many real customer churn prediction modeling, the class distribution is highly imbalanced, so that the performance of model is poor and it's difficult to achieve satisfactory results. At the same time, in reality, there are only a small number of labeled samples, and a large number of them are unlabeled, which cause a lot of waste of useful information. In order to solve the two problems above, this study combines the technologies of meta cost-sensitive learning, semi-supervised learning and ensemble method of Bagging, and proposes semi-supervised ensemble based on metacost model (SSEM) for customer churn prediction. This model mainly includes the following three stages:1) Metacost method is used to modify the label of initial labeled training set L, a new training set Lm is obtained, then Lm is randomly divided into model training set Ltr and model verification set Va; 2) Va is used to select three base classifiers with the highest classification accuracy, then these classifiers cooperate to selectively label some samples from unlabeled data set U, which are added into Ltr; 3) N base classifiers are trained on the new model training set Ltr, then using them to classify samples in test set, and the final classification results are obtained by integration. The empirical analysis is conducted in two customer churn prediction datasets, and the results show that the performance of SSEM model is superior to the common used supervised ensemble models and the semi-supervised ensemble models.
Keywords:customer churn prediction  imbalanced class distribution  semi-supervised  co-training  cost-sensitive  
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《系统工程理论与实践》浏览原始摘要信息
点击此处可从《系统工程理论与实践》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号