首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于mRMR与因子分解机的分类模型研究
引用本文:王美,龙华,邵玉斌,杜庆治.基于mRMR与因子分解机的分类模型研究[J].四川大学学报(自然科学版),2020,57(1):96-102.
作者姓名:王美  龙华  邵玉斌  杜庆治
作者单位:昆明理工大学信息工程与自动化学院,昆明650000;昆明理工大学信息工程与自动化学院,昆明650000;昆明理工大学信息工程与自动化学院,昆明650000;昆明理工大学信息工程与自动化学院,昆明650000
基金项目:国家自然科学基金(61761025)
摘    要:很多学者用“全球恐怖主义研究数据库”GTD数据集,采用博弈论、K近邻法和支持向量机等分析恐怖事件的聚集性,已经取得一些成果.但在前期研究中未有很好考虑数据的稀疏性以及高维度多冗余等会导致聚集分类准确率不高的问题.本文提出一种基于最小冗余最大相关与因子分解机结合的TFM分类模型,使用增量搜索方法寻找近似最优的特征解决高维度多冗余问题和FM方法解决数据稀疏问题,并对预处理后的恐怖袭击事件数据用TFM模型做量化分类.文中使用朴素贝叶斯NB、支持向量机SVM、逻辑回归LR与TFM等4个模型的“马修斯相关系数”MCC进行比较,结果显示TFM的MCC相对于其他三个模型NB、SVM、LR分别提高了49.9%,2.5%,2.3%,可见TFM模型有一定可行性.

关 键 词:最小冗余最大相关  GTD  因子分解机  马修斯相关系数  TFM分类模型
收稿时间:2019/4/3 0:00:00
修稿时间:2019/8/30 0:00:00

Classification model based on mRMR and factorization machines algorithm
Wangmei,Long Hu,Shao Yubin and Du Qingzhi.Classification model based on mRMR and factorization machines algorithm[J].Journal of Sichuan University (Natural Science Edition),2020,57(1):96-102.
Authors:Wangmei  Long Hu  Shao Yubin and Du Qingzhi
Institution:Kunming University of Science and Technology,Kunming University of Science and Technology, Faculty of Information Engineering and Automation,Kunming University of Science and Technology, Faculty of Information Engineering and Automation,Kunming University of Science and Technology, Faculty of Information Engineering and Automation
Abstract:Many scholars have made some achievements in aggregation analysis of terrorist events by using the data set of "Global Terrorism Research Database"(GTD) with game theory, k nearest neighbor method and support vector machine. However, data sparsity and high dimensional multi redundancy are not well considered in the previous research, which may lead to low accuracy of clustering classification. This paper proposes a TFM classification model based on "Minimal redundancy maximal relevancy" (mRMR) combined with " Factorization Machines " (FM), in which the incremental search method is used to find approximately optimal features to address the high dimensional multi redundancy and the data sparsity is tackled with FM method. TFM model is then used to make quantitative classification on the pre processed terrorist attack data. The experimental results show the proposed TFM model, in terms of Matthews correlation coefficient (MCC), is increased by 49.9%, 2.5% and 2.3% respectively compared with naive Bayes (NB), support vector machine (SVM) and logistic regression (LR). The comparative result demonstrates that TFM model is feasible to some extent.
Keywords:mRMR    GTD    Factorization Machines    MCC    TFM classification model
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《四川大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号