首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向不平衡数据集的SMOTENC-XGBOOST驾驶人交通安全评估模型
引用本文:王博文,王景升,吴恩重.面向不平衡数据集的SMOTENC-XGBOOST驾驶人交通安全评估模型[J].科学技术与工程,2023,23(2):831-837.
作者姓名:王博文  王景升  吴恩重
作者单位:中国人民公安大学交通管理学院;中国人民公安大学治安学院
基金项目:国家重点研发计划项目(2020YFC1522603);中国人民公安大学2022年基本科研业务费学科基础理论体系项目(2022JKF02013)
摘    要:为深入挖掘驾驶人因素与交通事故之间的关系,提出一种基于SMOTENC和极端梯度提升(extreme gradient boosting, XGBoost)的驾驶人交通状态优劣分类算法。首先针对交通事故发生与否不平衡的特点,使用SMOTENC算法对数据进行上采样并在采样过程中加入随机扰动,解决了数据不平衡问题。然后使用Embedded算法结合L1正则化,通过模型评估完成对特征子集的选择。最后使用机器学习的方法将XGBoost算法用于执行数据的特征提取和分类过程。实验表明,在对驾驶人的交通状态进行综合评价的任务上,XGBoost模型的准确率为99.85%,相较于随机森林、支持向量机等对照组模型,提升了约1.12%-1.80%。除此之外,使用SMOTENC算法对数据不平衡问题进行处理后,通过混淆矩阵观察到模型对于好坏个体均具备较好的识别能力。

关 键 词:交通安全    XGBOOST    SMOTENC    驾驶人因素    事故预防
收稿时间:2022/4/22 0:00:00
修稿时间:2022/10/23 0:00:00

SMOTENC-XGBOOST Driver Traffic Safety Assessment Model for Unbalanced Dataset
Wang Bowen,Wang Jingsheng,Wu Enzhong.SMOTENC-XGBOOST Driver Traffic Safety Assessment Model for Unbalanced Dataset[J].Science Technology and Engineering,2023,23(2):831-837.
Authors:Wang Bowen  Wang Jingsheng  Wu Enzhong
Institution:People''s Public Security University of China,School of Traffic Management; People''s Public Security University of China,Pulic Security College
Abstract:In order to deeply explore the relationship between driver factors and traffic accidents, a classification algorithm based on SMOTENC and Extreme Gradient Boosting (XGBoost) was proposed. Firstly, according to the unbalanced characteristics of traffic accidents, SMOTENC algorithm was used to up-sample the data, and random disturbance was added in the sampling process to solve the problem of data imbalance. Then, using Embedded algorithm combined with L1 regularization, the feature subset was selected through model evaluation. Finally, XGBoost algorithm was used to perform feature extraction and classification of data by machine learning method. Experiments show that the accuracy of XGBoost model is 99.85%, which is about 1.12%-1.80% higher than that of random forest and support vector machine. In addition, after the SMOTENC algorithm is used to deal with the data imbalance problem, it is observed through the confusion matrix that the model has good identification ability for good and bad individuals.
Keywords:raffic safety      XGBoost      SMOTENC      driver factor      accident prevention
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号