首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于BTS数据集的航班延误分类和预测算法
引用本文:郭海州,杨晶晶,吴季达,张彬,黄铭.基于BTS数据集的航班延误分类和预测算法[J].科学技术与工程,2023,23(12):5304-5311.
作者姓名:郭海州  杨晶晶  吴季达  张彬  黄铭
作者单位:云南大学信息学院;云南省无线电监测中心
基金项目:国家自然科学基金(61863035, 62261059, 61963037)
摘    要:针对神经网络分类模型对美国联邦运输统计局(Bureau of Transportation Statistics, BTS)航班数据集中的不均衡数据预测误差较大的问题,采用自适应合成采样算法(adaptive synthetic sampling approach, ADASYN)和合成少数类过采样算法(synthetic minority over-sampling technique, SMOTE)对航班延误类别进行平衡处理,并用随机森林(random forest, RF)模型进行训练和贝叶斯调参。结果表明:与不经过平衡采样的方法比较,该方法在权重平均下的精确率、召回率和F1评分分别提高了19%、8%和16%;分类预测准确率提升8.03%,模型拟合指数AUC(area under curve)提升5.4%。同时,采用多特征相融合的图神经网络模型Graph WaveNet对航班平均延误时间进行预测。实验结果表明:与单特征模型比较,该模型平均绝对误差和均方根误差分别降低了16%和12.45%。这些方法和结果对研究航班延误分类和预测算法研究具有参考价值。

关 键 词:不平衡分类数据  平衡采样算法  随机森林(RF)模型  图神经网络  特征融合
收稿时间:2022/9/15 0:00:00
修稿时间:2023/4/18 0:00:00

Research on Delay Classification and Prediction Algorithm Based on BTS Flight Data Set
Guo Haizhou,Yang Jingjing,Wu Jid,Zhang Bin,Huang Ming.Research on Delay Classification and Prediction Algorithm Based on BTS Flight Data Set[J].Science Technology and Engineering,2023,23(12):5304-5311.
Authors:Guo Haizhou  Yang Jingjing  Wu Jid  Zhang Bin  Huang Ming
Institution:School of Information Science And Engineering, Yunnan University
Abstract:To address the problem that the neural network classification model has a large prediction error on the unbalanced data in the Bureau of transportation statistics (BTS) flight dataset, the adaptive synthetic sampling approach (ADASYN) and the synthetic minority over-sampling technique (SMOTE) are used to balance the flight delay categories, ADASYN and synthetic minority over-sampling technique (SMOTE) are used to balance the flight delay categories, and the random forest RF (random forest) model is used for training and Bayesian conditioning. The results show that compared with the method without balanced sampling, the accuracy, recall and F1 score of the method under weight averaging are improved by 19%, 8% and 16%, respectively; the classification prediction accuracy is improved by 8.03% and the model fit index (area under curve, AUC) is improved by 5.4%. Meanwhile, Graph WaveNet, a multi-feature fusion graph neural network model, was used to predict the average flight delay time, and the experimental results showed that the average absolute error and root mean square error of the model were reduced by 16% and 12.45%, respectively, compared with the single-feature model. These methods and results are of reference value for studying flight delay classification and prediction algorithm research.
Keywords:unbalanced classification data  balanced sampling algorithm  RF model  graph neural network  feature fusion
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号