首页 | 本学科首页   官方微博 | 高级检索  
     检索      

参数并行:一种基于群启发式算法的机器学习参数寻优方法
引用本文:杨艳艳,李雷孝,林浩,王永生,王慧,高静.参数并行:一种基于群启发式算法的机器学习参数寻优方法[J].科学技术与工程,2022,22(5):1972-1980.
作者姓名:杨艳艳  李雷孝  林浩  王永生  王慧  高静
作者单位:内蒙古工业大学数据科学与应用学院;内蒙古农业大学计算机与信息工程学院
摘    要:针对机器学习算法超参数寻优效率低的问题和参数寻优主流算法的特点,提出了一种基于参数并行机制的机器学参数寻优方法。该方法利用群启发式算法来进行机器学习算法的参数寻优,将种群转换为Spark平台特有的弹性分布式数据集,针对参数寻优耗时特点并行计算种群中个体适应度。选取随机森林和遗传算法作为实验算法设计了多组实验对所提出的学习训练方法进行验证。实验结果表明,在20万条以下的小数据量下,文中提出的基于参数并行机制的机器学习参数寻优方法与基于数据并行机制的机器学习参数寻优方法相比,运行时间最多能够减少2个小时,并具有良好的可扩展性。

关 键 词:参数寻优    群启发式算法    Spark    参数并行    机器学习算法
收稿时间:2021/6/2 0:00:00
修稿时间:2021/9/17 0:00:00

Parallel Parameters: A Method for Optimizing Machine Learning Parameters Based on Swarm Heuristic Alogorithm
Yang Yanyan,Li Leixiao,Lin Hao,Wang Yongsheng,Wang Hui,Gao Jing.Parallel Parameters: A Method for Optimizing Machine Learning Parameters Based on Swarm Heuristic Alogorithm[J].Science Technology and Engineering,2022,22(5):1972-1980.
Authors:Yang Yanyan  Li Leixiao  Lin Hao  Wang Yongsheng  Wang Hui  Gao Jing
Institution:College of Data Science and Application,Inner Mongolia University of Technology; College of Computer and Information Engineering,Inner Mongolia Agricultural University
Abstract:Aiming at the low efficiency of hyperparameter optimization of machine learning algorithms and the characteristics of mainstream parameter optimization algorithms, a machine learning parameter optimization method based on parameter parallel mechanism is proposed. This method uses the swarm heuristic algorithm to optimize the parameters of the machine learning algorithm, converts the population into a flexible distributed data set unique to the Spark platform, and calculates the fitness of individuals in the population in parallel. The IC card data of Guangzhou buses are selected as experimental data, random forest and genetic algorithm are used as experimental algorithms, and multiple sets of experiments are designed to verify the proposed learning and training method. The experimental results show that, under the small amount of data, the machine learning parameter optimization method based on the parameter parallel mechanism proposed in this paper can reduce the running time by up to 2 hours and has good scalability compared with the machine learning training method based on the data parallel mechanism.
Keywords:parameter optimization      swarm heuristic algorithm      Spark      parallel parameters      machine learning algorithm
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号