首页 | 官方网站   微博 | 高级检索  
     

小样本分散数据的回归建模和多目标优化
引用本文:姚煜,胡涛,付建勋,胡顺波.小样本分散数据的回归建模和多目标优化[J].上海大学学报(自然科学版),2021,28(3):451-462.
作者姓名:姚煜  胡涛  付建勋  胡顺波
作者单位:1.上海大学 计算机工程与科学学院, 上海 200444;2.上海大学 材料科学与工程学院 先进凝固技术中心 省部共建高品质特殊钢冶金与制备国家重点实验室, 上海 200444;3.上海大学 材料基因组工程研究院 材料信息与数据科学中心, 上海 200444;4.之江实验室, 浙江 杭州 311100
基金项目:国家重点研发计划资助项目(2018YFB0704400);云南省重大科技专项资助项目(202102AB080019-3);云南省重大科技专项资助项目(202002AB080001-2);之江实验室科研攻关资助项目(2021PE0AC02);上海张江国家自主创新示范区专项发展资金重大资助项目(ZJ2021-ZD-006)
摘    要:小样本分散数据上的回归对建模有一定挑战, 利用高斯过程对其回归进行建模, 即采用极大似然估计进行核函数的超参数学习, 通过后验来计算回归结果并预测出目标函数的均值和方差. 在此基础上结合方差的多目标优化, 在进行材料逆向设计的同时能对设计结果的不确定性进行估计. 对 1215MS 非调质钢和三点弯混凝土数据集进行了实验验证. 实验结果表明, 对于三点弯混凝土平均有 50% 实验数据落在预测的 95% 置信区间内, 高斯过程回归 (Gaussian process regression, GPR) 模型可以较好地度量分散小样本数据的不确定性, 进行合理预测. 对于 1215MS 非调质钢数据集, 在高斯过程回归模型的基础上, 运用带精英策略的非支配遗传算法 (elitist non-dominated sorting genetic algorithm, NSGA-Ⅱ) 进行多目标优化, 将材料的力学性能以及所对应的方差作为优化目标, 在考虑最优力学性能的同时兼顾不确定因素对实验结果的影响, 得到最优帕累托解集, 以此作为下次实验的候选点, 辅助材料设计和制备优化.

关 键 词:小样本分散数据  高斯过程回归  多目标优化  NSGA-Ⅱ  
收稿时间:2022-03-18

Regression modeling and multi-objective optimization for small sample scattered data
YAO Yu,HU Tao,FU Jianxun,HU Shunbo.Regression modeling and multi-objective optimization for small sample scattered data[J].Journal of Shanghai University(Natural Science),2021,28(3):451-462.
Authors:YAO Yu  HU Tao  FU Jianxun  HU Shunbo
Affiliation:1. School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China;2. Center for Advanced Solidification Technology (CAST), State Key Laboratory of Advanced Special Steel, School of Materials Science and Engineering, Shanghai University, Shanghai 200444, China;3. Center of Materials Informatics and Data Science, Materials Genome Institute, Shanghai University, Shanghai 200444, China;4. Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China
Abstract:Regression modeling on small-sample scattered data poses certain challenges. In this study, the Gaussian process is used to model regression, and maximum likelihood estimation is performed to learn the hyperparameters of the kernel function. The regression results, i.e., the mean and variance of the objective function, are calculated and predicted from the posterior. Combining the results with the multi-objective optimization of variance, the uncertainty of material reverse design can be estimated. Experimental verifications are conducted on 1215MS non-quenched and tempered steel and three-point bending concrete datasets. The results show that for the three-point bending concrete, 50% of the experimental data are within the 95% confidence interval of the prediction, and the Gaussian process regression (GPR) model can measure the uncertainty of the scattered small-sample data more effectively and yield reasonable predictions. For the 1215MS dataset, a non-dominated genetic algorithm with an elite strategy is used to perform multi-objective optimization based on the GPR model. The mechanical properties of the material and the corresponding variance are used as optimization objectives, and the optimal mechanical properties are considered while considering the effect of uncertainties on the experimental results. The optimal Pareto solution set is obtained, which is subsequently used as candidate points for the next experiment to assist material design and preparation optimization.
Keywords:small sample scattered data  Gaussian process regression  multi-objective optimization  elitist non-dominated sorting genetic algorithm (NSGA-Ⅱ)  
点击此处可从《上海大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《上海大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号