首页 | 本学科首页   官方微博 | 高级检索  
     检索      

深度学习中对比散度算法的有偏性分析
引用本文:吴海佳,张雄伟,孙蒙,杨吉斌.深度学习中对比散度算法的有偏性分析[J].解放军理工大学学报,2015(3):224-230.
作者姓名:吴海佳  张雄伟  孙蒙  杨吉斌
作者单位:解放军理工大学 指挥信息系统学院,江苏 南京,210007
基金项目:国家自然科学基金资助项目(NSFC61471394,NSFC61402519);江苏省自然科学基金资助项目(BK2012510);江苏省青年基金资助项目(BK20140071,BK20140074)
摘    要:为了给对比散度算法的进一步优化提供理论指导,尝试从理论上分析对比散度算法的收敛性。首先从仅含4个结点的玻尔兹曼机入手,利用单纯形表征模型的概率空间,以及流形表征概率空间与模型参数的关系,形象地表示了对比散度算法和极大似然算法的收敛过程,并从理论上推导出对比散度算法的收敛集与极大似然算法的收敛集之差不为空,从而证明了对比散度算法的有偏性。基于该结论,设计了一种先利用对比散度算法进行预训练,再利用极大似然算法调优的训练策略。实验结果表明,在应用该策略获得同等收敛效果的条件下,训练迭代步骤降低了83.3%。

关 键 词:深度学习  对比散度  受限玻尔兹曼机  极大似然估计
收稿时间:7/5/2014 12:00:00 AM

Biasness of contrastive divergence in deep learning
WU Haiji,ZHANG Xiongwei,SUN Meng and YANG Jibin.Biasness of contrastive divergence in deep learning[J].Journal of PLA University of Science and Technology(Natural Science Edition),2015(3):224-230.
Authors:WU Haiji  ZHANG Xiongwei  SUN Meng and YANG Jibin
Institution:College of Command Information Systems, PLA Univ. of Sci. & Tech., Nanjing 210007, China,College of Command Information Systems, PLA Univ. of Sci. & Tech., Nanjing 210007, China,College of Command Information Systems, PLA Univ. of Sci. & Tech., Nanjing 210007, China and College of Command Information Systems, PLA Univ. of Sci. & Tech., Nanjing 210007, China
Abstract:Some theoretical problems on the convergence property of the contrastive divergence (CD) algorithm were investigated,providing theoretical guidance for optimizing the CD algorithm. Simplex was used to represent the probability space of the model, and manifold used to represent the relationship between the probability space and parameters of the model. Both of them help to reveal the convergence process visually. Compared with the results from normal maximum likelihood estimation (MLE) for a Boltzmann machine with only 4 nodes, the CD algorithm actually has biasness. Based on this conclusion, a new training strategy of CD pre-training followed by MLE fine-tuning was designed. The experimental results show that, in the same convergence condition, the procedure of the algorithm with the new strategy is reduced by 83.3% compared with the traditional algorithm.
Keywords:deep learning  CD  restricted Boltzmann machine(RBM)  MLE
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《解放军理工大学学报》浏览原始摘要信息
点击此处可从《解放军理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号