首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于降噪自动编码器的不平衡情感分类研究
引用本文:秦胜君,卢志平.基于降噪自动编码器的不平衡情感分类研究[J].科学技术与工程,2014,14(12):238-241.
作者姓名:秦胜君  卢志平
作者单位:广西科技大学,广西科技大学
基金项目:欠发达地区工业化与信息化融合及其系统动力机制研究(11FJL007); 广西教育厅人文社科研究项目(SK13YB069)
摘    要:目前,网络评论的情感分类研究大部分是不平衡样本数据,正向样本的数量一般远大于负向样本,对这种不平衡样本集进行分类时容易产生少数类误差较大的问题。而且由于网络评论的表达形式多变,不易获取到大量的有监督的数据。针对上述问题,对无监督的不平衡网络评论情感分类进行研究。首先通过改进降噪自动编码器,提高少数类的特征值,避免分类样本向多数类偏移。然后将获取的特征值作为k-means算法的输入值,实现了无监督的样本分类。实验证明,该算法对不平衡率较高的样本具有良好的适应性,从而验证了算法的有效性。

关 键 词:情感分类  深度学习  降噪自动编码器  不平衡数据
收稿时间:2013/11/6 0:00:00
修稿时间:2013/11/20 0:00:00

Research of Unbalance Sentiment Classification Based on Denoising Autoencoders
Qin Sheng-jun and Lu Zhi-ping.Research of Unbalance Sentiment Classification Based on Denoising Autoencoders[J].Science Technology and Engineering,2014,14(12):238-241.
Authors:Qin Sheng-jun and Lu Zhi-ping
Abstract:Currently, the network comments sentiment classification studies usually use unbalanced sample data in which the number of positive samples generally much larger than the negative sample. That imbalance sample classification is prone to minority class large error. In addition the network comments expression varied, it is difficult to get a large number of supervised data. In order to solver these problems, the article studies the web reviews imbalance unsupervised sentiment classification. First, through improving the Denoising Autoencoders, we increase minority class characteristic value to avoid the majority class classification sample deviation. Then the eigenvalues is put in k-means algorithm as input values to achieve unsupervised classification. Experimental results show that the algorithm has a good adaptability for higher imbalance sample data, and verify the effectiveness of the algorithm.
Keywords:Sentiment Classification    Deep Learning    Denoising Autoencoder    Unbalance data
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号