首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于自动编码特征的汉语解释性意见句识别
引用本文:贺宇,潘达,付国宏.基于自动编码特征的汉语解释性意见句识别[J].北京大学学报(自然科学版),2015,51(2):234-240.
作者姓名:贺宇  潘达  付国宏
作者单位:黑龙江大学计算机科学技术学院, 哈尔滨 150080;
基金项目:国家自然科学基金(61170148,60973081);黑龙江省人社厅留学人员科技活动项目;哈尔滨市科技创新人才研究专项(2009RFLXG007)资助
摘    要:提出一种基于自动编码特征的汉语解释性意见句识别的分类方法。首先从汽车和手机两个领域的产品评论中构造一个解释性意见语料库, 然后采用分类的方法进行解释性意见句识别。特别地, 采用自动编码技术表示和学习解释性意见句分类的词向量特征。最后, 在支持向量机框架下通过实验优选解释性词向量 维度, 并与一些传统特征表示方法进行比较。实验结果表明, 与传统的卡方、信息增益和TF-IDF及其组合方法相比, 自动编码特征的引入能有效提升汉语解释性意见句识别性能。

关 键 词:意见挖掘  解释性意见句识别  自动编码  
收稿时间:2014-06-28

Chinese Explanatory Opinionated Sentence Recognition Based on Auto-Encoding Features
HE Yu , PAN Da , FU Guohong.Chinese Explanatory Opinionated Sentence Recognition Based on Auto-Encoding Features[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2015,51(2):234-240.
Authors:HE Yu  PAN Da  FU Guohong
Institution:School of Computer Science and Technology, Heilongjiang University, Harbin 150080;
Abstract:An auto-encoding feature based classification method to Chinese explanatory opinionated sentence recognition was presented. An explanatory opinion corpus is built firstly from online product reviews in cellphone and car domains. Then, word embeddings are learned from product reviews using the auto-encoding technique. Finally, the learned word embeddings are used as features for explanatory opinionated sentence classification under the framework of supported vector machines. Experimental results show that word embeddings are more effective than some traditional representations of features like Chi-square, TF-IDF and information gains for explanatory opinionated sentence classification.
Keywords:opinion mining  explanatory opinionated sentence recognition  auto-encoding
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号