首页 | 本学科首页   官方微博 | 高级检索  
     

基于主题情感混合模型的无监督文本情感分析
引用本文:孙艳,周学广,付伟. 基于主题情感混合模型的无监督文本情感分析[J]. 北京大学学报(自然科学版), 2013, 49(1): 102-108
作者姓名:孙艳  周学广  付伟
作者单位:海军工程大学信息安全系, 武汉 430033;
基金项目:国家自然科学基金(61100042)资助
摘    要:针对有监督、半监督的文本情感分析存在标注样本不容易获取的问题, 通过在LDA模型中融入情感模型, 提出一种无监督的主题情感混合模型(UTSU模型)。UTSU模型对每个句子采样情感标签, 对每个词采样主题标签, 无须对样本进行标注, 就可以得到各个主题的主题情感词, 从而对文档集进行情感分类。情感分类实验对比表明, UTSU模型的分类性能比有监督情感分类方法稍差, 但在无监督的情感分类方法中效果最好, 情感分类综合指标比ASUM模型提高了约2%, 比JST模型提高了约16%。

关 键 词:主题模型  LDA  情感分析  混合模型  
收稿时间:2012-06-02

Unsupervised Topic and Sentiment Unification Model for Sentiment Analysis
SUN Yan,ZHOU Xueguang,FU Wei. Unsupervised Topic and Sentiment Unification Model for Sentiment Analysis[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2013, 49(1): 102-108
Authors:SUN Yan  ZHOU Xueguang  FU Wei
Affiliation:Deparment of Information Security, Naval University of Engineering, Wuhan 430033;
Abstract:Supervised and semi-supervised sentiment classification methods need label corpora for classifier training. To solve this problem, an unsupervised topic and sentiment unification model (UTSU model) is proposed based on the LDA model. UTSU model imposes a constraint that all words in a sentence are generated from one sentiment and each word is generated from one topic. This constraint conforms to the sentiment expression of language and will not limit the topic relation of words. UTSU model is compeletly unsupervised and it needs neither labeled corpora nor sentiment seed words. The experiments of sentiment classification show that UTSU model comes close to supervised classification methods and outperforms other topic and sentiment unification models. UTSU model improves the F1 value of sentiment classification 2% than ASUM model and 16% than JST model.
Keywords:topic model  latent Dirichlet allocation (LDA)  sentiment analysis  unification model  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号