首页 | 本学科首页   官方微博 | 高级检索  
     

基于共现词对的文档表示方法研究
引用本文:史科,宣国庆. 基于共现词对的文档表示方法研究[J]. 阜阳师范学院学报(自然科学版), 2012, 29(4): 60-63,77
作者姓名:史科  宣国庆
作者单位:1. 安徽广播电视大学省直分校,安徽合肥,230001
2. 合肥市庐阳中学,安徽合肥,230041
摘    要:提出一种新的文档表示模型——基于共现词对的向量空间模型。模型以文档中共现的词对为基本考察对象,通过统计学特征选择有代表性的词对来表示文档。基于覆盖算法的文本分类实验表明此模型有较强的文档表示效果,为文本自动化处理提供了一条新思路。

关 键 词:共现词对  文档表示  向量空间模型  特征选择

Research on document representation based on word co-occurrence model
SHI Ke,XUAN Guo-qing. Research on document representation based on word co-occurrence model[J]. Journal of Fuyang Teachers College:Natural Science, 2012, 29(4): 60-63,77
Authors:SHI Ke  XUAN Guo-qing
Affiliation:1.Shengzhi Branch School of Anhui Open University,Hefei Anhui 230001,China; 2.Hefei Luyang Middle School,Hefei Anhui 230041,China)
Abstract:A new document representation model, a vector space model based on word co-occurrence(VSMBWC) is presen- ted in this article, which uses the co-occurring word pairs as the basic inspection object, and selects typical word pairs to represent document in statistical method. The text classification experiments based on cross cover algorithm show that this model is better in document representation, and provide a new way of thinking for text automatic processing.
Keywords:word co-occurrence  document representation  VSM  feature selection
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号