首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于细粒度依存关系的中文长句相似度计算
引用本文:王继鹏,魏墨济.基于细粒度依存关系的中文长句相似度计算[J].科学技术与工程,2017,17(11).
作者姓名:王继鹏  魏墨济
作者单位:安阳师范学院,山东省科学院情报研究所
基金项目:国家自然科学基金(U1504612)、河南省高校创新人才计划 (15HASTIT023)、河南省科技攻关项目(132102210264)资助
摘    要:长句是中文书面语的常见现象,其由于结构复杂在计算句子相似度时难度较大。综合考虑依存关系中的关键元素,对中文依存句法树进行研究和分析,提出了一种细粒度依存关系的相似度计算方法。通过研究依存句法树中的各节点的词语、词性以及它们之间的依赖关系及其重要性权重等多个特征量,给出了两个依存句法树的相似度计算方法;基于该算法实现中文长句的相似度计算。实验结果表明该方法用于计算中文长句相比较其他算法有更高的准确率。

关 键 词:自然语言处理  句子相似度  依存句法  知网
收稿时间:2016/10/15 0:00:00
修稿时间:2016/11/21 0:00:00

Chinese Long Sentences Similarity Calculation based on Fine-grained Dependency Syntax
Wang Jipeng and Wei Moji.Chinese Long Sentences Similarity Calculation based on Fine-grained Dependency Syntax[J].Science Technology and Engineering,2017,17(11).
Authors:Wang Jipeng and Wei Moji
Institution:Anyang Normal University,Information Research Institute of Shandong Academy of Sciences
Abstract:Long sentence is a common phenomenon in Chinese written material. It is difficult to calculate the sentence similarity because of its complex structure. Multi-feature fusion method is proposed to research and analysis the Chinese dependency syntax tree. A similarity computing method for dependency syntactic tree is introduced. Based on the dependency syntactic tree structure, the node words, parts of speech, and the dependencies between words are considered. The similarity calculation method between two dependency syntactic trees is proposed through comprehensive analysis of feature weights of dependency relation. And a similarity calculation for Chinese long sentences is realized based on the method. Experimental results show that this method achieved a higher accuracy rate comparing with other method.
Keywords:NLP  sentence similarity  dependency syntax  HowNet
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号