首页 | 本学科首页   官方微博 | 高级检索  
     检索      

特定领域问答系统中基于语义检索的非事实型问题研究
引用本文:仇瑜,程力,Daniyal Alghazzawi.特定领域问答系统中基于语义检索的非事实型问题研究[J].北京大学学报(自然科学版),2019,55(1):55-64.
作者姓名:仇瑜  程力  Daniyal Alghazzawi
作者单位:1. 中国科学院新疆理化技术研究所, 乌鲁木齐 830011 2. 中国科学院大学, 北京 100049 3. 新疆民族语音语言信息处理实验室, 乌鲁木齐 830011 4. 阿卜杜勒阿齐兹国王大学计算机和信息技术学院, 吉达 21493
基金项目:中国科学院“西部之光”人才培养计划基金(2017-XBZG-BR-001)、国家“千人计划”项目(Y32H251201)和中国科学院新疆理化技术研究所所长基金(2015RC007)资助
摘    要:面向财税领域非事实型问题, 提出基于语义检索的方法来抽取答案。首先使用领域知识库对问题及领域文档进行语义标注, 引入语义相似度特征提高法规及案例的检索准确率; 其次使用排序学习算法融合领域文本的多种特征对法规检索结果优化; 最后使用法规特征对案例检索结果进行筛选, 并从相似案例中抽取相应答案。在真实数据集上的测试结果表明, 该方法在准确率和效率上比基准方法有显著提升。

关 键 词:问答系统  非事实型问题  领域知识库  语义检索  排序学习  
收稿时间:2018-06-29

Semantic Search on Non-Factoid Questions for Domain-Specific Question Answering Systems
QIU Yu,CHENG Li,Daniyal Alghazzawi.Semantic Search on Non-Factoid Questions for Domain-Specific Question Answering Systems[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2019,55(1):55-64.
Authors:QIU Yu  CHENG Li  Daniyal Alghazzawi
Institution:1. Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011
2. University of Chinese Academy of Sciences, Beijing 100049
3. Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011
4. Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21493
Abstract:A semantic-based retrieval method was proposed to extract answer sentences from tax regulations and cases. Firstly, a domain knowledge base was employed to generate semantic annotations for questions, regulations and cases. Secondly, a filtering system was developed for the removal of irrelevant cases from answer candidates. In addition, a semantic similarity measurement method was employed for answer extraction. Finally, a rank model was proposed for the optimization of the retrieved results. In order to validate the proposed method, a series of experiments were performed on real-life dataset. Experiment results show noticeable improvement in accuracy and performance compared to the baseline methods.
Keywords:question answering system  non-factoid question  domain knowledge base  semantic search  learning to rank  
本文献已被 CNKI 等数据库收录!
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号