首页 | 本学科首页   官方微博 | 高级检索  
     

采用相关反馈和文档相似度的维吾尔语检索词加权方法
引用本文:于丽,亚森·艾则孜. 采用相关反馈和文档相似度的维吾尔语检索词加权方法[J]. 华侨大学学报(自然科学版), 2017, 0(3): 408-413. DOI: 10.11830/ISSN.1000-5013.201703022
作者姓名:于丽  亚森·艾则孜
作者单位:新疆警察学院 信息安全工程系, 新疆 乌鲁木齐 830011
摘    要:针对维吾尔语Web文档的有效检索问题,提出一种基于相关反馈和文档相似度的检索词加权方法.首先,对维吾尔语文档进行预处理,获得相应的词干集.然后,当用户输入多个检索词时,执行初始检索,并基于局部相关反馈思想提取出排名靠前的N个文档.接着,利用TF-IDF算法计算检索词与反馈文档之间的词频相似度,通过余弦距离计算文档之间的相似度,并以此对检索词进行两次加权.最后,根据加权后的检索词进行文档检索.实验结果表明:该方法能够准确地检索出用户所需的文档,并将其靠前排序.

关 键 词:维吾尔语  文档检索  检索词加权  相关反馈  文档相似度

Uyghur Retrieval Word Weighting Scheme Using Relevance Feedback and Document Similarity
YU Li,YASEN·AIZEZI. Uyghur Retrieval Word Weighting Scheme Using Relevance Feedback and Document Similarity[J]. Journal of Huaqiao University(Natural Science), 2017, 0(3): 408-413. DOI: 10.11830/ISSN.1000-5013.201703022
Authors:YU Li  YASEN·AIZEZI
Affiliation:Department of Information Security Engineering, Xinjiang Police College, Urumqi 830011, China
Abstract:For the issue that the effective retrieval of Uyghur web documents, a Uyghur retrieval word weighting scheme based on the relevance feedback and document similarity is proposed. First of all, the Uyghur documents are pre-processed to obtain the corresponding stem set. Then, the initial search is executed when the user input a number of retrieval words, and it extracts the top N documents based on local relevance feedback. Follow, the TF-IDF algorithm is used to compute the frequency similarity between retrieval word and feedback documents. At the same time, the cosine distance is used to compute the similarity between documents, so as to make twice weighted for retrieval words. Finally, it performs document retrieval according to the weight of retrieval words. Experimental results show that the proposed method can accurately retrieve the documents required by the user, and can sort them in the front.
Keywords:Uygur  document retrieval  weighted retrieval words  relevance feedback  document similarity
本文献已被 CNKI 等数据库收录!
点击此处可从《华侨大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《华侨大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号