首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向基因测序数据的高效位点与表型关联挖掘算法
引用本文:胡建龙,邵峰晶,吴舜尧.面向基因测序数据的高效位点与表型关联挖掘算法[J].青岛大学学报(自然科学版),2014(2):23-28.
作者姓名:胡建龙  邵峰晶  吴舜尧
作者单位:[1]青岛大学信息工程学,青岛266071 [2]青岛大学自动化工程学院,青岛266071
基金项目:国家自然科学基金(批准号:91130035)资助;国家公益性行业科研专项基金(批准号:200905030-2)资助;山东省自然科学基金(批准号:ZR2012FZ003)资助;山东省自然科学基金(批准号:ZR2012FQ017)资助.
摘    要:疾病表型通常会受SNP位点调控,挖掘疾病表型与SNP位点间的关联规则有助于提供个性化分子诊疗方案。由于SNP位点具有遗传异质性,在挖掘疾病表型与SNP位点间的关联规则时,需要将最小支持度阈值设为较低值,甚至是0,又由于SNP位点数据量庞大,这会使得关联规则算法时间复杂度极高。为此,提出了HEMAPS算法,通过使用线程并行处理和垂直数据格式改进Apriori算法。此外,为解决质量性状表型样本比例不平衡问题,提出了一种新的关联规则评价指标——匹配度。实验结果表明,HEMAPS算法的时间复杂度比Apriori算法明显降低。

关 键 词:关联规则  垂直数据结构  多线程并行  Apriori算法  匹配度

High-Efficiency Mining Algorithm for Association Rules Between Phenotypes and SNPs
HU Jian-long,SHAO Feng-jing,WU Shun-yao.High-Efficiency Mining Algorithm for Association Rules Between Phenotypes and SNPs[J].Journal of Qingdao University(Natural Science Edition),2014(2):23-28.
Authors:HU Jian-long  SHAO Feng-jing  WU Shun-yao
Institution:(a. College of Information Engineering, b. College of Automation Engineering, Qingdao University, Qingdao 266071, China)
Abstract:Since SNPs usually regulate disease phenotypes, association rules between disease phenotypes and SNPs can help provide personalized molecular diagnosis and treatment. In consideration of SNPs' ge- netic heterogeneity, we need to set the minimum support threshold for a low value or even for zero when mining association rules between disease phenotypes and SNPs. Besides, the time complexity of mining al gorithm becomes very high owing to the large number of SNPs. Therefore, HEMAPS algorithm, an improvement of Apriori, is presented in this paper. HEMAPS improves Apriori by using vertical data format and multi-thread parallel computing. In addition, this paper proposes match degree as a new evaluate index of association rules to solve the problem of sample ratio imbalance of quality traits. Experimental results show that the time complexity of HEMAPS is significantly lower than that of Apriori.
Keywords:association rule  vertical data format  multi-thread parallel computing  Apriori  match degree
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号