Big Data Research ›› 2020, Vol. 6 ›› Issue (5): 16-28.doi: 10.11959/j.issn.2096-0271.2020041

• TOPIC:MEDICAL BIG DATA • Previous Articles     Next Articles

Parallel optimization of variation detection algorithms for large-scale genome data

Yingbo CUI1,Chun HUANG1,Tao TANG1,Canqun YANG1,Xiangke LIAO1,Shaoliang PENG2,3   

  1. 1 College of Computer,National University of Defense Technology,Changsha 410073,China
    2 College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China
    3 National Supercomputer Center in Changsha,Changsha 410082,China
  • Online:2020-09-20 Published:2020-09-29
  • Supported by:
    The National Key Research and Development Program of China(2018YFB0204301);The National Key Research and Development Program of China(2017YFB0202602);The National Natural Science Foundation of China(61772543);The National Natural Science Foundation of China(61972408)

Abstract:

Sequence alignment and mutation detection are the basic steps of genomic data analysis.They are the premise of subsequent functional analysis,and the most time-consuming steps.In order to effectively deal with the massive genomic big data brought by high-throughput sequencing technology,MPI,OpenMP and other technologies to perform multi-level parallel optimization of sequence alignment algorithm and SNP detection algorithm were used.By testing on different data sets and parallel scales,the core algorithm reached more than 9x speedup,and the parallel efficiency remained above 60% in large-scale test.The improved algorithms obtain good parallel performance and scalability,that effectively improves the ability of genomic big data mutation detection.

Key words: sequence alignment, SNP, OpenMP, MPI

CLC Number: 

No Suggested Reading articles found!