大数据 ›› 2016, Vol. 2 ›› Issue (2): 76-87.doi: 10.11959/j.issn.2096-0271.2016021

• 研究 • 上一篇    下一篇

高通量DNA测序数据的生物信息学方法

詹晓娟1,姚登举2,朱怀球3   

  1. 1 黑龙江工程学院计算机科学与技术学院,黑龙江 哈尔滨 150050
    2 哈尔滨理工大学软件学院,黑龙江 哈尔滨 150040
    3 北京大学生物医学工程系,北京 100871
  • 出版日期:2016-03-20 发布日期:2020-09-29
  • 作者简介:詹晓娟(1978-),女,黑龙江工程学院讲师,主要研究方向为数据挖掘、机器学习、生物信息。|姚登举(1980-),男,哈尔滨理工大学副教授,主要研究方向为数据挖掘、机器学习、生物信息。|朱怀球(1970-),男,北京大学教授,主要研究方向为生物医学信息学和计算系统生物学。
  • 基金资助:
    黑龙江省自然科学基金资助项目(F201313);黑龙江省教育厅科学技术研究资助项目(12541124);哈尔滨市科技创新人才资助项目(2013RFQXJ114)

Bioinformatics methods for high-throughput DNA sequencing data

xiaojuan Zhan1,dengju Yao2,huaiqiu Zhu3   

  1. 1 College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China
    2 School of Software, Harbin University of Science and Technology, Harbin 150040, China
    3 Department of Biomedical Engineering, Peking University, Beijing 100871, China
  • Online:2016-03-20 Published:2020-09-29
  • Supported by:
    The Natural Science Foundation of Heilongjiang Province(F201313);The Foundation of Heilongjiang Province Educational Committee(12541124);The Harbin Special Funds for Technological Innovation Research of Heilongjiang Province of China(2013RFQXJ114)

摘要:

高通量测序技术产生的DNA序列数据长度较短,而且数据量非常巨大。分析了高通量测序环境下大数据的挑战和机遇,总结并讨论了数据压缩、宏基因组数据序列拼接、宏基因组数据序列分析方面的算法和工具等研究成果。最后,展望了高通量测序下DNA短读序列数据研究的发展趋势。

关键词: 高通量DNA测序, 生物信息学, 短读序列数据压缩, 短读序列数据拼接, 短读序列数据分析

Abstract:

DNA sequence data generated by high-throughput sequencing technology is short in length, and the amount of data is enormous. The challenges and opportunities of the big data in high-throughput sequencing environment were analyzed. The data compression, the assembly of metagenomic sequence data, and algorithms and tools of metagenomic sequence data analysis also were summarized and discussed. Finally, the future of the study on short read DNA sequence data in high-throughput sequencing environment was discussed.

Key words: high-throughput DNA sequencing, bioinformatics, short read sequence data compression, short read sequence data splicing, short read sequence data analysis

中图分类号: 

No Suggested Reading articles found!