[1] |
SCHUSTER S C . Next-generation sequencing transforms today’s biology[J]. Nature Methods, 2008,5(1): 16-18.
|
[2] |
SANGER F , NICKLEN S , COULSON A R . . DNA sequencing with chain-terminating inhibitors[J]. Proceeding of the National Academy of Sciences, 1977,B7(12): 5463-5467.
|
[3] |
SHENDURE J , JI H . Next-generation DNA sequencing[J]. Nature Biotechnology, 2008,26(10): 1135-1145.
|
[4] |
HIGGINS G . Human Genomes and Big Data Challenges[R]. Mason: AssureRx Health Inc, 2013.
|
[5] |
WARD R M , SCHMIEDER R , HIGHNAM G , et al. Big data challenges and opportunities in highthrough-put sequencing[J]. Systems Biomedicine, 2013,1(1): 29-34.
|
[6] |
DUNHAM I , BIRNEY E , LAJOIE B R , et al. An integrated encyclopedia of DNA elements in the human genome[J]. Nature, 2012,489(7414): 57-74.
|
[7] |
COLLINS F S , BARKER A D . Mapping the cancer genome[J]. Scientific American, 2007,296(3): 50-57.
|
[8] |
HAYDEN E C . International genome project launched[J]. Nature, 2008,451(7177): 378-389.
|
[9] |
GEVERS D , KNIGHT R , PETROSINO J F , et al. The human microbiome project:a community resource for the healthy human microbiome[J]. PLoS Biology, 2012,10(8):e1001377.
|
[10] |
HAUSSLER D , O’BRIEN S J , RYDER O A , et al. Genome 10K: a proposal to obtain whole-genome sequence for 10 000 vertebrate species[J]. The Journal of Heredity, 2008,100(6): 659-674.
|
[11] |
O’ROAK B J , VIVES L , GIRIRAJAN S , et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations[J]. Nature, 2012,485(7397): 246-250.
|
[12] |
EHRLICH S D . MetaHIT: the European union project on metagenomics of the human intestinal tract[M]// Metagenomics of the Human Body. New York: Springer, 2011: 307-316.
|
[13] |
LEGRAIN P , AEBERSOLD R , ARCHAKOV A , et al. The human proteome project: current state and future direction[J]. Molecular & Cellular Proteomics, 2011,10(7):M111. 009993.
|
[14] |
GILBERT J A , MEYER F , ANTONOPOULOS D , et al. Meeting report: the terabase metagenomics workshop and the vision of an earth microbiome project[J]. Standards in Genomic Sciences, 2010,3(3): 243.
|
[15] |
ROBINSON G E , HACKETT K J , PURCELL M M , et al. Creating a buzz about insect genomes[J]. Science, 2011,331(6023): 1386.
|
[16] |
JOLY Y , DOVE E S , KNOPPERS B M , et al. Data sharing in the post-genomic world: the experience of the international cancer genome consortium (ICGC) data access compliance office (DACO)[J]. PLoS Comput Biol, 2012,8(7):e1002549.
|
[17] |
WU X D , ZHU X Q . Data mining with big data[J]. IEEE Transactions on Knowledge and Data Engineering, 2014,26(1): 97-108.
|
[18] |
CHRISTLEY S , LU Y , LI C , et al. Human genomes as email attachments[J]. Bioinformatics, 2009,25(2): 274-275.
|
[19] |
BRADON M C , WALLACE D C , BALDI P , et al. Data structures and compression algorithms for genomic sequence data[J]. Bioinformatics, 2009,25(14): 1731-1738.
|
[20] |
KOZANITIS C , SAUNDERS C , KRUGLYAK S , et al. Compressing genomic sequence fragments using SlimGene[J]. Journal of Computational Biology, 2011,18(3): 401-413.
|
[21] |
WANG C , ZHANG D . A novel compression tool for efficient storage of genome resequencing data[J]. Nucleic Acids Research, 2011,39(7): e45.
|
[22] |
FRITZ M H Y , LEINONEN R , COCHRANE G , et al. Efficient storage of high throughput DNA sequencing data using reference-based compression[J]. Genome Research, 2011,21(5): 734-740.
|
[23] |
MILLER J R , KOREN S , SUTTON G , et al. Assembly algorithms for next-generation sequencing data[J]. Genomics, 2010,95(6): 315-327.
|
[24] |
BONFIELD J K , MAHONEY M V . Compression of FASTQ and SAM format sequencing data[J]. Plos One, 2013,8(3): 1453-1456.
|
[25] |
COX A J , BAUER M J , JAKOBI T , et al. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform[J]. Bioinformatics, 2012,28(11): 1415-1419.
|
[26] |
HACH F , NUMANAGI? I , ALKAN C , et al. SCALCE: boosting sequence compression algorithms using locally consistent encoding[J]. Bioinformatics, 2012,28(23): 3051-3057.
|
[27] |
SELVA J J , CHEN X . SRComp: short read sequence compression using burstsort and Elias omega coding[J]. PloS One, 2013,8(12): e81414.
|
[28] |
PATRO R , KINGSFORD C . Data-dependent bucketing improves reference-free compression of sequencing reads[J]. Bioinformatics, 2015:btv248.
|
[29] |
JONES D C , RUZZO W L , PENG X , et al. Compression of next-generation sequencing reads aided by highly efficient de novo assembly[J]. Nucleic Acids Research, 2012,40(22): e171.
|
[30] |
METZKER M L . Applications of next-generation sequencing technologies the next generation[J]. Nature Reviews Genetics, 2010,11(1): 31-46.
|
[31] |
WOOLEY C , GODZIK A , FRIEDBERG I . . A primer on metagenomics[J]. PLoS Comput Biol, 2010,6(2):e1000667.
|
[32] |
POP M , PHILLIPPY A , DELCHER A L , et al. Comparative genome assembly[J]. Briefings in Bioinformatics, 2004,5(3): 237-248.
|
[33] |
KECECIOGLU J , JU J . Separating repeats in DNA sequence assembly[C]// The 5th Annual International Conference on Computational Biology, April 22-25,2001, Montreal, Canada. [S.l.:s.n.], 2001: 176-183.
|
[34] |
PRIDE D T , MEINERSMANN R J , WASSENAAR T M , et al. Evolutionary implications of microbial genome tetranucleotide frequency biases[J]. Genome Research, 2003,13(2): 145-158.
|
[35] |
WU Y W , YE Y . A novel abundance-based algorithm for binning metagenomic sequences using l-tuples[J]. Journal of Computational Biology, 2011,18(3): 523-534.
|
[36] |
PRAKASH T , TAYLOR T D . Functional assignment of metagenomic data:challenges and applications[J]. Briefings in Bioinformatics, 2012,13(6): 711-727.
|
[37] |
QIN J , LI R , RAES J , et al. A human gut microbial gene catalogue established by metagenomic sequencing[J]. Nature, 2010,464(7285): 59-65.
|
[38] |
QIN J , LI Y , CAI Z , et al. A metagenome-wide association study of gut microbiota in type 2 diabetes[J]. Nature, 2012,490(7418): 55-60.
|
[39] |
BORODOVSKY M , MCININCH J . GENMARK: parallel gene recognition for both DNA strands[J]. Computers &Chemistry, 1993,17(2): 123-133.
|
[40] |
LUKASHIN A , BORODOVSKY M . GeneMark.hmm: new solutions for gene finding[J]. Nucleic Acids Research, 1998,26(4): 1107-1115.
|
[41] |
BESEMER J , LOMSADZE A , BORODOVSKY M , et al. GeneMarks: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions[J]. Nucleic Acids Research, 2011,29(12): 2607-2618.
|
[42] |
SALZBERG S L , DELCHER A L , KASIF S , et al. Microbial gene identification using interpolated Markov models[J]. Nucleic Acids Research, 1998,26(2): 544-548.
|
[43] |
DELCHER A L , BRATKE K A , POWERS E C , et al. Identifying bacterial genes and endosymbiont DNA with Glimmer[J]. Bioinformatics, 2007,23(6): 673-679.
|
[44] |
FRIGAARD N U , MARTIMEZ A , MINCER T J , et al. Proteorhodopsin lateral gene transfer between marine planktonic bacteria and archaea[J]. Nature, 2006,439(7078): 847-850.
|
[45] |
OUYANG Z , ZHU H , WANG J , et al. Multivariate entropy distance method for prokaryotic gene identification[J]. Journal of Bioinformatics and Computational Biology, 2004,2(2): 353-373.
|
[46] |
ZHU H Q , HU G Q , YANG Y F , et al. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes[J]. BMC Bioinformatics, 2007,8(1): 97.
|
[47] |
NOGUCHI H , TANIGUCHI T , ITOH T , et al. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes[J]. DNA Research, 2008,15(6): 387-396.
|
[48] |
HOFF K J , LINGNER T , MEINICKE P , et al. Orphelia: predicting genes in metagenomic sequencing reads[J]. Nucleic Acids Research, 2009,37(suppl 2): W101-W105.
|
[49] |
ZHU W , LOMSADZE A , BORODOVSKY M , et al. Ab initio gene identification in metagenomic sequences[J]. Nucleic Acids Research, 2010,38(12):e132.
|
[50] |
RHO M , TANG H , YE Y , et al. FragGeneScan:predicting genes in short and error-prone reads[J]. Nucleic Acids Research, 2010,38(20):e191.
|
[51] |
KELLEY D R , LIU B , DELCHER A L , et al. Gene prediction with Glimmer for metagenomic sequences augmented by classification and clustering[J]. Nucleic Acids Research, 2012,40(1):e9.
|
[52] |
HYATT D , LOCASCIO P F , HAUSER L J , et al. Gene and translation initiation site prediction in metagenomic sequences[J]. Bioinformatics, 2012,28(17): 2223-2230.
|
[53] |
WANG Y , LEUNG H C M , YIU S M , et al. MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample[J]. Bioinformatics, 2012,28(18): i356-i362.
|
[54] |
LIU Y , GUO J , HU G , et al. Gene prediction in metagenomic fragments based on the SVM algorithm[J]. BMC Bioinformatics, 2013,14(suppl 5): S12.
|
[55] |
DESANTIS T Z , HUGENHOLTZ P , LARSEN N , et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB[J]. Applied and Environmental Microbiology, 2006,72(7): 5069-5072.
|
[56] |
PRUESSE E , QUAST C , KNITTEL K , et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB[J]. Nucleic Acids Research, 2007,35(21): 7188-7196.
|