Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Comparing de novo assemblers for metagenomic data obtained from Medium- Length sequencing technologies

No description
by

Amir Zadeh

on 22 October 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Comparing de novo assemblers for metagenomic data obtained from Medium- Length sequencing technologies

Comparing de novo assemblers for metagenomics data odtained from Medium-Length sequencing technologies Introdunction Background: Medium_LengthAssemblers: AATCGTGCATTGCCAATCGTGCATTGAATCCGAT ... CGATATCCGATTAAGCGATTAGCAGTTAACGGCAAT ~200 bp DATASETS mesocosm study’s metagenomic data Presence of corresponding metatranscriptome Knowledge about large gene families and high percentage of novel genes The reads medium- length (average sequence length of 207 bp) ASSEMBLERS Basic assembly metrics Performance Evaluation Amir M. Zadeh
Bioinformatics Research Lab. CONCLUSION: Merging two assemblies at a time FUTURE WORK Use a simulation approach Exploring the parameter space for each assembler REFRENCES: 21.Gilbert, J.A., Thomas, S., Cooley, N.A., et al. Potential for phosphonoacetate utilization by marine bacteria in temperate coastal waters. Environmental Microbiology 11, 1 (2009), 111-125
22.Gilbert, J. a, Field, D., Huang, Y., et al. Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PloS ONE 3, 8 (2008), e3042.
23.Zhang, W., Chen, J., Yang, Y., Tang, Y., Shang, J., and Shen, B. A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies. PloS ONE 6, 3 (2011), e17915.
24.Li, R., Fan, W., Tian, G., et al. The sequence and de novo assembly of the giant panda genome. Nature 463, 7279 (2010), 311-317.
25.Li, R., Zhu, H., Ruan, J., et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20, 2 (2010), 265-272.
26.Imelfort, M. and Edwards, D. De novo sequencing of plant genomes using second-generation technologies. Briefings in Bioinformatics 10, 6 (2009), 609-618.
27.Nowrousian, M., Stajich, J.E., Chu, M., et al. De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis. PLoS GENETICS 6, 4 (2010), 22.
28.Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27, 2 (1999), 573-580.
29.Mundry, M., Bornberg-Bauer, E., Sammeth, M., and Feulner, P.G.D. Evaluating characteristics of de novo assembly software on 454 transcriptome data: a simulation approach. PloS ONE 7, 2 (2012), e31410.
30.Kumar, S. and Blaxter, M.L. Comparing de novo assemblers for 454 transcriptome data. BMC genomics 11, (2010), 571.
31.Miller, R.T., Christoffels, A.G., Gopalakrishnan, C., et al. A Comprehensive Approach to Clustering of Expressed Human Gene Sequence: The Sequence Tag Alignment and Consensus Knowledge Base. Genome Research 9, 11 (1999), 1143-1155.
32.Chevreux, B., Wetter, T., and Suhai, S. Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. Computer Science and Biology Proceedings of the German Conference on Bioinformatics GCB 99, 1995 (1999), 45-56.
33.Wooley, J.C., Godzik, A., and Friedberg, I. A primer on metagenomics. PLoS COMPUTATIONAL BOLOGY 6, 2 (2010), e1000667.
34.Abeggunde, T. Comparison of DNA Sequence Assembly Algorithms Using Mixed Data Sources.M.Sc. Thesis, Department Computer Science, University of Saskatchewan. April (2010) 1.Sanger, F., Nicklen, S., and Coulson, A.R. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America 74, 12 (1977), 5463-5467.
2.Huang, X. and Madan, A. CAP3: A DNA Sequence Assembly Program. Genome Research 9, 9 (1999), 868-877.
3.New, T.H.E. THE NEW SCIENCE OF METAGENOMICS Revealing the Secrets of Our Microbial Planet. The National Academies Press, (2007).
4.Warnecke, F. and Hess, M. A perspective: Metatranscriptomics as a tool for the discovery of novel biocatalysts. Journal of Biotechnology 142, 1 (2009), 91-95.
5.Amann, R.I., Ludwig, W., and Schleifer, K.H. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiological reviews 59, 1 (1995), 143-169.
6.Martin, J.A. and Wang, Z. Next-generation transcriptome assembly. Nature Reviews Genetics 12, 10 (2011), 671-682.
7.Miller, J.R., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data. Genomics 95, 6 (2010), 315-327.
8.Margulies, M., Egholm, M., Altman, W.E., et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 7057 (2005), 376-80.
9.Chevreux, B. MIRA: An Automated Genome and EST Assembler. German Cancer Research Center, Heidelberg (2005).
10.Huang, X. and Madan, A. CAP3: A DNA Sequence Assembly Program. Genome Research 9, 9 (1999), 868-877.
11.CLC bio: CLC Assembly Cell User Manual. Retrieved April 1.2012 from: [http://www.clcbio.com/index.php?id=1393].
12.Zerbino, D.R. and Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 18, 5 (2008), 821-9.
13.Zerbino, D.R., McEwen, G.K., Margulies, E.H., and Birney, E. Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler. PLoS ONE 4, 12 (2009), 9.
14.Swindell SR, Plasterer TN: SEQMAN. Sequence Data Analysis GuidebookHumana Press 1997, 70:75-89.
15.Hernandez, D., Franc Ois, P., Farinelli, L., Ÿstera S, M., and Schrenzel, J. De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. Genome Research 18, 5 (2008), 802-809.
16.Jackman, S. and Birol, I. Assembling genomes using short-read sequencing technology. Genome Biology 11, 1 (2010), 202. 17.Schmidt, B., Sinha, R., Beresford-Smith, B., and Puglisi, S.J. A fast hybrid short read fragment assembly algorithm. Bioinformatics 25, 17 (2009), 2279-80.
18.Warren, R.L., Sutton, G.G., Jones, S.J.M., and Holt, R.A. Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23, 4 (2007), 500-1.
19.Pertea, G., Huang, X., Liang, F., et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19, 5 (2003), 651-652.
20.Miller, R.T., Christoffels, A.G., Gopalakrishnan, C., et al. A Comprehensive Approach to Clustering of Expressed Human Gene Sequence: The Sequence Tag Alignment and Consensus Knowledge Base. Genome Research 9, 11 (1999), 1143-1155. Thanks for your attention. -Six Medium-Length assemblers
-All were run under default parameters
-Two sets of parameters were used to evaluate the assemblers:
Basic assembly metrics
assemler's performance Overal assembly size
Number of major contigs
Average contig size
N50 and N80 values Newbler
Velvet
All the remaining assembler preformed almost as well No songle assembler petrformed best in all our criteria
Full transcript