Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
MS-SIG 2013 - Galaxy-P and CloudBioLinux
Transcript of MS-SIG 2013 - Galaxy-P and CloudBioLinux
John M. Chilton*, James E. Johnson, Ebbing P. de Jong, Getiria Onsongo, Benjamin J. Lynch, Pratik D. Jagtap, Timothy J Griffin
1 University of Minnesota Supercomputing Institute, 2 University of Minnesota
What is Galaxy?
Goecks, J, Nekrutenko, A, Taylor, J and The Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 2010, 11: R86.z
A web-based bioinformatics data analysis platform
Originally designed to address issues in genomic informatics including:
• Software accessibility and usability
• Analytical transparency
New paradigm: “…allows transparent sharing of not only the precise computational details underlying an analysis, but also intent, context, and narrative.” (Goecks et al Genome Biol. 2010, 11: R86.)
Workflows, histories, and pages can be
shared with others or the world.
Why Galaxy for Mass Spec Informatics?
Emanating from a larger community, Galaxy is more mature and developing faster than any comparable open source proteomics platform.
no need to reinvent the wheel
ideally suited for systems biology
4 of 6 talks at the protein identification oral session of ASMS incorporated sequencing data, multi-omic biology can benefit immensely from a multi-omics data analysis platform.
What is Galaxy-P?
Three year US NSF funded grant to build mass spec & proteomics data analysis platform on top of Galaxy.
nearing end of year 1
Galaxy-P on the Tool Shed
Tool shed is the "app store" for Galaxy tools, datatypes, & workflows.
We have contributed many tools and datatypes to the Galaxy tool shed including...
tools and datatypes
Scaffold tools and datatypes
An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics
(J. Proteome Res., 2013, PMID: 23391308).
All these tools available.
We have assembled these pieces into useful workflows for identification, quantification, proteogenomics, and metaproteomics.
"Minnesota Two Step"
Galaxy-P for Proteogenomics
Galaxy-P provides an integrated platform for every step
of proteogenomic analysis.
Build target database - download and translate EST databases or perform gene prediction with Augustus.
Numerous tools for identification and text manipulation.
Workflow utilizing BLAST to identify novel peptides.
Tool to assess peptide-spectrum matches and visualize spectra.
Visualize identified peptides on the genome.
150 step seamless, integrated proteogenomic workflow
Visit ISMB Poster O097 for more information
Galaxy-P in the Cloud
Visit bit.ly/galaxyp-cloud to spin up your own Galaxy-P cluster on Amazon's infrastructure today!
"Using Cloud Computing Infrastructure with CloudBioLinux, CloudMan, and Galaxy. Afgan, Chapman, et. al. Current Protocols in Bioinformatics. June, 2012. "
BioCloudCentral and CloudMan are easy to use web interfaces for launching and managing Galaxy clusters without requiring knowledge of AMIs, ssh keys, or the Unix terminal.
More than Galaxy-P
For those with more technical savy, our cloud clusters come preloaded with more than just Galaxy-P
Large Proteomic Suites
NCBI Blast+, EMBOSS, Augustus, peptide-to-gff
Trans-proteomic pipeline, OpenMS, crux
Identification Tools (Standard and Specialized)
Myrimatch, X! Tandem, OMSSA, TagRecon, Pepitome, PepNovo
Percolator, Fido, Mayu, psm-eval
MZmine, PeptideShaker/SearchGUI, TOPPAS, PRIDE Converter, PRIDEInspector
R (w/xcms, mzR, FactoMineR, caret, ggplot2, VennDiagram)
Three Ways to Galaxy-P
Built on CloudBioLinux
Our public and internal Galaxy-P servers as well as the
cloud images are built with CloudBioLinux.
We have contributed numerous enhancements to make this an excellent framework for proteomic and
mass spec data analysis platform.
See our ASMS poster on building proteomic
platforms with CloudBioLinux
The whole Galaxy and Galaxy-P team at the Minnesota Supercomputing Institute, with special thanks to Anne-Francoise Lamblin.
Brad Chapman, Dave Clements, Enis Afgan, Jorrit Boekel, Dannon Baker, Nate Coraor, Jeremy Goecks, Ira Cooke, and the rest of the Galaxy and CloudBioLinux communities.
This work is funded by the National Science Foundation.