Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


MS-SIG 2013 - Galaxy-P and CloudBioLinux

No description

John Chilton

on 27 September 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of MS-SIG 2013 - Galaxy-P and CloudBioLinux

Innovative, Reproducible MS-Based Proteomic Informatics in the Cloud for Emerging Applications with Galaxy-P and CloudBioLinux
John M. Chilton*, James E. Johnson, Ebbing P. de Jong, Getiria Onsongo, Benjamin J. Lynch, Pratik D. Jagtap, Timothy J Griffin
1 University of Minnesota Supercomputing Institute, 2 University of Minnesota
* chilton@msi.umn.edu

What is Galaxy?
Goecks, J, Nekrutenko, A, Taylor, J and The Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 2010, 11: R86.z
A web-based bioinformatics data analysis platform
Originally designed to address issues in genomic informatics including:
• Software accessibility and usability
• Analytical transparency
• Reproducibility
• Scalability
• Share-ability
Old Paradigm
New Paradigm
New paradigm: “…allows transparent sharing of not only the precise computational details underlying an analysis, but also intent, context, and narrative.” (Goecks et al Genome Biol. 2010, 11: R86.)
Workflows, histories, and pages can be
shared with others or the world.
with Galaxy...
Why Galaxy for Mass Spec Informatics?
Paradigm Shift?
Emanating from a larger community, Galaxy is more mature and developing faster than any comparable open source proteomics platform.
no need to reinvent the wheel
ideally suited for systems biology
4 of 6 talks at the protein identification oral session of ASMS incorporated sequencing data, multi-omic biology can benefit immensely from a multi-omics data analysis platform.
What is Galaxy-P?
Three year US NSF funded grant to build mass spec & proteomics data analysis platform on top of Galaxy.
nearing end of year 1
Galaxy-P on the Tool Shed
Tool shed is the "app store" for Galaxy tools, datatypes, & workflows.
We have contributed many tools and datatypes to the Galaxy tool shed including...
mzQuantML datatype
MaxQuant tool
Various OpenMS
tools and datatypes
Scaffold tools and datatypes
An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics
(J. Proteome Res., 2013, PMID: 23391308).
All these tools available.
Galaxy-P Workflows
We have assembled these pieces into useful workflows for identification, quantification, proteogenomics, and metaproteomics.
"Dual-label workflow"
"Minnesota Two Step"
Galaxy-P for Proteogenomics
Galaxy-P provides an integrated platform for every step
of proteogenomic analysis.
Build target database - download and translate EST databases or perform gene prediction with Augustus.
Numerous tools for identification and text manipulation.
Workflow utilizing BLAST to identify novel peptides.
Tool to assess peptide-spectrum matches and visualize spectra.
Visualize identified peptides on the genome.
150 step seamless, integrated proteogenomic workflow
Visit ISMB Poster O097 for more information
Galaxy-P in the Cloud
Visit bit.ly/galaxyp-cloud to spin up your own Galaxy-P cluster on Amazon's infrastructure today!
"Using Cloud Computing Infrastructure with CloudBioLinux, CloudMan, and Galaxy. Afgan, Chapman, et. al. Current Protocols in Bioinformatics. June, 2012. "
BioCloudCentral and CloudMan are easy to use web interfaces for launching and managing Galaxy clusters without requiring knowledge of AMIs, ssh keys, or the Unix terminal.
More than Galaxy-P
For those with more technical savy, our cloud clusters come preloaded with more than just Galaxy-P
Large Proteomic Suites

NCBI Blast+, EMBOSS, Augustus, peptide-to-gff
Trans-proteomic pipeline, OpenMS, crux
Identification Tools (Standard and Specialized)
Myrimatch, X! Tandem, OMSSA, TagRecon, Pepitome, PepNovo
Validation Tools
Percolator, Fido, Mayu, psm-eval
GUI Applications
MZmine, PeptideShaker/SearchGUI, TOPPAS, PRIDE Converter, PRIDEInspector
Programming Environments
pyteomics (Python)
mspire (Ruby)
R (w/xcms, mzR, FactoMineR, caret, ggplot2, VennDiagram)
Three Ways to Galaxy-P
Built on CloudBioLinux
Our public and internal Galaxy-P servers as well as the
cloud images are built with CloudBioLinux.
We have contributed numerous enhancements to make this an excellent framework for proteomic and
mass spec data analysis platform.
See our ASMS poster on building proteomic
platforms with CloudBioLinux
this presentation
The whole Galaxy and Galaxy-P team at the Minnesota Supercomputing Institute, with special thanks to Anne-Francoise Lamblin.
Brad Chapman, Dave Clements, Enis Afgan, Jorrit Boekel, Dannon Baker, Nate Coraor, Jeremy Goecks, Ira Cooke, and the rest of the Galaxy and CloudBioLinux communities.
This work is funded by the National Science Foundation.
Full transcript