Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
ASMS Galaxy-P Workshop Presentation - ProteoGenomics
Transcript of ASMS Galaxy-P Workshop Presentation - ProteoGenomics
Project-based approach to install, test and use tools and workflows within Galaxy.
- Proteogenomics (and Metaproteomics).
- Quantitative analysis projects.
Galaxy-P has multiple application tools – some that are proteomics application specific and others from the genomics Galaxy framework.
For example, Protein Database Downloader downloads UniProt protein FASTA databases of various organisms.
Proteogenomics: Identifies protein sequences that are not yet annotated in the predicted proteome of the organism under study.
Provides information for gene annotation- thus enhancing our understanding of genomes.
STEPS IN PROTEOGENOMICS ANALYSIS
Various sources of databases exist.
- cDNA databases. (ECGene and EnSEMBL)
- Gene-prediction tool generated databases. (Augustus)
- 6-frame databases.
- RNA expression databases.
- Metagenomic databases.
- Cancer sample datasets.
Two-step method maximizes the peptide matching sensitivity for applications requiring large databases, especially valuable for proteogenomics and metaproteomics studies.
BLAST-P searches for small peptides (8-30 aas) and large peptides (31 aas and more) against NCBInr database.
Use of peptide summary processing; use of parameters for short-BLAST-P (8-30 aas) and BLAST-P (31 aas and more); threshold parameters for selecting non-matching peptides.
Automated BLAST-P Search
'Minnesota Two-Step Method'
Peptide Spectrum Match Evaluator
From Peptide Summary to Spectra to Validation.
Input : Peptide summary of unmatched peptide sequences.
Process: Uses mzml file and extracts spectral information.
Output: Spectral characteristics of PSMs. Link to annotated spectrum to visualize and confirm.
We have a blueprint / prototype for an integrated proteogenomics analytical workflow that includes all steps starting from creating database to genomic context analysis.
We have practiced an abundance of caution at each step of filtering. We have identified 24, 28 and 38 novel peptides from three datasets.
The platform, tools, workflows are versatile enough to adapt to any other systems biology studies including metaproteomics studies.
Ebbing de Jong.
Sean Seymour (AB Sciex)
NSF Grant 1147079.
Minnesota Supercomputing Institute.
Center for Mass Spectrometry and Proteomics.
PSM Evaluation Inputs
Genomic Context Analysis
Input : Peptide summary of validated Peptide Spectral Matches.
Process: Uses a peptide summary and EnSEMBL GTF file & cDNA file to create a GFF3 file.
Output: GFF3 file link can be used in IGV Browser.
Integrated Genomics Viewer (Broad Institute)