Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

GMDS

No description
by

Martin Steinegger

on 3 August 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of GMDS

Pricing
Discussion
Results
1.) AWS Image
Local cluster Architecture
Cloud Architecture
Infrastructure (IaaS)
Platform (PaaS)
Application (SaaS)
Elastic Compute Cloud
S3 Storage
Elastic Block Store
PredictProtein
BLAST NCBI
2010
2011
Local cluster cost
Spot Market
be patient and
save money
"Translational bioinformatics in the cloud: an affordable alternative" Dudley et al. Genome Medicine 2010,
AWS Spot Market
Resources and costs for microbial sequence analysis evaluated using virtual
machines and cloud computing, Angiuoli et al, Pone 2011
"Translational bioinformatics in the cloud: an affordable alternative" Dudley et al. Genome Medicine 2010,
Predicted runtimes using varying bid prices
Conclusion
Cloud computing is well positioned to address big computing and data challenges in bioinformatics
Easy to use
Starcluster, Apache Mesos, AWS API ...
Competitive pricing
Starcluster
14.
22.
26.
28.
29.
31.
Time effort to build a cluster in the cloud -> 8 minutes


Time to build a local cluster -> 1 month
"StarCluster Brings HPC to the Amazon Cloud", Riley at al, High Performance Computing (HPC) in the Cloud 2010
2.) PredictProtein in the Cloud
Central Dogma
DNA
RNA
Protein
Image Source: Wikipedia
SNAP
SNAP works with different methodes,
databases and machine learning
Published 2007 by
Yana Bromberg and Burkhard Rost
Predict effect of non-synonymous polymorphisms on function
Screening for Non-Acceptable Polymorphisms
SNAP2: New version recently optimized
by Maximilian Hecht
(Thesis)
Use Case
SNAP Prediction
Experimental
Finding residues important for function
Unraveling signals related to disease
Linking genotype to phenotype
Can be used for ...
Functional sites
Large scale annotation of protein data
?
Single-nucleotide polymorphisms
Image Source: Wikipedia
Our Idea
Database with the effect of every single possible SNP
Mutations to calculate for one human genome
> 350,000,000
Architecture
Introduction
Whole Genome Mutagenesis
Optimization
Cloud
Reprof: Speedup through
rewriting and vectorization
351 residue protein => 68 min
351 residue protein => 17 min
SNAP2:
Recently optimized
by Maximilian Hecht (Thesis)
Cluster Utilization over one Year
Run Human
Performance Baseline
Future Development
Extending the database by running the complete UniProt by another group member (Manfred Roos)

Integration of data into PredictProtein.org
Enabling API access to data and toolset
Base Costs of ca. 10.000$
Final Costs of 350$
2.54 $/h
16 Cores
29000
genes
68 minutes/gene
Average time
Spotmarket
Optimization
0.54 $/h
17 minutes/gene
Area under the curve ~54%
complete in silico mutatgenisis
Full transcript