Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Biomedical Data Fusion

A CIBR Seminar, Baylor College of Medicine, May 7, 2014
by

Blaz Zupan

on 21 July 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Biomedical Data Fusion

2054
Steven Spielberg
Minority
Report
user interfaces
visualization
human-computer interaction
In a future where a special police unit is able to arrest murderers before they commit their crimes, an officer from that unit is himself accused of a future murder.
precognitions
population register
driving license data
architectural features
urban planning data
geo data
recommendation
systems

"meta" movies
"meta" users
Matrix tri-factorization
Predictions
Data
Factorization
Predictions
All together now: data fusion by simultaneous matrix factorization
Thanks!
Waleed Nasser
Edward Nam
Chris Dinh
Adam Kuspa
Gad Shaulsky
Rafael Rosengarten
Mariko Katoh-Kurasawa
Balaji Santhanam
Marinka Žitnik
Blaz Zupan

Computer Science @ University of Ljubljana
Genetics @ Baylor College of Medicine
Adam Kuspa
Gad Shaulsky
Dictyostelium
discoideum
bacteria
predator!
genetic
screens
Dicty genes for bacterial response?
Gram+ defective: swp1, gpi, nagB1
Gram- defective: clkB, spc3, alyL, nip7
genome
found
workload
estimated
12,000 genes
7 genes
5 years
~200 genes
Now what?!
Nasser et al., Current Biology 10(23), 2013.
More genetic screens?
50% coverage (100 genes) -> 20 screens required!
80% coverage (160 genes) -> 65 screens required!
Data-driven
approach
genes
mutant phenotypes
timepoints
phenotype data
expression
data
phenotype
ontology
publications
PubMed
data
MeSH terms
MeSH
annotation
MeSH
ontology
Data Integration
Marinka Žitnik
Movies
Users
"meta" -> original space recipes
Dicty bacterial response
14 data sources
7 seed genes
ranked 12,000 genes
9 candidates
P(X>=8) = 10
-13
Drug-induced liver injury prediction
fusion of 29 data sources

substantial improvement
of accuracy over other
machine learning approaches
Zitnik & Zupan, CAMDA-2013, Berlin (best paper award)
Discovery of disease-disease associations
Fusion of 11 data sources
systems-level molecular data.

Proposed 14 new associations
not present in Disease Ontology.
Confirmed in literature.

Large-scale data fusion for reclassification of diseases.
Zitnik et al. (2013) Scientific Reports.
Biomedical Data Fusion
data integration by ranking & filtering
data fusion:
all data -> a single gene ranking model
Funding: NIH, ARRS, EU FP7, Fulbright
50,000 clonal mutants
Chisholm & Firtel (2004) Nat Rev Mol Cell Biol
8 predictions correct
Mariko Katoh-Kurasawa (Gad Shaulsky)
Graeme Mardon
Rui Chen
Current work with
Full transcript