Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

(Almost) Big Data in Biology: the Arising of a New Ecosystem

No description
by

Vincent Guillemot

on 5 June 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of (Almost) Big Data in Biology: the Arising of a New Ecosystem

(Almost) Big Data in Biology: the Arising of a New Ecosystem
Biological data analysis
Genomics (RNA Seq, Whole Exome, Whole Genome, Microarrays)
Neuro-imaging
"Small" data (biostatistics)
Heterogeneous data integration
Teaching
...
Biostatistics and Bioinformatics Platform
Genomics
Heterogeneous data
Integration
Introduce
Sparsity and Structure
Proximal mapping
Proximal gradient
descent
Non smooth convex penalty
Proximal operator too complex or not known
--> Differentiable
--> Known gradient
--> Lipschitz continuous gradient
Nesterov smoothing
Neuroimaging
Professor Terence Speed
University of California at Berkeley,
Department of Statistics
a a a a
The term 'Big Data' is meant to capture the opportunities and challenges facing all biomedical researchers in accessing, managing, analyzing, and integrating datasets of diverse data types [e.g., imaging, phenotypic, molecular (including various '– omics'), exposure, health, behavioral, and the many other types of biological and biomedical and behavioral data] that are increasingly larger, more diverse, and more complex, and that exceed the abilities of currently used approaches to manage and analyze effectively.
What is Big Data
for the US NIH?
Heterogeneous
Many variables
Complex
Difficult to integrate results
Locating data
and
software tools
.
Getting access
to the data and software tools.
Standardizing
data and metadata.
Extending policies
and practices for data and software sharing.
Organizing, managing, and processing
biomedical Big Data.
Developing new methods
for
analyzing & integrating
biomedical data.
Training researchers
who can use biomedical Big Data effectively.
Major challenges in using
biomedical Big Data include:
Biological Knowledge
Groups or networks
Neuro-images
Genomic data
Phenotypes
Clinical status
Biological measurements
Scores
Etc.
0/1
0/1
0/1
N
G
y
Multi-block Analysis
weights
outer
component
Inner
component
Add
Constraints
weights
outer
component
Inner
component
Add
Constraints
Remarks
Add a constraint to...
make a more general model, (RGCCA)
introduce sparsity (SGCCA)
or structure, (PLS'14)
generalize the "link" (PLS'14).
THANK YOU
Fouad Hadj-Selem
Tommy Löfstedt
Edouard Duchesnay
Ivan Moszer
Justine Guégan
Andigoni Malousi
Vincent Perlbarg
Bioinformatics and
Biostatistics platform
Convex Optimization
with non-smooth penalties
Arthur Tenenhaus
Cathy Philippe
Vincent Frouin
SGCCA
Full transcript