Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Internship-Intelligent network layout

No description
by

Urszula Czerwinska

on 16 June 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Internship-Intelligent network layout

Create a layout of a biological network
driven by the data
How?
Java programming
DeDaL
Interaction network
DeDaL
Computational Systems Biology for Cancer
Urszula Czerwińska
ulcia.liberte@gmail.com

layout - visualization of the network
Cytoscape 3.0 plug-in
Conclusions
network structure + high-througput
=
data driven network layout
accessible and useful tool for biological network analysis
Thank You!
grid
hierarchical
force directed
structure based layouts
U900
Cytoscape 3.0 plugin
Data Driven Network Layout
Pure structure based layout
Mixed Layout
Pure Data Driven Layout
High-throughput data
Protein/gene
expression data (-omic):
RNAseq
microArray
LC-MS/MS
....
Big data!
Multidimensional
Statistical methods for meaningful visualization and pattern identification
PCA
Principal Component Analysis
PMs
nonlinear Principal Manifolds*
*1. Gorban A, Kegl B, Wunch D, Zinovyev A. (eds.) Principal Manifolds for Data Visualisation and Dimension Reduction. 2008. Lecture Notes in Computational Science and Engeneering 58, p.340.
2. Gorban A.N., Zinovyev A. 2010. Principal manifolds and graphs in practice: from molecular biology to dynamical systems. Int J Neural Syst 20(3):219-32.
orthogonal linear transformation

data is projected into a new coordinate system

where axes are principal components (PC)

first PC has the largest possible variance, second: the second largest variance etc.

dimension reduction
individual with the same profiles will be placed nearby in the 2D space
clusters identification


a) Configuration of nodes and 2D Principal Surface in the 3D PCA linear manifold.

The dataset is curved and cannot be mapped adequately on a 2D principal plane;

b) The distribution in the internal 2D non-linear principal surface coordinates (
ELMap2D
) together with an estimation of the density of points;

c) The same as b), but for the linear 2D PCA manifold (
PCA2D
).
Linear PCA versus nonlinear Principal Manifolds for visualization of breast cancer microarray data
elastic maps algorithm
principal manifolds approximation
cytoscape-swing-api
VDAOengine
libraries
User-friendly interface
dialogue windows
Data Driven network Layout
so far in Cytoscape we have...
no free access data driven layout !
low expression
high expression
now, there is DeDaL

low expression
high expression
Transformation between two layouts
network (structure based layout)
Data Analysis (PCA)
purely Data Driven network Layout
DeDaL
Networks' Alignment
0%
100%
50%
purely structure based
purely data driven
mixed data driven
low expression
high expression
rotation
mirroring
minimization of Euclidean distance
reference layout
before alignment
after alignment
it works with any two layouts!
condition 1
it works with any layouts
you can align more than two layouts at the same time
Website
Description
Files
Tutorial
Coming soon
Publication
DeDaL
Independent functions
http://bioinfo-out.curie.fr/projects/dedal/
Follow up of the project
Acknowledgments: Zinovyev Andrei , Calzone Laurence, Bonnet Eric , Viara Eric , Martignetti Loredana and all Inserm U900
Motivation
Diseases, like cancer, are related to dysregulation of molecular interactions in large molecular networks
protein 1 ppi protein 2
source
inter. type
target
node
edge
node
Providing a meaningful representation of the knowledge of molecular interaction is not trivial
High amount of high-throughput data which analysis is problematic as well
condition 2
1. Multidimensional data
Maths...
genes
cell lines/ tumor type/time points
2.Center the matrix by gene
3. Compute a covariance matrix
4. Compute eigenvectors and eigenvalues of covariance matrix
S=
eigenvalues
eigenvectors
The eigenvector ith the highest eigenvalue will become PC1, the eigenvector with the second highest eigenvalue will became PC2 etc...
5. Deriving a new dataset (projection into a new coordinate system)
We decide to keep only 2 PC (for 2D representation)
L- number of PC kept 1<=L<=m
Y-new dataset ,E eigenvector matrix, X initial centered dataset
Xc -centered matrix
or Simple Data-Driven Layout
Data and network
Data
: A549 epithelial cells treated for up to 72 hours with TGF-beta to induce epithelial-mesenchymal transition (EMT).
Affymetrix
Human Genome U133 Plus 2.0
Array

Divided into two conditions
early time (0-16h)
and
late time (24-72h)

Network
: 56 genes with highest variance retreived from Human Protein Reference Database (
HPRD
)

Some improvements....
Outliers
are detected according to the formula

m(node)>m(all)+p*std(all)

m- mean pairwise distant
sdt - standard deviation of the pairwise distance
p - parameter (=1.5)
such a node is placed on the same line at the distance 2*std from the center
Overlapping
noise is introduced
if edge < 1
r=random( -3 : 3 )
Node (x,y) = Node (x+r, y+r)
Nodes without values
Ignored in the PCA/PMs algorithm, placed at the mean distance of its neighbor nodes' coordinates
l
target
l
source
l'
(l )
1
(l' )
1
(l )
ref
Degree
5
4
3
2
1
0
outlier
after adjustment
adjustment
6
Full transcript