Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Coordinate Descent for Big Data - Multicore, Cluster, GPU

No description

Martin Takac

on 31 October 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Coordinate Descent for Big Data - Multicore, Cluster, GPU

Coordinate Descent Methods
Martin Takáč
Big Data Optimization
Failure of naive parallelism!
The Problem
LASSO with 1 Billion Variables
Problem size: 500GB
joint work with Peter Richtárik, Jakub Mareček
(Multicore, Cluster, GPU, ....)
Multicore, shared memory
Cluster, MPI/OpenMP
#CPUs: 32
RAM: 20-30GB, sometimes 1-2TB
Following theory
all proces. to read the same x
all proces. choose different coordinate
all proces. compute update
x is updated by all proces. afterwards
Asynchronous impl.
requires 2 synchronization
steps per iteration!!!
Tesla K20X
#Cuda cores: 2688
max single precision performance: 3.95 Tflops
GDDR5: 6 GB (Memory bandwidth: 250 GB/sec)
SIMD architecture
sensitive on memory access
Nice sampling of coordinates is not good!
What about creating blocks?
many nodes, communication using MPI
each node many cores => OpenMP parallelization
wrap size = 2
Synchronous vs. asynchronous
FP (Fully Parallel)
PS (Parallel and
Serial blocks)
Lasso problem with 3 TB data matrix
additional memory requirements
How many updates on node for RA-PS variant
avg. computation time for one coordinate
duration of one reduce all operation
total computation time
4 sockets, each sockets 8cores
IBM research
Full transcript