Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Effective Interoperability

Talk given at 2014 HIPERFIT Workshop for Partners and Faculty
by

Simon Andreas Frimann Lund

on 23 March 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Effective Interoperability

Python/Chapel interoperability module
pyChapel
pychapel.readthedocs.org
FFI
Module Compiler
Example
that turns
Python / Numpy
into an array language
npbackend
also known as the Python module
npbackend, how does that help?
And how do I use it?
npbackend - Usage and User Interface
Goals
Legacy support
Minimally Intrusive
Graceful degration

Enable Use of multiple "targets"
Bohrium
Numexpr
libgpuarray
pyChapel
... <insert your backend here>
Benchmarks
Shallow Water
Domain: 2000x2000
Iterations: 100
Heat Equation
Domain: 3000x3000
Iterations: 100
Measuring elapsed wall-clock in seconds
Comparing NumPy to npbackend with different targets
Comparing NumPy to NumPy through npbackend to measure overhead
Intel Xeon E5640, 2.66Mhz
12MB, LLC
96GB DDR3, Main memory
NVIDIA Geforce GTX 460, 1GB DDR5
With: GCC 4.8.2, OpenCL 1.2, Linux 3.13, Python 2.7, and NumPy 1.8.1.

Average of three runs, deviation from
the mean used as verification.
npbackend, facilitating target optimizations
Indirection Layer
Lazy Evaluation

Construct GPU kernels
Array Contraction
Loop fusion
Loop tiling

Dead code elimination

Task parallelism
Examplified
Results
Numexpr x2
BohriumCPU x2
libgpuarray x3.7
BohriumGPU x12
Results
Numexpr x2.2
BohriumCPU x2.6
libgpuarray x5.6
BohriumGPU x18
or what can be gained from playing well with others...
Effective Interoperability
Productivity vs Performance
Choose one?
High-Level vs Low-Level Languages
R
Python
MATLAB
OpenMP/pthreads
CUDA
OpenCL/CUDA/OpenACC
MPI
GPUs,
Accelerators
APUs,
Hybrid,
FPGA
CPUs
For efficiency
Modeling
Visualization
Data Analysis
Experimentation


The Best of Both Worlds?
High Productivity Computing Systems Project aka
High Productivity High Performance Languages
X10
Fortress
UPC
Chapel
Parallel Programming Languages
Shared Memory
Distributed Memory
Convenient notation
DEAD
Java-ish
C/C++ ish
Alive and well, OpenSource with an active team at Cray Inc. including academic collaborations
Designed from scratch with heritage / lessons learned from HPF and ZPL.
Multi-resolution: high-level abstractions implemented within the language itself, abstractions can be pealed off.
Chapel
pragmas
C / C++ / Fortran
Locale Abstraction
Hierarchical
Abstract unit of target architecture
Reasoning about locality and affinity

PGAS
Public vs Private determined by scoping

Express:

begin on Locale[0]
....
begin on node.left do
search(node.left)

Probing:
locale.(physicalMemory, id, name, coreCount, etc.)

Code-blocks / Expression
begin
cobegin
coforall
sync
serial
On Data
atomic variables
sync variables
Global-view abstractions
forall loops
High-level: A = B + alpha * C
Mapped to hardware with Domain Maps using Locales
Default strategies user-overridable
in one slide
First-class index-set
dense
sparse
strided
associative
un-structured

Handles:
Memory layout
Distribution
Domains
Loop constructs
Iterators
Parallel Iterators
I/O
The case for Interoperability
or, why bother? My approach is the best!
Languages as toolboxes
Systems programming (C)
Desktop experimentation (R, Matlab, Python)
Parallelization and scalability (Chapel, Futhark, Streaming Nesl, APL)

Interoperate to avoid duplication of efforts
Who wants to implement the entire software stack for interactive interpreters, visualizaton provided by Matlab, R and iPython Notebook and the like?

An aid for adopting new languages

Revamp "old" languages
Full transcript