Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Conclusion

Process

Machine Learning is numerical intensive

Matrix Multiplication

Matrix Representation

  • Get Data

What is Machine Learning?

  • Analyse

Machine Learning involves developing systems to process (potentially very large) datasets, developing algorithms and models that can then be used to make predictions of future events.

  • Develop Model
  • Train Model

Matrix Multiplication

  • Profit?

Performance

  • Display Results

http://upload.wikimedia.org/wikipedia/en/e/eb/Matrix_multiplication_diagram_2.svg

GPGPU on the JVM soon?

  • aparapi - converts java byte code to Open CL to run on GPU (not clojure)

Performance

Comparison

  • Project Sumatra - generalised API for tuples / GPGPU on JVM

Vectorized vs Multi-Threaded

*

Why Clojure for ML?

Multiply 1000x1000 matrix of doubles

  • Naive impl: 220s
  • Multithreaded(incanter): 5.7s
  • Vectorized(clatrix/jblas): 0.8 s

2nd Order MM, 100,000 words P(3rd|1st,2nd) needs 10^15 values

It's not just about the theory

  • Scalability
  • Reliability
  • Maintainability

Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz

*

It's not just about optimizing the maths

  • Part of a larger system
  • Connect with the real world
  • General purpose vs mathematical language

Get Data

Load, process & store large/messy data sets

Data Pipelines

Data Load / Process

Parse CSV into 'vector of vectors'

  • Load File
  • Strip Comments
  • Split at Delimiter
  • Convert to Doubles
  • ML sits inside larger systems.
  • Scalability, Reliability, Maintainability important metrics for success.
  • Clojure is *great* for the 'general'
  • Idiomatic clojure can perform 'quite' well in many circumstances.
  • Options to break out to high performance native code.

Connecting to Data

  • Access to Java ecosystem.

Scraped websites, XML/JSON

Advanced Pipelines

https://github.com/swannodette/enlive-tutorial

Machine Learning vs Clojure

A story in n^2 parts

Neale Swinnerton - @sw1nn

London Clojure User Group 11/2012

Hadoop

Storm

Cascalog

Learn more about creating dynamic, engaging presentations with Prezi