Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Machine Learning vs Clojure

A review of Machine Learning options in the clojure ecosystem

Neale Swinnerton

on 6 November 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Machine Learning vs Clojure

Get Data Load, process & store large/messy data sets Connecting to Data Comparison Process Machine Learning is numerical intensive Scraped websites, XML/JSON Machine Learning vs Clojure A story in n^2 parts Data Load / Process Parse CSV into 'vector of vectors'
Load File
Strip Comments
Split at Delimiter
Convert to Doubles Data Pipelines Advanced Pipelines Storm Hadoop Cascalog Matrix Multiplication http://upload.wikimedia.org/wikipedia/en/e/eb/Matrix_multiplication_diagram_2.svg Matrix Multiplication Performance Performance Vectorized vs Multi-Threaded Multiply 1000x1000 matrix of doubles Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz Naive impl: 220s
Multithreaded(incanter): 5.7s
Vectorized(clatrix/jblas): 0.8 s 2nd Order MM, 100,000 words P(3rd|1st,2nd) needs 10^15 values * * GPGPU on the JVM soon? aparapi - converts java byte code to Open CL to run on GPU (not clojure) Project Sumatra - generalised API for tuples / GPGPU on JVM Matrix Representation Neale Swinnerton - @sw1nn
London Clojure User Group 11/2012 https://github.com/swannodette/enlive-tutorial Conclusion ML sits inside larger systems. Why Clojure for ML? It's not just about the theory
Maintainability It's not just about optimizing the maths
Part of a larger system
Connect with the real world
General purpose vs mathematical language What is Machine Learning? Machine Learning involves developing systems to process (potentially very large) datasets, developing algorithms and models that can then be used to make predictions of future events. Display Results Get Data
Develop Model
Train Model
Profit? Access to Java ecosystem. Options to break out to high performance native code. Idiomatic clojure can perform 'quite' well in many circumstances. Clojure is *great* for the 'general' Scalability, Reliability, Maintainability important metrics for success.
Full transcript