Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Scalable Machine Learning with Apache Mahout
Transcript of Scalable Machine Learning with Apache Mahout
Linear, or near linear scaling:
Double the data, should take
twice the time. Collaborative Filtering Clustering Dimensional Reduction Frequent Pattern Mining Classification Note: mention of "cluster size" implies that the algorithm is parallelizable.
This is very helpful, but neither necessary nor sufficient to imply "scalable". Clustering Math or: doubling the cluster
size should halve the time Text Utilities Math Random Forests Perceptron / Winnow (Complementary) Naive Bayes Dirichlet Process KMeans / Fuzzy KMeans Canopy Mean-Shift Recommenders: User
Taste Web App: SingularValueDecomposition Latent Dirichlet Allocation Lucene Text to Vectors Collocations Examples: Wikipedia
(nGrams / frequent phrases) (index as input) COLT Primitive Collections Distributed Matrices Online Stats Log-Likelihood Jake Mannix
Unemployed Layabout (work done while Principal Search Engineer at LinkedIn)
(starting Monday @twitter) email@example.com @pbrane /in/jakemannix Resources:
Mahout in Action (Manning EAP)
Sean Owen, Robin Anil, Ted Dunning, and Ellen Friedman Powered By Powered By Here are a few
companies powered by Mahout