Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Machine Learning and Deep Learning

No description
by

Li Dai

on 2 April 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Machine Learning and Deep Learning

Machine Learning
Deep Learning
by Li Dai
Based On
Anshul Joshi Machine learning
Zhen Zuo Deep Learning

How humankind know world
Machine Learning
Deep Learning
Trends
Opportunities
Do you like it
Why&How
Outline
Machine Learning
Work flow
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Types of Machine Learning
What is Machine Learning?
What is machine learning.
Supervised and Unsupervised learning.
Some algorithms of supervised and unsupervised learning.
Reinforcement Learning
Clustering.
Recommendation systems. (Use case, example.)
"A computer program is said to learn from experience (E) with some class of tasks (T) and a performance measure (P) if its performance at tasks in T as measured by P improves with E"
Examples:
Automatic speech recognition
Automatic Voice/Face/Finger print recognition
Natural Language Processing
Automatic Medical diagnostics
Email spam detection
Advertisements
Content (image, video, text) categorization
Suspicious activity detection from CCTVs, fraudulent activities in banks. (ex. credit card fraud).
Frequent pattern mining
Predicting tastes in music (Pandora), in movies/shows (Netflix), predicting interests (Facebook, Grindr, Tinder), shopping list (Flipkart, Amazon) etc.
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Supervised Learning
The correct labels of the training data are known.
Classification
: predict class from observations.
discrete/categorical variable
group the output into a class.
example: Email spam detection, Content (image, video, text) categorization, Google News.
Regression:
statistical process for estimating the relationships among variables.
means to predict the output value using training data.
real number/continuous.
example: Housing price prediction, sales prediction, web traffic prediction
Unsupervised Learning
The correct classes of the training data are not known
Clustering:
Task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).

Blind signal separation:
separation of a set of source signals from a set of mixed signals.


Use Case
Recommendation Engine
Decision Trees
A flow chart like tree structure
Internal node represents a test on the attribute
Branch represents the test of the result
Leaf nodes represents the class labels or class distribution.

Used to classify an unknown example.
k-Nearest Neighbour algorithm
All instances corresponds to n-D space.
The nearest neighbor are defined in terms of Euclidean distance.
The target function could be discrete or real valued.
For discrete valued, the kNN returns the most common value among the training examples.
Linear Regression (Prediction)
Predicting the value of one variable on the basis of other variable.
If two variables are involved, one is dependent variable that is to be found and one is independent variable which is the basis of finding the independent variable.
k Means clustering
Organizing data into classes such that there is:
high intra-class similarity
low inter-class similarity
Recommender systems have changed the way people find products, information, and even other people.
They study patterns of behavior to know what someone will prefer from among a collection of things he has never experienced.
content-based filtering, user-user collaborative filtering, item-item collaborative filtering, dimensionality reduction, and interactive critique-based recommenders.
Deep Learning
Fundamental Neural Network
Mainstream deep learning approaches
Conclusions
Function decomposition:
Convolutional Neural Network (CNN)
Motivated by the deep architecture of human brain and cognitive process, Deep Learning is a new area of Machine Learning research, which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Artificial Intelligence.

Deep Learning is about learning multiple levels of representation and abstraction that help to make sense of data such as images, sound, and text.

Key principles of deep architecture:
Unsupervised learning of representations is used to pretrain each layer.
Unsupervised training of one layer at a time, on top of the previously trained ones. The representation learned at each level is the input for the next layer.
Use supervised training to fine-tune all the layers
Motivation

Fundamental Neural Network

Mainstream deep learning approaches

Recent applications of deep learning

Conclusion
Motivation
In order to learn the kind of complicated functions that can represent high-level abstractions, one may need deep architectures.
Current Recognition Approach
Traditional Neural Network
Deep Learning
Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae.
image/video pixels
object class
Hand-designed Feature Extraction
Trainable Classifier
Features are not learned (e.g. SIFT, HOG)
Trainable classifier is often generic (e.g. SVM)
Need supervised training labels
problem
Combining multiple hand-designed features or do multi-kernel learning will lead to improvement of performance, but only few gain, and extracting multiple features is time consuming.

Can we learn better features? Unsupervised/Semisupervised?
Inspired by the architectural depth of the brain, neural network researchers had wanted for decades to train deep multi-layer neural networks, but no successful attempts were reported before 2006: researchers reported positive experimental results with typically two or three levels (i.e., one or two hidden layers), but training deeper networks consistently yielded poorer results.
Perceptron Neural Network

Feed-forward Neural Network

Recurrent Neural Network
Hierarchies in Vision
Mid-level features
Combination of attributes
Spatial pyramid
New methods for unsupervised pre-training have been developed (Restricted Boltzmann Machines = RBMs, autoencoders, contrastive estimation,etc.)

More efficient parameter estimation methods

Better understanding of model regularization
Improvements
Recent applications of deep learning
Forward-propagation/Backward-propagation:
A multilayer neural net can be thought of as a stack of logistic regression classifiers. Each input is the output of the previous layer.
Logistic neurons:
Loss:
1) Compute loss on small mini-batch (F-prop)

2) Compute gradient w.r.t. parameters (B-prop)

3) Use gradient to update parameters
Deep Belief Networks (DBN)
Deep Boltzmann Machine (DBM)
Conceptual example of convolutional neural network
The convolution and subsampling process
Graphical model of DBN: The structure is similar to a sigmoid belief network, except for the top two layers formed a Restricted Boltzmann Machine (RBM).
Comparison of different types of deep neural networks
What's
RBM
?
Sturcture of DBM: Each unit dependencies between hidden variables. All connections are undirected.
Flowchart of Deep Autoencoder
Restricted Boltzmann Machine (RBM)
Undirected graphical model of RBM
A pseudo-code of k-step Contrastive Divergence (CD-k)
Face expression recognition
Multi-modal learning
Parsing
Pattern completion
Share knowledge across categories
Data dimension reduction
2000-500-250-125-2
2000-2
Object Recognition
Abstract
Simple
Chaos
How humankind know world
Humankind
Emotion
Reasoning
Creation
Imagination
Learning
Computer
Hardworking
Fast
Machine Learning
Deep Learning
[10000hours Expert]
Trends
Opportunities
Do you like it
From Deductive To Inductive
From Explanation To Modeling
From Accuracy To Approximation
Trends of way knowing world
Artificial Intelligence Startups
https://angel.co/artificial-intelligence

http://en.wikipedia.org/wiki/List_of_mergers_and_acquisitions_by_Google

http://en.wikipedia.org/wiki/List_of_mergers_and_acquisitions_by_Facebook

http://en.wikipedia.org/wiki/List_of_mergers_and_acquisitions_by_Apple

http://en.wikipedia.org/wiki/List_of_mergers_and_acquisitions_by_Yahoo!
Artificial Intelligence
Business
Interdisciplinary
What humankind can think
will all become reality
Why&How
Panacea
The 10 Best Tech Careers In 2015
No. 1: Software Engineer, $98,074
No. 2: Database Administrator, $97,835
No. 3: Product Manager, $113,363
No. 4: Data Scientist, $104,476
No. 5: Solutions Architect, $121,657
No: 6: QA Engineer: $77,499
No. 7: Network Engineer: $87,518
No. 8: IT Project Manager, $103,710
No. 9: Mobile Developer: $79,810
No. 10: Sales Engineer: $91,318
We are still pursuing the truth of the world.

But the output of machine learning or deep learning from data is just the world from data, it works well but is not the world itself.

So machine learning and deep learning just help us know how but not know why.

We still need genius and more powerful methods to know the world.
Questions & Thanks
Programming
– R Python
Math Essential
statistics
– Probability
– Statistical inference
– Validation
– Estimates of error, confidence intervals
Linear algebra
– Hugely useful for compact representation of linear
– transformations on data
– Dimensionality reduction techniques
Optimization theory
Business Feeling
In Detail
Top Masters in this field
Peter L. Bartlett
Professor of Computer Science and Statistics
Department of Statistics,
Division of Computer Science/EECS
University of California, Berkeley
Research Expertise and Interest
statistics, machine learning, statistical learning theory, adaptive control

Michael I. Jordan
Pehong Chen Distinguished Professor
Department of EECS
Department of Statistics
University of California, Berkeley
Research Expertise and Interest
Mixtures of experts, spectral clustering, Graphical model, nonparametric, Bayesian.

Yoshua Bengio
Full Professor
Department of Computer Science
and Operations Research
Canada Research Chair in
Statistical Learning Algorithms
Université de Montréal
Research Expertise and Interest
Deep Learning
Geoffrey E. Hinton
Inventor of the backpropagation and
contrastive divergence training algorithms
Part time for Google researcher
Department of Computer Science
University of Toronto
Research expertise and interest
deep learning
Yann LeCun,
Director of AI Research, Facebook
Founding Director of the NYU Center for Data Science
Silver Professor of Computer Science, Neural Science,
and Electrical and Computer Engineering,
The Courant Institute of Mathematical Sciences,
Center for Neural Science, and
Electrical and Computer Engineering Department, NYU School of Engineering
New York University.
ANDREW NG
Andrew Ng is Associate Professor of Computer
Science at Stanford;
Chief Scientist of Baidu;
and Chairman and Co-founder of Coursera.
Research expertise and interest
machine learning
Daphne Koller
Professor
Computer Science Department at Stanford University
MacArthur Foundation Fellowship
Research expertise and interest
Probabilistic models and machine learning
John Lafferty
Louis Block Professor
Department of Statistics
Department of Computer Science
Physical Sciences Division
University of Chicago
Before 2011 Professor of Computer Science,
Machine Learning, and Statistics
at Carnegie Mellon University.
Research expertise and interest
statistics and machine learning
Yaser S. Abu-Mostafa
Professor
Electrical Engineering and Computer Science
California Institute of Technology
Research expertise and interest
machine learning and computational finance
David M. Blei
Professor
Statistics and Computer Science
Columbia University
Research expertise and interest
Probabilistic graphical models
Approximate posterior inference
Topic models
Bayesian nonparametric statistics
Journal of Machine Learning Research
Based On
Anshul Joshi Machine learning
Zhen Zuo Deep Learning
Basic setup of the learning problem
Death will be precious
Robots will come to protect mind and intelligence
Bodies Gone Thoughts Left
Love will be commodity
Global Brain dominates world
Machine Learning
Deep Learning
Artificial Intelligence
Full transcript