Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
Transcript of Data Science
What is Data
Transformation of raw data into meaningful and useful information for business analysis purposes
Kanwal Prakash Singh
Small Data (really) ?
Internet of Things
Extraction of knowledge / insights from data
Depict the analysis in an engaging way
Art and science
statistician who knows more programming than other statisticians and a programmer who knows more statistics / ML / maths than other programmers
A Data Scientist can be a better BI manager / expert
A BI manager / expert also loaded with statistics / ML / maths is again a Data Scientist
In Memory Storage
Nominal : Categorical, discrete
Ordinal : natural orderings, ranking
Interval : Similar to ordinal with defined difference
Ratio : similar to interval with a natural 0
Null Hypothesis : refers to a general statement or default position that there is no relationship between two measured phenomena
A Statistical Model can be thought of as a pair (y,P) where Y is set of possible observations and P is probability Distribution over Y
A statistical model is a formalization of relationships between variables in the form of mathematical equations ( source wiki)
Errors & Bias
Type 1 : False positive - Incorrect rejection of Null Hypothesis
Type 2 : False Negatives -incorrect failure to reject a false null hypothesis
Bias : Missing from the Target by a quantity / measure
Supervised : Data set is labeled.
Example linear/logistic regression, SVM
Unsupervised : Finding Structures and relationships on unlabeled data.
Example : K-means, DBSCAN, K- NN
Construction and study of systems that can learn from data. Examples , explain it like 5!
Classification : problem of identifying to which of a set of categories a new observation belongs
Clustering : Grouping of samples into groups such that samples belonging to the same group / cluster are more similar
Regression : takes a group of random variables, thought to be predicting Y, and tries to find a mathematical relationship between them
Support Vector Machines
Perceptron Training Algorithm
Artificial Neural Networks
Forecasting : Making statements (predicting) about the events which are about to occur. Examples - Weather Forecasting, trading, sales forecasting etc.
Optimizations : Minimizing / maximizing a cost function, examples gradient descent, K-means.
Mathematics / statistics
Behavior Analysis : Clustered SVMs
Business Expansion Optimization
Data Science Central
Courses in IITB
Play with Data Sets