Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Transcript of Session 4
Search for Signal
This step is for the Understanding of the Modeler....
Prepare your Data
Making Sure your IV's are Truly Independent
Risk analysis : Default Prediction
When Business Objective /Target involves categorization of observations into Groups.
Building a Model to predict the Response of Campaign i.e to predict Whether customer will Respond( category 1) or Not ( Category 2)
Where to Apply Logistic Model
HR and IT : Attrition Prediction
Filtering Excercise : Finding the relationship of IV's with the Target
Nature : Categoric
Nature : categoric
Nature : Numeric
Output of this Step
List of few Imp Variables that can predict the Target variables
Search for the Signal
Step 1 :Division of Dataset to form Training and Testing Dataset
Step 2: Creation Of Dummy Variables( if required)
Use VIF parametre do decide on the variables that are exhibiting the Multicollinearity . If VIF for a variable VIF is greater than 10, it indicates a strong degree of correalation with atleast one other variable in the model.
Performing a Demo run to understand the health of Model.
Model Thumb Rules
Thumb Rule 4:
Somers' D >=.5
Thumb Rule 3:
P value < .05
Thumb Rule 1:
Model Convergence status
Thumb Rule 2:
Model P value < .05
Thumb Rule 5:
H&L p value >.05
For F and FirstPruch Vif>10 that means mulicollinearity is present.
We will use Stepwise Regression Algorithm
to help identify optimal Model combination
According to Stepwise;
these 5 variables are
the best possible
combination to predict
Deciding the cutoff probability value
Testing the stability of Model
on Validation data
Right now predicted output value is Probability value ;
but Prediction should be in Binary format.
Whether People are buying Florence (Class1) or Not(Class 0)
To do this conversion; Cutoff probability value is required.
Rule : Choose the Cutoff value where
value are close to each other.
At .060 probability value ,
Sensitivity and specificity
are closest to each other
ROC can also be used to decide the cutoff
Sensitivity and Specificity can both be 70%
On Training data set also
we are getting 70%
specificity and sensitivity
Good News !!
Model is Stable
All p values are <.05
Sommers' D > .5
H&L > .05