Prezi

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in the manual

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Follow your Rules, but listen to your Data

Rules Fest 2010 - Presentation: Seamless integration of predictive analytics and business rules = enhanced decisioning.
by Alex Guazzelli on 23 October 2010

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Follow your Rules, but listen to your Data

Model Development R/Rattle allows for reliable data manipulation and model development The Making of a Predictive Solution Devopment
Deployment
Execution Model Deployment PMML allows for easy expression and deployment of data transformations and data mining models Execution Instant execution of solutions via Web Console, Web Services and Excel Phases of the CRISP-DM Process Model Rattle Developed by Togaware - Australia
Offers a simple graphical interface for data analytics and model development
Great way to learn R as it shows produced R code
It is open source
Exports models in PMML Predictive Model Markup Language Standard used to represent data mining models
XML-based
Avoid proprietary issues and incompatibilities
Share models between compliant applications
Eliminates need for custom model deployment PMML Structure PMML defines a standard not only to represent data-mining models, but also data transformations A Data Dictionary defines all the input data fields
Several data transformations strategies allow for intelligent extraction of feature detectors
A comphreensive list of data mining models offers power and flexibility
Model explanation allows for performance evaluation Industry Support Mature Standard - current version 4.0
Data Mining Group (dmg.org)
Active group and constant enhancements
Vendor independent consortium





Supporters include:
One Standard, One Process A (new) first book on PMML Model Execution via iPhone Zementis Contributions ADAPA: Decisioning Engine available for on-site and cloud deployments
Excel Add-in: scores from within Excel

Member of the DMG: helping to shape PMML
Code contributor for the R PMML Package

PMML Book: available on Amazon
PMML Blogs
PMML Articles: R Journal and others PMML Thank you! R The yin yang of model deployment Rules + Predictive Analytics
= Enhanced Decisioning Rules = Expert Knowledge Logic used by experts to solve problems can be represented as business rules: When body temperature more than X AND blood pressure more than Y, then ... Predictive Analytics = Data-Driven Knowledge Based on the ability to automatically recognize patterns in data not obvious to the expert eye. Learn from past behavior present in historical data to predict the future. Ideal World: Seamless Integration of Predictive Analytics and Business Rules Development Knock, Knock.
"Who's there?"
"FBI. You're under arrest."
"But I haven't done anything"
"You will if we don't arrest you," replied Agent Smith of the Precrime Squad. Minority Report - 20th Century Fox 2002
Rules and Predictive Analytics
Predictive Solutions Fish Processing Plant Goal: Automate the process of sorting incoming fish according to species (salmon or sea bass) From [Duda, Hart and Stork, 2001] Association Rules
Cluster Models
Decision Trees
Naive Bayes Classifiers
Neural Networks
Regression Models
Support Vector Machines
Time-Series Outline Popular Techniques Thank you! Introduction Deployment Predictive Solution in R 1) Raw Data: Obtain images of salmon and sea bass (as to be implemented in production).



2) Preprocessing: Image Processing Algorithms (e.g. segmentation to separate fish from background).


3) Feature Extraction: Through data analysis we find out that 1) on average, sea bass is larger than salmon; and 2) salmon has a higher scale intensity.



4) Model Training: Select a predictive technique and train the model to classify incoming fish based on extracted features.



5) Rules: Create business strategies around model output probabilities. Length of fish Intensity of the scales Model Results IF probality of salmon > 95 AND probability of sea bass < 5
THEN assign fish to [Premier Salmon Conveyor Belt]

IF probability of salmon > 80 AND probability of sea bass < 10
THEN assign fish to [Special Salmon Conveyor Belt]

IF probability of salmon > 60 AND probability of sea bass < 20
THEN assign fish to [Ordinary Salmon Conveyor Belt]

... Business Rules in Our Proposed Solution Enhanced Decisioning 1 Enhanced Decisioning 2 Enhanced Decisioning 3 Model Execution Development
Deployment (PMML)
Execution Our Solution at Work Closing Remarks Accessible from Anywhere Fish Plant PMML Code
See the full transcript