Prezi

Share this prezi

Who can edit:

Present Online

Send the link below via email or IM to invite your audience

Copy

Start the presentation

Start presenting

  • Invited audience will follow you as you navigate and present
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can view together your prezi
  • Learn more about this feature in the manual

Download prezi for:

Present offline on a PC or Mac.

  • Embedded YouTube videos need an active Internet connection to play.
  • Portable prezis are not editable.

Edit and present offline with Prezi Desktop

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

Follow your Rules, but listen to your Data

Rules Fest 2010 - Presentation: Seamless integration of predictive analytics and business rules = enhanced decisioning.
by Alex Guazzelli on 23 October 2010

Comments (0)

Please log in to add your comment.

Report abuse

Prezi Transcript

Model Development R/Rattle allows for reliable data manipulation and model development The Making of a Predictive Solution Devopment Deployment Execution Model Deployment PMML allows for easy expression and deployment of data transformations and data mining models Execution Instant execution of solutions via Web Console, Web Services and Excel Phases of the CRISP-DM Process Model Rattle Developed by Togaware - Australia Offers a simple graphical interface for data analytics and model development Great way to learn R as it shows produced R code It is open source Exports models in PMML Predictive Model Markup Language Standard used to represent data mining models XML-based Avoid proprietary issues and incompatibilities Share models between compliant applications Eliminates need for custom model deployment PMML Structure PMML defines a standard not only to represent data-mining models, but also data transformations A Data Dictionary defines all the input data fields Several data transformations strategies allow for intelligent extraction of feature detectors A comphreensive list of data mining models offers power and flexibility Model explanation allows for performance evaluation Industry Support Mature Standard - current version 4.0 Data Mining Group (dmg.org) Active group and constant enhancements Vendor independent consortium Supporters include: One Standard, One Process A (new) first book on PMML Model Execution via iPhone Zementis Contributions ADAPA: Decisioning Engine available for on-site and cloud deployments Excel Add-in: scores from within Excel Member of the DMG: helping to shape PMML Code contributor for the R PMML Package PMML Book: available on Amazon PMML Blogs PMML Articles: R Journal and others PMML Thank you! R The yin yang of model deployment Rules + Predictive Analytics = Enhanced Decisioning Rules = Expert Knowledge Logic used by experts to solve problems can be represented as business rules: When body temperature more than X AND blood pressure more than Y, then ... Predictive Analytics = Data-Driven Knowledge Based on the ability to automatically recognize patterns in data not obvious to the expert eye. Learn from past behavior present in historical data to predict the future. Ideal World: Seamless Integration of Predictive Analytics and Business Rules Development Knock, Knock. "Who's there?" "FBI. You're under arrest." "But I haven't done anything" "You will if we don't arrest you," replied Agent Smith of the Precrime Squad. Minority Report - 20th Century Fox 2002 Rules and Predictive Analytics Predictive Solutions Fish Processing Plant Goal: Automate the process of sorting incoming fish according to species (salmon or sea bass) From [Duda, Hart and Stork, 2001] Association Rules Cluster Models Decision Trees Naive Bayes Classifiers Neural Networks Regression Models Support Vector Machines Time-Series Outline Popular Techniques Thank you! Introduction Deployment Predictive Solution in R 1) Raw Data: Obtain images of salmon and sea bass (as to be implemented in production). 2) Preprocessing: Image Processing Algorithms (e.g. segmentation to separate fish from background). 3) Feature Extraction: Through data analysis we find out that 1) on average, sea bass is larger than salmon; and 2) salmon has a higher scale intensity. 4) Model Training: Select a predictive technique and train the model to classify incoming fish based on extracted features. 5) Rules: Create business strategies around model output probabilities. Length of fish Intensity of the scales Model Results IF probality of salmon > 95 AND probability of sea bass < 5 THEN assign fish to [Premier Salmon Conveyor Belt] IF probability of salmon > 80 AND probability of sea bass < 10 THEN assign fish to [Special Salmon Conveyor Belt] IF probability of salmon > 60 AND probability of sea bass < 20 THEN assign fish to [Ordinary Salmon Conveyor Belt] ... Business Rules in Our Proposed Solution Enhanced Decisioning 1 Enhanced Decisioning 2 Enhanced Decisioning 3 Model Execution Development Deployment (PMML) Execution Our Solution at Work Closing Remarks Accessible from Anywhere Fish Plant PMML Code
See the full transcript