Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


KM data mining

No description

Kacper Winiarczyk

on 21 April 2010

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of KM data mining

Data mining (DM) known also as: 1.Knowledge discovery from database (KDD)
2.Information discovery
3.Information harvesting
4.Data archaeology
5.Data pattern processing
DM definition: 1.A body of scientific knowledge accumulated through decades of forming well established disciplines
2.A technology evolving from high volume transaction system, data warehouses and the Internet
3.A business community forced by an intensive competitive environment to innovate new ideas
The search for relationship and global patterns that exist in large database but are hidden among the vast amount of data Holsheimer and Kersyen 1994 BUSINESS INTELLIGENCE Is a global term for all processes, techniques, and tools that support decision making based on information technology DATA MINING AND BUSINESS INTELLIGENCE Business drivers -Competition
-Information abundance
-Serving the knowledge workers efficiently
TECHNICAL DRIVERS Analytical methods provide advanced learning capabilities •Drawing hypothesis
•Finding patterns
•Undertaking actions
•No analysis – no business intelligence
•No business intelligence – no assimilation of data
Objective of DM •To optimize the use of available data
•Reduce risk of wrong decisions
DM development analytical foundations •Statistics
oTesting hypothesis
oFinding relations
•Machine learning
oSubfield of artificial intelligence
oLearning PC from themselves
oLearning from examples
oReinforcement learning
oSuper and unsupervised learning
Data warehouses •Data generated from operations as volatile asset:
oOLTP online transaction processing,
oPOS point of sale,
oATM automated teller machine
•It is extracting and transforming operational data into informational or analytical data and loading it into a central data warehouse
Major features of DW •Time variant – stored not updated data
oPredictions, forecasting, trend analysis
•Subject oriented – customers’ vendor products
oLoans, accounts, transactions
•Nonvolatile – two kinds of operations in DW
oInitial loading
oAccess data
oDifference from operational data houses
•Integration – consistent naming of variables
OLAP Online Analytical Processing •Nonstatic reporting system
•Data show in several dimensions – details and zooming out general view
Power of OLAP •Visualization tool
•Easy to use interactive
•Good as first step to understand data
Limitations of OLAP – as main strengths •Does not find patterns automatically
•Does not have powerful analytical techniques
Evolution of the decision making architecture •DM components of business intelligence architecture
•BI brings benefits from actions
oOnline analytical process
oData visualization
oData analysis
DM complements to BI tools provide global integrating approach for decision making process which allows:
oAccess from different sources
oVisualize models
oInteract with results to get knowledge
DM Virtuous Cycle What is the goal of DM Allow a corporation to improve its internal operations and external relationships with customers, suppliers, and other entities. The virtuous cycle of DM is about harnessing the power of data and transforming it into added value for the entire organization.
Response to extracted patterns
Selection of the right action
Learning from past actions
Turning action into business value
The main focus of DM To extract maximum benefit from data and to gain better understanding of customers, suppliers, and the market The virtuous cycle Business Understanding Identifying the business opportunity and problems faced by the firm
The goal is to identify the areas where data can provide values
The problems should involve the technical people and the business experts
Taking insurance industry for example... Major challenges is fraud, such as claim fraud, premium fraud, and indemnity fraud
Other challenges are customer retention or loyalty enhancement programs
Developing the DM Application The goal is to extract knowledge from data and make the mechanism of discovering a new pattern operational, including four activities:
Define the adequate data-mining tasks
Organize data for analysis
Use the right DM technique to build the data model
Validate the model
The Developing Process Begins with identification of the outcomes expected from the DM application. Consist of the following:
Clustering – divides a database into different groups
Classification – identifies the characteristics of the group to which each case belong
Affinity grouping – descriptive approach to exploring data that can help identify relationship among values in a database
Two common approaches to affinity grouping
Association discovery
Sequence discovery
Data management Data sources
Taxonomy of data
Data preparation
Model building
Parameter settings and tuning
Model testing and analysis of results
Taking action and deployment
Postdevelopment phase
Data sources Flat files
Relational databases
Data warehouses
Geographical databases
Time series database
World wide web
Taxonomy of data Business transactions
Scientific data
Medical data
Personal data
Text and documents
Web repositories
Data preparation Evaluating data quality
Handling missing data
Processing outliers
Normalizing data
Quantifying data
Model building The most popular techniques:
- Association rules
- Classification rules
- Neural networks
Model testing and analysis of results Reviewing the business objectives and success criteria
Assessing the success of data management project
Understanding the data-mining results
Interpreting the results
Comparing the results
Taking action and deployment The deployment phase involves several task:
- Summarizing the deployable results
- Identify the users of the discovered knowledge
- Defining a performance measure to monitor
Postdevelopment phase Return on investment (ROI): financial impact, the response rate, and other impacts
Role of DM in customer relationship management DM technologies and techniques Helps business sift through layers of seemingly unrelated data for meaningful relationships, where they can further anticipate, than simply to react to customer needs. DM applications that enhance organization’s customer services: 1.Customer Acquisition
2.Campaign Optimization
3.Customer Scoring
4.Direct Marketing
5.Integrating DM, CRM, and e-business
-Finding customers who previously were not aware of the product, or who were not purchasing it, or who have bought it from competitors -Take a set of offers and customers, along with the characteristic and constraints of the campaign, to determine which offer should go to which customers, over what channels, and what time (Precision Marketing). -Identifying customers who are at risk of changing their service
-Determining the right action that encourage customers to retain their service
-Applied to obtain a better campaign response and a higher return on investment
-Can develop a response model by using data from a past direct marketing campaign to predict those most likely to respond to the next campaign
Intelligent e-business system enhances CRM by enabling a level of responsiveness and proactive customer care not achievable through other channels
Through personalization, corporations can build successful 1-to-1 relationship with customers
The ability to reach customers is fundamental to a successful e-business
Feedback is an integral process that measures the overall effectiveness of CRM
Parameter settings and tuning Several parameters: adjusted empirically
Parameters: eventual use and comparison.
The testing and validating samples are used for this task.
Model testing and analysis of results Reviewing the business objectives and success criteria
Assessing the success of data management project
Understanding the data-mining results
Interpreting the results
Comparing the results
Implications for knowledge management Data problems Bad quality
Incorrect Managerial Problems Too Long
Too expensive
Bad Selection of Vendors Modeling Mistakes Inadequate Tools
Bad Sampling
Inadequate Testing Structural Problems No Commitment
No Deployment Efforts challenges Basics Common problems Insufficient understanding of business needs
Careless handling of data

Invalidly validating the data-mining model
Belive in alchemy a. Overquantifying data
b. Miscoding data
c. Analizying without taking precautions against sampling errors d. Loss of precision due to improper rounding of data values
e. Incorrectly handling missing values
Full transcript