### Present Remotely

Send the link below via email or IM

CopyPresent to your audience

Start remote presentation- Invited audience members
**will follow you**as you navigate and present - People invited to a presentation
**do not need a Prezi account** - This link expires
**10 minutes**after you close the presentation - A maximum of
**30 users**can follow your presentation - Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

### Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.

You can change this under Settings & Account at any time.

# Copy of Neural Networks Project

Prezi Presentation that describes a proposed algorithm that enhances the kmeans

by

Tweet## Lubna Shaikh

on 22 February 2013#### Transcript of Copy of Neural Networks Project

Data Mining Using Neural Networks

Made By:

Lubna Shaikh

10bce090 Artificial Neural Networks Introduction Definition Computational Model 3 Properties Feed Forward Recurrent LEARNING/TRAINING METHODS Back Propagation Algorithm Step 2 Step 1 Step 3 Step 5 Step 4 Step 6 Step 7 Activation function is a weighted sum (the sum of the inputs xi multiplied by their respective weights wji)

A_j (x ,w)= ∑_(i=0)^n x_i w_ji Output function is the sigmoidal function which is close to 1 for large +ve numbers, 0.5 at 0, & close to 0 for large -ve numbers

O_j (x ,w )= 1/(1+e^(-A_(j ) (x ,w )) ) Error function

E_j (x ,w ,d)=(O_j (x ,w )- d_j) ^2

Oj =output of neural network.

dj =desired output for given inputs. Error of the network will be the sum of the errors of all the

neurons in the output layer:

E_ (x ,w ,d)=∑_j (O_j (x ,w )- d_j ) ^2 The weights are adjusted using the method of gradient descendent:

Δw_ji= -μ ∂E/(∂w_ji )

How much the error depends on the output, which is the derivative of E in respect to Oj

∂E/(∂O_j )=2(O_j- d_j)

How much the output depends on the activation,

which in turn depends on the weights:

(∂O_j)/(∂w_ji )=(∂O_j)/(∂A_j ) (∂A_j_ )/(∂w_ji )=O_j (1-O_j)x_i

A massively parallel, distributed processor Simple processing units

Natural propensity for storing experiential knowledge

Resembles the brain in two respects:

1. Knowledge is acquired by the network from its environment through a Learning process.

2. Interneuron connection strengths, known as synaptic weights are used to store the acquired knowledge. A neuron is a real function of the input vector (y1... yk) given by

f (xj) = f (αj + Σki-1wij y j)

f is a the sigmoid (logistic or tangent hyperbolic) function.

Compositions of weighted sums of the functions.

The higher a weight of an artificial neuron is, the stronger the input which is multiplied by it will be. Negative Weights can inhibit a signal. 1. The pattern of its interconnections, architecture.

2. Method of determining and updating the weights on the interconnections, training.

3. The function that determines the output of each individual neuron, activation or transfer function.

Characteristics of Neural Networks 1. A large number of processing elements called neurons.

2. Interconnections strenghts or links with an associated weight store memory.

4. Information is processed by changing the strengths of interconnections and/or changing the state of each neuron.

5. A neural network is trained rather than programmed. 6. Acts as an associative memory.

7. It can generalize; i.e, detect similarities between patterns.

8. It is robust; the performance does not degrade appreciably if some neurones or interconnections are lost. (Distributed memory)

9. Recall information from incomplete or noisy or partially incorrect inputs.

10. Self-organizing. Can generalize from data patterns used in training without being provided with specific instructions on exactly what to learn. Supervised learning

Every input pattern is associated with the target or the desired pattern.

Error = Computed output - correct expected output

The error is used to change network parameters, which results in an improvement in performance.

Unsupervised learning

Target output is not presented to the network.

System learns of its own by discovering and adapting to structural features in the input patterns.

Reinforced learning

The expected answer is not available but only if the computed output is correct or incorrect is known. , how much the output depends on the activation, which in turn depends on the weights (from (1) and (2)):

(∂O_j)/(∂w_ji )=(∂O_j)/(∂A_j ) (∂〖A_j〗_ )/(∂w_ji )=O_j (1-O_j)x_i Step 8 Adjustment to each weight will be:

Δw_ji=-2μ(O_j-d_j)O_j (1-O_j)x_i

Limitation of Backpropagation

Trap of local minimum.

Trained weights may not necessarily produce the best answer. Architectures APPLICATIONS

Model real neural networks

Study behavior and control in animals & machines Engineering purposes, such as

pattern recognition

forecasting

data mining

data compression. Data Mining Data Mining Techniques Prediction- Use of existing variables in order to predict unknown or future values of interest

Description- Finding patterns describing the data and the subsequent presentation for user interpretation.

Data mining techniques

•Associations

•Classifications

•Sequential patterns

•Clustering.

An association rule is an expression of the form X => Y, where X and Y are the sets of items.

X => Y has support s, in the transaction set D if s% of the transactions in D supports X U Y.

Support means how often X and Y occur together as a percentage of the total transactions.

Confidence measures how much a particular item is dependent on another. Patterns with a combination of intermediate values of confidence and support provide the user with interesting and previously unknown information. Association Rules Method of grouping data according to similar trends and patterns.

Set of regions or clusters, either deterministically or probabilitywise in some optimal fashion with some measure of similarity

Can also build set functions that measure some particular property of groups and it achieves optimal partitioning. Clustering Rules that partition the data into disjoint groups.

Input- Training data set, whose class labels are already known.

Classification constructs a model based on the class label, and aims to assign class label to the future unlabelled records.

Known as supervised learning.

Classification discovery models.

Decision tree

Neural networks

Genetic algorithms

Statistical models. Classification Rules Data Mining Methods

• Neural Networks

• Genetic Algorithms

• Rough Sets Techniques

• Support Vector Machines

• Cluster Analysis

• Induction

• OLAP

• Data Visualization DATA MINING PROCESS BASED ON NEURAL NETWORKS 3 main phases

Data preparation

Rules Extraction

Rules Assessment

Data Preparation

Define and process the mining data to make it fit specific data mining method.

Plays a decisive role in the entire data mining process. 1. Data Cleansing: Fill the vacancy of the data, eliminate the noise data and correct the inconsistencies.

2. Data Option: Data option is to select the data range and row used.

3. Data Pre-processing: Enhanced process to clean data which has been selected.

4. Data Expression: Transform the data into the form which can be accepted by the data mining algorithm based on neural network.

The data mining based on neural network can only handle

numerical data, so it is need to transform the sign data

into numerical data. Data Preparation Methods to extract rules

LRE method

Black-box method

Extracting fuzzy rules

Extracting rules from recursive network,

the algorithm of binary input output rules extracting (BIO-RE)

partial rules extracting algorithm (Partial- RE)

Full rules extracting algorithm (Full-RE). Rules Extraction Rules can be assessed in accordance with the following objectives:

Find the optimal sequence of extracting rules, making it obtain the best results in the given data set

Test the accuracy of the rules extracted

Detect how much knowledge in the neural network has not been extracted; Rules Assessment

Full transcriptMade By:

Lubna Shaikh

10bce090 Artificial Neural Networks Introduction Definition Computational Model 3 Properties Feed Forward Recurrent LEARNING/TRAINING METHODS Back Propagation Algorithm Step 2 Step 1 Step 3 Step 5 Step 4 Step 6 Step 7 Activation function is a weighted sum (the sum of the inputs xi multiplied by their respective weights wji)

A_j (x ,w)= ∑_(i=0)^n x_i w_ji Output function is the sigmoidal function which is close to 1 for large +ve numbers, 0.5 at 0, & close to 0 for large -ve numbers

O_j (x ,w )= 1/(1+e^(-A_(j ) (x ,w )) ) Error function

E_j (x ,w ,d)=(O_j (x ,w )- d_j) ^2

Oj =output of neural network.

dj =desired output for given inputs. Error of the network will be the sum of the errors of all the

neurons in the output layer:

E_ (x ,w ,d)=∑_j (O_j (x ,w )- d_j ) ^2 The weights are adjusted using the method of gradient descendent:

Δw_ji= -μ ∂E/(∂w_ji )

How much the error depends on the output, which is the derivative of E in respect to Oj

∂E/(∂O_j )=2(O_j- d_j)

How much the output depends on the activation,

which in turn depends on the weights:

(∂O_j)/(∂w_ji )=(∂O_j)/(∂A_j ) (∂A_j_ )/(∂w_ji )=O_j (1-O_j)x_i

A massively parallel, distributed processor Simple processing units

Natural propensity for storing experiential knowledge

Resembles the brain in two respects:

1. Knowledge is acquired by the network from its environment through a Learning process.

2. Interneuron connection strengths, known as synaptic weights are used to store the acquired knowledge. A neuron is a real function of the input vector (y1... yk) given by

f (xj) = f (αj + Σki-1wij y j)

f is a the sigmoid (logistic or tangent hyperbolic) function.

Compositions of weighted sums of the functions.

The higher a weight of an artificial neuron is, the stronger the input which is multiplied by it will be. Negative Weights can inhibit a signal. 1. The pattern of its interconnections, architecture.

2. Method of determining and updating the weights on the interconnections, training.

3. The function that determines the output of each individual neuron, activation or transfer function.

Characteristics of Neural Networks 1. A large number of processing elements called neurons.

2. Interconnections strenghts or links with an associated weight store memory.

4. Information is processed by changing the strengths of interconnections and/or changing the state of each neuron.

5. A neural network is trained rather than programmed. 6. Acts as an associative memory.

7. It can generalize; i.e, detect similarities between patterns.

8. It is robust; the performance does not degrade appreciably if some neurones or interconnections are lost. (Distributed memory)

9. Recall information from incomplete or noisy or partially incorrect inputs.

10. Self-organizing. Can generalize from data patterns used in training without being provided with specific instructions on exactly what to learn. Supervised learning

Every input pattern is associated with the target or the desired pattern.

Error = Computed output - correct expected output

The error is used to change network parameters, which results in an improvement in performance.

Unsupervised learning

Target output is not presented to the network.

System learns of its own by discovering and adapting to structural features in the input patterns.

Reinforced learning

The expected answer is not available but only if the computed output is correct or incorrect is known. , how much the output depends on the activation, which in turn depends on the weights (from (1) and (2)):

(∂O_j)/(∂w_ji )=(∂O_j)/(∂A_j ) (∂〖A_j〗_ )/(∂w_ji )=O_j (1-O_j)x_i Step 8 Adjustment to each weight will be:

Δw_ji=-2μ(O_j-d_j)O_j (1-O_j)x_i

Limitation of Backpropagation

Trap of local minimum.

Trained weights may not necessarily produce the best answer. Architectures APPLICATIONS

Model real neural networks

Study behavior and control in animals & machines Engineering purposes, such as

pattern recognition

forecasting

data mining

data compression. Data Mining Data Mining Techniques Prediction- Use of existing variables in order to predict unknown or future values of interest

Description- Finding patterns describing the data and the subsequent presentation for user interpretation.

Data mining techniques

•Associations

•Classifications

•Sequential patterns

•Clustering.

An association rule is an expression of the form X => Y, where X and Y are the sets of items.

X => Y has support s, in the transaction set D if s% of the transactions in D supports X U Y.

Support means how often X and Y occur together as a percentage of the total transactions.

Confidence measures how much a particular item is dependent on another. Patterns with a combination of intermediate values of confidence and support provide the user with interesting and previously unknown information. Association Rules Method of grouping data according to similar trends and patterns.

Set of regions or clusters, either deterministically or probabilitywise in some optimal fashion with some measure of similarity

Can also build set functions that measure some particular property of groups and it achieves optimal partitioning. Clustering Rules that partition the data into disjoint groups.

Input- Training data set, whose class labels are already known.

Classification constructs a model based on the class label, and aims to assign class label to the future unlabelled records.

Known as supervised learning.

Classification discovery models.

Decision tree

Neural networks

Genetic algorithms

Statistical models. Classification Rules Data Mining Methods

• Neural Networks

• Genetic Algorithms

• Rough Sets Techniques

• Support Vector Machines

• Cluster Analysis

• Induction

• OLAP

• Data Visualization DATA MINING PROCESS BASED ON NEURAL NETWORKS 3 main phases

Data preparation

Rules Extraction

Rules Assessment

Data Preparation

Define and process the mining data to make it fit specific data mining method.

Plays a decisive role in the entire data mining process. 1. Data Cleansing: Fill the vacancy of the data, eliminate the noise data and correct the inconsistencies.

2. Data Option: Data option is to select the data range and row used.

3. Data Pre-processing: Enhanced process to clean data which has been selected.

4. Data Expression: Transform the data into the form which can be accepted by the data mining algorithm based on neural network.

The data mining based on neural network can only handle

numerical data, so it is need to transform the sign data

into numerical data. Data Preparation Methods to extract rules

LRE method

Black-box method

Extracting fuzzy rules

Extracting rules from recursive network,

the algorithm of binary input output rules extracting (BIO-RE)

partial rules extracting algorithm (Partial- RE)

Full rules extracting algorithm (Full-RE). Rules Extraction Rules can be assessed in accordance with the following objectives:

Find the optimal sequence of extracting rules, making it obtain the best results in the given data set

Test the accuracy of the rules extracted

Detect how much knowledge in the neural network has not been extracted; Rules Assessment