Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Multilabel Classification with Label Ranking

No description
by

Shail Dave

on 21 June 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Multilabel Classification with Label Ranking

Methods for Multi-label Classification
Problem Transformation Methods
Applications of Single Label Classification
Single-label and Multi-label Classification
Overview of Multi-label Learning Algorithms
The various problem transformation methods are [2]:

1)
Simple Problem Transformation Methods ( Copy,Copy_Weight, Select_Max, Select_Min, Select_ Random, Ignore).

2)
Binary Relevance (BR)

3)
Label Power-Set (LP)

4)
Pruned Problem Transformation (PPT)

5)
Ranking by Pairwise Comparison (RPC)

6)
Calibrated Label Ranking (CLR)
What is Data Classification?
--> Classification

-->
Classification can be divided into two
types
:
Single label classification
Multi label classification
-->
Classification is considered an instance of
supervised learning
, i.e. learning where a training set of correctly identified observations is available.
Single label classification
is a common learning problem where the goal is to learn from a set of instances, each associated with a unique class label from a set of disjoint class labels L. Depending on the total number of disjoint classes in L, the problem can be identified as
binary classification
(when|L| = 2) or multi-class classification (when |L| > 2) problem [1].
Multi-label classification
allows the instances to be associated with more than one class. The goal in multi-label classification is to learn from a set of instances where each instance belongs to one or more classes in L [1]. Multi-label classification is a special case of Multi-target classification.

Multi-label classification methods can be classified into two main categories:

1)

Problem Transformation Methods
Problem Transformation methods are those that transform multi-label classification problem either into one or more single-label classification or regression problems. It is algorithm independent method [2][3].

2)

Algorithm Adaptation Methods

Algorithm Adaptation methods are those that extend specific learning algorithms in order to handle multi-label data directly [3].
The various algorithm adaptation methods are [2]:

1)
Decision Tree Based (C4.5)

2)
Tree Based Boosting
( AdaBoost.MH & AdaBoost.MR)

3)
Neural Network Based (BP-MLL)

4)
Lazy Learning (kNN, ML-kNN)
Problem Transformation Methods
Algorithm Adaptation Methods
Figure 4: Example of Multi-label dataset [2]
(I) Simple Problem Transformation Methods
1)
The
Copy Transformation method
replaces each multi-label instance with a single class-label for each class-label occurring in that instance [2].
2)
A variation of this method, dubbed
Copy-Weight
, associates a weight to each produced instance. These methods increase the instances, without any information loss [2].
(I) Simple Problem Transformation Methods: (Contd..)
3)
The
Select Transformation method
replaces the Label-Set (L) of instance with one of its member. These methods are very simple but they result in some information loss [2].
select-max
select-min
select-random
4)
The
Ignore Transformation method

Figure: 8 Ignore Transformation method [2]
(II) Binary Relevance (BR)
Binary Relevance (BR)
[4] method learns q binary classifiers (q=│|L|│; total number of classes (L) in a data set), one for each label.
If particular instance contains label Lj (1≤ j ≤ q), then it is labeled positively otherwise labeled negatively. For a new instance to classify, BR outputs the union of labels that are predicted positively by the q classifiers.
It does not consider label dependency . This is the major limitation of BR [2].
(III) Label Power- Set (LP)
Label Power-Set (LP)
[4] method considers each unique set of labels that exists in a multi-label training set as one of the classes of a new single-label classification task.
Given a new instance, the single-label classifier of LP outputs the most probable class, which is actually a set of labels.
LP can also rank the labels following the approach in [5].
Figure 10: Transformed data set using Label Power- Set method
Figure 11: Example of obtaining ranking from LP
Figure 9: Transformation using Binary Relevance
(IV) Pruned Problem Transformation (PPT)
The
Pruned Problem Transformation (PPT) method
[5] extends LP.
It prunes away label sets that occur less times than a small user-defined threshold (e.g. 2 or 3) and optionally replaces their information by introducing disjoint subsets of these label sets that do exist more times than the threshold.
(V) Ranking by Pairwise Comparison (RPC)
Ranking by pairwise comparison (RPC)
[6] transforms the multi-label dataset into q(q-1)/2 binary label datasets, one for each pair of labels (Li, L j), 1≤ i < j <q.
Figure 14: Transformation using Ranking by Pairwise Comparison
Figure 1: Data Classification
(VI) Calibrated Label Ranking (CLR)
Calibrated label ranking (CLR)
[7] extends RPC by introducing an additional virtual label, which acts as a
natural breaking point
of the ranking into relevant and irrelevant sets of labels.
This way, CLR solves the complete MLR task.
Merits & Demerits of Problem Transformation Methods
Outline
Objective
Related Background
Literature Review
Simulation
Motivation from Literature
Methodology
Implementation Strategy
Results and Discussion
Suggestions from MSR
Conclusion & Future Work
Publication
References
OBJECTIVE
Multi-Target classification
is one in which each instance has multiple class labels and each class label has multiple values. [2]
[1]
M. S. Sorower, “A Literature Survey on Algorithms for Multi-label Learning”,
Oregon State University, Corvallis
, December 2010.

[2]
Modi, H. and Panchal, M., “Experimental Comparison of Different Problem Transformation Methods for Multi- Label Classification using MEKA”,
International Journal of Computer Applications
, vol. 59, No. 15, 2012.
Categorizing
news stories
as
finance, weather, entertainment, sports
, etc,.

Classifying
credit card transactions
as
legitimate or fraudulent
.

Classifying
secondary structures of protein
as
alpha-helix, beta-sheet, or random coil .
Figure 2: Example of Single Label Classification
Applications of Multi-Label Classification
A
text document
that talks about scientific contributions in medical science can belong to both
science and health category.
Genes
may have
multiple functionalities
(e.g. diseases) causing them to be associated with multiple classes.
An
image
that captures a field and fall colored trees can belong to both
field and fall foliage categories.
A
movie
can simultaneously belong to
action, crime, thriller, and drama categories.
An
e-mail
message can be tagged as both
work and research project.
[3]
RELATED BACKGROUND
Figure 3: Applications of Multi- Label Classification
[3]
G. Tsoumakas, I. Katakis, “Multi-label classification: An overview”,
International Journal of Data Warehousing and Mining (IJDWM)
, vol. 3, no. 3, pp. 1-13, 2007.
Applications of Multi-Target Classification

Thyroid
data set contains
7 class-labels
and each class-label has
multiple values
.
class
hyperthyroid
{
A,B,C,D,NEG
}
class
hypothyroid
{
E,F,G,H,NEG
}
class
general_health
{
K,NEG
}
class
replacement_ theory
{
L,M,N,NEG
}
class
antithyroid_treatment
{
O,P,Q,NEG
}
class
discordant_results
{
R,NEG
}.
Objective
Provide classification rules in multi-label form using J48 classifier.

Provide confusion matrix which shows overall accuracy as well as label wise accuracy for each different value of every class label.

Provide label wise ranking among class labels.

LITERATURE REVIEW
[2]
H. Modi and M. Panchal, “Experimental Comparison of Different Problem Transformation Methods for Multi- Label Classification using MEKA”,
International Journal of Computer Applications
, vol. 59, No. 15, 2012.

[3]
G. Tsoumakas, I. Katakis, “Multi-label classification: An overview”,
International Journal of Data Warehousing and Mining (IJDWM)
, vol. 3, no. 3, pp. 1-13, 2007.
Figure 5: Copy Transformation method
Figure 6: Copy -Weight method
Figure 7: Select Transformation methods
[2]
Modi, H. and Panchal, M., “Experimental Comparison of Different Problem Transformation Methods for Multi- Label Classification using MEKA”,
International Journal of Computer Applications
, vol. 59, No. 15, 2012.
[4]
G. Tsoumakas, I. Katakis and I. Vlahavas, “Mining multi-label data”, in
Data Mining and Knowledge Discovery Handbook
, 2nd ed., O. Maimon, L. Rokach, Springer US, 2010, pp. 1-20.
[4]
G. Tsoumakas, I. Katakis and I. Vlahavas, “Mining multi-label data”, in
Data Mining and Knowledge Discovery Handbook
, 2nd ed., O. Maimon, L. Rokach, Springer US, 2010, pp. 1-20.

[5]
J. Read, “A pruned problem transformation method for multi-label classification”, In: Proc.
2008 New Zealand Computer Science Research Student Conference (NZCSRS)
, 2008, pp. 143-50.
Figure 12: Pruned Problem Transformation method with threshold=3
[5]
J. Read, “A pruned problem transformation method for multi-label classification”, In: Proc.
2008 New Zealand Computer Science Research Student Conference (NZCSRS)
, 2008, pp. 143-50.
[6]
E. H¨ullermeier, J. F¨urnkranz, W. Cheng and K. Brinker, “Label ranking by learning pairwise preferences”,
Artificial Intelligence
, vol. 172, no. 16-17, 1897–1916, November 2008.
[7]
J. F¨urnkranz, E. H¨ullermeier, E. Lozamenćıa and K. Brinker, “Multilabel classification via calibrated label ranking”,
Machine Learning
, vol. 73, no. 2, pp. 133-153, November 2008..
Presentation (Dissertation Phase II)
On
Discovering Classification Rules and Providing Ranking among Labels to Multi-Label Data using Novel Approach
Prepared by: Priyadarshini Barot
Enrollment no.: 130910701001
Co-Guide: Mahesh Panchal
Shri Satsangi Saketdham "Ram Ashram" Group of Institutions
Gujarat Technological University
MOTIVATION
Motivation from Literature
1) Since MEKA (Multi-label Extention to Weka - A Machine Learning Tool) does not provide classification rules, this system will provide rules using J48 classifier.

2) MEKA does not provide confusion matrix in well comprehensible form, this system will generate confusion matrix for all possible class label values.

3) Since the label power-set approach tends to generate too many subsets, in current work, pruned problem transformation method is used to identify infrequent subsets and replace them with frequent lower order subsets. This has the advantage in no loss of data.

METHODOLOGY
IMPLEMENTATION STRATEGY
RESULTS AND DISCUSSION
Implementation Strategy
Platform
: Windows
Language
: Java
Tool
: Eclipse IDE
Classifier
: J48
Dataset
:Thyroid.arff
[9]
Asuncion, A., Newman, D.J. (2007). UCI Machine Learning Repository [Thyroid Dataset]. Retrieved from http://www.uci.edu/~mlearn/MLRepository.html
Figure 23: Information on thyroid dataset [9]
PUBLICATION
Publication
REFERENCES
References
[1]
M. S. Sorower, “A Literature Survey on Algorithms for Multi-label Learning”,
Oregon State University, Corvallis
, December 2010.
[2]
H. Modi and M. Panchal, “Experimental Comparison of Different Problem Transformation Methods for Multi-Label Classification using MEKA”,
International Journal of Computer Applications
, vol. 59, no. 15, 2012.
[3]
G. Tsoumakas, I. Katakis, “Multi-label classification: An overview”,
International Journal of Data Warehousing and Mining (IJDWM)
, vol. 3, no. 3, pp. 1-13, 2007.
[4]
G. Tsoumakas, I. Katakis and I. Vlahavas, “Mining multi-label data”, in
Data Mining and Knowledge Discovery Handbook
, 2nd ed., O. Maimon, L. Rokach, Springer US, 2010, pp. 1-20.
[5]
J. Read, “A pruned problem transformation method for multi-label classification”, In: Proc.
2008 New Zealand Computer Science Research Student Conference (NZCSRS)
, 2008, pp. 143-50.
[6]
H¨ullermeier, E., F¨urnkranz, J., Cheng, W., Brinker, K., “Label ranking by learning pairwise preferences”, Artificial Intelligence 172, pp. 1897–1916, 2008.
[7]
J. F¨urnkranz, E. H¨ullermeier, E. Lozamenćıa and K. Brinker, “Multilabel classification via calibrated label ranking”,
Machine Learning
, vol. 73, no. 2, pp. 133-153, November 2008.
[8]
P. Barot and M. Panchal, "Review on various problem transformation methods for classifying multi-label data",
International Journal of Data Mining and Emerging Technologies
, vol. 4, no. 2, 2014.
[9]
Asuncion, A., Newman, D.J. (2007). UCI Machine Learning Repository. Retrieved from http://www.uci.edu/~mlearn/MLRepository.html.
[10]
S. Godbole, S. Sarawagi, “Discriminative methods for multi-labeled classification”, In: Proc.
8th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2004)
, 2004, Springer, pp. 22–30.
[11]
RE. Schapire, Y. Singer, “Boostexter: A boosting-based system for text categorization”,
Machine Learning
, vol. 39, no. 2-3, pp. 135-68, 2000.

Guide: Avdhesh Gupta
Methodology
MODULE 1
Figure 20: Transformation of multi-target dataset into single-label dataset
Module 2
Figure 21: Generation of classification rules and confusion matrix
Module 3
Figure 13: RPC dataset
(V) Ranking by Pairwise Comparison (RPC)
(contd.)
New instance x'
Ranking:
Figure 15: Ranking by RPC
Figure 16: Transformation using CLR
(VI) Calibrated Label Ranking (CLR)
(contd.)
New instance x'
Ranking:
Figure 17: Ranking by CLR
Figure 22: Assign rank to labels
Figure 18: Comparison of various Problem Transformation methods
Conclusion
Figure 48: Review paper publication
For example


Assigning a given
e-mail
into
"spam"
or
"non-spam"
classes.
Assigning a
diagnosis

to a given patient
(gender, blood pressure, presence or absence of certain symptoms, etc.).
Classifying
whether
a creature
is a mammal, an insect, a bird or a plant, etc.
SIMULATION
Simulation
Figure 19: Simulation Results [8]
Performance Measures
Accuracy
Proportion of predicted correct labels to the total number of labels for that instance [10].
Exact Match
Accuracy of each example where all label relevancies must match exactly for an example to be correct.
Hamming Loss
Reports how many times relevance of a example to a class label is incorrectly predicted [11].
[8] P. Barot and M. Panchal, "Review on various problem transformation methods for classifying multi-label data", International Journal of Data Mining and Emerging Technologies, vol. 4, no. 2, 2014.
[10]
S. Godbole, S. Sarawagi, “Discriminative methods for multi-labeled classification”, In: Proc.
8th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2004)
, 2004, Springer, pp. 22–30.

[11]
RE. Schapire, Y. Singer, “Boostexter: A boosting-based system for text categorization”,
Machine Learning
, vol. 39, no. 2-3, pp. 135-68, 2000.
Snapshots of the system
Upload multi-target dataset which is in ARFF file format
Figure 26: Upload multi-target dataset
Figure 28: Enter number of class attributes and index
Figure 32: Get confusion matrix
Figure 36: Classification rules
CONCLUSION AND FUTURE WORK
Can generate confusion matrix for any multi-target dataset.

Can provide label and class wise ranking for any multi-target dataset.

Can generate classification rules for any multi-target dataset.
Threshold vs. Accuracy
Figure 48: Threshold value vs. accuracy
Platform:
Windows
Language:
Java
Tool:
Eclipse IDE
Classifier:
J48
Dataset:
Solar_Flare.arff
Figure 24: Information on solar flare dataset [9]
[9] Asuncion, A., Newman, D.J. (2007). UCI Machine Learning Repository [Solar Flare]. Retrieved from http://www.uci.edu/~mlearn/MLRepository.html
Platform:
Windows
Language:
Java
Tool:
Eclipse IDE
Classifier:
J48
Dataset:
Bridges.arff
Figure 25: Information on bridges dataset [9]
[9] Asuncion, A., Newman, D.J. (2007). UCI Machine Learning Repository [Bridges]. Retrieved from http://www.uci.edu/~mlearn/MLRepository.html
Operating System
: Windows 8.1

All the experiments to implement the algorithm have been performed in Windows 8.1.
Programming Language
: Java

Java was designed for easy use and is therefore easy to write, compile, debug and learn than other programming languages. One of the most significant advantages of Java is its ability to move easily from one computer system to another.

Tool:
Eclipse Juno

Eclipse is used by most of the developers (integrated development environment) as the original free Java IDE. The Eclipse IDE provides support for several languages (PHP, C/C++, etc.) and frameworks. Eclipse implements widgets through Java toolkit called SWT, whereas most applications use the Java standard abstract window toolkit (AWT) or Swing.

Servlet

The servlet is a java programming language class used to extend the capabilities of server. Although servlets can respond to any type of requests they are commonly used to extend the application hosted by web server.

JSP

Java Server Pages (JSP) is a technology that helps software developers create web pages dynamically based on XML and HTML.
Click on the browse button and select the required ARFF file.
Figure 27: Select ARFF file
Conversion into single-label by PS method
Figure 29: Enter required number of class attributes as threshold
Figure 30: Multi-target ARFF file
Figure 31: Converted single-label CSV file
Get Confusion Matrix
Figure 33: Confusion Matrix
Confusion Matrix
Ranking
Figure 34: Ranking(1)
Figure 35: Ranking(2)
Classification Rules
Classwise confusion matrix and ranking
Figure 37: Enter class for which ranking is required
Figure 38: Class-wise confusion matrix and ranking
Comparison with existing methods
THYROID DATASET
Figure 37: Accuracy for thyroid dataset
Figure 38: Hamming Loss for thyroid dataset
Figure 39: Exact match for thyroid dataset
SOLAR FLARE
Figure 40: Accuracy for Solar flare dataset
Figure 41: Hamming Loss for solar flare dataset
Figure 42: Exact match for solar flare dataset
BRIDGES
Figure 43: Accuracy for Bridges dataset
Figure 44: Hamming Loss for Bridges dataset
Figure 45: Exact Match for Bridges dataset
Comparison with existing tools
Figure 46: Comparison with MEKA
Figure 47: Comparison with WEKA
Future Work
The system can be designed to overcome the need to provide total number of class attributes and index from the user.

The system can be made more reliable and user friendly.

The system may be designed to relieve the user from the pain of entering the values in confusion matrix at run time.

The comparison is done with problem transformation method which can also be done with algorithm Adaptation method of multi-label classification.
SUGGESTIONS FROM MSR
How to find FP Rate
Classification rules in IF-THEN form
Figure 49 FP Rate
Figure 50 Classification rules in IF-THEN form
Full transcript