Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
Classification in EDM & LA
Transcript of Classification in EDM & LA
Popular tools & Platforms in the industry
Data Mining tool
Application of Classification
Discussion & Future Insights
in Educational Data Mining & Learning Analytic
Comparison of Classification Methods
Define briefly EDM & LA.
Define Classification, Different methods & how they differ ?
Popular Classification tools in the industry.
Application of Classification in EDM & LA
Choosing a Classifier
Separation of training & testing Data
Classifier Accuracy Evaluation
Probabilistic or Discriminative Classifier?
Linear or Non-Linear Class Boundaries?
Filling missing values
try correcting error in values
The goodness of feature extraction and selection cannot be evaluated before the classifier is learned and tested. If the number of attributes is large, all possibilities cannot be tested.
new attributes are produced from combining and transforming the original Data
* Analyzing the dependencies between the class attribute & explanatory attributes
Support Vector Machine
* Use a learning algorithm and select the most important attributes in the resulting classifier.
To Measure it ...
Receiver Operating Characteristics (ROC)
can be used for assessing the goodness of a predictor
if the data set is already small, it is not advisable to reduce the training set any more.
Data used for Training never used for testing and vise verse
Data Size Vs Model
Data Size Vs Model
E-Learning and Assessment tools
Supervised Vs. Unsupervised
Emerging of the Bayesian Networks
Classification and Clustering Converging
Tools Limitation & a new workbench
And More ...
Cognitive Tutor , Moodle, ITS ... etc
Needed Definitions of EDM & LA
Classification "Supervised" you have a set of predefined classes and want to know which class a new object belongs to.
Clustering "Unsupervised" tries to group a set of objects and find whether there is some relationship between the objects.
Customization and Adaptation of Behavior
Method Improvement and optimization
Providing Monitoring and Feedback for instructors
Predicting student's performance
Bayesian networks are used for reasoning and under uncertainty (Pearl, 1988).
Bayesian networks are also known as causal networks or belief networks.
Probability theory and graph theory form their basis: random variables are nodes and conditional dependencies are edges in a directed acyclic graph.
Edges typically point from cause to effect
What are Bayesian Networks ?
Why Bayesian Networks ?
Adoption By Wider educational research and practice communities
developing the labels that support supervised learning
distilling relevant and appropriate data features
setting up appropriate cross-validation
configuration and building algorithms for classification
Will it be solved in 2012 ?!
it's a Start ...
Version 1.0 of the EDM Workbench
Expected Future Work
In the coming months
label previously collected educational log data with behavior categories of interest.
Allows learning scientists to
automatically distill additional information from log files for use in machine learning.
collaborate with others in labeling data.
(1) The automatically distilled features are hard-coded; future releases will make it easier to alter the feature list.
(2) The process of amending XML to create new features will be made more user-friendly.
(3) The coders cannot change the way in which the text replays are displayed.
(4) Users can currently only sample data and assemble it into clips in a limited number of fashions; we intend to implement more sophisticated sampling and clip-creation strategies
EDM proceedings 2010
EDM Proceedings 2011
EDM Proceedings 2012
Analysis of Publication in Terms of Charts
Bayesian Networks Vs Decision Trees
Although both could be similar in accuracy
still BN has more advantage with small sized
samples ... Handling incomplete Data ?!
Dr. Mohamed Amine Chatti
Discussion & insights About the Future
for your attention