Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


RCOMM 2010 Cross-validation

See prekopcsak.hu for details

Zoltan Prekopcsak

on 15 September 2010

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of RCOMM 2010 Cross-validation

Cross-Validation: the illusion of reliable performance estimation Z. Prekopcsák, T. Henk, Cs. Gáspár-Papanek Budapest University of Technology and Economics Holdout validation K-fold
Cross-validation Leave-one-out K = 5 K = N easy to use
easy to misuse Performance estimation predict future performance

model selection DM model £ $ € % DM model 1
DM model 2
DM model 3
... Best model Using global information Using future data Picking the best Feature selection Cross-validation % Time series Cross-validation 100% 100% Cross-validation Pick the best 62% ... ... ~50% Optimization CHECKLIST ~50% ~50% ~50% 100 models Typical mistakes http://prekopcsak.com Comments? Does the preprocessing work without the label?
Should NOT use the label attribute before validation.

Are the observations (rows in the table) independent?
Should NOT have duplicate rows.
Cross-validation 63% 500 attributes 100 rows random 0-1 labels 100 rows 10 attributes random 0-1 labels 1000 rows 500 attributes 1000 rows 54% 10 attributes Cross-validation 100 models 100 rows 53% ~50% ~50% ~50% ... ~50% 100 models 1000 rows Pick the best 100 models ...
Full transcript