Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
ICDM 2014 - Chordalysis MML
Transcript of ICDM 2014 - Chordalysis MML
We need to know why!
An information-theoretic scoring of decomposable models
Experiments: it works!
We have very effective approaches for making predictions from data
Credit card fraud detection
How good are we at understanding the why of our prediction systems?
"We predicted you that you are going to get
in 4 years.
What you can do is...
Temperature on Earth is likely to increase by 3°C,
Data will be even more valuable when we know
how to take actions
that will make a difference.
A statistically efficient and scalable method
for log-linear analysis of high-dimensional data
Log-linear analysis for more than 2 variables
Closed form iff the graph is chordal (triangulated)
2. Efficient computation of sub-scores -
4. Details about chordal graphs:
We need a controlled environment...
... because in reality, we never have the "ground truth".
... hard to tell for sure...
1. choose a structure
2. parametrize the model
3. sample data
4. learn structure
5. compare structure
6. output stats
Study of the elderly
20 years of trading
1. can analyse high-dimensional data
2. is parameter free
3. is released on
Monash University - Melbourne
IEEE ICDM 2014
...a very simple example
for the presidential election independent of the being a
member of one or more associations
Why log-linear analysis?
The null model
How to compute E?
Let's do the maths
For multinomial sampling, the null model will be rejected when gets large.
... i.e. when
we reject the null hypothesis
1. formulate hypotheses
2. Get data
2 things have to be sent
1. What model it is?
2. The parameters of the model
There are 2 possible models
François Petitjean, Lloyd Allison and Geoff Webb
Send denominator 1473 (all models will have it)
Model => 1 bit
=> 18.9 bits
Total = 19.9 bits
Model => 1 bit
=> 29.8 bits
If you know that 987 people voted, how many bits does it take to reconstitute the data re
This is an ordering problem:
Sending any takes
Total = 2,742 bits
Total = 2,690 bits
more probable than
Disclaimer: it's actually a bit more complicated than that, see our paper - Section III.A
General framework: selecting a statistically significant log-linear model = superset of Markov Networks
If the data has more than
variables, the statistical approach to log-linear analysis
only works for the class of
ICDM 2013 paper
Question addressed by this paper:
Can we design a Bayesian framework that keeps a very
low-rate of false discovery
and scale to datasets with
Scoring a decomposable model
All we need is...
given the model
1. sending : the number of edges
2. sending where the edges are
Encoding the structure of the graph
We only need to send the edges:
: the number of nodes/variables is common to all models => no impact on the score => no need to take it into account
max number of edges
Encoding the parameters of the model
multinomial distributions, e.g.
... and every parameter is a ratio
A clique C has
Encoding the data given the model
Sending positions for:
- combinations of
- combinations of
Sending the positions for the clique
N is the number of transactions/examples in the dataset
The positions for
would be sent twice!
minimal separators of the graph
What are the elements that are sent twice?
Scoring models that have only 1 edge difference
Scoring can be done efficiently
function of only
- 500 maximal cliques
- 500 min. separators
We only need to consider
Not in the talk...
What else will you find in the paper?
...but in the paper!
3. Memoization of partial results
- How to elect the edges that will keep the graph triangulated?
- How to compute maximal cliques and minimal separators?
1. Theory for scoring edges' addition
Learning graphical models from data
1. Learn structure from data
2. Use the structure
Alice has difficulty walking
She had a
10 years ago
She takes medication for her
high blood pressure
The doctor suspects Type 2 diabetes
Diabetes is not likely, the doctor decides to change her medication for high blood pressure
Studying correlations & independencies
The systolic blood pressure is independent of the weight, given
diastolic blood pressure
medication for high blood pressure
A weight change is not likely to change the systolic blood pressure only
number of samples