Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Case based entropy

No description
by

Rajeev Rajaram

on 6 January 2017

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Case based entropy

Why (re-iterated)
Scale free comparison of distributions
Measures the true diversity contribution of all or part of a distribution
Currently no method available to do this!!
✳✱*
Scale free comparison
How do we compare a left skewed, a bimodal and a right skewed distribution?
Idea: Compute the number of equiprobable types needed for the same amount of information
Why
Case-based entropy C_c provides a common normalization where it computes the percentage diversity contribution up to a cumulative probability c
Case based entropy
Why?
The Math
Examples

EmpiricalExamples

Household Income
The mathematics
For each part of the distribution, up to cumulative probability c, compute the number of equiprobable types required, to maintain the same amount of Shannon information
(
from galaxies to gases
)
Here, we wanted to know what the
diversity of household income
was for the United States in 2012
Here, we have 41 different diversity types; constituting 41 different economic probability states for household income in the United States.

If one explored this distribution using regular statistical terminology, N=41 would be used to compute the mean, median, mode, skew, etc. in relation to the total sample size.

THAT IS NOT WHAT WE ARE DOING!


INSTEAD, We are measuring the
diversity of information in the system, based on a case-based notion of equi-probable types
.

1. As such, the
N = 41 types
is not what we used to compute perfect diversity.

2. Instead, based on the formulas shown in our SIMPLE EXAMPLE, the number of equi-probable diversity types necessary to maintain the same value of Shannon-entropy is
N equi-types = 31.21
.

3. With our
N equi-types
computed, we treat this number as our
denominator
, which 'converts diversity' to percentages. We do this to make our measure scale-free, which allows us to:

(a) compare distributions to one another; and also
(b) compare any part of a distribution to the rest of it, so as to know its percentage contribution to the total diversity.

4. With our denominator determined, we can then compute the diversity contribution for any of our original
N = 41 types
relative to the
N equi-types = 31.21
. For example:

(a) the diversity contribution from Type 1 (Under $5,000) -- which comprises 3.32% of all cases in the sample -- is 3.20%

(b) In turn, the combined diversity contribution from Type 1 to Type 13 ($60,000 to $64,999) -- which comprises 59.32% of all cases in the sample -- is 41.03%.




Figure 1 Part A
Figure 1 Part B
As the comparison of these two
pictures show, the graph for
case-based entropy is very
different from the skewed-right
probability distribution with which
we started.

The
x-axis
represents the
percentage of diversity
explained by any combination
of the the N=41 types relative
to the N =31 equi-types.

The
y-axis
is the cumulative
frequency of cases in the system
of study

In this case, we find that roughly
60% of all cases -- which also
happen to be the smallest
incomes -- account for only 40%
of the total diversity of household
income.

Interestingly enough, we found this
rule to be true for the skewed-right
probability distributions for an
exceptionally wide variety of
complex systems.

Here, we have eight different examples of complex systems. On the left are the skewed-right probability distributions. On the right are the case-based entropy graphs for the same eight systems. It is noteworthy that all eight systems pass through the 60-40 region on the left graph, suggesting that the 60-40 rule plays a role in the diversity of complexity is many systems.
Inspired by our initial findings for the above eight complex systems, we decided to explore further three of the most classic energy distributions in physics to see if case-based entropy could:

(1) effectively map the distribution of diversity of probability states in these systems -- which constitute different forms of energy types, in both discrete and continuous form.

(2) compare the diversity of complexity in these systems to see if the 60/40 rule held sway.

We examined:
(a) The Maxwell Boltzmann Distribution in both one-dimension and three-dimensions
(i.e., MB 1D & MB 3D).
(b) The Boze-Einstein Distribution, for both helium and photons (i.e., BE Helium & BE Photon)
(c) Fermi-Dirac Distribution, for sodium at four different temperatures, Na 6000K, 300K, 1.2K and 15000000K.

NOTE: We postulate that, because fermions do not obey the Pauli Exclusion Principle, Fermi gas does not sufficiently clump toward the lower bound.
https://www.dropbox.com/s/xv8qbmln9ijhl3z/Flower%20Example.xlsx?dl=0
Why equiprobable
Ideal diversity means each probability state or each type occurs with equal probability
If the original distribution is not equiprobable, then we ask how many states or types are required if we have an "equivalent" equiprobable distribution.
The "equivalence" is established by demanding that we have the same amount of Shannon entropy as the original distribution
Hypothetical example
Imagine a garden that has different types of flowers.
Each flower type in general, will have a different frequency of occurence.
In an ideal world, perfect diversity would mean that each type has the SAME frequency of occurence - or each type is equi-probable.
Diversity in the mathematical sense refers not to the number of types of flowers, but to the equiprobable occurence of all types.
If we don't have perfect diversity, then we compute the number of flower types required for an "equivalent" equiprobable distribution that has the same value of Shannon information.
Caveat
We are assuming some order of importance of the flowers (or order of preference). Changing the order will change the diversity contribution if the frequencies are not the same.
Why
For us,
case-based entropy
is a measure of the diversity within a complex system (be it physical, biological, psychological, social, ecological, etc) that incorporates both species richness (
the number of cases in a system
) and the evenness of species' abundances (
the diversity of types, be they discrete or continuous
).

SYNONYMS FOR DIVERSITY
Inequality
cultural diversity in society
species diversity in an ecological community
diversity of information in cybernetic systems
diversity of major and minor trends in longitudinal data
diversity of health and wellbeing
trade diversity of a country
network diversity
diversity of complexity in systems


Full transcript