Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Chi Square Test - Class Presentation

References: http://www.simafore.com/blog/bid/54594/How-to-use-Chi-Square-test-for-3-common-business-analytics-problems, http://www.youtube.com/watch?v=CRyoXkxObsA & http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Chi Square Test - Class Presentation

Presented By:
Siva Krishnamurthy Daniel Cohen
Alex Panagiotidis Michael Davies
What Is The Chi Square Test
How Do We Use Chi Square
Advantages And Disadvantages
Of Using Chi Square Test

= Original Value
= Estimated Value
In simple terms:
Chi Square is used to test whether or not the observed and estimated values are at all related
A measurement of how expectations compare to results. The data used in calculating a chi square statistic must be random, raw, mutually exclusive, drawn from independent variables and be drawn from a large enough sample.
Areas Chi Square Can Be Used

Actuarial Sciences – Study Of Risk Management

Bio-statistics – Statistics Used For Biological Purposes

Business Analytic – Used For Business Forecasting

Chemometrics – Statistics Used For Chemistry Purposes

Operations Research - Statistics Used To Determine Operations

Quality Control – Statistics Related To Quality And Reliability

General Relations Of Variances
The higher the value of chi square, the more variance there is between the
Original value & Estimated value
Chi Square = O - E
There are two major forms for this test:
Goodness Of Fit Test:
Test Of Association:
This is a test which determines whether results from research are consistent with expected results from a hypothesis.
This is a test which determines the correlation between two of more different variables and is used to find out whether a change in one variable would affect another.
Chi Square is a more precise method of mathematically estimating future projections than regression & correlation
Once the surveying is completed we can begin our
Chi Square Testing.
"Level Of Significance"
has to be set.
In this example it is 5%

H0 (
the null hypothesis
): gender DOESN'T affect working conditions.

H1 (
the alternate hypothesis
) : gender DOES affect working conditions.
Level of Significance
df P = 0.05 P = 0.01 P = 0.001
1 3.84 6.64 10.83
2 5.99 9.21 13.82
3 7.82 11.35 16.27
4 9.49 13.28 18.47
5 11.07 15.09 20.52
6 12.59 16.81 22.46
7 14.07 18.48 24.32
8 15.51 20.09 26.13
9 16.92 21.67 27.88
10 18.31 23.21 29.59
11 19.68 24.73 31.26
12 21.03 26.22 32.91
13 22.36 27.69 34.53
Calculating Chi Square
What does this mean?
Example Question!
Now It's Your Turn!
Level of Significance
df P = 0.05 P = 0.01 P = 0.001
1 3.84 6.64 10.83
2 5.99 9.21 13.82
3 7.82 11.35 16.27
4 9.49 13.28 18.47
5 11.07 15.09 20.52
6 12.59 16.81 22.46
7 14.07 18.48 24.32
8 15.51 20.09 26.13
9 16.92 21.67 27.88
10 18.31 23.21 29.59
11 19.68 24.73 31.26
12 21.03 26.22 32.91
13 22.36 27.69 34.53
Class Activity
Chi Square Terminology
Null & Alternate Hypothesis

Level Of Significance

Degree Of Freedom
DF = (N Of Columns - 1) x (N Of Rows - 1)
DF = (3-1) x (2-1)
DF = 2
Web Definition - A number representing the range of possibilities for movement.
Work out the experimental values and then apply the
"Chi Square Formula"
"You have been asked to write a report on the results of a questionnaire among the employees of a multinational computer manufacturer concerning pay & conditions"
The level of significance can be described as the chance of making an error in the test
The survey results are based on the replies of 400 selected employees relating gender to the question - "What is the most important thing to me about my work?"
The Survey:
The Observed Results
To obtain the estimated values:
RT = Row Total
CT = Column Total
GT = Grand Total
Asymmetry Of Proof
The Survey:
The Estimated Results
Some disadvantages include:
First you need to determine how your data will be collected to ensure that it is unbiased and from a large enough sample.
Original Entry
H0 : the degree classification at City and Kingston universities are the same

H1 : the degree classification at City and Kingston universities are not the same
Null and Alternate
Estimated Entry
Chi Square
Since the test stat < critical stat the null hypothesis is accepted as there is no change in the degree classification between the two universities
Class Activity Findings
This Percentage generally tends to be
5% (0.05), 1% (0.01) or 0.1% (0.001).

The most common level of significance used by most scientists and statisticians would be 5%.

The level of significance table would be used to help guide us to confirm either the null or alternate hypothesis.
The table would look something like this
This will be explained later, don't worry!
Overall, helping to predict variables which may occur in the future,
to investigate potential occurrences of one set of variables when another one changes.
Our Hypothesis
Level Of Significance
We have 95% confidence in this test which would mean our level of significance is 5% (0.05)

(as long as we are clarified of it not being biased data)
x 300
Now To Apply The Chi Square Formula
By stating a hypothesis and applying chi square to the data we can test if this survey has a degree of authority and is not a result of randomness.
This can also be known as the test statistic and is the value which will be related to the level of significance table in order to decide whether our null hypothesis should be accepted or rejected.
This is the calculated chi square value, if the value is high then it means there is a big difference between the O and E values.
In order to test this against our level of significance table we must first calculate the degree of freedom.
df =(4-1) x (2-1)
Therefore our df = 3
So, what does this tell us?
Chi Square/ = 31.99
Test Value
Critical Value at df 3
Level of Significance Table/ = 7.82
Because chi square value is higher than our critical stat, the null hypothesis (H ) is rejected. Therefore it can be stated that the findings of this test show gender does affect working conditions.
Data must be numerical values only - no percentages
Data must be in groups and of a reasonably high sample size
Not always easy to determine whether data is biased or not
If there is only one degree of freedom & have the significance level at 0.01 or 0.001, the results would not be as reliable
You are provided with data collected of degree classifications from two different universities.
Your Job:
Calculate/ Answer The Following: -
The Estimated Values
The Null & Alternate Hypothesis
Chi Square/ Test Statistic
Compare the chi square value with the critical value from the level of significance (at 5%) table
Degree Of Freedom
Think about what the null hypothesis may be AND identify the degree of freedom
Next, find out the estimated values
Observed Entry
It's time to calculate chi square...
df = 3
Test statistic = 0.156
Test Stat = 0.156
Critical Stat = 7.82
This is when it is difficult to prove a general statement true.
For example
"All penguins can't fly"
This statement can never really be proved since you cannot observe every penguin to disprove this general statement
Therefore we create a null and alternate hypothesis
All penguins can't fly
Some penguins can fly

H0 if test stat < critical stat = reject
H1 if test stat >= critical stat = accept
Full transcript