Degrees of Freedom

The degrees of freedom are essentially, the number of values in the final calculation of the statistics of something like an experiment. In a Chi Square Analysis, the degrees of freedom are the number of categories (in the test) -1. So in this M&M statistical test or experiment, we have 6 colors of M&M's, and since this is what we are comparing to the expected data, these colors are labeled as the categories. Our value of the degrees of freedom therefore are 5 since we subtract one.

**How to Chi Square**

By: Andrew Lee

Chi Square is an important tool of science and biology because it helps us to determine if our results from any particular experiment we do are due to chance or if they are due to something that is involved in our experiment. Chi Square was especially involved in a recent test we did on whether the M&M company did put in an accurate amount of each color of M&M's into their packages containing the M&M's.

But how do we know if the amount of each color is accurate enough? This is where Chi Square analysis comes into play. Through Chi Square, we can determine the preciseness of the data we see to the expected data. This method is based on the number that results from a Chi Square equation, that we compare to the Chi Square number using something called degrees of freedom.

So how did Chi Square come to be? In 1900, a mathematical named Karl Pearson, he conducted something called a x^2 test, which is now known as a Chi Square test for goodness of fit. R.A Fisher conducted something similar called Tests of Significance that proposed a model that was compared to the data, just like Pearson's. In 1933, Pearson's son Egon and Jerzy Neyman developed a null hypothesis for the Chi Square test that was either accepted or rejected, thus forming the Chi Square Analysis.

The Null Hypothesis

Shown in this picture are the M&M's I received in a cup. The M&M's were originally in the bag it comes in, Mrs. Stock transferred the pieces in the bag to the cup.

A null hypothesis is a prediction, a prediction that the differentiation between the expected results and the observable results seen from the experiment being operated is all due to chance. For every Chi Square analysis, the null hypothesis is the same, which is,

Any difference between the observed and expected data is due to chance.

Chi Square Value

Finding the degrees of freedom leads us to our Chi Square value that we use our x^2 value to compare to. The Chi Square value is basically the quota of the difference between the observed and expected numbers. The Chi Square value is 11.07 because for each value, it relies on the number of degrees of freedom. Since we have 5 degrees of freedom, our Chi Square value is 11.07, but if we had 4 degrees of freedom, then our value would be 9.49. If the x^2 value, or the collection of data divided by the expected results, we obtain is less than or equal to the Chi Square value, we must accept the null hypothesis. If it is higher than the value, we must reject the null hypothesis.

In explaining the Chi Square value, I mentioned the x^2 value. How exactly do we find this value? Well, we use an equation called x^2=Σ(O-E^2)/E

These mysterious symbols all have meanings. The O represents the observable data that you receive from calculating the statistics of the test,the E is the expected results from the test. So in this M&M experiment, the O is the percentage of each color of M&M's in the bag, and the E is the percentages of each color that are expected to be in each bag, based on the information posted on the M&M website. Lastly, the Σ is the sum of all of these percentages obtained from calculating, and from this we get our x^2 value.

Based on this data of my individual test, I reject the null hypothesis because the calculated Chi Square sample size from my individual data was 13.97, which is a larger value than 11.07, the original Chi Square value for all experiments under 5 degrees of freedom. Therefore, the null hypothesis is rejected. My value may have been above the original Chi Square value due to a possible mistake in the carrying out of the protocol (assorting the M&M's into the cups and plates).

The calculated Chi Square sample size , which is 6.95, has a lesser value than the original Chi Square value, which as stated in the analysis of my individual data, is 11.07 under 5 degrees of freedom, and so I accept the null hypothesis.

So what was this lab?

This lab was essentially for us to know how to apply Chi Square in order to find out the difference between our observation of the data that's in front of us and the data we expect to see. In this lab we used M&M's to determine if the number of each color of M&M's were accurate enough to what the company stated. We first got individual samples and used a Chi Square test to see and compare the numbers and eventually percentages of each color to the expected percentages on the company's website. We then added all our data as a class and did another Chi Square test and percieved that the M&M company was true to its word.

The expected results were the percentages for each color of M&M's found on the M&M website. This was the best choice for basing our expected results off of because they make us see if the company that makes the M&M's are making its claim for the approximate number of M&M's put in each packet legitimate. These percentages also help us see the contrast between our observable measurements because again, it's the claim that they represent the accurate amount to an extent of M&M's put in. Lastly, the percentages help us to reject or accept the null hypothesis, which is "Any difference between the observed and expected results is purely by chance."

We also want to label our controls in any sort of test or experiment, so the controls for this M&M Chi Square test are,

Same procedure and protocol

Same M&M colors being tested for everybody

Same company and distributor of the M&M's

Type and size of the M&M's

Approximately equal M&M mass

Same M&M bag

Comparison of Data

Here is also a comparison of the percentages of the the class data, my personal data, and the expected data all in a bar graph for each color.

Graph of x^2 value

Here is also a graph of the calculated Chi Square values of both my individual sample data and the data of the class.