**Statistics: Understanding the Basics**

**Introduction**

**How can we prove what our "gut" feels?**

*Steps to understanding a phenomena:

collect and consider the data

know when data is not useful/remembering that data often explains "what" not "why"

Often careful analysis of data can get us to the ROOT CAUSE.

*Steps to understanding a phenomena:

collect and consider the data

know when data is not useful/remembering that data often explains "what" not "why"

Often careful analysis of data can get us to the ROOT CAUSE.

It's All Important

Statistics - A set of tools designed to help describe the sample or population from which the data* were gathered and

to explain the possible relationship between variables.

(Creighton 2010, p. 2)

measures of variability illustrate how spread out data (scores) are along a distribution

statistical tools help collect, organized, analyze and interpret data

Descriptive statistics - summarize numeric characteristics of a sample

Correlations

Examine relationships between two variables

Show direction and strength

DO NOT EXPLAIN CAUSES

Pearson Correlation Coefficient is most frequently used

Level of significance is universally set at p< .05, this is a linear relationship between paired data

Spearman correlations are used when variables do not meet required assumptions instead, Spearman's correlation coefficient is a statistical measure of the strength of a monotonic relationship between paired data

Regressions

Regressions predict one variable based on another

Simple regressions have two variables - ex: SAT/ACT scores and GPA

Multiple regressions have several variables - ex: GPA and hours spent studying for ACT/SAT tests

"Must Haves": at least 20 cases for each regression

"Must Meets": linearity, independence of errors, homoscedasticity, normality

t-Tests

t-Tests are used to compare the means of two groups

Dependent - comparing the means within the same group/subjects, paired samples (ex:SAT scores for all 11th grade boys before and after study session)

Independent - comparing the means between two groups

(ex: mean SAT scores for all 11th grade girls, compared to all 11th grade boys before and after a study session)

In order to preform a t-Test the following assumptions must be met:

presence of interval/ratio data

ANOVAs vs. Chi-square Goodness of Fit

allows for the comparison of the mean difference between two or more samples (two means = two t-Tests)

MUST HAVES - independence of errors, homogeneity of variance, no outliers, normality

TYPES - one way ANOVA (compares multiple groups), repeated measures ANOVA (tracks changes over time),

two-way ANOVA (compares two independent variables)

*use with care...figure out first what you wish to discover.

Chi-square Goodness of Fit (or what do you do when you data is not normal?)

CsqGF - compares (observed) frequency count of a sample to the (expected) frequency count of a population

WHEN to USE? - with categorical (nominal) data, no interval or ratio data, or a one-option data-set

*this is a non-parametric test! Always use parametric statistics prior to using CsqGF

*remember the difficulties of data: it does have many uses, but there can be mistrust of data, measurement errors and challenges, and lack of time to reflect on collected data.

CAN'T DO IT WITHOUT THE NUMBERS

Research Design (the BIG PICTURE for the LITTLE DETAILS)

There are 5 main elements of research design - purpose/questions, literature review, METHODS, findings, conclusion

METHODS: QUALITATIVE vs. QUANTITATIVE

...and of course, MIXED METHODS

*want a quick illustration of correlation between two variables? Use a scatterplot

normal distribution (think bell curve)

*a note about validity and reliability

reliability = consistent responses (how consistent is the measure?)

validity = correct responses (how well does the tool measure what it is supposed to measure?)

Measurement

scales of measurement:

(how are these variables differentiated?)

nominal - name/classification, mutually exclusive categories

interval - zero is a point on scale, ordered categories of the same size

ordinal - rank

ratio - the most precise scale, no absolute zero (zero is absence of measured quality), ordered sets

- to assign numbers to a variable using measures (questionnaire, survey, test...)

References

Creighton, Theodore. (2007). Schools and data, The

educator's guide for using data to improve decision making. Thousand Oaks, CA: Corwin Press.

Varga, Mary Alice. (2013). Notes for lecture on

Data-driven Decision Making. Retrieved from westga.view.usg.edu/dl2

ANOVA = analysis of variance