Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Overview of Statistics DDDM9960

Just the Basics

Lena Hamilton

on 5 July 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Overview of Statistics DDDM9960

Statistics: Understanding the Basics
How can we prove what our "gut" feels?

*Steps to understanding a phenomena:
collect and consider the data
know when data is not useful/remembering that data often explains "what" not "why"
Often careful analysis of data can get us to the ROOT CAUSE.

It's All Important
Statistics - A set of tools designed to help describe the sample or population from which the data* were gathered and
to explain the possible relationship between variables.
(Creighton 2010, p. 2)

measures of variability illustrate how spread out data (scores) are along a distribution

statistical tools help collect, organized, analyze and interpret data

Descriptive statistics - summarize numeric characteristics of a sample

Examine relationships between two variables
Show direction and strength

Pearson Correlation Coefficient is most frequently used
Level of significance is universally set at p< .05, this is a linear relationship between paired data

Spearman correlations are used when variables do not meet required assumptions instead, Spearman's correlation coefficient is a statistical measure of the strength of a monotonic relationship between paired data
Regressions predict one variable based on another

Simple regressions have two variables - ex: SAT/ACT scores and GPA

Multiple regressions have several variables - ex: GPA and hours spent studying for ACT/SAT tests

"Must Haves": at least 20 cases for each regression

"Must Meets": linearity, independence of errors, homoscedasticity, normality
t-Tests are used to compare the means of two groups

Dependent - comparing the means within the same group/subjects, paired samples (ex:SAT scores for all 11th grade boys before and after study session)

Independent - comparing the means between two groups
(ex: mean SAT scores for all 11th grade girls, compared to all 11th grade boys before and after a study session)

In order to preform a t-Test the following assumptions must be met:
presence of interval/ratio data
ANOVAs vs. Chi-square Goodness of Fit
allows for the comparison of the mean difference between two or more samples (two means = two t-Tests)

MUST HAVES - independence of errors, homogeneity of variance, no outliers, normality

TYPES - one way ANOVA (compares multiple groups), repeated measures ANOVA (tracks changes over time),
two-way ANOVA (compares two independent variables)

*use with care...figure out first what you wish to discover.

Chi-square Goodness of Fit (or what do you do when you data is not normal?)

CsqGF - compares (observed) frequency count of a sample to the (expected) frequency count of a population

WHEN to USE? - with categorical (nominal) data, no interval or ratio data, or a one-option data-set

*this is a non-parametric test! Always use parametric statistics prior to using CsqGF

*remember the difficulties of data: it does have many uses, but there can be mistrust of data, measurement errors and challenges, and lack of time to reflect on collected data.
Research Design (the BIG PICTURE for the LITTLE DETAILS)
There are 5 main elements of research design - purpose/questions, literature review, METHODS, findings, conclusion

...and of course, MIXED METHODS

*want a quick illustration of correlation between two variables? Use a scatterplot
normal distribution (think bell curve)
*a note about validity and reliability

reliability = consistent responses (how consistent is the measure?)
validity = correct responses (how well does the tool measure what it is supposed to measure?)

scales of measurement:
(how are these variables differentiated?)
nominal - name/classification, mutually exclusive categories
interval - zero is a point on scale, ordered categories of the same size
ordinal - rank
ratio - the most precise scale, no absolute zero (zero is absence of measured quality), ordered sets

- to assign numbers to a variable using measures (questionnaire, survey, test...)

Creighton, Theodore. (2007). Schools and data, The
educator's guide for using data to improve decision making. Thousand Oaks, CA: Corwin Press.

Varga, Mary Alice. (2013). Notes for lecture on
Data-driven Decision Making. Retrieved from westga.view.usg.edu/dl2
ANOVA = analysis of variance
Full transcript