Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Statitical Decision Tree

No description
by

Kelsey Mariah

on 22 November 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Statitical Decision Tree

Interval/Ratio
Dependent Variable

Statistical Decision Tree
This is a decision tree designed to explain each step of selecting an appropriate statistic. If you are unsure how to make a decision, just zoom in and look a bit closer to see more details. Also, feel free to zoom out and try to get a better idea of how all these statistical analysis relate. :) Enjoy!
Sources
Nominal
Dependent Variable

Ordinal
Dependent Variable

1 Independent Variables with 2+ levels
2+ Independent Variables
0 Independent Variables
One-Way
MANOVA

Factor
Analysis

Between
subjects
design
Not a
between
subjects
design
2 groups
3 +
groups

Friedman
Wilcoxon or Sign Test
2 groups
3 +
groups
Kruskal-
Wallis

>20
per cell
<20
per cell
rank-
sum

Mann-
Whitney U

Looking for
the difference
between
expected and
observed?
Looking at the
relationship
between two
variables?
Chi square
goodness of fit

Chi square test
of independence

for Inferential Statistics
No Independent
Variables
Independent
Variables
Known
sigma
Unknown
sigma
independent
samples t-test
or
1-way between
subjects ANOVA
dependent
samples t-test
or
1-way within
subjects ANOVA
Multiple
Dependent Variables

2+ Independent
Variables
1 Independent
Variable
Independent Variables
are all
between
subjects
Independent Variables
are all
within
subjects
Independent Variables
are all
between
subjects
and

within
subjects
2-way (or higher)
between
subjects
ANOVA
2-way (or higher)
within
subjects
ANOVA
2-way (or higher)
mixed
ANOVA
Independent
Variable has
2 levels
Independent
Variable has
3 + levels
Between
subjects
design
Not a between
subjects
design
z-test
t-test
1 way ANOVA
between or
within subjects
An ordinal variable has:
two or more categories
no true zero
a ordering or ranking to the variables with varying distances between the options
Examples:
First place, second place, third place, etc.
Level of education (high school, undergraduate, graduate, etc)
Economic Status (low, medium, high)
A categorical (nominal) variable has :
two or more categories
no intrinsic 'order' to the categories
no true zero
Examples:
gender (male, female)
hair color (blonde, brunette, red, etc)
presence/absence
race (black, asian, latino, white, etc)
An interval variable has :
two or more categories.
Intrinsic order to the categories
Equal differences between categories
Examples:
Temperature in Celsius and Fahrenheit
IQ scores
Likert -type scales with 'equal' distances between categories (these are sometimes considered ordinal)
A ratio variable has :
two or more categories.
An intrinsic 'order' to the categories
Equal differences between categories
A true zero
Examples:
height
weight
Temperature in Kelvin only
True zero


A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured.
True zero


A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured.
True zero


A true zero typically refers to the absence of something. It is the point from which positive or negative numerical quantities can be measured.
Between Subjects


An experimental design with 2 + groups of subjects, each group is being tested under different treatment conditions.
Not Between Subjects


An experimental design that is not between subject is referred to as a within-subjects (aka: repeated measures) design. In within-subjects designs, the participants are exposed to multiple levels of the treatment conditions. This is also the type of measure used in longitudinal studies.
Per cell refers to per condition

Per cell refers to per condition

Sigma


Sigma show how much variation from the average there is (
Standard Deviation
). A low sigma indicates the data points are very close to the mean, whereas a high sigma indicates that the data points are spread out over a large range. The deviation is how far a given point is from the average.
Image source: David Chandler, MIT
http://web.mit.edu/newsoffice/2012/explained-sigma-0209.html
Sigma


Sigma show how much variation from the average there is (
Standard Deviation
). A low sigma indicates the data points are very close to the mean, whereas a high sigma indicates that the data points are spread out over a large range. The deviation is how far a given point is from the average.
Independent Variables


Independent variables are the causes or predictor variables in the experiment. Technically, they are only referred to as independent variables in a true experiment, in all other types of experimental design they are called the predictor variables as they predict the dependent variables.
Independent Variables


Independent variables are the causes or predictor variables in the experiment. Technically, they are only referred to as independent variables in a true experiment, in all other types of experimental design they are called the predictor variables as they predict the dependent variables.
This one gives great info on how to interpret statistical analysis in SPSS, R, SAS and Stata (this is my favorite statistical website I think, you should check it out) :)
http://www.ats.ucla.edu/stat/mult_pkg/whatstat/

Here is my second favorite (as this one is obviously my favorite) interactive decision tree:
http://www.microsiris.com/Statistical%20Decision%20Tree/default.htm

This is a glossary from the above interactive decision tree:
http://www.microsiris.com/Statistical%20Decision%20Tree/Glossary.htm#STATISTICAL MEASURE

I found this website particularly useful for a variety of things. It does a great job explaining power and effect size and such very succinctly:
http://onlinestatbook.com/2/introduction/levels_of_measurement.html
These are a few of my favorite
What it does:
With a Chi square goodness of fit test, we can see if the observed (actual experimental) results are different from the expected (hypothesized/ predicted) results.
What to report it:
You'll want to report the Chi-square value, degrees of freedom, and the p value.

What it does:
The Wilcoxon-Mann-Whitney test is the nonparametric version of the the independent samples t-test. It is handy because you don't have to assume that the dependent variable is normally distributed. (The dependent variable does have to be interval but is does have to be ordinal).
What to report it:
Make sure to report the z score and p value.

What it does:
With a Chi square test of independence, you can check for a relationship between two categorical variables. This test assumes that the value in each cell is higher than 5. If you don't meet this criteria, use Fisher's exact test (only for a 2x2 table)
What to report it:
You'll want to report the Chi-square value, degrees of freedom, and the p value.
Hint:
In SPSS, chi-squared test is under the crosstabs command

What it does:
With a One-way ANOVA, we can look at the differences in the means of the dependent variable (normally distributed, interval) broken down by levels of the independent variable (2+ categories, categorical).
What to report it:
The F value is important (as well as its p value). The means and standard deviations of the variables may also be worth reporting.

What it does:
You will want to use the sign test when the variable has just direction information (positive and negative) rather than ordinal. The Wilcoxon is used when there is direction (positive and negative) as well as magnitude information in the data.

(Like the Wilcoxon, this is also a nonparametric test)

What it does:
The Wilcoxon signed rank sum test is the nonparametric version of a paired samples t-test (so you don't have to assume that the difference between the two variables is normally distributed or interval). You can use it to compare the differences between two variables
What to report:
You'll want to report the z score and p value
What it does:
There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to independent samples t-test.
Using the independent samples t-test, we can test to see if a mean (for a normal distributed, interval, dependent variable) is the same for two independent groups.
What to report it:
You'll want to report the t value p value and perhaps the raw meas and standard deviations.
What it does:
There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to dependent t-test, also known as paired samples t-tests.
Using the dependent (paired) samples t-test, we can test to see if two means (normally distributed) differ from one another.
What to report it:
You'll want to report the t value p value.
What it does:
There are several types of t-tests. Independent, paired samples (dependent), and one sample t-test. This category is specific to one-sample t-tests.
Using the one-sample t-test, we can test to see if a sample mean is significantly different from the hypothesized value.
The t-test which follows a Student's t-distribution (versus the z-test, which follows a normal distribution. The t-test is also better for smaller samples than the z-test. The t-test is more commonly used.
What to report it:
You'll want to report the t value p value.
What it does:
The One-way MANOVA is similar to the ANOVA except that there are 2+ dependent variables and one categorical independent variable. It allows us to look at the differences between the dependent variables broken down by level of the independent variable.
What to report:
F values, degrees of freedom, and p values for each of the relationships. This test also provides the Type III Sum of Squares in the output
What it does:
Factor analysis is used to either decrease the number of variables or find relationships among variables. With this analysis, we can describe the variability among the variables in terms of factors (a factor is something that may encompass one or more of the variables together, helping to decrease the number of variables overall). This analysis shows the interdependence between variables. This analysis can also be used to test for the validity of a test.
*Note: Factor analysis is related to the principal component analysis (PCA) although PCA is a descriptive statistical technique.
*Note: This is a complicated analysis to explain and I would highly recommend looking at a book or taking some time online to really understand it well. With that said, it is a highly useful and often used analysis.
What to report:
Variance accounted for by each factor, commonalities (what they have in common... the opposite of uniqueness) for the factors, rotated factor loadings
http://www.ats.ucla.edu/stat/mult_pkg/whatstat/
What it does:
When using a within-subjects independent variavle with multiple level, you use the Fridman test to look for a difference in different scores. The null would be that there is no difference between the scores.
What to report:
You'll want to report the chi-square value and p-value.
What it does:
Using the z-test, we can test to see if a sample mean is significantly different from a provided constant number or provided mean when we know sigma. The z-test follows a normal distribution (versus the t-test which follows a Student's t-distribution). The z-test is also better for larger samples than the t-test; however, the z-test is less commonly used.
What to report it:
You'll want to report the z value p value.
What it does:
In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A
between
-subjects 2-way ANOVA just means that both the independent variables are between subjects.

What to report it:
You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.
What it does:
In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A
within
-subjects 2-way ANOVA just means that both the independent variables are within subjects.

What to report it:
You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.
What it does:
In a two-way ANOVA, you are comparing the mean differences between groups that are split on two independent variables. It allows you to look for an interaction between the two independent variables due to the dependent variable.

Here's a good review of the assumptions to make sure you meet before beginning: https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php

A
mixed-
subjects 2-way ANOVA just means that one independent variable is between subjects and the other is within subjects.

What to report it:
You'll want to report the F values (for main effects and interaction), degrees of freedom, and p-values.
Multivariate Multiple Linear Regression
What it does:
This is used to predict 2 or more dependent variables from two or more independent variables. It estimates a single regression model with multiple outcome (dependent) variables
What to report:
F values, degrees of freedom, and p values for each of the relationships. This test also provides the Type III Sum of Squares in the output.
Full transcript