The purpose of Statistics is to

yes or no?

what's going on?

what's the formula?

DECIDE

DESCRIBE

PREDICT/EXPLAIN

ESTIMATE

what's this number?

hypothesis tests (p-values)

descriptive statistics: graphs and basic numbers

modelling and regression

confidence intervals

How can I calculate a student's grade based on their number of hours of sleep during semester?

What sorts of things might be related to whether a person does volunteer work?

What sort of relationship might the amount of sleep a student gets have with their grades?

What possibilities are there for body temperature after a meal with or without chilli?

How many chapters do novels have?

How can I explain a person's body temperature after a meal using their temperature before and the chilli content of the meal?

How can I use a person's gender, age, income and religion to predict their chances of participating in volunteer work?

Does getting more sleep affect a students’ grades?

Are women more likely to participate in volunteer work than men?

Is your body temperature higher after a meal if it has chilli in it?

Is the median number of chapters in a novel 20?

Know the

type of question

and you can choose what

type of statistics

...

The purpose of Statistics is to

**ANSWER QUESTIONS**

using DATA

using DATA

Know more

about your data

and you can choose what

statistical method

...

VARIABLES IN

THE DATA

HOW THE DATA

IS COLLECTED

HOW MUCH DATA

what type?

what is done to the subjects?

when is information recorded?

how are the subjects chosen?

lots of things recorded

per subject?

lots of subjects?

what distribution?

VARIABLES (things you record)

Categorical / Qualitative

Nominal

Ordinal

Numerical / Quantitative / Scale / Interval

Continuous

Discrete

(numbers: how far apart has meaning)

(words: how far apart has no meaning)

(measured)

(counted)

(ordered: "more" or "less" has meaning)

(names: "more" or "less" has

no

meaning)

**Some information you need to know to figure out what stats you need.**

**THE RIGHT QUESTIONS ABOUT STATISTICS**

**Maths Learning Centre**

how to measure?

DISTRIBUTION of NUMERICAL data

(how the possible values are spread out)

Skewed or worse

Approximately normal

A parametric test

will be fine

A non-parametric test might be more appropriate

**KNOW**

YOUR QUESTION

YOUR QUESTION

**KNOW**

YOUR DATA

YOUR DATA

HOW TO ORGANISE DATA

becomes...

missing data?

**The purpose of Statistics is to**

**ANSWER QUESTIONS**

using DATA

using DATA

**...**

the question has to be

specific

...

the data has to be

specific

the question has to be

specific

...

the data has to be

specific

Statisticians say:

"PLEASE make it consistent!"

defining groups or measurements?

**ANSWER QUESTIONS**

using DATA

using DATA

WHAT CATEGORIES DEFINE

Independent

Groups

Repeated

Measures

(matched pairs)

On average, how much of an effect does 30 minutes more sleep have on a students’ grades?

How much more (or less) likely is a woman to participate in volunteer work than a man?

How much higher is your body temperature after a chilli meal compared to one without?

What is the median number of chapters in a novel?

Actually it's a way to cope with VARIATION

Statistics helps to find descriptions and explanations for how things VARY, using the information from data.

This means we can use it to answer questions in situations where things do vary.

Questions are about CONCEPTS

About the relationship between concepts

About a single concept

Are science fiction books thicker than other books?

Does getting more sleep affect a students' grades?

Are women more likely to participate in volunteer work than men?

What is the median number of chapters in a novel?

How much salt is there in an avocado?

What percentage of children finish all their homework?

**KNOW**

YOUR STATS

YOUR STATS

**This information helps you choose which stats to use.**

Note: This probably doesn’t matter if you have a lot of data.

Likert scale

A non-parametric test probably won't work -- might have to treat as categorical.

OBSERVATIONAL OR EXPERIMENT

Observational study:

the only thing you did to the subjects while you were watching them was record information about them.

Experiment:

you made a choice at least once to do something that might influence the outcome (possibly you did this randomly)

RANDOMNESS

Random selection:

you chose the subjects randomly from a population, or at least they are independent of each other.

Random allocation:

you chose which subject got what treatment randomly.

OUTCOMES and EXPLANATORY VARIABLES

outcome or explanatory?

Explanatory

Variables

Outcome

Variable

(also known as

predictor variables or

independent variables)

(also known as

response variable or

dependent variable)

Hypothesis tests are designed to answer yes or no questions.

One of the answers (yes or no) is called the NULL HYPOTHESIS.

The p-value can be thought of as a measure of how consistent your data is with the null hypothesis, so a LOWER p-value is STRONGER evidence against the null hypothesis.

The SIGNIFICANCE LEVEL is the cutoff for the p-value where you decide the answer is yes or no. This is called REJECTING or RETAINING the null hypothesis.

0.01

very strong evidence

0.05

0.10

strong evidence

some evidence

no evidence

reject

retain

significance level

Confidence intervals are designed to answer "what's the number" questions.

They give a range of possible values for what a number in the population could be.

They list all of the values that mathematically seem consistent with the data.

APPEARS CONSISTENT

WITH DATA

INSIDE

NOT CONSISTENT

WITH DATA

OUTSIDE

NOT CONSISTENT

WITH DATA

OUTSIDE

HYPOTHESIS TESTS

CONFIDENCE INTERVALS