**F distribution**

T distribution

Chi-square distribution

Normal distribution

T distribution

Chi-square distribution

Normal distribution

**Normal distribution:**

a theoretical continuous probability distribution of a random variable

a theoretical continuous probability distribution of a random variable

Level of measurement

Number of variables

Purpose

Nominal

Ordinal

Interval

Univariate

Bivariate

Multivariate

Descriptive

Inferential

**Statistics**

THE NORMAL

DISTRIBUTION

By Susana Muniz

AS 313

T-TH 3:00 - 4:00 pm.

To describe groups of people, countries, cars, and so on...

Frequency Distributions

Measures of Central Tendency

Measures of Dispersion

DESCRIPTIVE STATISTICS

INFERENTIAL STATISTICS

To make guesses/ inferences about a population from sample data

To know how much we can rely on such inferences

POPULATION

sample

**Normal distribution:**

a theoretical continuous probability distribution of a random variable

a theoretical continuous probability distribution of a random variable

Frequency distribution:

an arrangement of data that shows all the possible outcomes of a variable and the number of times each outcome is observed

**Probability distribution:**

an arrangement of data that shows all the possible outcomes of a variable and their probabilities.

an arrangement of data that shows all the possible outcomes of a variable and their probabilities.

Probability: The likelihood of something to happen. The likelihood of an outcome.

**%**

**%**

Probabilities go from 0 to 1.

0

means that the is impossible.

1

means that the outcome is certain.

e.g.

Probability of being struck by a lighting 0.000006

Probability of being in prison (males) 0.005

Probability of being female at born 0.5

Probability of dying for human beings 1

The larger the number, the more probable the outcome, and the smaller, the less likely to happen.

=

# of times something can happen

Total observations of all possible outcomes

=

f

n

**Discrete**

**Continuous**

**Binomial distribution**

Hypergeometric distribution

Poisson distribution

Hypergeometric distribution

Poisson distribution

**Normal distribution:**

a theoretical continuous probability distribution of a random variable

a theoretical continuous probability distribution of a random variable

**Continuous**

**SHAPE**

**Theoretical**

A characteristic or feature of a case (e.g. a person) can be:

+

Constant:

only 1 possible outcome. e.g. species: homo sapiens

+

Variable:

2 or more possible outcomes. e.g. gender: male/female

Random variable

: when the possible outcomes happen by chance.

**Normal distribution:**

a theoretical continuous probability distribution of a random variable

a theoretical continuous probability distribution of a random variable

A characteristic or feature of a case (e.g. a person) can be:

+

Constant:

only 1 possible outcome

+

Variable:

2 or more possible outcomes

Random variable

: when the possible outcomes happen by chance.

**Features:**

1) It is unimodal

2) Its mean, median and mode have the same value

3) It is symmetric about its mean

4) It is non-zero over the entire real line

1) It is unimodal

2) Its mean, median and mode have the same value

3) It is symmetric about its mean

4) It is non-zero over the entire real line

**50%**

**50%**

> 50%

<50%

Non symmetrical

**So how do we know if an empirical distribution looks like the normal distribution?**

FIRST OF ALL, BY VISUAL INSPECTION!

FIRST OF ALL, BY VISUAL INSPECTION!

**NORMAL OR NOT?**

Skewedness:

a measure of symmetry.

If it is normal, its value is Zero.

The less normal it is, the higher or lower (negative) the value is.

Kurtosis:

a measure of "peakdness"

If it is normal, its value is Zero.

The less normal it is, the higher or lower (negative) the value is.

**Skwedness= 0.61**

Kurtosis= -0.48

Kurtosis= -0.48

**Skwedness= -.14**

Kurtosis= 0.9

Kurtosis= 0.9

**Skwedness= 5.13**

Kurtosis= 27.19

Kurtosis= 27.19

**Normal distribution:**

a theoretical continuous probability distribution of a random variable.

Features:

1) It is unimodal

2) Its mean, median and mode have the same value

3) It is symmetric about its mean

4) It is non-zero over the entire real line

a theoretical continuous probability distribution of a random variable.

Features:

1) It is unimodal

2) Its mean, median and mode have the same value

3) It is symmetric about its mean

4) It is non-zero over the entire real line

The standard normal distribution instead of using a scale such as years, income, scores, and so on, it uses STANDARD DEVIATION UNITS or Z SCORES

Z-scores:

it is a value in a distribution expressed as the distance from the mean, in standard deviation units.

**Why do we bother in transforming original scores into Z-scores?**

**Because we want to estimate probabilities!**

We want to know how likely or unlikely is something to happen!

We want to know how likely or unlikely is something to happen!

**% ≈ proportions ≈ probabilities**

The areas under the normal curve are constant.

Between the mean (z=0) and +1 standard deviation (z=+1) there will be always 34.13% of the total area under the normal curve.

Below a score

Between two scores

Outside two scores

**Thank you.**

www.prezi.com

Find this presentation at:

If you have any question, come to my office hours!

MSc. Susana Muniz

AS 313

T - TH

3:00 to 4:00

0.45

0

0.25

**Perfect**

**Empirical**

Distribution

Distribution

**Theoretical Distribution**

Perfect

In the normal curve the probabilities of finding values between a specific range of standard deviations are constant.

X Z

It comes from real data. It refers to a real phenomenon

It does not refer to a concrete phenomenon. It is an abstraction. A model.

An empirical distribution will never be normal. It will look like normal, or it will be fairly normally distributed, it will be called normal-like, almost-normal... but it will never be completely perfectly normal

**Standard normal distribution**

Age 4 11 18 25 32 39 46

(years old)

(points)

Exam Scores 60 90 120 150 180 210 240

Std. Deviations -3 -2 -1 0 1 2 3

X=25, s=7

X=150, s=30

(Z scores)

Z =

X- X

s

Z = 32 - 25

7

= 7

7

Z = 1

Find the value of Z for X=32 years old

Find the value of Z for X =210 points

Z = 210 - 150

30

= 60

30

Z = 2

Z scores

Find the value of Z for X=90, given that X= 150 and s=30

Z =

X- X

s

Z = 90- 150

30

Z = - 60

30

Z = -2

X=90

What is the probability

of X=90 or lower?

Between the mean (z=0) and +1 and -1 standard deviations, there will always be 68.26% of the area.

68.26%

P= 0.3413

P= 0.6826

2.14% + 0.13% = 2.28% or

P= 0.0228

What is the probability of

X= 90 or higher?

13.59+ 34.13 +34.13 +13.59+2.14+0.13 = 97.72% or P=0.9772

also 100 - 2.28= 97.72% or

P= 0.9772

Above a score

Find the probability of finding a value between X=100 and X=130.

X=150 and s=30

Z= X-X

s

Z= 100-150

30

Z= -157

30

Z= -1.67

1)

2) Z= 130-150

30

Z= -20

30

Z= -0.67

Look at

The Table!!!!

Figure A.1 Area between Mean and Z

Z (b)

Area between...

...

1.65 0.4505

1.66 0.4515

1.67 0.4525...

Figure A.1 Area between Mean and Z

Z (b)

Area between...

...

0.65 0.2422

0.66 0.2454

0.67 0.2486...

**0.4525 - 0.2486 =**

What is the value of X at the 90th percentile?

X=150, s=30

10%

90%

20%

30%

40%

60%

70%

80%

Look at

The Table!!!!

Figure A.1 Area between the mean and Z...

Z (b)

Area between...

1.27 0.3980

1.28 0.3997

1.29 0.4015

1.30 0.4032...

X = X+Z(s)

0.40

Median = 50% percentile

Z?

Z= 1.28

X= 150 + 1.28 (30)

X= 150 + 38.4

X= 188.4

X = X- Z(s)

What is the value of X at the 30th percentile?

0.50

0.30 0.20

Z (b)

Area between

0.51 0.1950

0.52 0.1985

0.53 0.2019

Z= 0.52

X = 150 - 0.52 (30)

X = 150-15.6

X= 134.4

**Let's do some SPSS!**

--> Explore --> Susana Muniz --> The normal curve

0.2486

Z= -1.67

X= 100

Z= -0.67

X= 130

0.4525

Now what do I do?

0.2486

The normal curve.

Definition

Characteristics.

Z-scores.

What are they?

How to transform X to Z-scores and vice-versa

Probabilities

How to calculate probabilities from Z-scores

How to know X and Z-scores from probabilities

Look at

The Table!!!!

Table. Area under the Normal Curve

FIGURE A.1 Area Between Mean and Z

FIGURE A.2 Area Beyond Z

b b

c c

(b) (c)

Area Area

Between Beyond

Z Meand and Z Z

0.00 0.0000 0.5000

0.01 0.0040 0.4960

0.02 0.0080 0.4920

0.03 0.0120 0.4880

...

Find P for Z=-2.00

P=0.0228

2.00

0.4772

0.0228

0.5000

0.5000

P= 0.9772

0.2039

**Empirical**

Above the mean

Below the mean