**Statistics for Social Research**

**Professor McCabe**

**Measurement and**

the Collection of Data

the Collection of Data

**Describing Data and Distributions**

Visualizing Quantitative Data

**Introduction to Probability Theory**

Concept: Observation and the

Unit of Analysis

Concept: Variable

Concept: Levels of Measurement

1. Nominal - Categories, Unranked

2. Ordinal - Categories, Ranked

3. Interval - Continuous, no true zero

4. Ratio - Continuous, true zero

A. Population of Towns in Pennsylvania

B. Religious Denominations

(Christian, Jewish, Muslim, Buddhist)

C. Grade in High School

(Freshman, Sophomore, Junior, Senior)

D. Grade on the Final Exam

(100, 99, 98, 97, etc.)

A. Genres of movies

B. Average temperature in DC in August

C. Number of students in each major

at Georgetown

D. Ways to cook your steak

(Well-done, medium-well, medium, etc.)

A. Feeling Thermometer

(between 0 - 100, how do you feel about ... )

B. Positions in the Church Hierarchy

(e.g., Bishop, Archbishop, Pope, etc.)

C. Marital Status

(e.g., Married, Single, Divorced, Widowed)

D. Amount of money in your wallet

**Concept: Three Key Measures**

of Central Tendency

of Central Tendency

Mean

Median

Mode

**Concept: Mean**

**Concept: Median**

**Concept: Mode**

Concept: Collecting Data

Concept: Reliability

Concept: Validity

**Introducing Statistics**

**Samples and Populations**

**Confidence Intervals**

**Statistical Inference**

and Hypothesis Testing

and Hypothesis Testing

**Comparing Means:**

t-tests and ANOVA

t-tests and ANOVA

**Relationships of Nominal Variables**

**Correlation and**

Linear Regression

Linear Regression

**Introducing Multiple Regression**

**Thinking about Causality**

in Social Research

in Social Research

**Reading Social Research**

Over 3 percent of people in Washington are HIV+.

More than 75 percent of people with HIV are African-American, even though African-Americans account for less than fifty percent of the population of Washington, DC.

Last year, Washington, DC supported 110,000 tests for HIV, nearly triple the number of tests supported in 2006.

(Statistics from the DC Department of Health)

The team average batting average for the Washington Nationals last season was 0.258.

For players with at least forty at-bats, the batting average ranges from 0.318 (Jayson Werth) to 0.122 (Gio Gonazles).

Four players have at least 100 hits and a dozen home runs.

(Statistics from the Washington Nationals website)

The more churches in a city, the more crime there is. Thus,

churches lead to crime.

17

A major flaw is that both increased churches and increased crime rates

can be explained by larger populations. In bigger cities, there are both

more churches and more crime. This problem, which we will discuss

in more detail in a later module refers to the third variable problem.

Namely, a third variable can cause both situations; however people

erroneously believe that there is a causal relationship between the two

primary variables

Statistics refers to more than just the numbers and facts that are collected and reported; instead, statistics refers to the set of tools and techniques social scientists use for

collecting

,

describing

,

analyzing

,

and

interpreting

data about the world around us.

- How do we collect information on the HIV+ population in DC? (Hint: We don't know the HIV status of every person; instead, we make decisions about how to select a sample, make a claim about the population, etc.)

- How do we decide what baseball statistics to collect, record and make decisions based on? For every statistic we track, there are plenty of others that we don't track, or at least don't use in our decision-making process (e.g., when fans decide who should join All-Star teams, reporters decide who is "hot," or management decides who to sign).

Become thoughtful, careful young social science researchers, able to consider ways of collecting and analyzing data, and using quantitative data to make claims about the social world.

Become critical consumers of statistical information, questioning the source, content and claims of quantitative data from the world around you.

Course Goals:

•Identify ways that social scientists collect, describe and analyze data about the social world;

•Create and critically evaluate visual displays of information, including charts, graphs and other visual tools;

•Explain the importance of sampling for making statistical inferences about broader populations;

•Conduct various statistical tests for evaluating the relationship between variables;

•Differentiate between correlation and causation, recognizing the importance of causal inference to social research, as well as the limitations of generating casual estimates;

•Consume statistical information in your everyday lives with a critical eye toward the source of that data and the legitimacy of the research claims.

Why use statistics?

1. ... to describe events (Descriptive Statistics)

2. ... to make inferences about a population (Inferential Statistics)

3. ... to observe patterns or relationships among variables.

4. ... predict future events, or quantify the likelihood of a particular event occurring.

Wait! Can I do statistics if I'm not very good at math?

Yes! For this course, I will assume only a basic set of math skills.

Addition, Multiplication, Division

Squares, Exponents and Square Roots

Basic Linear Algebra

Some of the notation may be unfamiliar.

More important than your math skills, I think, are developing your critical thinking and analytical skills.

"There are three kinds of lies -

lie, damned lies, and statistics."

- Mark Twain, quoting British Prime

Minister Benjamin Disraeli

Descriptive Statistics

1. Winning times for the Olympics marathon.

2. Racial demographics of the Georgetown student body

3. The percentage of Washington, DC public elementary students passing their standardized exams each year.

Inferential Statistics - Using a sample, or a subset of a larger set of data, to draw inferences about the larger population

1. To test whether women approve of President Obama at higher rates than men.

2. To test whether attitudes on social issues (e.g., abortion, gay marriage or racial profiling) reach a certain threshold.

Concept: Independent

vs. Dependent Variable

1

Finding Statistics in Everyday Life

2

3

4

Well, what are statistics?

(And can I do statistics if I hate math?)

Fine, but why would I use statistics?

What should I expect to get out of this course?

Professor Smiley

Professor Sweater Vest

Professor Shabby Tie

**Professor**

Smiley

5 5 5 5 4

4 4 4 4 4

3 3 3 3 3

3 2 2 2 2

Smiley

5 5 5 5 4

4 4 4 4 4

3 3 3 3 3

3 2 2 2 2

Professor

Sweater Vest

5 5 4 4 4

4 4 4 4 4

4 4 4 4 4

4 4 4 3 3

Professor

Shabby Tie

5 5 5 5 5

5 5 5 5 5

5 3 3 3 1

1 1 1 1 1

Displaying Data Badly

Question: Can you explain the public

option (in the health care debate)?

**Blackjack**

**Roulette**

Probability Question #1:

If I randomly pick one card from a deck, what is the probability of picking the Ace of Spades? What is the probability of picking either the two of hearts or the Ace of Spades? What are the odds of picking a card other than an Ace?

Probability Question #2:

Assuming the chances of having boys and girls is the same, what are the chances my first child will be a boy? Assuming my first child was a boy, what are the chances my second child will be a boy? Assuming my first and second child are both boys, what are the chances that my third child will be a girl?

Probability Question #3:

Assume that I plan to have three children. Before having any children, what are the chances that my first child will be a boy, the second child will be a boy, and the third child will be a girl? Now, regardless of the order, what is the probability that I will end up with two boys and a girl?

Probability Question #4:

The World Series is played until a team wins four games (with the maximum number of possible games being seven). Assuming that each team is equally likely to win each game, and that the games are independent events, what is the probability of having a four-game, five-game, six-game and seven-game World Series?

Length Theoretical Possibility Expected Number (out of 92) Actual Number

4 1/8 11.5 18

5 1/4 23.0 20

6 5/16 28.8 20

7 5/16 28.8 34

48% of Americans approve of the job

the United States Supreme Court is doing

(Fox News Poll, April 2012, MoE: 3 points)

Only 23% of Americans are satisfied

with the direction of the country.

(Gallup Poll, August 2012, MoE: 4 points)

52% of Americans support taxing junk food

and using the money for programs aimed at

fighting obesity.

(Washington Post Poll, July 2012, MoE: 2 points)

45% of Americans support limiting the number

of guns an individual can own.

(CNN Poll, July 2012, MoE: 3 points)

Polls for the presidential election last year showed that the race was within the margin of error.

At one point, President Obama held an eight-point lead among women, but Governor Romney held an eight-point lead among men.

Polling organizations eventually switched from a sample of registered voters to a sample of likely voters.

(Statistics from Gallup, July 30-August 19)

What is the "mark" of a criminal record? Does evidence of a criminal record change the likelihood that job applicants get interviews?

Are homeowners more likely to vote, volunteer or participate in community organizations than renters?

How have individual donors to political campaigns changed over the last thirty years? Are they more partisan? Do they give more money? Do they give to a larger number of candidates?

Are there differences in outcomes (e.g., educational success, behavioral problems, etc.) between children that grow up in stable, two-parent heterosexual households vs. those that grow up in stable, two-parent same-sex households?

How much does class size matter for kindergarten students? Do students perform better on standardized tests when they learn in small classrooms?

(Available from Hoya Computing)

When we collect information, each of the individuals or subjects in our research represents a unique

observation

.

In research, we need to think about the unit we are analyzing. At what

level

do we collect, analyze and measure data?

e.g., households vs. individual (income)

e.g., colleges vs. college students

Variables

are the characteristics that vary from one observation to another.

Hair color

Eye color

Favorite movie

Worst fear

College GPA

Number of siblings

Annual salary

Gini coefficient

Length of the coastline

Whether country has nuclear weapons

Continent

GDP (or GDP per capita)

Average lifespan for residents

Color of house

Number of occupants

Sales price

Year built

Number of bedrooms

Dependent variables

are the variables whose variation we are trying to explain.

Why do some students score better on standardized tests than other students in DC public schools? (Standardized test scores)

Why do some people make more money than other people? (Income)

What explains why people have different BMIs (or weights, relative to their heights)? (BMI)

Independent variables

are the variables used to predict variation in the dependent variable, or those that are related to the dependent variable.

Does a student's race predict their performance on standardized tests? (Race of student)

Do people with more education tend to make more money? (Level of education)

Does your proximity to a local grocery store predict your BMI? (Distance to grocery store)

Discrete

Continuous

Measurement

Good measurement must be ...

1. Reliable.

2. Valid.

3. Exhaustive (Discrete)

4. Mutually Exclusive (Discrete)

Reliability

refers to the consistency of a measure, whether it produces the same result across time.

Validity

refers to whether the measurement you use actually gets at the concept you're trying to measure.

Concept: Exhaustive

For discrete (ordinal or nominal) variables, we want to make sure that the response options cover all possible outcomes. When all potential responses are included in one of the categories, we can say that the response options are

exhaustive.

For discrete (ordinal or nominal) variables, we want to make sure that all response options fit into one and only one category. In other words, there is no ambiguity about how a response should be coded. When responses fit into one and only one category, we can say that the response options are

mutually exclusive.

Concept: Mutually Exclusive

Concept: Measurement Error

Concept: Organizing Data

Concept: Types of Data

How do social scientists collect data about the social world?

Experimental studies vs. Observational studies

Natural experiments

Controlled experiments

Surveys

Administrative Data

Content analysis

1. Cross-sectional data

the Social Capital Community Survey

a survey conducted by GUSA about campus facilities

A Survey Monkey survey you completed.

2. Repeated cross-sections

the General Social Survey

the American National Election Survey

3. Longitudinal (or Panel) data

the Panel Study of Income Dynamics (PSID)

annual World Bank country indicators

When we organize data for statistical analysis,

we typically organize the

observations in rows

and the

variables in columns

.

Statistical programs, including Minitab, Excel, SPSS and Stata, should make this organization intuitive.

Data (and particularly data collected in a survey) typically comes with a

codebook

that describes the content of the dataset in more detail. The codebook includes information on how the data were collected, the response options for discrete categories, ways the data are coded, information on missing values, etc.

Concept: Frequency Distribution

The

frequency distribution

is a way of understanding all of the observations that share a common property. It displays the

frequency

- or the number of times - that a particular property occurs among observations.

Imagine that in a class, there are eleven seniors, fourteen juniors and one sophomore.

(cc) image by anemoneprojectors on Flickr

Concept: Percents, Proportions,

Rates and Ratios

The

proportion

is the number of items in a group relative to the number of items in total. It is expressed in decimal form.

The

percent

is simply the proportion multiplied by 100.

The

ratio

expresses the comparison of one subgroup to another subgroup (rather than one subgroup to the whole).

Construct a Frequency Table

(Because the variable is nominal, rather than ordinal, you don't need to include the cumulative frequency or the cumulative percentage.)

Number of Medals Won in the 2012 Olympics by Continent

446 - Europe

238 - Asia

166 - North America

34 - Africa

29 - South America

48 - Oceania

Concept: Percentiles

The

percentile

is the value of a variable below which a certain percentage of observations fall. For example, the 25th percentile would be the score or value below which one-quarter of scores (on an exam, for example) fall.

Common percentiles include quartiles (25/50/75), quintiles (20/40/60/80) and deciles (10/20/etc.)

Review: In a class of twenty students, exams on the midterm were as follows:

76 78 78 80 82 82 84 86 89 89

90 92 92 92 93 93 94 96 96 98

Using five-point intervals (e.g., 70-75, 76-80, 81-85, etc.), create a frequency table to describe scores on the midterm. Include both the cumulative frequency and the cumulative percentage.

The measure of

central tendency

are used to tell us something about the normal, typical or average score in a distribution of scores.

Concept: Distribution

The distribution tells us about the frequency that scores occur within any dataset. It lays out and clarifies the set of scores from the data. In some cases, like an exam of twenty students, it is easy to see all the numbers in the distribution. In other cases, there may be too many observations to actually see all the values in the distribution.

But what if we looked at the amount of money every individual donated to political candidates in the 2010 election? There would be millions of observations, and writing them all out would be tedious, boring and unnecessary ...

$25

$200

$35

$100

$60

$60

$55

$320

$100

$105

$140

$20

$10

$100

$200

$330

(Yawn, this is boring)

The

mean

is equal to what you colloquially think of as the

average.

It is equal to the sum of the scores divided by the total number of scores.

76 78 78 80 82 82 84 86 89 89

90 92 92 92 93 93 94 96 96 98

Sum of scores = 1760

Number of observations = 20

Mean = 1760/20 = 88

Notes: The mean only be used on continuous variables; it can't be used to understand nominal variables. The mean is skewed by outliers. Outliers are scores that are extremely large or extremely small, relative to the rest of the distribution.

Imagine that the lowest two scores were 10 and 15, rather than 76 and 78.

Sum of scores = 1631

Number of observations = 20

Mean = 1631/20 = 81.55

**Concept: Weighted Mean**

When you're combining group means from groups of different sizes, you can't simply average the means! Instead, you need to take a

weighted mean,

weighted by the size of each group.

Group 1: Women

N = 12

Mean Exam Score = 84.5

Group 2: Men

N = 8

Mean Exam Score = 93.25

If I asked you the mean exam score for the class, you can't simply average the two scores.

It is not just (84.5 + 93.25)/2=88.75.

Instead, you must weight each mean by the number of observations and take the weighted mean.

The proper calculation is ((84.5*12)+(93.25*8))/20=88

The

median

is the

middle score

in an ordered distribution. It is the score that divides the distribution equally in half.

76 78 78 80 82 82 84 86 89 89 92 92 92 93 93 94 96 96 98

The

mode

is the score that occurs most frequently in the distribution.

76 78 78 80 82 82 84 86 89 89 90 92 92 92 93 93 94 96 96 98

Notes: The median is insensitive to other scores in the distribution. Again, you can't use the median with nominal variables.

76 78 78 80 82 82 84 86 89 89 92 92 92 93 93 94 96 96 98

26 28 28 30 32 32 34 36 39 89 92 92 92 93 93 94 96 96 98

The median is the same, even though the bottom nine scores in the distribution have changed substantially. (What would happen to the mean in this example?)

Notes: The mode can be used to talk about the most frequent score, but it doesn't tell us anything about the scores that occur around those scores!

Question: How do we pick between the measures of central tendency? When is the mean the best measure, or when is the mode or median the best?

We often look at course evaluations to determine the "best" professor to take. For each of the following three professors, calculate the mean, median and mode of their course evaluations. Looking at the data, interpret the meaning of these measures of central tendency.

**Statistics Lectures**

Introducing Statistics

Measurement and the Collection of Data

Describing Data and Distributions

Visualizing Quantitative Data

Introduction to Probability Theory

Samples and Populations

**End!**

**Concept: Measures of Dispersion**

(or Measures of Variability)

(or Measures of Variability)

When we have a distribution of scores, the

measures of dispersion

(or variability) tell us how the scores are spread around the mean (or another measure of central tendency). In doing so, these measures tell us about the shape of the distribution. Instead of describing the scores, as we do with a measure of central tendency, we are now concerned to describe the way the scores are spread relative to each other.

**Concept: Range**

The

range

is simply the distance between the minimum and the maximum score in a distribution.

Exam 1: 89-80 = 9

Exam 2: 99-72 = 27

**Concept: Deviation**

The

deviation

represents the difference between any particular observation (xi) and the mean. For each observation, the deviation tells us the distance from that observation to the mean. It can be positive (if the score is greater than the mean) or negative (if the score is less than the mean).

**Concept: Variance**

To

variance

is the average sum of the squared distance from each score to the mean. On its own, the variance is not a particularly useful statistic, but it is an important step along the way.

To calculate the variance ...

1. Calculate the difference between each score and the mean.

2. Square each difference (or deviation). (Note: Squaring them makes all values positive!)

3. Add up the squared differences.

4. Divide by the number of observations.

1. Difference between each

score and the mean

2. Square each difference!

(Notice they're all positive!)

3. Add up the squared differences

4. Divide by the number of observations

**Concept: Standard**

Deviation

Deviation

To calculate the

standard deviation

, simply take the square root of the variance. The standard deviation is the central statistic that tells us how the scores are spread around the mean in a distribution.

Exam 1 - Standard Deviation

Exam 2 - Standard Deviation

The standard deviation is lower for Exam 1 - where the scores were all bunched closer to the mean - than for Exam 2 - where the scores were spread farther away from the mean.

Example: There are two judges, both of whom sentence criminals charged with misdemeanors. While the mean sentence both judges give the same - 18 months in jail - the standard deviation is very different. One has a very small standard deviation, while the other has a very large standard deviation. What does this mean?

Rules about the standard deviation:

1. Standard deviation is always greater than or equal to zero!

2. Standard deviation is only equal to zero when the all the values in a distribution are the same; in other words, when there is no variation is scores!

3. The greater the variability is scores around the mean, the greater the standard deviation.

**Concept: Normal Distribution**

**Concept: Z-Scores**

(or standardized scores)

(or standardized scores)

**Concept: Skewed**

Distributions

Distributions

Metaphor: You can't compare apples and oranges?

ACT vs. SAT

Both college entrance exams.

However, one point on the ACT (ACT point) does not equal one point on the SAT (SAT point).

Different units of measurement.

In order to compare them, you need to standardize the measurements.

A student who scored 2 standard deviations above the mean SAT score got a 1,100 on the exam.

A student who scored 2 standard deviations above the mean ACT score got a 24 on the exam.

The student who scored a 1,100 on the SAT did as well, relative to the average grade, as a student who scored a 24 on the ACT. They are both 2 standard deviations above the mean.

Now that we know about the standard deviation, we can begin to think about standardized scores, or

Z-scores

.

To simplify the concept, you can consider that every score can be represented in two ways - as a

raw score

or as a

standardized score

(Z-score).

The raw score is the score in the original units of measurement.

You weight 140 lbs.

You scored 85 points on the exam.

You have an IQ of 105.

You have $1,250 in your bank account.

You are 72 inches tall.

These scores are all presented in their original units of measurement - pounds, exam points, IQ points, dollars or inches. Note that these scores tell us nothing about how someone scored

relative to everyone else in the distribution

.

Each of these scores also has a corresponding standardized score, expressed as the number of standard deviations the score falls from the mean score in the distribution.

Your weight is 1.2 sd above the mean.

Your exam score is 1 sd below the mean.

Your IQ is 0 sd from the mean (meaning you have the average IQ score).

Your bank balance is 2 sd below the mean.

Your height is 0.2 sd above the mean.

Note that these scores are all expressed in standard deviation units. We can claim that your height falls much closer to the average height than your bank balance, which is actually pretty far from the average bank balance (even though height is measured in inches and bank balances are measured in dollars)!

Calculating the Z-Score

Example: The average score on the Physics midterm was an 82. Smart kid that you are, you scored a 92. The professor calculated the standard deviation and told you that it was 6. What is your Z-score?

Wait! I've calculated a Z-Score, showing that I scored 1.67 standard deviations above the mean on the Physics exam, but what does that actually mean?

Question: On the next exam, you again score a 92 and the class average is again an 82. However, the standard deviation has changed to 10. What does this mean for the spread of scores on the exam? What does this mean for your score, relative to the other scores?

This is the most common curve you will see in statistics. It is called the normal distribution.

68%

95%

68 percent of scores fall

within 1 sd of the mean!

95 percent of scores fall

within 2 sd of the mean!

99 percent of scores fall

within 3 sd of the mean!

99%

The

normal distribution

- also known as the bell curve - is a symmetrical curve defined by two statistics - the mean (mean = mode = median) and the standard deviation. In the curve, half of all observations fall above the mean and half fall below the mean. Many social phenomenon (e.g., intelligence, height, etc.) approximately follow the normal distribution. As you get farther from the mean score, you will find fewer and fewer observations.

The three rules of thumb!

2.10 2.40 2.70 3.00 3.30 3.60 3.90 Raw Score

The GPA Distribution of Students at College X

What is the average GPA?

What is the standard deviation?

For a student with a GPA of 3.45 (raw score), what is her standardized score?

If a student has a GPA 1 standard deviation below the mean, what is his GPA?

What percentage of GPAs fall between 2.70 and 3.30?

95 percent of GPAs fall between what raw scores?

Would it be common to find a student with a GPA of 2.00 or below?

When scores are normally distributed, the right tail of the distribution is the same length as the left tail of the distribution, and the mean=median=mode.

However, we will sometimes find social phenomenon that are not normally distributed. This is because some observations have extremely high or extremely low scores, thereby making it so that the mean, median and the mode are not equal to one another.

Example of a Right Skew (or Positive Skew): In the United States, income is not normally distributed because some people make millions of dollars. What happens to the mean, median and the mode when you have some outliers at the top of the distribution?

Example of a Left Skew (or Negative Skew): On a final exam, a handful of students do really poorly, getting extremely low grades relative to everyone else. What happens to the mean, median and the mode when you have some outliers at the bottom of the distribution?

16 Books

37 Books

Between 2008 and 2013, the graphical representation of books increased by 131%. It more than doubled, from 16 to 37.

However, tuition increased by only 16 percent - from $47,908 to $55,640.

Thus, the visual representation is extremely misleading.

30 stick figures

13 stick figures.

11.5 stick figures.

18 stick figures.

The ratio of stick figures from 2013 to 2015 (11.5: 30) makes it look like the yield nearly tripled! In fact, the yield went up by only 15%.

Rules for Displaying

Data Well

Concept: Bar Graphs

Concept: Line Graphs

A

bar graph

is a visual display of discrete categories (either nominal or ordinal) where the

length of each bar

represents the

percentage of frequency

of a category.

Title: The percent of people (age 12 or older) who report using illicit drugs last month, by type of county.

Source: 2010 National Survey of Drug Use and Health

Source: 2010 National Survey of Drug Use and Health

Number of users (age 12 and older) with dependence of abuse, by drug type.

**Concept: Histograms**

A

histogram

is a visual display for

continuous data (interval/ratio)

where the scores are presented along one axis and the frequency (or percentage) of that score is presented along the other axis. Often, continuous data are recoded into categories before the construction of a histogram (e.g., a continuous GPA may be recoded into intervals of 0.10).

Average annual count of evicted tenants, by gender and neighborhood racial composition

Source: Desmond 2012

How could you improve

the quality of this graph?

Predicted Probability of Trusting Various Social Groups, by Homeownership Status

Source: McCabe 2012

A

line graph

is a visual display of data typically used to track a social phenomenon across time, or some other continuous measure.

Concept: Pie Graphs

Pie Charts are good for making representations of Pac Man, but aren't particularly good for displaying statistical information. The reason is two-fold. First, and most importantly, pie charts (like bar charts or histograms) can tell us about the relative relationship between two variables, but tell us nothing about their frequency. Second, it is often difficult to correctly visualize the relative size of a piece of the pie.

Which Type of Visual

Tool Should I Use?

6. I want to show the distribution of GPAs for the students at Georgetown.

Using Statistical Tools to Describe and Visualize Quantitative Data

Open and describe the data (e.g., the number of variables, observations, missing values, etc.).

Sort the data according to particular variables in the data.

Recode continuous measures into discrete measures (e.g., continuous age measure into categorical age measure).

Get basic descriptive statistics (e.g., measures of central tendency, measures of dispersion).

Create data visualizations (bar charts, histograms, line graphs, etc.)

How do you play roulette (and why is does that woman look like she's having so much fun)?

**Concept: Probability**

Probability

refers to the likelihood that a particular outcome will occur over a long sequence of observations. It is equal to the proportion of times we expect a particular event over a large number of trials.

Notation:

P(A)

refers to the probability of event "A" occurring. For example, in the flip of a fair coin,

P(Head) = 0.50

. In a class of twenty-five students where ten of the students are sophomores, the probability of picking a sophomore when randomly selecting a student =

P(sophomore) = 0.40

.

**Concept: Probability**

Rules (or Probability

Rules!)

Rules (or Probability

Rules!)

1. The probability of an event occurring equals the number of successful outcomes divided by the total number of possible outcomes.

P(A) = Number of Successful Outcomes/Number of Total Outcomes

2. The probability of an event occurring always ranges between 0 and 1.

3. Converse Rule: The probability of an event not occurring is equal to 1 minus the probability of that event occurring.

P(not A) = 1-P(A)

4. Addition Rule: If A and B are distinct outcomes with no overlap, then the probability of either getting A or B is equal to just adding up the probability of both outcomes.

P(A or B) = P(A) + P (B)

5. Multiplication Rule: The probability of getting a combination of events is equal to the probability of their separate occurrences.

If A and B are

independent

events,

then P(A and B) = P(A) * P(B).

4a. Adjusting for Joint Occurrence: If an event double-counts, we have to make a correction to eliminate double-counting events. In this case, we simply subtract out the joint occurrences.

5a. Conditional Probability: If A and B are both possible outcomes, then P(A and B) = P(A) * P(B given A)

If I randomly select one student in this class, what is the probability he or she will have a Georgetown ID?

If I randomly select one student in this class, what is the probability he or she will already have a bachelors degree?

P(King): The probability of selecting a King from a deck of cards is 0.0769.

P(Not King): The probability of not selecting a King from a deck of cards is 0.9231.

**Concept: Probability Distribution**

**Concept: Random Variable**

**Concept: Probability**

and the Normal Curve

and the Normal Curve

The

rate

is the frequency of an occurrence, relative to a base number (measured in the 10's, 100's or 1000's, etc.)

Seven cities with the highest murder rate (2010)

(Note: the murder rate is the number of murders

per 100,000 people

)

1. New Orleans - 49.1

2. St. Louis - 40.5

3. Baltimore - 34.8

4. Detroit - 34.5

5. Newark - 32.1

6. Oakland - 22.0

7. Washington, DC - 21.9

Can we measure

grit

?

Concept: Unit of Measurement

What is the unit you're measuring?

For income, you're measuring in dollars.

For standardized test scores, you're measuring in points.

For education, you're measuring in years of schools.

Nominal

Ordinal

Interval

Ratio

(Special case: Dichotomous/Dummy [yes/no])

Note: It is possible to make continuous variables into discrete categories. For example, you could have a continuous age variable (e.g., 18, 19, 20, 21, 22, 23, 24 etc.) and recode it into a categorical variable (e.g., 18-21, 22-30, 31-40, etc.)

Person A: Weighs 150 lbs!

Weighs himself five times and gets the following scores ...

130

145

150

120

160

Weighs himself five times and gets the following scores ...

125

126

125

126

124

1

2

If you were asking a survey question about a person's race, what would be the set of response options you would give to ensure that the categories are exhaustive?

In creating a question about race, what is an example of a set of response options that are not mutually exclusive?

Measurement is imperfect.

Sometimes the instruments are imperfect (e.g., the scale is slightly off). Sometimes people are misleading about their responses (e.g., they might under-report their weight, or over-report their voting behavior).

While we work to minimize measurement error, there is the possibility for error in all measurement.

1. Imprecise tools.

2. Poorly worded questions, surveys.

3. Interviewer biases.

4. Respondent biases (e.g., social desirability).

5. Coding/processing errors.

Using the variable for the number of siblings for each person in this class, construct (and interpret) a frequency table.

Final Activity:

How do we measure poverty

in the United States?

We often hear accounts of poverty, or the percentage of Americans who live in poverty, but what are the actual indicators used to measure poverty?

"X bar" is the mean!

"Sigma" is to sum everything

or to add them all up!

"X i" (or X subscript i) is the ith

observation in a dataset

"N" is the total number of

observations in that dataset

10 15

76 78

78 80 82 82 84 86 89

89 90 92 92 92 93 93 94 96 96 98

10 15

76 78

78 80 82 82 84 86 89

89 90 92 92 92 93 93 94 96 96 98

Outliers!

Outliers are scores the are markedly different from the rest of the scores in the distribution. They distort calculations of central tendency, like the mean.

Looking back at these scores ... In which distribution do the scores typically fall closest to the mean?

Example: Home Prices in DC

To calculate the variance

1. Subtract each observation

from the mean (to get the deviation)

2. Square the deviation

3. Add up all of the squared

deviations ("Sum of Squares")

4. Divide the Sum of Squares

by the number of observations

to get the variance.

(Note: These are two normal curves.

The peak of each curve is the mean.

When the scores are bunched closer

to the mean, the standard deviation

is small; when the scores are spread

wider from the mean, the standard

deviation is larger. We will learn

about the normal curve soon.)

Two student in two separate Introduction to Sociology course both score a 90 on their exam. Are all 90s created equally?

If a person in class A got a 90, but the mean was 95, he did below average; if a person in class B got a 90, but the mean was 75, she did well above the average.

Even though the raw scores are the same (because 90=90), the students scored very differently

relative to the rest of the students in their class!

**Example: Calculate**

the measures of dispersion.

the measures of dispersion.

There are six jobless households. They have

received unemployment benefits for the following number of weeks:

9 8 6 4 2 1

Calculate the mean, the range, and the standard deviation for the distribution.

Mean = 5

Range = 8

Standard Deviation = 2.94

The standardized score

for a value of X

The difference between

X and the mean of X (X-bar)

The standard deviation

Things to remember about

the normal curve ...

1. The entire area under the curve

is always equal to 100 percent!

2. The peak of the normal curve is

the mean of the distribution. In a

normal distribution, the mean = mode =

median.

3. Half of all observations fall above the mean.

Half of all observations fall below the mean.

4. Nearly all scores - more than 99 percent of them! - fall within three standard deviations of the mean

(+/- 3 sd). It is very unlikely to find an observation with a score more than 3 sd from the mean.

5. About 95 percent of scores fall within 2 sd of the mean. About 68 percent of scores fall within 1 sd of the mean.

What percentage of scores fall

between 0 and 1 standard

deviation

above

the mean?

What percentage of scores fall

between 0 and 2 standard

deviations

below

the mean?

What percentage of scores fall

between 0 and 3 standard

deviation

above

the mean?

**Normal Distribution Table**

(p. 596, Ritchey Book)

(p. 596, Ritchey Book)

When you calculate a z-score, it often won't be a nice, even number (e.g., 1, 2 or 3). You may get a z-score of 1.45, or a z-score of -2.05.

In the Normal Distribution Table, you will find z-scores and the corresponding area under the normal curve for each z-score.

Example 1: On the SATs, Person A scores 1.5 standard deviations above the mean. Assuming SAT scores are normally distributed, what percentage of observations fall at or below her score?

Example 2: A parent brings her child to get measured (height) and weighed. The child's height is 0.20 standard deviations below the mean. What percentage of children fall between 0.20 standard deviations on either side of the mean?

Example 3: The average strawberry weighs 8.5 ounces, with a standard deviation of 1 oz. A worker finds a strawberry that weighs 11 ounces! What is the z-score for this strawberry's weight? What percentage of strawberries would weigh more than this one?

Scores

Frequency (# of times it occurs)

Here's what you should be able to do by now ...

- Distinguish between levels of measurement (e.g., nominal, ordinal, interval, etc.).

- Talk about the challenges of measurement (e.g., reliability, validity, mutual exclusivity, etc.) in the social sciences.

- Calculate rates, ratios, percentages and proportions.

- Make a frequency table, including recoding continuous variables into discrete measures.

- Calculate the measures of central tendency (mean, median and mode) and talk about their limitations.

- Calculate the measures of dispersion (range, deviation, standard deviation).

- Explain the standard deviation.

- Calculate a z-score and explain the difference between a z-score and a raw score.

- Use the normal distribution table in the back of your book when calculating z-scores.

Example 4: We are studying the distribution of annual income for cashiers at fast food chains. The mean income is $14,000 and the standard deviation is $1,500. Assuming a normal distribution, what is the standardized score for a cashier making $16,000? What is the probability of randomly selecting a cashier whose income falls between $14,000 and $16,000?

1. Crash course on statistical programs

(e.g., Excel, MInitab)

2. Visualizing Data Badly

3. Creating Good Data Visualizations

(Bar Charts, Line Graphs, Histograms)

1. Open & Describe the Data

2. Sort the Data

3. Recode Variables

4. Descriptive Statistics

5. Data Visualizations

Opening Data:

FILE --> Open Project

Basic Descriptives:

STAT --> Basic Statistics -->

Display Descriptive Statistics

Sorting Data:

DATA --> Sort

Recoding Data:

DATA --> Code

Descriptive Statistics:

CALC --> Column Statistics

Z-Scores:

CALC --> Standardize

Data Visualizations:

GRAPH --> Bar Chart

GRAPH --> Line Chart (or Time Series Plot)

GRAPH --> Histogram

1. Axis scales can be misleading!

Don't manipulate the scale in a way

that makes people read the figure differently.

2. The visual images should be

correctly sized to the data. Otherwise

we end up with perceptual distortion

(where the image tells a story different from the actual data).

3. Consistent scales & consistent axes!

(Units across the axes must be the same.)

4. Enough information, but not too much information. (Avoid data junk.)

1. Select the appropriate type of graph.

2. Clearly label your axes.

3. Ensure consistent scales on the axes. Include a legend (where appropriate).

4. Write titles that identify the information in the chart. (We should be able to "read" the chart without any accompanying text).

5. Avoid perceptual distortions.

6. Minimize data "junk". (This includes excess colors, symbols, and information not directly related to the data story itself.)

7. Remember: Data visualizations are being used to tell a story!

Explain Each Visual Display of Data ...

1. For every year since 1980, I want to graph the acceptance rate at Georgetown.

2. I want to plot the number of murders committed in each of the four quadrants (NW, NE, SW, SE) in DC last year.

3. I would like to show the distribution of SAT scores for students from public and private schools.

4. I want to show voter participation rates for different age groups.

5. I would like to compare the incarceration rates in each state in the United States.

Reading Social Research

Understanding Statistical Tools in Contemporary Research

Murray et al. 1990. Teacher Personality Traits and Student Instructional Ratings in Six Types of University Courses. Journal of Educational Psychology.

McAdam, Doug and Cynthia Brandt. 2009. Assessing the Effectiveness of Voluntary Youth Service: The Case of Teach for America. Social Forces

**Concept: Sample**

Polling & Presidential Politics

"The percentage of voters believing that the country is better off since Barack Obama became president jumped seven points after the two political conventions, according to a new NBC News/Wall Street Journal poll.

Thirty-eight percent of registered voters now say the nation is better off

, which is up from 31 percent in the August NBC/WSJ poll conducted before the conventions. Still, a plurality of voters -- 41 percent -- believe the country is worse off, while 21 percent think it's in the same place." September 18, 2012

"Obama holds a five-point advantage over Romney, 50 percent to 45 percent,

among likely voters

in the nationwide survey conducted Sept. 12-16. Obama led Romney by 50 percent to 44 percent among registered voters in the poll released yesterday."

Registered vs. Likely Voters:

What is a

likely voter

model?

**Concept: Population**

The

population

is the group of observations, all of which share some characteristic, that we are trying to learn about.

Examples could include ...

- The population of Americans (e.g., "Americans believe ...")

- The population of registered (or likely) voters

- The universe of students at Georgetown

- The people that own at least one car in Washington, DC

- The elementary schools in the America

The

sample

is a smaller number of individuals or observations drawn from the population. The sample is used to make claims about the broader population.

As opposed to

descriptive statistics

, which describe the characteristics of a particular set of observations (e.g., the measures of central tendency, the frequency with which particular observations occur, or the measures of dispersion around a central score),

inferential statistics

use a sample of observations to make claims - or to infer something - about a larger population.

For example, we ...

- use a sample of likely voters to infer how the electorate is likely to vote.

- use a sample of American households to infer American attitudes on a range of social issues.

- use a sample of American workers to make claims about the rate of unemployment and the characteristics of the labor force.

**Concept: Inferential**

Statistics

Statistics

Note: What's important in each of these cases is that we use a smaller sample to make claims about a broader population.

Concept: Point Estimate

(or Sample Statistic)

Concept: Population

Parameters

**Concept: Drawing**

a Sample

a Sample

How do we select a sample of cases to research?

What is the process for drawing a sample? What

are the criteria?

**Concept: Probability vs.**

Non-Probability Sample

Non-Probability Sample

A

probability sample

- also known as a

random sample

- generally fits two requirements. First, every unit the population has an

equal probability

(or likelihood) of being in the sample. Second, and related, the goal of a probability sample is to be

representative

of the population at-large.

In a non-probability sample - or a

convenience sample

- not every member of the population has an equal chance of being included in the sample. Examples might be a snowball sample, or a sample of students in your class or through an on-line poll.

How do we ensure a representative sample?

Follow the rules of

EPSEM

E

qual

P

robability of

SE

election

M

ethods

EPSEM requires that you select a sample in a

manner that affords each case an equal probability

of being included in the sample.

There are several types of probability samples:

- Simple random sample:

- Systematic random sample:

- Stratified random sample:

- Clustered random sample:

Why would we select a

non-probability sample?

- Some research questions make the selection of a probability sample very difficult (e.g., a sample of IV drug users, a sample of people exhibiting a particular deviant behavior, etc.)

- Sometimes researchers "pilot" their studies on a convenience sample before testing their hypotheses on a larger, representative sample.

What are the drawbacks of

a non-probability sample?

- Difficulty in making generalizations about the broader population when the sample is not representative.

We typically draw samples, rather than surveying the entire population, because the latter would prove to be prohibitively expensive, time-consuming or simply unnecessary.

The

population parameter

is a statistic about a population that is, at least in theory, both unknown and unknowable.

Because the population parameter is unknown, we calculate a

sample statistic

, known as the

point estimate

(or sometimes, the

parameter estimate

), that

is our best guess of that population parameter.

For population parameters, we typically use Greek letters.

For point estimates, or sample statistics, we typically use English letters.

Concept: Single Sample

vs. Repeated Samples

Concept: Sampling

Distribution

Because you only draw a single random sample, you don't know empirically what the distribution of means would look like if you drew ten, a hundred or a thousand random samples ... Each sample, though, would have a (slightly different) mean. The

sampling distribution

is a

theoretical

distribution for all samples of a particular size (N). In other words, it helps us imagine what the distribution of sample means would look like if we drew a large (infinite) number of samples. Because it is theoretical, it is never obtained in real life by researchers (who usually only draw a single sample). Sometimes, you will hear researchers refer to this as the

sampling distribution of sample means

.

Concept: Sampling

Error

Concept: Standard Error

Research question: How many minutes each week do Americans spend on their phones? Obviously, I can't ask this question of every American, so instead, I draw a sample of 200 Americans.

Let's imagine that, in my sample (n=200), the mean amount of time spent gabbing on the phone each week is 101.55 minutes. (Xbar = 101.55)

Because I've got too much time and too many resources, I decide to draw a couple more samples ... (Note: This is where the situation veers from real life. Researchers never draw repeated samples.)

In the second sample (n=200), Xbar = 96.50

In the third sample (n=200), Xbar = 100.40

In the fourth sample (n=200), Xbar = 103.05

In the fifth sample (n=200), Xbar = 92.65

And on and on and on ... In theory, if we did this exhaustively (and exhaustingly) we would draw every imaginable sample and form our own

sampling distribution of sample means

.

What are the characteristics of the sampling distribution?

1. The distribution approximates a normal curve (or is normally distributed).

2. The mean of the sampling distribution is equal to the true population mean (which, in real life, is unknown). (Here, we sometimes say that the mean of means equals the population mean.)

3. The standard deviation of the a sampling distribution is smaller than the standard deviation of the population. (This is important.)

4. The sampling distribution describes the possible outcomes from our samples, and the probability that each outcome will occur. Intuitively, we know that sample means far from the population mean are unlikely (low probability, in the tails of the curve) while those closer to the true population mean are substantially more likely to occur.

From a deck of cards (52), the probability that

I will randomly select a King = P(King) = 4/52 = 0.0769

P(King or Queen) = P(King) + P(Queen) = 4/52 + 4/52 = 0.0769 + 0.0769 = 0.1538.

P(King or Heart ) = P(King) + P(Heart) - P(Joint Occurrence) = 4/52 + 13/52 - 1/52 = 16/52 = 0.3077.

P(King and Tails): P(King) * P(Tails) = 4/52 * 1/2 = 4/104 = 0.385

with Replacement

vs.

without Replacement

Question: What is the probability of selecting two Kings in a row? Here, replacement matters!

Do you put the card back in the deck before selecting the second card, or do you select from the remaining 51 cards?

P(Ace and Ace) with replacement?

P(Ace and Ace) without replacement?

In a randomized events (e.g., child birth, flipping coins, etc.), the probability distribution describes the likelihood of each possible outcome of the event. The probability distribution is analogous to the frequency distribution, except that it is based on the number of expected occurrences in the long-term (based on probability theory) rather than the actual number of occurrences (as described by empirical evidence).

The normal curve - which you have already seen - is basically an ideal or theoretical model showing the probability that particular events will occur. When we calculated z-scores and looked at the area under the curve, we were figuring out how likely it was that particular events would occur. This is an application of the rules of probability.

Percentile Rankings: We can also use the area under the normal curve to think about percentiles, knowing the

percentage of the population that falls at or below

a certain score.

Question: How likely is a score to fall +/- 1 standard deviation from the mean?

P (+/-1sd) = P(-1<z<0) + P(0<z<1) = .3413 + .3413 = .6826

Question: What is the percentile ranking for someone with a score 1 standard deviation above the mean?

P(<1 sd) = P(z<0) + P(0<z<1) = .5000 + .3413 = .8413 = 84th percentile = 84 percent of scores fall at or below this score.

Question: How likely is a score to fall between 2 and 2.5 standard deviations above the mean?

P(2<z<2.5) = P(0<z<2.5) - P(0<z<2) = 0.4938-0.4772 = .0166

Question: If you scored 1.4 standard deviations above the mean, what is your percentile ranking?

P(z<1.4) = 0.5000 + 0.4192 = .9192

Percentile ranking is the 92nd percentile. You scored at or above 92 percent of people.

When we talk about samples and populations, we begin to move into the realm of

inferential statistics

. In this part of the course, we begin to use statistical techniques to

make inferences about populations

larger than we are able to reach. We might want to know something about DC residents, likely voters, or bike commuters. Because we can't talk to all DC residents, all likely voters or all bike commuters, we

draw a sample

- a smaller group representative of the total group. The

size of the sample

and the strategies we use to

select the sample

matter.

In the last election, we heard a lot about public opinion polls. What was the population being targeted? How were we selecting the samples? And why does sample selection matter to our polling?

Contacting a sample

How do pollsters select the people for their polls?

Random Digit Dialing (RDD) is a common

method of contacting polling respondents.

However, one concern with RDD is that it

often captures only landlines, but doesn't

include cellphones.

What could be some problems if we include

only landlines in a RDD design, but not cell

phones?

How likely is she to have a landline?

How likely is he to have a landline?

The concept of equal probability and representativeness are important because we want the sample to look like the population. For example, if the population is ~ 55% female, we want the sample to be ~ 55% female. If ~20% of the population is under 30, we want ~ 20% of the sample to be under 30. Drawing a random sample where everyone has an equal probability of selection ensures that the sample resembles the population (assuming a large enough sample size).

For example, there are more than 300 million Americans. We couldn't possibly ask them all about their views on abortion, health care reform, gay marriage, etc. Those statistics - the percent that support abortion rights, the percent that oppose health care reform, etc. - are the population parameters.

From our sample, we might find out that 51% of people support abortion rights. This sample statistic is our best guess of the population parameter.

If I flip a coin twenty times and it lands on heads 12 times (and tails 8 times) ....

Probability Frequency

Heads 0.50 0.55

Tails 0.50 0.45

When we draw a sample to estimate a population parameter, we typically

draw only one sample

. Under the conditions we will talk about, that sample provides a good estimate of the true population parameter. However, as you can imagine, if you drew a second sample (or a third or fourth sample), your sample statistics would be different each time.

Ping Pong Balls

For reasons of cost, benefit and efficiency, we don't draw repeated samples in real life ... but the concept of repeated sampling will be important shortly.

In each sample, the mean of the sample is our best guess of the population parameter, but it is not equal to the parameter (unless we're very very lucky). The

difference between the value of the sample statistic and the population parameter is known as sampling error

. However, because we don't actually know the population parameter, we can't actually calculate a value for the sampling error.

Concept: Law of

Large Numbers

Concept: Central

Limit Theorem

While we often want to know something about an entire population, it is often difficult to reach every member (or every observation) of a population.

How would you contact every voter? Or every car owner in Washington, DC? (And even when it is feasible, it is often unnecessary.)

We often talk about the number of people in our

sample by referring to the "sample size." In statistics,

n=sample size.

If we polled 300 Georgetown students, n=300

If we asked 820 Americans something, n=820

How do you draw

a simple random

sample?

1. RDD = Random Digit Dialing

2. Lottery

3. Number Tables

The standard error is simply the standard deviation for the sampling distribution. If we drew repeated samples from the population, they would each give us a sample mean. The standard error tells us about the spread of these sample means around the mean of the distribution. In other words, the standard deviation of that sampling distribution is known as the

standard error

.

To understand the standard error, we need to begin by reviewing the standard deviation. Remember that in any distribution, the standard deviation tells us how far the scores in the distribution are spread from the mean. When the standard deviation is small, the scores in the distribution are tightly bundled around the mean; when the standard deviation is larger, the scores in the distribution fall farther from the mean.

Ok, I understand what the standard error is, but what does it tell us about the sampling distribution? The standard error tells us how much sampling error there is when we repeatedly sample the population.

To calculate the standard error, we simply

the standard deviation of the sample by

the square root of the sample size.

Put simply, the law of large numbers tells

us that

drawing a big sample is better than

drawing a small sample

. (However, this is true

only up to a point. At some point, drawing a

bigger sample doesn't really give you more

information about the population. A sample of

10,200 people is not really any better than a

sample of 10,000 people ...)

Intuitively, this should make sense - having more

data points (i.e., more information about a population) is better than having few data points.

If I wanted to draw a sample from this class to estimate the class GPA, a sample of 18 students (n=18) is a better guess at the class mean than a sample of 5 students (n=5).

Likewise, if I wanted to estimate the average weight of American men, drawing a sample of 300 men (n=300) would probably give me a better estimate of the national body weight than a sample of (n=30).

Mathematically, the law of large numbers is important because as you increase the sample size (n), you decrease the standard error. In other words, as you increase the number of people in your sample, there is less variability around the mean - sample means tend to cluster closer to the mean of means (or the population mean).

Draw a sample of ping pong balls of n=20.

Draw a sample of ping pong balls of n=5.

Calculate the standard error for n=20

Calculate the standard error for n=5

If n increases, the standard error decreases

If n decreases, the standard error increases

We have talked extensively about the normal distribution and variables - like IQ, body weight, or heights - that are normally distributed. However, not all social phenomena are normally distributed. Some variables are skewed - for example, income - while others might have a bi-modal distribution.

The Central Limit Theorem says that, regardless of the shape of the distribution of the underlying (raw score) data, the sampling distribution will be normal when the sample size is a certain size (n>121).

In other words, even if the raw data is skewed (income) or bi-modal, the sampling distribution of means will be normally shaped.

Ok, so far everything we have talked about has been for continuous (interval/ratio) variables. How does this all work for nominal variables?

Concept: Estimating

Proportions

Rather than estimating the mean of a variable, we might be interested in estimating the proportion.

The proportion of ping pong balls that have a blue dot.

The proportion of Americans who support stronger gun control.

The proportion of American elementary schools built after 1980.

The proportion of Georgetown students with blond hair.

Each of these variables is a discrete variable - nominal or ordinal - rather than a continuous measure.

From the sample of ping pong balls,

I'd like to estimate the percentage with

a

red "X

". In other words, what percentage

of ping pong balls have a

red "X

"? (Note:

This is not a question about means; it's a

question about percentages because the

variable is discrete (nominal) rather than

continuous (ratio).)

Like the standard error of means, the standard error of proportions is normally distributed.

Because the sampling distribution

is normally distributed ...

We know that 68% of all sample means fall within +/-1 SD of the population mean.

We know that 95% of all sample means fall within +/-2 SD of the population mean.

Previously, when we calculated the standard deviation, we had all the observations in the distribution. Here, we have only one observation - the mean that we've just drawn from our sample!

Smaller standard deviation.

The sampling distribution has a smaller standard deviation than the underlying data.

Larger standard deviation.

The underlying data has a larger

standard deviation than the standard

deviation of the sampling distribution.

Draw one more sample of n=20.

Calculate the standard deviation

of the sample.

Calculate the standard error.

Why do I keep making you draw a sample

of n=20? Why not draw a sample of 10, or

a sample of 40? Does it really matter

how big the sample is?

Building on this, we will talk about

confidence intervals (or, what in these polls,

are referred to as the margin of error) and

think more seriously about why sample

sizes matter.

What is an example of an underlying variable where the distribution might be bimodal?

The Central Limit Theorem: Regardless of the shape of the underlying distribution of data, the sampling distribution will be normally distributed for samples of a certain size (n>121)

Draw two samples (n=20) of ping pong

balls to estimate the percent with a red "X"

Good.

Now you can draw ping pong balls out of

a box (or other samples from a population)

and you can estimate the standard errors,

both for discrete variables and for continuous

variables.

But what does this all mean?

Because we know the standard error -

the standard deviation for the sampling

distribution - we can figure out what

percentage of random samples will fall

within a certain distance from the mean.

Again, remember the 68%, 95%, 99% rules

We said that the sampling distribution was normally distributed for continuous variables when n>121.

For discrete variables, the sampling distribution is

normally distributed when the following rules are met.

First, decide on the smaller proportion - either P or (1-P).

Second, multiple that proportion times the sample size (either P*n or (P-1)*n)

Third, if that value - either P*n or (P-1)*n - is greater than 5 (e.g., P*n > 5), then the sampling distribution is normally distributed.

Midterm Review

Confidence Intervals

**Your turn!**

Questions?

Comments?

Confusions?

Thoughts?

Questions?

Comments?

Confusions?

Thoughts?

Statistical Inference and

Hypothesis Testing

**Concept: Confidence**

Intervals

Intervals

We said that the point estimate (or sample statistic) was

the best estimate

of the population parameter ... but how good of an estimate is it? How

confident

can we be that our sample statistic accurately represents the population parameter?

The most common confidence interval is a 95% percent interval. With this interval, 95% of confidence intervals will contain the true population parameter.

**Concept: Level of**

Confidence

Confidence

As social scientists, we're usually happy with a confidence interval of 95%. However, you can select any level of confidence. You want to be 99% confident? No problem! Only interested in making a claim with 80% confidence? Well, ok!

The trade-off will be between

precision

and

confidence.

As your confidence level rises - from 95% to 99%, let's say - your level of precision will fall. In other words, there will be a wider range of values into which your estimate will fall.

T

he expected level of error is symbolized by

alpha

(symbolized by the Greek letter). (It's easy to remember - alpha males are (over)confident.)

If you were 100% confident, there would be no error. Since you can't be 100% confident, you must subtract your level of confidence (e.g., 95%) from 100% to get your alpha.

**Concept: Alpha**

**Concept: The Importance**

of the Sample Size

of the Sample Size

**Concept: Student's t**

**Concept: Confidence**

Intervals for Proportions

Intervals for Proportions

The confidence interval is a range of values, centered around the sample statistic. It specifies the degree of confidence that we have that the true population parameter falls within that range.

The level of confidence is simply the specified probability that your range of values contains the true population parameter. As the researcher,

you get to select the level of confidence you want!

Alpha

Level of Confidence = 100% -

95% = 100% - 5%

99% = 100% = 1%

**Concept: Calculating**

the Confidence Interval

the Confidence Interval

Confidence

Interval

Sample Statistic

The Z-score that corresponds

with the level of error

you're willing to accept

Standard error

**Concept: Interpreting**

the Confidence Interval

the Confidence Interval

Here is the correct interpretation of the confidence interval:

If you were to repeatedly draw samples, the true population parameter would fall within the confidence intervals 95% of the time

(for a 95% confidence interval).

I draw a sample and find that the sample statistic is 3.65. The standard error is 0.20. Construct a 95% confidence interval.

Back to the ping pong balls ... Draw a sample of n=20.

Sophomores/Juniors: Construct a 90% confidence interval.

Seniors: Construct at 99% confidence interval.

What is the trade-off

between

confidence

and

precision

?

As confidence level goes up,

the precision goes down.

As the confidence goes down,

the precision goes up.

Why does the sample size matter for our

discussions of the confidence interval?

How does the sample size impact the

range of the confidence interval?

If n increases, the standard error decreases

When the standard error decreases, the range

of the confidence interval shrinks.

As the confidence interval shrinks, the range of

values included within that interval gets smaller.

Activity: Construct a

Confidence Interval

Graphing Confidence

Intervals

This graph measures a score for a level

of prejudice (Y Axis) and year in college

(X Axis). The points in the middle are

the sample statistics, and the lines

extending out are the 95% confidence

intervals.

LCL = Lower Confidence Limit

UCL = Upper Confidence Limit

Standard error

Sample Proportion

**Concept: Margin**

of Error

of Error

1. Get a point estimate

(or sample statistic) and

calculate the standard

error.

2. Decide on alpha.

What level of error

are you willing to accept?

3. What is your level

of confidence? What

is the corresponding

Z-score.

Hint: 90% confidence, z = 1.64

95% confidence, z = 1.96

99% confidence, z = 2.57

4. Calculate the confidence

interval. State the range of

values, from the LCL to the

UCL.

Practice

Problems

You poll 260 Georgetown students to ask whether they support building a pub in the basement of Healy, and you find that 58% of your sample support the Healy pub. Can you say with 95% confidence that more than half of Georgetown students support the Healy pub?

1. Calculate the sample

proportion.

P = 152/260 = .585

2. Calculate the standard

error of proportions

s = sqrt[(P(1-P))/n]

= sqrt[(.58*(1-.58))/260]

= 0.031

3. Select your alpha

Confidence Level = 95%

Alpha = 0.05

4. Determine the

corresponding Z-score

For alpha = 0.05,

Z=1.96

5. Calculate the

Lower Confidence

Interval and the Upper

Confidence Interval

CI = P +/- (Z)(s)

LCI = 0.524

UCI = 0.646

6. Answer the

Research Question

Yes! The confidence interval is 0.524

to 0.646. I know that 95% of confidence

intervals contain the true population

parameter. The confidence interval does

not include 0.50.

Twist one: Imagine that the proportion was obtained from

a sample of n=140. Calculate the new 95% confidence interval. (Last names: A-F)

Twist two: Imagine that the proportion was obtained from

a sample of n=840. Calculate the new 95% confidence interval. (Last names: G-J)

Twist three: Sticking with the original sample of n=260, calculate the 90% confidence interval. (Last names: K-R)

Twist four: Sticking with the original sample of n=260, calculate the 99% confidence interval. (Last names: S-Z))

Twist one: Explain what happens to the confidence interval when the sample size goes down.

Twist two: Explain what happens to the confidence interval when the sample size goes up

Twist three: Explain what happens to the confidence interval when you decide on a lower level of confidence.

Twist four: Explain what happens to the confidence interval when you decide on a higher level of confidence.

Twist one: The confidence interval gets wider. (More values are included in the interval.)

Twist two: The confidence interval gets narrower. (Fewer values are included in the interval.)

Twist three: The confidence interval gets narrower. (Fewer values are included in the interval.)

Twist four: The confidence interval gets wider. (More values are included in the confidence interval.)

Note that

none

of these confidence intervals include the score 0.50, meaning that for each of these scenarios, we can be confident that the true population proportion is greater that 0.50!

**Concept: Optimal**

Sample Size

Sample Size

Unless you have unlimited resources, you will be limited in your sample size. What is the optimal

number of respondents to pick?

In the previous example, we saw that the confidence interval changed from ...

0.503 to 0.666 when n=140

0.525 to 0.645 when n=260

0.551 to 0.618 when n=840

Note: As the sample size increases, the standard error declines. With declining standard errors, the confidence intervals narrow. Therefore, an increase in the sample size leads to a narrower confidence interval. (Note, however, diminishing returns on this.)

**Concept: Optimal**

Confidence Intervals

Confidence Intervals

Now that we've discussed sample sizes,

how about confidence levels. What level

of confidence should we pick?

Note: As we demand greater confidence, our

alpha (expected error) decreases. Corresponding

to a decrease in alpha is an increase in the z-score.

This corresponds with wider confidence intervals.

Therefore, as we require higher levels of confidence,

we accept wider confidence intervals.

So far, we have used the normal distribution curve

to get z-scores in all of our statistics. We noted early

on that the sampling distribution is normally distributed when the sample is of a particular size (n>120). What happens when we have a smaller sample?

In principle, the concern of a smaller sample is

that the sample statistic is more likely to be skewed

by outliers. In other words, an extreme score in a small-n sample is likely to distort the sample mean. Big outliers will inflate the mean; small outliers will depress the mean.

As a result, in repeated samples with smaller sample sizes (n<121), the sampling distribution has a larger spread in sampling error (i.e., there is more sampling error in small-n samples than in large-n samples) and the curve will be flatter. This curve is called the t-distribution (or the student's t).

Characteristics of the student's t

1. Approximately normal.

2. Symmetrical.

3. Flatter than a normal curve (and gets

even flatter as n decreases)

4. Above n=121, t=z

Really, though, the take away is that when

we have small samples, we will use the t-distribution. (More on this in future classes.)

**Concept: Hypothesis**

and Hypothesis Testing

and Hypothesis Testing

A hypothesis is a prediction about how

two variables are related to one another.

**Concept: The**

Null Hypothesis

Null Hypothesis

**Concept: The**

Research Hypothesis

Research Hypothesis

**Concept: One-tail tests**

vs. two tail tests

vs. two tail tests

The null hypothesis is the statement of "no effect" or "no difference". Regardless of what you actually think the relationship is between two variables, the null hypothesis always claims that there is no relationship between the variables.

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Reject the null hypothesis?

Your turn:

Give me a hypothesis!

The mean GPA of students

from private school equals

the mean GPA of student

from public school.

The proportion of black fifth

graders who pass standardized

tests is equal to the proportion

of Hispanic fifth graders who

pass standardized tests.

The percentage of Washington, DC

residents who approve of the job Mayor

Gray is doing is equal to fifty percent.

The average number of baseball cards

owned by an American teenager is 1,000

When we run a hypothesis test, we are

always asking the following question: Can

we reject the null hypothesis?

The research hypothesis is the hypothesis we will accept if the null hypothesis is rejected. In other words, if we reject the null hypothesis, we can then accept the research hypothesis.

Final Review, Part I

The proportion of black fifth

graders who pass standardized

tests is greater than the proportion

of Hispanic fifth graders who

pass standardized tests.

The average number of

baseball cards owned by an American teenager is not 1,000

Practice Problem:

A car company wants to know the expected number of miles per gallon for a new model it's producing. Recognizing that not all of the cars are perfectly the same because of workmanship and variation in parts, it takes 100 of the cars off the assembly line and does a test run. The mean miles per gallon is 26, and the standard deviation is 4. Calculate the 95% confidence interval.

Standard error = 4/sqrt(100) = 0.4

Z-score = 1.96

Margin of Error = 0.78

Confidence Interval is 25.22 to 26.78

We can be 95% confident that the true mean miles per gallon is between 25.22 and 26.78.

**Concept: One Sample**

Means Test

Means Test

In a one-sample test, you are testing to determine whether your sample mean is equal to a specific, target value.

**Step-by-Step**

Tests

Tests

1. State the Null Hypothesis.

2. Stating the Research

Hypothesis and, in doing so,

decide on a one-tailed

or a two-tailed test.

In conducting a significance test, we're asking

whether the difference between a sample mean

and the target value occurred because of a true

difference, or could the difference - on account

of sampling error - be the result of chance and

chance alone?

**Concept: Two Types**

of Error - Type I and

Type II Errors

of Error - Type I and

Type II Errors

In a

one-tailed test

, we hypothesize that a particular value is

greater than or less than

another value.

The average GPA of seniors in the spring is less than the average GPA of seniors in the fall.

Students living in single-parent households spend less time doing homework than students living in two-parent households.

In a

two-tailed test

, the hypothesis is

non-directional

. We are not concerned about whether the value is greater than or less than; either way satisfied our hypothesis.

The GPA of sociology major is different from the GPA of government majors. (Note: This hypothesis doesn't specify whether sociology majors have higher GPAs or lower GPAs than government majors; it only hypothesizes that they are not the same.)

Remember that hypothesis testing is all about determining how likely it would be to get your sample statistic if the null hypothesis were true. Because it is all about the likelihood - or the probability - we will sometimes make mistakes in rejecting or accepting the null hypotheses. These are called Type I and Type II errors.

A

Type I error

occurs when we reject the null hypothesis, even though the null hypothesis is actually true.

A

Type II error

occurs when we fail to reject the null hypothesis even though the null hypothesis is actually false.

3. Write down the sample statistic

(e.g., the sample mean). Calculate

the standard error.

4. Decide on a level of significance

- alpha - which is also the level

of expected error. (Typically, we

will just stick with alpha = 0.05)

5. Calculate the test statistics ( )

6. Determine the p-value that corresponds with the test statistic.

7. Make the rejection decision.

If the p-value < alpha, reject the

null hypothesis (and accept the

alternative hypothesis).

If the p-value > alpha, we fail to

reject the null hypothesis. (Note:

We don't accept the null hypothesis;

we just fail to reject the null

hypothesis.)

8. Interpret your decision.

**Concept: Test Statistic**

When we conduct a hypothesis test, we first have to calculate a test statistic.

Test Statistic

Sample mean

Hypothesized population

mean (or just the number

you're comparing against

your sample mean)

Standard error

**Concept: p-value**

The p-value is the area in the tail of the normal curve, beginning with your test statistic. It tells

us how unusual it would be to find your test

statistic if the null hypothesis were true.

For example, let's say that we are testing whether the mean GPA at Georgetown is 3.52. That's our null hypothesis. We draw a sample, write down the sample mean, and calculate the test statistic. From there, the p-value tells us how unusual it would be to find our sample mean if the true population mean were 3.52. In repeated samples, it tells us the

probability of drawing a sample with our sample mean (or a sample mean farther from the hypothesized population mean) if the population mean actually is 3.52.

**Concept: Critical**

Value

Value

The critical value is a Z-score that corresponds with a particular level of alpha. When your test statistic is greater than the critical value, you know that you can reject the null hypothesis.

**Practice Problem**

You hypothesize that students who go on to graduate school have a higher IQ than the average American college student. You decide to test the hypothesis by administering an IQ test to 144 college students who are headed to graduate school in the fall. The average IQ of college students is 105. In your sample, you find that graduate school-bound students have a mean IQ of 111 with a standard deviation of 4. Test your hypothesis that students who go on to graduate school have a higher IQ than other students. What do you conclude?

A politician administers a "feeling thermometer" to a random sample of her constituents (n=225) to determine whether she will support a piece of legislation. She decides that if the average feeling thermometer score of her constituents is above 50 - feeling generally positive about the legislation - then she will support it. She finds that the mean score for her constituents is 53, and that the standard deviation is 3. Should she support the legislation?

On the 2006 General Social Survey (GSS), respondents were asked to identify their political ideology on a scale of 1-7, with 1 being extremely liberal, 4 being moderate, and 7 being extremely conservative. Using a sample of 186 individuals, researchers want to know whether Americans lean towards being liberal or being conservative. To do so, they will test whether the population response is significantly different from 4 - the position of being "moderate, middle of the road." A score above 4 shows a propensity towards conservatism. A score below 4 shows a propensity toward conservatism. The mean score for their sample is 4.075 and the standard deviation is 1.512.

Standard error = 0.111

Test statistic = 0.68

p-value = 0.2483

because 0.2483 > 0.05,

we fail to reject the null

If the null hypothesis were true (that the population mean = 4), it would not be unusual to observe the results of our sample. It is plausible that the population mean is 4.0, leaning neither conservative nor liberal.

**Concept: Confidence**

Intervals and

Hypothesis Testing

Intervals and

Hypothesis Testing

When I asked about the IQ score of students going to graduate school ... We

rejected the null hypothesis

that the mean IQ of students going to graduate school

was equal to

the mean IQ of all students (and we accepted the alternative hypothesis that their IQ was higher).

Now, imagine that in the real world - if we had data on all the students going to graduate school - we knew that their mean IQ actually was equal to the mean IQ of all students. In this case, the null hypothesis is, in fact, true, but our sample led us to reject the null hypothesis.

Classic Type I error.

Why does the Type I error occur?

When we draw samples, most of the sample

means will fall close the population mean ...

but not all of them! (Remember: the distribution

of sampling means.) If, just by chance, we draw

a sample that falls farther from the population

mean, then we might be led to a Type 1 error.

We failed to reject the null hypothesis that the population mean for political ideology was equal to 4. What if we knew, though, that the population mean was, in fact, 4.25. In this case, we would have failed to reject the null hypothesis when, in fact, the null hypothesis is false.

Classic Type II error.

We never know if we have, in fact, committed a Type I error or a Type II error because we don't actually know the population mean! (That's why we're drawing samples and running statistical tests.)

Instead, we can quantify the likelihood of committing a Type I error or a Type II error by knowing alpha.

Alpha is the probability of a Type I error. As the probability of a Type I error goes down, the probability of a Type II error goes up. (The likelihood of a Type II error depends how far the population mean falls from the hypothesized value.)

Think about our legal system.

What's the equivalent of a Type I

or Type II error in our legal system?

Alpha is the expected level of error. When alpha equal 0.05, we need to decide if that expected error should occur on just one side of the distribution (one-tail test) or whether it should be split between both sides of the distribution (two-tail test).

Remember: We're asking how unusual it would be to find our test statistic if the null hypothesis were true. Two-tailed tests are more conservative than one-tailed tests. It is more difficult to reject the null hypothesis (and accept the research hypothesis) when conducting a two-tailed test. (Imagine you got a p-value of 0.04. Look at the graph above, you could reject the null on a one-tailed test, but you would fail to reject the null for a two-tailed test.)

Critical Value = 1.64

Alpha = 0.05

Test Statistics = 1.92

p-value = 0.0274

In this case, you reject the null hypothesis because

the p-value < alpha. (0.0274 < 0.05)

You can also make this rejection decision knowing that the test statistic is greater than the critical value. (1.92>1.64)

Critical Value = 1.64

Alpha = 0.05

Test Statistics = 1.24

p-value = 0.1075

In this case, you fail to reject the null hypothesis because the p-value > alpha. (0.1075 > 0.05)

You can also make this decision knowing that the test statistic isn't as big as the critical value. (1.24 < 1.64). Your test statistic would have to be larger than the critical value to reject the null.

Constructing confidence intervals and running single-sample hypothesis tests require the same set of inputs ...

- Sample mean

- Level of confidence/alpha

- Standard error of the sampling distribution

So, how are hypothesis testing and confidence intervals related to one another? What do they tell us about a sample statistic?

For our political ideology problem (Xbar = 4.075),

construct a 95% confidence interval. (Remember: se = 0.111)

Now, let's compare the confidence interval with the results of the hypothesis test.

CV

CV

= 4

=4.075

p-value = 0.2483

Z = 0.68

Z = 0

/2 = 0.025

/2 = 0.025

95% Confidence Interval

Z = 1.96

Z = -1.96

When the p-value is small, it tells us the probability of coming up with that sample statistic if the null hypothesis were true is small. If the null hypothesis were true, it would be unusual to draw a sample where the p-value is very low.

**Concept: Sample**

sizes and Hypothesis

Tests

sizes and Hypothesis

Tests

As with confidence intervals, changing the sample size (n) has an impact on the hypothesis tests.

Why does the sample size impact the hypothesis test? How does it impact the hypothesis test?

As the sample size goes up, the

standard error goes down.

As the standard error goes down,

the value of the test statistic goes up.

As the value of the test statistic goes up,

the p-value gets smaller.

As the p-value gets smaller, it becomes

more likely that the p-value will be

smaller than alpha.

So as the sample size increases, it

becomes easier to reject the null

hypothesis.

**Concept: Statistical**

Significance vs.

Substantive Significance

Significance vs.

Substantive Significance

**Concept: Tests**

for Proportions

(rather than Means)

for Proportions

(rather than Means)

**Concept:**

t-distribution

t-distribution

**Hypothesis**

Testing Review

Testing Review

1. This is about the null hypothesis.

Are you able to reject the null hypothesis,

or do you fail to reject the null hypothesis?

Failure to reject the null does

not

mean

that you can accept the research hypothesis.

2. In running our hypothesis test, we are asking

how unusual it would be to find our test

statistic

if the null hypothesis were true.

The

p-value is the probability of finding our sample

statistic if the null hypothesis were true.

3. To determine our rejection decision, we can do one of two things.

First, we can compare the p-value to alpha. If the

p-value is less than alpha, we reject the null.

Second, we can compare the test statistic to the critical value. If the test statistic is greater than

the critical value, we can reject the null.

4. As the sample size increases, it becomes

easier to reject the null hypothesis.

A random sample of 1,200 Floridians were asked

whether they supported a new plan to raise taxes

as a way of avoiding service reductions. 52% of

Floridians sampled favored this plan. Can we say with confidence that this number is different from 50%?

For the standard error, we calculate the standard error for the distribution around the null hypothesis.

standard error = 0.0144

test statistic = 1.39

alpha = 0.05

critical value = 1.96

p-value = 0.0823

because p-value > alpha,

we cannot reject the null.

When looking at proportions, rather than means,

the only major difference is in our estimate of the

standard error.

Remember: Hypothesis testing is all

about

whether or not we can reject

the null hypothesis

. We are asking

how unusual it would be to find our

sample statistic if the null hypothesis

were true

. Because we can never be

100% certain, we are always accepting

some room for error.

One-tail and two-tailed tests, then,

are all about the distribution of alpha -

or the amount of error we're willing

to accept.

What is the relationship of the

p-value to the critical value?

Imagine: Alpha = 0.05

Test Statistic = 1.92

Imagine: Alpha = 0.05

Test Statistic = 1.24

We use the

t-distribution

when the sample

size is small (n<121). For larger samples,

the t-distribution is equal to the z-scores

that we have used throughout the semester.

You will read about

degrees of freedom

throughout these sections. At the moment,

it is sufficient to know that you find

the degrees of freedom by just subtracting

n-1.

df = n-1

The chart for the t-distribution

gives you critical values! Therefore,

it's important to understand the interplay

of critical values and p-values, and

know how to use critical values to reject

the null hypothesis.

Two-tailed

Alpha = 0.05

n = 50

test statistic = 2.44

Two-tailed

Alpha = 0.05

n=70 (always round down,

to be conservative)

test statistic = 1.52

1. At the beginning of the semester, we differentiated between types of

variables.

What are the different types of variables - or

levels of measurement

- and why is it important to distinguish between them?

2. We often encounter

measurement error

when measuring social phenomena. Define measurement error, and give an example of the source of that error.

3. What is the difference between

descriptive statistics

and

inferential statistics

? Give an example of each.

4. As part of descriptive statistics, we construct

frequency tables

. What do frequency tables tell us about our data?

5. What's an

outlier?

How does it influence different statistical measures we've talked about this year?

6. When we say that an observation scored at a particular

percentile

(or percentile rank), what do we mean?

7. What are the three

measures of central tendency

? Give an example where it might be useful to use each one.

8. What is a

standardized score

?

What does it mean if your standardized score is 0?

What does it mean if your standardized score is negative?

What does it mean if your standardized score is above 1?

9. Describe the

normal distribution

. How did our discussions of

probability

factor into our understanding of the normal curve?

10. Name different ways that social scientists visually

display quantitative data

. Provide an example where each visual display might be useful.

11. Why do researchers draw

samples

from the

population?

Why do we typically prefer

probability samples

, and what are the characteristics of this kind of sample?

12. Give a reason why researchers should be concerned about the size of their sample (

sample size, n

).

13. What's the difference between a

sample statistic

and a

population parameter

?

14. Define the

sampling distribution

, both in theoretical terms and in its practical significance to the study of statistics.

15. When does

sampling error

occur? Should we be concerned about sampling error?

16. What's the difference between the

standard deviation

and the

standard error

?

Example: Measuring Poverty

in the United States

How do we measure poverty?

What type of variable is it?

Why might poverty be prone to measurement error?

What are alternative ways of measuring poverty?

What does

this mean?

When we fail to draw a random sample, we end up with

sampling bias

. A

biased sample

means that some parts of the population are over-represented, but other parts are under-represented. This limits the

external validity

of our research findings, or our ability to

generalize

the findings from our sample to the population.

17. How does the

standard error

change when the

sample size

changes?

18. Explain the concept of a

confidence interval

. How does the confidence interval change as the sample size increases? How does the confidence interval change as the level of confidence we demand increases?

19. What's the

margin of error

for a confidence interval?

20. Why don't researchers ever

accept the null hypothesis

? (And what would be an appropriate penalty if you accepted a null hypothesis?)

22. How do you know if you’re supposed to use a

one-tail test

or a

two-tail test

? Give an example of each.

To observe patterns or relationships among variables

1. What is the relationship between how high students rate a course and how easy they think it is?

2. What is the relationship between the number of crimes committed in a neighborhood and the number of bars and restaurants?

3. What is the relationship between whether or not a basketball player made his last free-throw and whether or not he makes his next one?

4. What is the relationship between your parents' income and your future income? (Do you think this varies depending on your race, where you grew up, or the year you were born?)

To predict future events ...

1. Can we use statistics to quantify the likelihood of a political candidate winning, or a basketball team winning?

2. Can we use statistics to quantify the likelihood that it will be sunny, or that it will snow?

Comparison: Bar Chart vs. Pie Chart

**Concept: Odds**

The

odds

refer to the probability that an event will occur relative to the probability that an event will not occur.

If there are twenty women in the class and ten men,

the odds of randomly selecting a woman is 2:1.

If I randomly pick a day of the week, the odds that I pick Tuesday are 6:1.

Question 1: What is the probability that the ball will land on black? What are the odds that the ball will land on black?

Question 2: What is the probability that the ball will land on 21? What are the odds that the ball will land on 21?

Question 3: What is the probability that the ball will land on 28 or 31? What are the odds that the ball will land on 28 or 31?

Question: How likely is a score to fall +/- 1 standard deviation from the mean?

Question: What is the percentile ranking for

someone with a score 1 standard deviation above

the mean?

Question: How likely is a score to fall between 2 and 2.5 standard deviations above the mean?

Question: If you scored 1.4 standard deviations above the mean, what is your percentile ranking?

Question: Sylvia has an IQ of 95. Herman has an IQ of 101. Assuming that IQ scores are normally distributed, with a mean of 100 and a standard deviation of 15, what percentage of the population has an IQ score between Sylvia and Herman?

Question: Sylvia has an IQ of 95. Herman has an IQ of 101. Assuming that IQ scores are normally distributed, with a mean of 100 and a standard deviation of 15, what percentage of the population has an IQ score between Sylvia and Herman?

P(-0.333<z<0.067) = P(-0.333<z<0) + P(0<z<0.0667) = .1293 + .0279 = .1572

Imagine I draw six samples from the same population.

The population mean is 100.

Sample mean #1 = 95.125

Sample mean #2 = 98.375

Sample mean #3 = 104.75

Sample mean #4 = 101.125

Sample mean #5 = 97.25

Sample mean #6 = 97.75

**Concept: Confidence**

vs. Precision

vs. Precision

Confidence: As my level of confidence goes up, I'm willing to accept more error.

Precision: Thinking about the range of values included in my confidence interval.

There is a trade-off between

confidence and precision.

As confidence level goes up,

the precision goes down.

As the confidence goes down,

the precision goes up.

Example 1: On the 2006 General Social Survey (GSS), respondents were asked to identify their political ideology on a scale of 1-7, with 1 being extremely liberal, 4 being moderate, and 7 being extremely conservative. Researchers want to know whether Americans lean towards being liberal or being conservative. To do so, they will test whether the population response is significantly different from 4 - the position of being "moderate, middle of the road." A score above 4 shows a propensity towards conservatism. A score below 4 shows a propensity toward conservatism. The mean score for their sample is 4.075 and the standard deviation is 1.512.

First, assume the sample is 186 people.

Second, assume the sample is 500 people.

Third, assume the sample is 3,020 people.

In each case, how does your test statistic change? How does your p-value change? How does your rejection decision change?