**Unit 4: Regression & Correlation**

Ex 5 - How can drawing a cloud around a data set help identify if there are any outliers?

Draw the "Tic-Tac-Toe" board and label where each type of outlier(x-only, y-only, x&y, x and y when considered jointly) would be located.

**U4L1I2: Clouds**

**What is different about these three scatter plots?**

U4L2I : Sum of Squared Errors (Residuals)

U4L2I1: Line of Regression (SSE), Residuals, Error in Prediction

**U4L1I1: Correlation**

**Correlation**

Ex 1: Classify each scatter plot below as Strong or Weak, Positive or Negative, then put the scatter plots in order from largest positive r to smallest negative r.

**Cool Correlation website**

http://www.shodor.org/interactivate/activities/Regression/

http://www.shodor.org/interactivate/activities/Regression/

A

B

C

D

E

F

G

H

I

J

Ex1 - r values

Ex 2 - The line of Best fit is drawn for each data set below, predict the r-value then explain how the line of best fit helped in your rankings

G,F,E,D,C,B,A, J,I,H

Positive

Negative

STRONG

A

B

C

D

E

F

G

H

I

J

Ex3 - Re-write your r-value rankings from ex 1, then draw a cloud around each of the scatter plots below. How does the shape of the cloud relate to the correlation coefficient of the scatter plot?

Ex 4 - Draw the cloud surrounding each scatter plot, then classify the scatter plot as strong or weak, and Linear or Non-Linear.

A

B

C

D

E

F

Ex5. Classify each Scatter plot below as linear or non-Linear and varying or non-varying. Then identify the strongest Linear and the strongest non-linear relationship.

A

B

C

D

E

F

G

Ex 6 Draw a cloud and the tic-tac-toe board for the data set below. Add an outlier that is...

a. An x-only outlier, label it A.

b. An y-only outlier, label it B.

c. An x & y outlier, label it C.

d. An x and y outlier when considered jointly, label it D.

ex 7 Examine the data sets below. Identify and classify all outliers.

**U4L1I2 - Outliers**

Teaching Example

Plot each set of points on the the graph then use a ruler to draw your "line of best fit".

Ex 8 - Finding the Least Squared Regression Line

Find the least squared regression line for each data set. Then determine the predicted y value for each point, draw the regression line and calculate the residuals.

Ex. 9

What is the sum of the residuals for each data set

Regression Intro

EX. 10 - Identifying Positive and Negative Residuals

A. Which points have a positive residual?

B. Which points have a negative residual?

C. Which point has the largest positive residual?

D. Which point has the largest negative residual?

E. How did you determine which point has the largest residual without actually calculating a residual?

G. In general where are the points with a positive residual located with respect to the regression line?

EX. 11 - Which line is the regression line, Explain.

A

B

Ex. 12 - Andrea's Stats

Andrea plays forward on her HS basketball team and she has kept track of her points per game in the table below.

a. Find a regression line and equation for Andrea's points per game. Draw the regression line on the graph.

b. What conclusions can you draw about Andrea's basketball skills from the slope of the regression line.

c. Calculate the residual for each point in the data table. Circle the point with the largest positive and largest negative residual, then draw a line that represents the residual on the graph.

d. How many points do you think Andrea will score in her 11th game?

Ex 13 - High Temps in November

A local weatherman has recorded the daily high temperature for the first 15 days in November, in the table and graph shown.

a. Find a regression line and equation for temperatures on November. Draw the regression line on the graph.

b. What conclusions can you draw about the high temperatures in November from the slope of the regression line.

c. Calculate the residual for each point in the data table. Circle the point with the largest positive and largest negative residual, then draw a line that represents the residual on the graph.

d. Based on the data predict the dailey hig on November 19.

e. When do you think the temperature will be below freezing ( freezing is 32).

Making a regression Equation with a calculator

SSE

Finding SSE with the Calculator.

Ex 14a- Find the sum of squared errors (SSE) for the data set below.

Ex 14b- Find the sum of squared errors (SSE) for the data set below.

U4L2I2: Influential Points

Ex. 15

Go into to L1 and L2 and delete the point (20, 29) and then re-calculate the regression equation and correlation coefficient.

Y=________________________________________ r= ______

What effect did removing this point have on the regression equation?

Draw the line of best fit on the scatter plot in a different color. (What two points did you use?)

Would you consider (20,29) an influential point? (Did the regression equation and r value change after the point was removed?)

Ex. 16

First Re-enter (20,29) into L1 and L2, then remove the point (12.9, 24.1) from L1 and L2.

Re-calculate the regression equation and correlation coefficient.

Y=________________________________________ r= ______

What effect did removing this point have on the regression equation?

Draw the line of best fit on the scatter plot in a different color. (What two points did you use?)

Would you consider (12.9, 24.1) an influential point?

Which point was more influential, (12.9, 24.1) or (20, 29)? Explain.

Ex 17

Is a point influential?

Finding SSE without a Calculator.

What is the SSE

What is the average length and width of the spider monkeys?

*Does this point lie on the

line of best fit?

*Your friend has a spider monkey for a pet, it is 13.8 inches long and 28 pounds, What is the error in prediction for your regression equation?