Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Copy of 13 Halloween Traditions Explained
Transcript of Copy of 13 Halloween Traditions Explained
2.5 OLS Fitted Values and Residuals
II. Mechanics and interpretation of ordinary least Squares
I. Motivation for Multiple Regression
1.1 The model with two independent variables
2.3 On the meaning of ''Holding other factors Fixed"" in Multiple Regression
2.1 Obtaining the OLS Estimates
Multiple Regression Analysis: Estimation
Multiple regression analysis
(Yi - Bo - 6B1Xi1 - B2Xi2
I. Motivation for Multiple Regression
II. Mechanics and Interpretation of ordinary Least Squares
III. The expected Value of the OLS Estimation
IV. The Variance of the OLS Estimators
V. Efficiency of OLS: The Gauss-Markov Theorem
The multiple regression model is still the most widely used vehicle for empirical analysis in economics and other social sciences. Likewise, the method of ordinary least squares is popularly used for estimating the parameters of the multiple regression model.
we now summarize some computational and algebraic feature of the method of ordinary least squares as it applies to a particular set of data. we also discuss to interpret the estimated equation.
we first consider estimating the model with two independent variables. The estimated OLS equation is written in a form similar to the simple regression case:
^Y = ^Bo + ^B1X1 + ^B2X2,
Where ^Bo = the estimate of Bo
^B1 = the estimate of B1
^B2 = the estimate of B2
To understand what OLS is doing, it is important to master the meaning of the indexing of the independent variables in (3.10). The independent variables have two subscripts here, i followed by either 1 or 2. The i subscript refers to the observation number. so the sum in (3.10) is over all i = 1 to n observations.
The second index is simply a method of distinguishing between different independent variable.
The partial effect interpretation of slope coefficients in multiple regression analysis can cause some confusion, so we provide a further discussion now.
The power of multiple regression analysis is that is allows us to do in nonexperimental environments what natural scientists are able to do in a controlled laboratory setting: keep other factors fixed.
In ancient times, the apple was viewed as a sacred fruit that could be used to predict the future. It was believed that the first person to pluck an apple from the water-filled bucket without using their
hands would be the first to marry.
Bobbing for Apples
If the bobber lucked out and caught an apple on the first try, it meant that they would experience true love, while those who got an apple after many tries would not be so lucky.
After obtaining the OLS regression line (3.11), we can obtain a fitted or predicted value for each observation. for observation i, the fitted value is simply
which is just the predicted value obtained by plugging the values of the independent variable for observation i into equation (3.11). we should not for get about the intercept in obtaining the fitted value; otherwise the answer can be very misleading.
Sometimes, we want to change more than one independent variable at the same time to find the resulting effect on the dependent variable.
we begin with the case of two independent variables:
Y = Bo + B1X1 + B2X2
More important than details underlying the computation of the ^Bj is the interpretation of the estimate equation.
As with simple regression, we can define the total sum of squares (SST), the explained sum of squares (SSE), and the residual sum of squares or sum of squared residuals (SSR).
The fact that R2 never decreases when any variable is added to a regression makes it a poor tool for deciding whether one variable or several variables should be add to the model. the factor that should determine whether an explanatory variable belong in a model is whether the explanatory variable has a nonzero partial effect on y in the population.
2.4 Changing more than one independent Variable Simultaneously
Professor Dr Soobong Uh
By: Miss Photchamane PHENGRATTANAVONH (Jean)
is more amenable to ceteris paribus analysis because is allows us to explicitly control for many other factor that simultaneously affect the dependent variable. this is important both for testing economic theories and for evaluating policy effect when we must rely on nonexperimental data. because multiple regression models can accommodate many explanatory variable that maybe correlated, we can hope to infer causality in cases where simple regression analysis would be misleading.
The first example is a simple variation of a wage equation introduced in chapter 2 or obtaining the effect of education on hourly wage:
wage = Bo + B1edu + B2exper + u
As a second example, consider the problem of explaining the effect of per student spending on the average standardized test score at the high school level.Suppose that the average test score depends on funding, average family income, and other unobservable:
avgscore = Bo + B1expend + B2avginc + u
1.2 The model with k independent variable
The general multiple linear regression model can be written in the population as:
y = Bo + B1X1 + B2X2 + B3X3 + ... + BkXk + u,
Bo is the intercept.
B1 is the parameter associated with X1.
B2 is the parameter associated with X2,
and so on.
terminology for multiple regression
No matter how many explanatory variable we include in our model, there will always be factors we cannot include, and these are collectively contained in u.
2.9 Regression through the origin
Sometime, an economic theory or common sense suggests that Bo should be zero, and so we should briefly mention OLS estimation when the intercept is zero.
(No perfect Collinearity)
3.3 Omitted Variable Bias: more General Cases
Deriving the sign of omitted variable bias when there are multiple regressors in the estimated model is more difficult. we must remember that correlation between a single explanatory variable and the error generally results in all OLS estimators being biased.
IV. The variance of the OLS
The Linear Relationships among the independent variables, R2j
Multiple regression analysis is also useful for generalizing functional relationships between variables.
Once we are in the context of multiple regression, there is no need to stop with two independent variables. Multiple regression analysis allows many observed factors to affect y. In the wage example, we might also include amount of job training, years of tenure with the current employer, measures of ability, and even demographic variables like the number of siblings or mother's education. In the school funding example, additional variables might include measures of teacher quality and school size.
1.2 The model with K Independent variables
The terminology for multiple regression is similar to that for simple regression,and is given in table 3.1 just as in simple regression, the variable u is the error term or disturbance. It contains factors other than X1,X2,...,Xk that effect y.
2.6 A ""partialling Out"" Interpretation of Multiple Regression
When applying LOS, we do not need to know explicit formulas for the ^bj that solve the system of equation in (3.13). nevertheless, for certain derivations, we do need explicit formulas for the ^Bj. These formulas also shed further light on the workings of OLS.
Two special cases exist in which the simple regression of Y on X1 will produce the same OLS estimate on X1 as the regression of Y on X1 and X2.
2.7 comparison of simple and Multiple Regression Estimates
1. Simple regression of y on X1 as ~y=~Bo+~B1X1
2. multiple regression as
comparison between simple and multiple regression:
~B1 = ^B + ^B2 1.
III. The Expected value of the OLS Estimators
We now turn on the statistical properties of OLS for estimating the parameters in an under lying population model. in this section, we derive the expected value of the OLS estimators.
In particular, we state and discus for assumption, which are direct extension of the simple regression model assumptions, under which the OLS estimators are unbiased for the population parameters. we also explicitly obtain the bias in OLS when an important variable has been omitted from the regression.
The first assumption we make simply defines the multiple linear regression (MLR) model.
Assumption MLR.1 (linear in parameters)
Assumption MLR.2 (Random Sampling)
The model in the population can be written as
where Bo,B1,...Bk are the unknown parameters of interest and u is and unobservable random error or disturbance term.
we have a random sample of n observations,
(Xi1,Xi2,...,Xik,Yi): i = 1,2,..., n , following the population model in assumption MLR.1.
In the sample, none of the independent variables is constant, and there are no exact linear relationships among the independent variables.
(Zero Denominational Mean)
The error u has an expected value of zero given any values of the independent variables. In other words.
E(uIX1,X2,...Xk) = 0.
The error u has the same variance given any value of the explanatory variables. In other words.
Var(uIX1,,...Xk) = Q2.
3.1 Including Irrelevant Variables in a regression model
One issue that we can dispense with fairly quickly is that of inclusion of an irrelevant variable or overspecifying the model in multiple regression analysis. This mean that one of the independence variable is included in the model even though is has no partial effect on y in the population.
The multiple regression model allows us to effectively hold other factors fixed while examining the effects of a particular independent variable on the dependent variable. It explicitly allows the independent variables to be correlated.
although the model is linear in its parameters, it can be used to model nonlinear relationships by appropriately choosing the dependent and independent variable.
The method of ordinary least squares is easily applied to estimate the multiple regression model. Each slope estimate measures the partial effect of the corresponding independent variable on the dependent variable, holding all other independent variables fixed.
R2 is the proportion of the sample variation in the dependent variable explained by the independent variables, and it serves as a goodness-of-fit-measure. it is important not to put too much weight on the value of R2 when evaluating econometric models.
Under the first for Guass-markov assumption assumptions, the OLS estimators are unbiased. this implies that including an irrelevant variable in a model has no effect on the unbiasednedd of the intercept and other slope estimators. on the other hand, omitting a relevant variable causes OLS to be biased. In many circumstances, the direction of the bias can be determined.
Addinf an irrelevant variable to an equation generally increases the variances of the remaining OLS estimators because of multicollinearity.
3.2 Omitted variable Bias: The Simple Case.
Now suppose that rather then including an irrelevant variable, we omit a variable that actually belongs in the true model. this is often called the problem of excluding a relevant variable or underspecifying the model.
Summary of bias in B1 when X2 is omitted in estimating
The term R2j in equation is the most difficult of the three components to understand. this term does not appear in simple regression analysis because there is only one independent variable in such case.
We now obtain the variance of the OLS estimators so that, in addition to knowing the central tendencies of the ^Bj, we also have measure of the spread in its sampling distribution. before finding the variances, we add a homoskedaticity assumption, as in chapter 2.
we begin with some simple example to show how multiple regression analysis can be used to solve problems that cannot solved by simple regression.