Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Simple Regression and Multiple Regression

No description
by

Rommon C.

on 21 August 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Simple Regression and Multiple Regression

Simple Regression and Multiple Regression
Simple/Multiple Regression
Normally, in Statistics we use Simple Linear Regression/Multiple Linear Regression technique when our dependent variable is quantitative variable and independent variable(s) is(are) quantitative variable(s).
Besides, Multiple Regression can be used when our dependent variable is quantitative but our independent variables (at least 2) are nominal/ordinal.
Multiple Regression
Multiple Linear Regression is an approach to modeling the relationship between a scalar dependent variable y and more than one explanatory variables denoted X.
Linear regression models are often fitted using the least squares approach, but they may also be fitted in other ways, such as by minimizing the "lack of fit" in some other norm, or by minimizing a penalized version of the least squares loss function as in ridge regression.
Multiple Linear Regression (cont.)
Outline
Simple Linear Regression
Multiple Linear Regression
To fit the Regression Line
Suppose there are n data points {yi, xi}, where i = 1, 2, …, n. The goal is to find the equation of the straight line which would provide a "best" fit for the data points.
the least-squares approach: such a line that minimizes the sum of squared residuals of the linear regression model. In other words, numbers alpha (the y-intercept) and beta (the slope) solve the following minimization problem:
To fit the Regression Line(Cont.)
Test ob Beta using t-Statistics
it can simply expand to get a quadratic in α

(alpha) and β (beta) , it can be shown that the values of alphaαand beta that minimize the objective function are
To Fit The Regression Lines
Given a data set of n statistical units, a linear regression model assumes that the relationship between the dependent variable yi and the p-vector of regressors xi is linear.
This relationship is modelled through a disturbance term or error variable — an unobserved random variable that adds noise to the linear relationship between the dependent variable and regressors. Thus the model takes the form
Often these n equations are stacked together and written in vector form as
where
Assumptions
Normality
Linearity
Constant Variance
Independence
Lack of Multicolinearity
Linearity
the mean of the response variable is a linear combination of the parameters and the predictor variables
Linearity Assumption: only a restriction on the parameters
Constant Variance
different response variables have the same variance in their errors, regardless of the values of the predictor variables
to determine for heterogeneous error variance, or when a pattern of residuals violates model assumptions of homoscedasticity, it is prudent to look for a "fanning effect" between residual error and predicted values
error will not be evenly distributed across the regression line
To Fit Regression Lines(cont.)
From the function mentioned before, there are estimation methods to estimate parameters.
These methods differ in computational simplicity of algorithms, presence of a closed-form solution, robustness with respect to heavy-tailed distributions, and theoretical assumptions needed to validate desirable statistical properties such as consistency and asymptotic efficiency.
Least Square Estimation
1.The OLS method minimizes the sum of squared residuals, and leads to a closed-form expression for the estimated value of the unknown parameter beta :
The estimator is unbiased and consistent if the errors have finite variance and are uncorrelated with the regressors
2.GLS is an extension of the OLS method, that allows efficient estimation of beta when either heteroscedasticity and/or correlations are/is present among the error terms of the model, as long as the form of heteroscedasticity and correlation is known independently of the data. GLS minimizes a weighted analogue to the sum of squared residuals from OLS regression.
This special case of GLS is called "weighted least squares".
The GLS solution to estimation problem is
3.other techniques i.e. IRLS, IV, TLS,
Optimal Instrument Regression etc.
Maximum-Likelihood Estimation and Related Techniques
1.Maximum-Likelihood Estimation: is performed when the distribution of the error terms is known to belong to certain parametric family ƒ(theta) of probability distributions. When ,
the resulting estimate is identical to the OLS estimate. GLS estimates are maximum likelihood estimates when error follows a multivariate normal distribution with a known covariance matrix.
2.other techniques are Ridge Regression, LAD, and Adaptive Estimation.
Other Techniques in Estimation
Bayesian linear regression
Quantile regression
Mixed models
Principal Component Regression (PCR)
Least-angle regression
The Theil–Sen estimator
etc.
Independence of Error
this assumes that the errors of the response variables are uncorrelated with each other
Lack of Multicolinearity
standard least squares estimation methods, the design matrix X must have full column rank p; otherwise, there is a condition known as multicollinearity in the predictor variables
it can also happen if there is too little data available compared to the number of parameters to be estimated
Linear regression is widely used in biological, behavioral and social sciences to describe possible relationships between variables. It ranks as one of the most important tools used in these disciplines.
Under the Hypothesis
Test Statistic
To test hypothesis by
Reject H0 when
Correlation
The Pearson correlation is +1 in the case of a perfect positive linear relationship, −1 in the case of a perfect decreasing linear relationship, and some value between −1 and 1 in all other cases, indicating the degree of linear dependence between the variables. As it approaches zero there is less of a relationship. The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables.
If the variables are independent, Pearson's correlation coefficient is 0, but the converse is not true because the
correlation coefficient detects only linear dependencies
between two variables.
Beta test Overall and Each Beta
Under the Hypothesis
Test Statistic
To test hypothesis by
Reject H0 when
For Overall test
For individual Beta
Assumption
- Independence
- Normality
- Homoscedasticity
- Linearity
Simple Linear Regression
In statistics, Simple Linear Regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model as small as possible.
Simple Linear Regression (cont)
Regression Model
Scale
Independent Variables
Dependent Variables
O , N , Scale
Restriction
1.Extrapolation
2.Inverse prediction
Under the Hypothesis
Test Statistic
To test hypothesis by
Reject H0 when
+e
Full transcript