What is it: Regression Analysis is a statistical analysis assessing the association between two variables. Regression Analysis is a technique used for the modeling and analysis of numerical data consisting of values of a dependent variable (response variable) and of one or more independent variables (explanatory variables). The dependent variable in the regression equation is modeled as a function of the independent variables, corresponding parameters ("constants"), and an error term. The error term is treated as a random variable. It represents unexplained variation in the dependent variable. The parameters are estimated so as to give a "best fit" of the data. Most commonly the best fit is evaluated by using the least squares method, but other criteria have also been used.
Why use it: Regression can be used for prediction (including forecasting of timeseries data), inference, hypothesis testing, and modeling of causal relationships. These uses of regression rely heavily on the underlying assumptions being satisfied. Where to use it: To identify and mathematically define the relationship between variables so that predictions can be made based on the expected relationship. How to use it: Regression models predict a value of the y variable given known values of the x variables. If the prediction is to be done within the range of values of the x variables used to construct the model this is known as interpolation. Prediction outside the range of the data used to construct the model is known as extrapolation and it is more risky. Important Notes:
Classical assumptions for Regression Analysis also include:
These are sufficient (but not all necessary) conditions for the leastsquares estimator to possess desirable properties, in particular, these assumptions imply that the parameter estimates will be unbiased, consistent, and efficient in the class of linear unbiased estimators. Many of these assumptions may be relaxed in more advanced treatments.
