Simple Linear Regression


The fundamental concept is that a linear model summarizes a relationship between a set of explanatory variables (x-axis) and a response variable (y-axis) that forms a straight line on a graph.

Linear Regression in Python 3

Fitting the Model

Python 3 does not have innate functions to fit linear models, so importing libraries is necessary. The most commonly used is the linear regression from the sklearn libraries. The code is as follows:

from sklearn.linear_model import LinearRegression

model = LinearRegression(), resVar)

Linear Regression in R

Fitting the Model

The following code creates a linear model with ‘resVar’ representing the y-axis or the response variable and ‘expVar1’ and ‘expVar2’ representing the x-axis or the explanatory variables:

linearModelVar<-lm(resVar ~ expVar1 + expVar2, data=dataVar)


Making Predictions

Now that we fitted the model as above we can use the model to make predictions. This involves asking the question, “When expVar1 = 7.5 and expVar2 = 35, what does the model predict resVar will equal?”. This question is represented using the following code:

predict(linearModelVar, newdata = data.frame(expVar1 = 7.5, expVar2 = 35))

Linear Regression in SAS

You can perform basic linear regression in SAS with the following code:

resVar = expVar1 expVar2;

Linear Regression in Excel

In the function section of excel you can use the trend function as follows, where A1:A10 are the y-values or the response values and B1:B10 are the x-values or the explanatory values:


Linear Regression in SPSS

Coming soon…

Basic Linear Regression

Yj = β0 + β1X1j + β2X2j + εj for j = 1, 2, . . . , n.

In the above model Y is the response variable, B0 is the line intercept with the X-axis, X1 is the first regressor/covariate??

Random Component (ε):

Systemic component (Xβ):

The three key components are:

  1. Use least squares to fit a line to the data
  2. Calcualte R2
  3. Calculate a p-value for R2

Significance Test for Regression

H0 : β = 0(or β1 = β2 = β3 = . . . = βk = 0)

HA: not all βi are zero (at least one βi 6= 0) for i = 1, 2, . . . , k.

Multiple Linear Regression Models (MLRs)


Generalized Linear Models (GLMs)

A generalized linear model is a multiple linear regression model in which the relationship between regressors and the dependent variable (Y) may be non linear.


So, in other words, the combination of regressors are liniear but are in some way are non-linear with the response variable and hence have an exponential relationship.

Random Component: E[Y]

Systemic Component: Xβ

Link Function: g(µ) = (η)