In general, the goal of regression analysis is to estimate the
relationship between a dependent variable as a function
of d independdent variables,
typically represented as a d-dimensional vector
.
The estimation is based on a given dataset of
observed points
and the corresponding values of the
dependent variables
, which can be more concisely
represented as
(1) |
As the simpliest form of regression, linear regression tries to
model the given dataset by a linear relationship between and
:
(2) |
There are in general two different methods for findig the model
paramters that fit the given data optimally. First,
the least squares (LS) method is based on the residual, the
difference between the predicted value by the model
and the actual value
given in the
dataset:
(3) |
(4) |
(5) |
Substituting
back into the expression of
we get
(7) |
///
we try to fit a hyperplane in a dimentional space to a set of
given data points:
containing vectors each for one of the
data points in the
d-dimensional space, and
is the corresponding values
to an n-D point and
.
data points in the d-dimensional space: where we try to fit a
The relationship between the inputs and outputs can be described by a
function
with additive noise, i.e.,
(8) |
The simplest form of this regression problem is the linear regression
based on the assumption that the function
is a linear combination of all components of the input vector
with weights
:
(9) |
(10) |
(11) |
(12) |
Example
(13) |
(14) |
(15) |
(16) |