t-Test for Linear Regression

The null hypothesis for the linear regression model parameters is the same as the hypothesized true value $w_0=w_{00}$ and $w_1=w_{10}$ . If the residual $r_i=y_i-\hat{y_i}=y_i-(w_0+w_1x_i)$ is assumed to be normally distributed with pdf ${\cal N}(0,\sigma^2)$ , then the estimated intercept and slope have normal pdfs:

$\displaystyle w_0 \sim {\cal N}\left(w_{00}, \; \left(\frac{1}{N}+\frac{\bar{x}... ... w_1 \sim {\cal N}\left(w_{10},\;\frac{\sigma^2}{\sum_i(x_i-\bar{x})^2}\right)$ (87)

$\displaystyle S_e^2=\frac{\sum_i r_i^2}{N-2},\;\;\;\;\;\;S_e=\sqrt{\frac{\sum_i r_i^2}{N-2}}$ (88)

$\displaystyle SE_{w_0}=S_e \sqrt{\frac{1}{N}+\frac{\bar{x}^2}{\sum_{i=1}^N(x_i-... ...right) \left(\frac{1}{N}+\frac{\bar{x}^2}{\sum_{i=1}^N(x_i-\bar{x})^2}\right)}$ (89)

$\displaystyle SE_{w_1}=S_e \sqrt{\frac{1}{\sum_{i=1}^N(x_i-\bar{x})^2}} =\sqrt{ \frac{\frac{1}{N-2}\sum_{i=1}^N r_i^2} {\sum_{i=1}^N(x_i-\bar{x})^2}}$ (90)

$\displaystyle t=\frac{w_j-w_{j0}}{SE_{w_j}}$ (91)

Typically we assume $w_{10}=0$ and test the null hypothesis $H_0:\;w_1=0$ , i.e., there is no relationship between variables $x$

and

. The alternative hypothesis is $H_a:\; w_1\ne 0$ , i.e., there is some relationship between $x$

and

We can also find the upper and lower limits $\pm t_{\alpha/2}$ from the t-table (with $\nu=N-2$ ) so that

$\displaystyle P(-t_{\alpha/2}\le t_j\le t_{\alpha/2}) =\int_{-t_{\alpha/2}}^{t_{\alpha/2}} {\cal T}_{N-2}(\tau)\;d\tau =1-\alpha$ (92)

		$\displaystyle P(-t_{\alpha/2}\le t_j\le t_{\alpha/2}) =P\left(-t_{\alpha/2}\le \frac{w_j-w_{j0}}{SE_{w_j}} \le t_{\alpha/2}\right)$
	$\textstyle =$	$\displaystyle P(w_j-t_{\alpha/2} SE_{w_j}\le w_{j0} \le w_1+t_{\alpha/2} SE_{w_j} ) =1-\alpha$

For multivariate linear regression, we can also carry out some tests to answer questions such as which subset of the independent variables $\{ x_1,\cdots,x_D\}$ is most important in affecting $y$

. For simplicity, we let $\{x_1,\cdots,x_p\}$ be a subset of $p<N$

variables out of the $D$

variables. Then the corresponding null hypothesis $H_o:\;w_1=\cdots=w_p=0$ , i.e., variable $y$

is not related to any variables in the subset $\{x_1,\cdots,x_p\}$ . The corresponding test statistic is

$\displaystyle f=\frac{(SSR_p-SSR)/p}{SSR/(N-D-1)}$ (93)