Modify t test for linear regression parameters in R

Asked

Viewed 490 times

3

I would like to know how I can change the t test in relation to the parameters of a linear regression in R. I would like to test whether B0 = 0 and whether B1 = 1. Generally, the output of a regression tests B0 = 0 and if B1 = 0. For example, the regression below

x <- c(1, 2, 3, 4, 5)
y <- c(1.2, 2.4, 3.3, 4.2, 5.1)
reg <- lm(y~x)

of the result:

summary(reg)
Coefficients:
Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.36000    0.11489   3.133 0.051929 .  
x            0.96000    0.03464  27.713 0.000103 ***

So B0 is not different from 0 (p value = 0.051929) and B1 is different from 0 (p value = 0.000103 ***). But how to test if B1 is different from 1? Would it be possible?

2 answers

2

With the R base you can ask for confidence intervals with the function confint:

confint(reg)
                   2.5 %    97.5 %
(Intercept) -0.005635243 0.7256352
x            0.849756826 1.0702432

In the above case, with a 95% range, any value within the confidence interval is not "rejected" by a hypothesis test with a 5% significance level (so you wouldn’t reject that B1 = 1, for example).

Similarly, any value outside the range is rejected at the significance level of 5%. To change the confidence level of the range, change the parameter level. for example, if you want 99%, confint(reg, level = 0.99).

If you prefer to do a specific test instead of having the interval, a package that has convenience functions for regression analysis is the package car. Among them the function linearHypothesis is for hypothesis testing. To test whether the coefficient of x (B1) is 1:

# se o pacote não estiver instalado rode install.packages("car") antes
library(car) 
linearHypothesis(reg, "x = 1")
Linear hypothesis test

Hypothesis:
x = 1

Model 1: restricted model
Model 2: y ~ x

  Res.Df   RSS Df Sum of Sq      F Pr(>F)
1      4 0.052                           
2      3 0.036  1     0.016 1.3333 0.3318

The p-value in this case is 0.3318, that is, you do not reject the hypothesis, as we saw with the confidence interval. The function linearHypothesis also accepts any other linear constraint in the parameters. For example, you can test whether the intercept (B0) is equal to the coefficient of x (B1):

linearHypothesis(reg, "(Intercept) = x")
Linear hypothesis test

Hypothesis:
(Intercept) - x = 0

Model 1: restricted model
Model 2: y ~ x

  Res.Df   RSS Df Sum of Sq      F  Pr(>F)  
1      4 0.236                              
2      3 0.036  1       0.2 16.667 0.02655 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Among other tests.

PS: I know this is not the place for that, but beware of the interpretation of p-values. For example, with respect to "B0 is not nonzero" you do not reject the null hypothesis that B0 is equal to any value within its confidence interval (such as 0.6), not just zero specifically.

1

I don’t know any function capable of doing this the R. In general, what is suggested is to do likelihood ratio tests, which are much more general and solve more sophisticated problems, even in generalized linear models.

However, nothing prevents us from writing our own function. After all, we have to test the hypotheses

H_0: beta = beta_0

H_1: beta != beta_0

where beta_0 is the reference value under which we wish to perform the test. By default, the value of beta_0 is zero. The function TesteCoeficiente, defined below, implements this hypothesis test for any value of beta_0. Simply enter the following values for the function:

reg: the regression model obtained with the function lm

coeficiente: the coefficient number in the regression model. For y = \beta_0 + \beta_1*x, 1 means testing \beta_0 and 2 means testing \beta_1

h0: null hypothesis value to be tested. In general, this value is zero

With this set, just run the example below to see how the function is executed:

x <- c(1, 2, 3, 4, 5)
y <- c(1.2, 2.4, 3.3, 4.2, 5.1)
reg <- lm(y~x)

TesteCoeficiente <- function(reg, coeficiente, h0){
  estimativas <- coef(summary(reg))
  estatistica <- (estimativas[coeficiente, 1]-h0)/estimativas[coeficiente, 2]
  2 * pt(abs(estatistica), reg$df.residual, lower.tail = FALSE)
}

TesteCoeficiente(reg, coeficiente=2, h0=0)
[1] 2.771281e+01 1.031328e-04

summary(reg)
Call:
lm(formula = y ~ x)

Residuals:
         1          2          3          4          5 
-1.200e-01  1.200e-01  6.000e-02  1.457e-16 -6.000e-02 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.36000    0.11489   3.133 0.051929 .  
x            0.96000    0.03464  27.713 0.000103 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1095 on 3 degrees of freedom
Multiple R-squared:  0.9961,    Adjusted R-squared:  0.9948 
F-statistic:   768 on 1 and 3 DF,  p-value: 0.0001031

Note that the output of the function TesteCoeficiente is identical to the function summary testing h0 = 0. Therefore, the function works for this particular case. It is quite reasonable to assume that it works for any beta_0 value. Now just change the value of the parameter h0 function to perform the desired hypothesis test.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.