Weighted linear regression using the inverse of variance as the weighting factor

Asked

Viewed 428 times

3

I have the following data set that establishes a relationship between two variables "X" and "Y":

df <- data.frame(X=c(25,25,25,25,25,25,50,50,50,50,50,50,
75,75,75,75,75,75,100,100,100,100,100,100,
125,125,125,125,125,125,150,150,150,150,150,150),    
Y=c(2457524,2391693,2450828,2391252,2444638,2360293,
4693194,4844527,4835596,4878092,4809226,4722253,
7142763,7182769,7135550,7173920,7216871,7076359,
9496553,9537788,9405825,9439201,9609870,9707734,
12031958,12027037,11935594,11930086,12154132,
12096462,14298064,14396607,13964716,14221039,
14283992,14042220))

Consider the following problem:

"Adjust a weighted linear model using the "lm" function and, as the weighting factor, the inverse of the "Y" variance for each "X" level". That is, the linear model should be weighted by the inverse of the variance of each level of "X". In this case, how can we specify the weighted functional relation? Is there any specific function to be entered as argument in "Weights”?

Technical detail: Only fit by the function "lm". Not fit by any other method (gls, glm, etc.).

1 answer

3

Just create the desired weight array to solve this problem. In your case, I called this vector pesos:

variancias_condicionais <- aggregate(df$Y, list(df$X), var)$x
quantidade_X <- as.numeric(table(df$X))
pesos <- rep(1/variancias_condicionais, quantidade_X)

ajuste <- lm(Y ~ X, data=df, weights=pesos)
summary(ajuste)

Call:
lm(formula = Y ~ X, data = df, weights = pesos)

Weighted Residuals:
     Min       1Q   Median       3Q      Max 
-2.17331 -0.71861 -0.08895  0.84733  2.42540 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)    28185      22538   1.251     0.22    
X              95300        330 288.777   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.097 on 34 degrees of freedom
Multiple R-squared:  0.9996,    Adjusted R-squared:  0.9996 
F-statistic: 8.339e+04 on 1 and 34 DF,  p-value: < 2.2e-16
  • Hello Marcus. We’re almost there... I had the variance of each level of "X" printed and I couldn’t help but notice that it was the same for all levels of the variable "X" (25,50, 75, 100, 125, 150). We need to correct this because the correct one would be to calculate the inverse of the variance each level of X.

  • The expected result for the weight vector would be: 6.20046E-10 6.20046E-10, 6.20046E-10, 6.20046E-10, 6.20046E-10, 6.20046E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 4.28484E-10, 4.28484E-10, 4.28484E-10, 4.28484E-10, 4.28484E-1010 4.28484E-10, 4.28484E-10, 7.96637E-11, 7.96637E-11, 7.96637E-11 7.96637E-11, 7.96637E-11, 7.96637E-11, 1.29098E-10, 1.29098E-10 1.29098E-10, 1.29098E-10, 1.29098E-10, 3.67609E-11 3.67609E-11, 3.67609E-11, 3.67609E-11, 3.67609E-11, 3.6760E-11, 3.67609E-11, 3.67609E-11, 3.67609E-1111 and congratulations on the elegant view !!!

  • I did not understand where these values came from. Please explain to me how the variance of a constant is a non-zero number.

  • Edited code.

  • Consider this one that is best explained: For X=25 (first level of X), we have the following values of "Y": 2457524, 2391693, 2450828, 2391252, 2444638, 2360293 so the variance of Y given X is equal to var(Y/X)=1612784428. The inverse of this variance is 1/var(Y/X)=6.20046E-10. Note that this value was repeated for each of the repetitions of level 25 in X. This must be the logic for the other levels of X, ie 50, 75, 100, 125, 150. Please note that for each level of X there will be a different variance.

  • Another thing: consider generic levels (N1, N2, N3, ...) because we do not want to restrict the code to the example.

  • 1

    Very good Marcus!!! As always a very elegant response!

  • Thank you. If possible, in addition to voting on the answer, accept it. So other people in the future will know more easily that this answer solves your problem.

  • Marcus, what is the procedure for accepting the answer? I’m new here and I’m still in the learning phase of the site (laughs).

Show 5 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.