3
I have the following data set that establishes a relationship between two variables "X" and "Y":
df <- data.frame(X=c(25,25,25,25,25,25,50,50,50,50,50,50,
75,75,75,75,75,75,100,100,100,100,100,100,
125,125,125,125,125,125,150,150,150,150,150,150),
Y=c(2457524,2391693,2450828,2391252,2444638,2360293,
4693194,4844527,4835596,4878092,4809226,4722253,
7142763,7182769,7135550,7173920,7216871,7076359,
9496553,9537788,9405825,9439201,9609870,9707734,
12031958,12027037,11935594,11930086,12154132,
12096462,14298064,14396607,13964716,14221039,
14283992,14042220))
Consider the following problem:
"Adjust a weighted linear model using the "lm" function and, as the weighting factor, the inverse of the "Y" variance for each "X" level". That is, the linear model should be weighted by the inverse of the variance of each level of "X". In this case, how can we specify the weighted functional relation? Is there any specific function to be entered as argument in "Weights”?
Technical detail: Only fit by the function "lm". Not fit by any other method (gls, glm, etc.).
Hello Marcus. We’re almost there... I had the variance of each level of "X" printed and I couldn’t help but notice that it was the same for all levels of the variable "X" (25,50, 75, 100, 125, 150). We need to correct this because the correct one would be to calculate the inverse of the variance each level of X.
– Weidson C. de Souza
The expected result for the weight vector would be: 6.20046E-10 6.20046E-10, 6.20046E-10, 6.20046E-10, 6.20046E-10, 6.20046E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 1.86313E-10, 4.28484E-10, 4.28484E-10, 4.28484E-10, 4.28484E-10, 4.28484E-1010 4.28484E-10, 4.28484E-10, 7.96637E-11, 7.96637E-11, 7.96637E-11 7.96637E-11, 7.96637E-11, 7.96637E-11, 1.29098E-10, 1.29098E-10 1.29098E-10, 1.29098E-10, 1.29098E-10, 3.67609E-11 3.67609E-11, 3.67609E-11, 3.67609E-11, 3.67609E-11, 3.6760E-11, 3.67609E-11, 3.67609E-11, 3.67609E-1111 and congratulations on the elegant view !!!
– Weidson C. de Souza
I did not understand where these values came from. Please explain to me how the variance of a constant is a non-zero number.
– Marcus Nunes
Edited code.
– Marcus Nunes
Consider this one that is best explained: For X=25 (first level of X), we have the following values of "Y": 2457524, 2391693, 2450828, 2391252, 2444638, 2360293 so the variance of Y given X is equal to var(Y/X)=1612784428. The inverse of this variance is 1/var(Y/X)=6.20046E-10. Note that this value was repeated for each of the repetitions of level 25 in X. This must be the logic for the other levels of X, ie 50, 75, 100, 125, 150. Please note that for each level of X there will be a different variance.
– Weidson C. de Souza
Another thing: consider generic levels (N1, N2, N3, ...) because we do not want to restrict the code to the example.
– Weidson C. de Souza
Very good Marcus!!! As always a very elegant response!
– Weidson C. de Souza
Thank you. If possible, in addition to voting on the answer, accept it. So other people in the future will know more easily that this answer solves your problem.
– Marcus Nunes
Marcus, what is the procedure for accepting the answer? I’m new here and I’m still in the learning phase of the site (laughs).
– Weidson C. de Souza
Here is a tutorial explaining step by step, with images.
– Marcus Nunes