2
I have a table with some columns of factors that vary over time. With multiple regression I can evaluate the influence of a group of factors on the variation of 1. How can I do this in R?
2
I have a table with some columns of factors that vary over time. With multiple regression I can evaluate the influence of a group of factors on the variation of 1. How can I do this in R?
3
You can run a regression on R
using the function lm
. Using the base mtcars
that already comes in R as an example:
regressao <- lm(mpg ~ cyl, data = mtcars)
First we move to the function lm
regression formula mpg ~ cyl
and then the database data = mtcars
. The formula mpg ~ cyl
means that we are regressing the variable mpg
(miles per gallon) against the variable cyl
(engine capacity), would be equivalent to the equation mph = B0 + B1*cyl + e, and you are estimating the parameters B0 (constant) and B1 (angular coefficient). The regression result was saved in the object regressao
.
In giving summary
you see the main regression results:
summary(regressao)
Call:
lm(formula = mpg ~ cyl, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
cyl -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
To do multiple regression, just include more variables after the ~
. More specifically, the element to the left of the ~
is the dependent variable (its y) and all variables to the right of the ~
are explanatory variables (os X). For example:
regressao_multipla <- lm(mpg ~ cyl + disp + wt + hp , data = mtcars)
Here we run a regression with 4 explanatory variables: cyl
, disp
, wt
and hp
, all in the data.frame
mtcars. To see the main results, use summary
again:
summary(regressao_multipla)
Call:
lm(formula = mpg ~ cyl + disp + wt + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.0562 -1.4636 -0.4281 1.2854 5.8269
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 40.82854 2.75747 14.807 1.76e-14 ***
cyl -1.29332 0.65588 -1.972 0.058947 .
disp 0.01160 0.01173 0.989 0.331386
wt -3.85390 1.01547 -3.795 0.000759 ***
hp -0.02054 0.01215 -1.691 0.102379
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.513 on 27 degrees of freedom
Multiple R-squared: 0.8486, Adjusted R-squared: 0.8262
F-statistic: 37.84 on 4 and 27 DF, p-value: 1.061e-10
There are several other functions to work with regressions in R. The object that function lm
returns is from class lm
, for you have an idea of the methods available for the class you can run methods(class = "lm")
.
Wow, thank you so much for the explanation! It helped me a lot!
I tested the function with a database I am working and worked very well.
Browser other questions tagged r
You are not signed in. Login or sign up in order to post.
The question is not clear, you just want to know how to do a regression in R?
– Carlos Cinelli
Yes, I would like to know how to do multiple regression on R and if possible, linear too.
– Bruno Rigueti
Do you have any code you have ever tried to develop or are trying? Paste it in the question, will improve the understanding of the staff and will be able to help better.
– Flávio Granato
You just follow the step-by-step answer below Carlos Cinelli. I followed the steps and managed to make the calculations perfectly and I got the data I needed.
– Bruno Rigueti
We don’t do your work for nothing . What have you tried?
– Bruno Costa