How to assemble models for GLM in R

Asked

Viewed 5,304 times

0

I would like to do a GLM with the following variables:

Response variable: EVI

Independent variables: % forest, edge density, number of fragments, temperature and precipitation.

Someone could kindly help me assemble the models, I have no idea how to proceed.

  • What is EVI? What kind of variable is this?

  • It is a vegetation index Joseph, it varies from 0.1 to 1. I use it as a proxy of plant productivity.

  • Why would you use glm instead of lm? This EVI variable is not continuous?

1 answer

2

There are several packages for adjusting generalized linear models (GLM) in R. However, people probably use the function itself more glm that already comes in R base.

To adjust a model using the function glm you need to pass the model formula, the distribution family you want to adjust (for example, binomial for binary data, poisson for counting data, gaussian for the traditional linear model and so on) along with the link (for example, probit, logit or cloglog for binomial). If you do not specify the link R will use the default for the chosen distribution.

Let’s see an example with logistic regression. Simulating some sample data:

set.seed(10)
x <- rnorm(1000)
prob <- plogis(-2*x)
y <- rbinom(1000, 1, prob)

And now adjusting the model:

glm(y ~ x, family = binomial(link = logit))
Call:  glm(formula = y ~ x, family = binomial(link = logit))

Coefficients:
(Intercept)            x  
   -0.01969     -1.94726  

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      1386 
Residual Deviance: 931.9    AIC: 935.9

Other packages that advance in generalized linear models are the glmnet for glm with penalty (L1, L2 or both), lme4 for models glm mixed-effects, vgam for vector Generalized linear models.

So to begin you have to have an idea of which (which) model(s) you want to adjust and follow more or less the ideas outlined above. In your case, your formula would be something like EVI ~ % forest + edge density + number of fragments + temperature + precipitation etc or other more suitable functional form... however, explain which functional form you have to adopt, or which distribution or link you have to choose from is something specific that involves statistics and substantive knowledge of your problem and escapes the scope of Stackoverflow.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.