Predict function in GLMM

Asked

Viewed 316 times

4

Edited

When we use a traditional logistic regression and make a prediction in R for example:

library(dplyr)
n = 300
xx<-c("r1","r2","r3","r4","r5")
xxx<-c("e1","e2","e3")
p=0.3
df1 <- data_frame(
  xx1 = runif(n, min = 0, max = 10),
  xx2 = runif(n, min = 0, max = 10),
  xx3 = runif(n, min = 0, max = 10),
  School = factor(sample(xxx, n,re=TRUE)),
  Rank = factor(sample(xx, n,re=TRUE)),
  yx = as.factor(rbinom(n, size = 1, prob = p))
)
df1
mm<-glm(yx ~ xx1 + xx2 + xx3 + School + Rank,binomial,df1)
n11 = data.frame(School="e3",Rank="r2",xx1=8.58,xx2=8.75,xx3=7.92)

predict(mm, n11, type="response") #No meu caso especifico

or Predict(mm, N11)

depending on what we’re interested in, no problem.

But when we work with GLMM, for example

library(lme4)
mm2 <- glmer(yx ~ xx1 + xx2 + xx3 + Rank +  (Rank | School), data = df1, 
family = "binomial",control = glmerControl(calc.derivs = FALSE))
predict(mm2, n11, type="response") #No meu caso especifico

shows the error

 Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrastes podem ser aplicados apenas a fatores com 2 ou mais níveis

I tried to do so

 predict(m2,n11, re.form=(~Rank|School))

and presents the error

 Error in UseMethod("predict") : 
   no applicable method for 'predict' applied to an object of class "glmmadmb"

What would be the correct form of prediction in R in GLMM?

  • 1

    Cleber, I apologize for being nosy, but I took a look at your history here at Stack Overflow. There were six questions asked most do not have answers. I suspect this happens because your codes are not reproducible. Take a look at this link and see how to ask a reproducible question on R, so that the people who want to help you can do this in the best possible way.

  • @I never graded the attention, I changed the question, I hope you’re better. Thank you very much.

1 answer

4

The problem lies in the declaration of the new dataset to be foreseen. In some cases (which unfortunately I can’t tell you exactly what they are), the package lme4 requires that factors be used to make the prediction. So I created a new df1 taking this into account:

n = 300
xx<-c("r1", "r2", "r3", "r4", "r5")
xxx<-c("e1", "e2", "e3")
p=0.3
School = factor(sample(xxx, n, replace=TRUE), levels=xxx, ordered=FALSE)
Rank = factor(sample(xx, n, replace=TRUE), levels=xx, ordered=TRUE)

df1 <- data_frame(
  xx1 = runif(n, min = 0, max = 10),
  xx2 = runif(n, min = 0, max = 10),
  xx3 = runif(n, min = 0, max = 10),
  School = School,
  Rank = Rank,
  yx = as.factor(rbinom(n, size = 1, prob = p))
)

df1

Note that my code is very similar to your original. Meanwhile, I forced School and Rank to have specified factors (xxx and xx, respectively), in addition to determining that School is not ordered and Rank is. Also, I created objects called School and Rank out of df1. This will be important in the future.

So far, there’s not much difference in what you did. Now, understand how I defined n11, the data set where the forecast will be made:

mm<-glm(yx ~ xx1 + xx2 + xx3 + School + Rank,binomial,df1)

n11 = data.frame(School=sort(unique(School))[3], 
  Rank=sort(unique(Rank))[2], xx1=8.58, xx2=8.75, xx3=7.92)

Note that I determined the values of School and Rank based on objects School and Rank that I created earlier. So, sort(unique(School))[3] is the third value of School. A similar idea applies to sort(unique(Rank))[2]. Now just make the predictions:

predict(mm, n11, type="response")
        1 
0.3715539

library(lme4)
mm2 <- glmer(yx ~ xx1 + xx2 + xx3 + Rank +  (Rank | School), data = df1, 
             family = "binomial",control = glmerControl(calc.derivs = FALSE))

predict(mm2, n11, type="response") #No meu caso especifico
        1 
0.4048813 

I recognize that write the levels to be predicted in the form School=sort(unique(School))[3] and Rank=sort(unique(Rank))[2]it’s a little ugly, but I only know how to make it work this way.

  • Marcus Nunes, thank you very much for your reply, I just wanted to show one that was sent to me in stackoverflow in English, which by my tests presents the same result. In this case you can keep the structure as I had sent, but doing likewise you did (which I had wrong in not doing) which would be including ordered. God bless you. N11 = data.frame(School=factor("E3", levels = levels(df1$School),ordered=FALSE), Rank=factor("R2", levels = levels(df1$Rank),ordered=TRUE),xx1=8.58,xx2=8.75,xx3=7.92)

  • From what I saw, both answers have the same principle. The change in the statement of n11 is only aesthetic. The two versions end up being identical for the purpose of prediction. I’m glad that two people independently arrived at the same result and they both serve you well.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.