Complete Separation in Hurdle Model

Asked

Viewed 49 times

0

In a Hurdle Model analysis, how can one work with one of the variables of the model that presents complete (or almost complete) separation in the binomial model?

1 answer

1


Short answer

No. The likelihood function will not be able to be maximized and this will affect the estimation of the parameters of the logistic part of the model.

Not So Short Answer

It depends. It is possible to deal with complete separation in logistic regression if you use a penalized likelihood. There are several ways to do this in R, although none of them are implemented in the package pscl (at least to my knowledge).

Another easier way to deal with this is to leave the problematic variable out of the analysis.

A third alternative is, if the variable that generates the complete separation is categorical, combine it with other categorical variables, so that this makes sense in the context of your problem.

  • Oops, Marcus, they say the tips! So the variable in question is quantitative, so I can’t combine it with others. I have thought about leaving it out of the analysis, but it is very important in my model, and there are other works that demonstrate this in similar contexts with what I am working with. Can you work with penalized likelihood within the Hurdle approach? I searched on this, but I confess that I did not find satisfactory results :( Any help is welcome!

  • In theory, it is possible to work with standardized verisimilitude using Hurdle models, because it separates verisimilitude from counts and proportions into two independent verisimilitudes. But I think that Achim Zeileis did not implement this in the package pscl, which is the package I know to work with Urdle models. So someone would have to program this specific function, find the likelihood and such, to be able to make this kind of adjustment. But to do this, by itself, would already give a master’s dissertation (perhaps doctoral thesis) in a statistics department.

  • My dissertation, which is of Ecology, will soon turn into a work of statistics even, hahaha. Okay, I’ve come to a better understanding, and I’m going to decide the most logical way to treat this variable. Brawl! :)

  • I understand you. To write my thesis, which was in statistics, I had to learn biology, bioinformatics and chemistry, just to understand how to develop a new statistical method to deal with RNA-Seq data. Who wants to work with multidisciplinarity ends up going through this.

  • Hahahaha, the whole point!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.