Once, right here in Stack Overflow, I commented on variable selection (link to the publication). The variable selection problem is similar to the model selection problem: we are trying to choose the simplest model that explains our data (in statistics, we always want the simplest possible model to describe our data).
But to make a test like this that you want, with sum of squares, it is necessary that the models tested are nested. The problem is that your models are not nested. It makes no sense to do a hypothesis test like
because they are not more complex and simpler versions of the same model. The nonlinear functions defined by the arguments fct = BC.4()
and LL.3()
are different. Therefore, from the theoretical point of view in the theory of Nonlinear Models (see Bates and Watts, Nonlinear Regression Analysis (1988), pp 103-104), the test you are trying to apply makes no sense. It can be done numerically, because it is possible to calculate the sum of squares for each of the models, but a test like this has no theoretical backing.
What can be done is to compare two nested models. For example,
lett.BC5 <- drm(weight ~ conc, data = lettuce, fct = BC.5())
lett.BC4 <- drm(weight ~ conc, data = lettuce, fct = BC.4())
The only difference between non-linear functions specified by fct = BC.5()
and fct = BC.4()
is that BC.5()
has one more parameter:
summary(lett.BC5)
Model fitted: Brain-Cousens (hormesis) (5 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
b:(Intercept) 1.502065 0.352231 4.2644 0.002097 **
c:(Intercept) 0.280173 0.248569 1.1271 0.288836
d:(Intercept) 0.963030 0.078186 12.3171 6.164e-07 ***
e:(Intercept) 1.120457 0.612908 1.8281 0.100799
f:(Intercept) 0.988182 0.776136 1.2732 0.234846
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error:
0.1149117 (9 degrees of freedom)
summary(lett.BC4)
Model fitted: Brain-Cousens (hormesis) with lower limit fixed at 0 (4 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
b:(Intercept) 1.282812 0.049346 25.9964 1.632e-10 ***
d:(Intercept) 0.967302 0.077123 12.5423 1.926e-07 ***
e:(Intercept) 0.847633 0.436093 1.9437 0.08059 .
f:(Intercept) 1.620703 0.979711 1.6543 0.12908
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error:
0.1117922 (10 degrees of freedom)
In this way, it is possible to compare the models lett.BC5
and lett.BC4
according to their sum of squares and the hypothesis test defined above:
anova(lett.BC5, lett.BC4)
1st model
fct: BC.4()
2nd model
fct: BC.5()
ANOVA table
ModelDf RSS Df F value p value
1st model 10 0.12498
2nd model 9 0.11884 1 0.4644 0.5127
(see more information on ?anova.drc
)
Since the p-value was greater than 0.05, we can say that the models are not different from each other, thus opting for lett.BC4
, which is simpler.
Note that I did not answer the main question. Maybe your interest is deciding between comparing function families LL
and BC
and decide what is the best family of functions to adjust to your data. Unfortunately, I don’t know any statistical method like a hypothesis test to solve this problem. I give you the following two suggestions as to how to decide between LL
and BC
:
1) Choose the best possible model among families LL
and BC
, using the above methodology. With the best models of each chosen family, analyze the waste two models found and, based on residue analysis, see which model violates the least hypotheses.
2) Make a conscious choice. Check your area’s literature for models with LL
(log-Logistic model) and BC
(Brain-Cousens modified log-Logistic) are the most used and why. Or, since you are making a parametric adjustment of the data, say that you will use either of these two options because of their interpretability or because your data has behavior that resembles some of them. Or test some other function, like Weibull, because maybe your results will be even better.