The first thing I would do, before proceeding with the analysis, is to organize the dataset. It is not necessary to use nor attach
nor convert variables whenever it is to adjust a different model:
library(nlme)
library(ggplot2)
dados = read.table("dadosnovo.csv", header = T, sep=",", dec=".")
dados$G <- factor(dados$G)
dados$S <- factor(dados$S)
dados$S1 <- factor(dados$S1)
str(dados)
##'data.frame': 471 obs. of 5 variables:
## $ G : Factor w/ 5 levels "G1","G2","G3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ S : Factor w/ 57 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ S1 : Factor w/ 13 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Peso : num 39.2 40.6 41.9 42.8 43 43.2 42.3 42.9 42.5 42.6 ...
## $ Tempo: int 1 2 3 4 5 6 7 8 9 10 ...
Note that the new data frame dados
has all five columns with the type of data they should have.
My second step would be to do a graphical exploratory analysis of the data. For this, I will plot a graph of panels, in which each panel refers to a Swiss mouse and its colors refer to the group of which they are part. Since the cage isn’t important in this analysis, I didn’t put it on my chart.
ggplot(dados, aes(x = Tempo, y = Peso)) +
geom_line(aes(colour = G)) +
facet_wrap(~ S) +
scale_x_continuous(breaks = seq(3, 12, 3))
With this, we can see how the weight behavior of each subject during the 12 weeks of the experiment is. Even, it is possible to notice that not everyone has reached the end (although I suspect you already know this).
Now we can proceed with the analysis. As the models 1 to 3 had no problems, I go straight to the analysis of model 4. When choosing a correlation structure of type ARMA, it is necessary to define the degrees p and q of the autoregressive polynomials and moving average, respectively. I won’t get into the merit of how to do this, but I recommend the book Mixed Effects Models and Extensions in Ecology with R (Zuur et al., 2009) for a discussion on this subject.
This being said, to adjust a correlation structure ARMA(1, 1) (i.e., p=1 and q=1), just rotate
model4 <- lme(Peso ~ G + Tempo, random=list(S=pdIdent(~1)),
correlation = corARMA(p=1, q=1, form= ~ 1 | S ), data=dados)
summary(model4)
## Linear mixed-effects model fit by REML
## Data: dados
## AIC BIC logLik
## 2069.962 2111.382 -1024.981
##
## Random effects:
## Formula: ~1 | S
## (Intercept) Residual
## StdDev: 1.655606 3.900222
##
## Correlation Structure: ARMA(1,1)
## Formula: ~1 | S
## Parameter estimate(s):
## Phi1 Theta1
## 0.7932553 0.2635397
## Fixed effects: Peso ~ G + Tempo
## Value Std.Error DF t-value p-value
## (Intercept) 43.62625 1.5002678 413 29.078976 0.0000
## GG2 -1.42735 1.7021576 52 -0.838551 0.4056
## GG3 -3.30517 1.6971294 52 -1.947506 0.0569
## GG4 -3.47787 1.6902994 52 -2.057548 0.0447
## GG5 -3.46626 1.6964361 52 -2.043260 0.0461
## Tempo 0.21538 0.0832649 413 2.586674 0.0100
## Correlation:
## (Intr) GG2 GG3 GG4 GG5
## GG2 -0.802
## GG3 -0.801 0.687
## GG4 -0.802 0.689 0.690
## GG5 -0.803 0.687 0.688 0.690
## Tempo -0.361 0.098 0.090 0.082 0.093
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.8640659 -0.3889211 0.1319522 0.7970502 2.8488140
##
## Number of Observations: 471
## Number of Groups: 57
Ready. Your model is adjusted and can now be compared with others, through verisimilitude ratio tests, so that it is decided the best way to model this data.
It is very difficult to give you any definite suggestion without having a description of what experiment was run. I will list three doubts I have. 1) At first glance, from the experience I have in the area, it makes me strange that
Tempo
is considered a factor in model fit. Is there any special reason why this specific covariable is not considered continuous? 2) I supposeS
is the variable that identifies the subject in the experiment andG
be the group to which each subject is a part. Is that right? 3) And the variableS1
? She doesn’t get into modeling right now?– Marcus Nunes