Use of summarySE function for graph construction in ggplot2

Asked

Viewed 159 times

2

Hello, I am looking for an aid to improve a script in R. I develop it with the help of the ggplot2 package. I have some errors and would like to get some feedback. In the study, I evaluated the effect of intra- and interspecific interaction between two species of tadpoles with two size classes "p" and "g" on the use of spatial resources through two response variables: vertical displacement (sec) and horizontal displacement (cm).

Mistakes:

  • In the execution of the summarySE function: "/Warning message:In Qt(conf.interval/2 + 0.5, datac$N - 1) Nans produced"

  • When executing Legend.title=element_blank(): Error: Unexpected Symbol in:"Legend.title=element_blank() Legend.title"

follows the script

#Importar dados. Pacotes para o gráfico
pmed<-read.table("https://drive.google.com/open?id=1cET5cROSb-_D-cZYud3yKBHGR-qamind",header=T, sep=',') 
require(Rmisc)
require(ggplot2)
require(EnvStats)
require(sciplot)
require(dplyr)
summarypmed <- summarySE(pmed, measurevar="PV", groupvars=c("ID","INT"),na.rm=T)
# gráfico 1
                graf1 <- ggplot(summarypmed, aes(x=INT, y= PV, fill=INT)+ facet_wrap(~INT) +  
                geom_dotplot(binwidth=0.05,binaxis="y",stackdir = "center") +                  
                geom_errorbar(aes (ymin=PV.y-se,ymax=PV.y+se),width = 0.25,size=0.25)+         
                  xlab("Tratamentos") +                  
                  ylab(" Posição Vertical (cm)") +                  
                  geom_text(aes(label = paste("N", "==",N,sep = "")), 
                                              parse = TRUE, y=-0.15) +
                  geom_point(aes(y=PV.y), size=1,show.legend = F) +
                  theme_bw () +                  
                  scale_fill_manual(values=c("grey85","grey30")) +     
                    theme(axis.ticks.x=element_blank(),                          
                    axis.title.x=element_blank(),                    
                    axis.text=element_text(size=7),                    
                    axis.text.y=element_text(size=10),                    
                    axis.text.x=element_blank(),                    
                    axis.title.y=element_text(size=11),                    
                    plot.title=element_text(size=12, hjust = 0.5),                    
                    panel.grid=element_blank(),                    
                    plot.title=element_text(vjust=2),     
                    legend.title=element_blank())

1 answer

4


This problem has no solution with the provided dataset. The function help summarySE says the following (my griffin):

Gives Count, Mean, standard deviation, standard error of the Mean, and Confidence interval (default 95%).

The three measures I highlighted above are measures of dispersion or variability. All three of these measures depend on the sample variance. The sampling variance formula is given by

inserir a descrição da imagem aqui

Note that the sum is divided by n-1. So in order to measure variability in a data set, we need at least two observations. After all, if n=1, we’ll have a split by zero.

In your original question, by grouping the data by ID (groupvars=c("ID","INT"), you are saying that you want to calculate the variability per ID of the subjects. But look at the following code, where I count the number of subjects per ID:

library(dplyr)
pmed %>%
  select(ID) %>%
  count()
 ID freq
  1    2
  2    2
  3    2
  4    2
  5    2
  6    1
  7    1
  8    2
  9    1
 10    2
 11    2
 12    2
 13    2
 15    2
 16    1
 17    1
 18    1
 19    1
 20    1
 21    1
 22    1
 23    1
 24    1
 25    1
 26    1
 27    1
 29    1
 30    1

Many Ids have only one subject. Therefore, it is impossible to calculate a measure of variability in these cases, as data is missing. That’s why you got a Warning regarding NaN (acronym for Not to NUmber) that are created.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.