Create boxplot graph of values in classes

Asked

Viewed 400 times

2

I have these values in a dataframe:

structure(list(NotaFinal = c(23.95, 25.4, 31.55, 25.4, 27.8, 
27.3, 31.85, 20.45, 31.95, 28.55, 20, 24.95, 14.45, 22.55, 25.65, 
10.35, 27.95, 21.45, 18.45, 21.1, 12.3, 22.65, 30.35, 27.4, 12.85, 
21.95, 26.25, 28.55, 24.3, 22.35), TempoConvertido = c("21.85", 
"32.88", "42.58", "44.24", "20.06", "29.93", "49.2", "22.71", 
"76.1", "25.76", "19.79", "32.87", "15.55", "62.4", "21.25", 
"12.89", "104.76", "15.35", "13.48", "24.47", "7.37", "22.73", 
"81.42", "24.25", "6.89", "42.4", "64.08", "49.71", "17.76", 
"16.62")), .Names = c("NotaFinal", "TempoConvertido"), row.names = c(NA, 
-30L), class = "data.frame")

I use the following library function fdth:

distribuicaoDeFrequenciaTempoConvertido=fdt(as.numeric(dfTempoNota[,2]))

It creates 6 classes of attribute values dfTempoNota[,2], I need to create a boxplot that correlates the values of dfTempoNota[,1] according to the classes of values generated in distribuicaoDeFrequenciaTempoConvertido. How can I make this correlation of values?

  • What is dfTempoNota? Not in the above data.

  • is the name I gave to my data frame from the above data.

1 answer

3


dfTempoNota <- structure(list(NotaFinal = c(23.95, 25.4, 31.55, 25.4, 27.8, 
                                            27.3, 31.85, 20.45, 31.95, 28.55, 20, 24.95, 14.45, 22.55, 25.65, 
                                            10.35, 27.95, 21.45, 18.45, 21.1, 12.3, 22.65, 30.35, 27.4, 12.85, 
                                            21.95, 26.25, 28.55, 24.3, 22.35), 
                              TempoConvertido = c("21.85", "32.88", "42.58", "44.24", "20.06", "29.93", "49.2", 
                                                  "22.71", "76.1", "25.76", "19.79", "32.87", "15.55", "62.4", 
                                                  "21.25", "12.89", "104.76", "15.35", "13.48", "24.47", "7.37", 
                                                  "22.73","81.42", "24.25", "6.89", "42.4", "64.08", "49.71", 
                                                  "17.76", "16.62")), 
                         .Names = c("NotaFinal", "TempoConvertido"), row.names = c(NA, -30L), class = "data.frame")

I transformed the variable TempoConvertido in numeric

dfTempoNota$TempoConvertido <- as.numeric(dfTempoNota$TempoConvertido)

I used the function fdt to create the class ranges

distribuicaoDeFrequenciaTempoConvertido <- fdth::fdt(dfTempoNota$TempoConvertido)

From here I find the smallest value, the largest and amplitude of the ranges that the function fdt creates and stores in the vector breaks. Then I create the vector int which has the minimum and maximum of each range. The function findInterval will store each TempoConvertido within these created ranges. Then, I only changed the name of the factor levels and created the boxplot.

breaks <- distribuicaoDeFrequenciaTempoConvertido[['breaks']]
int <- round(seq(from = breaks[1], to = breaks[2], by = breaks[3]), 4)
dfTempoNota$fdt <- as.factor(findInterval(dfTempoNota$TempoConvertido, int))
levels(dfTempoNota$fdt) <- levels(distribuicaoDeFrequenciaTempoConvertido$table$`Class limits`)
boxplot(NotaFinal ~  fdt, data = dfTempoNota)

Imgur

  • Brilliant solution! Only detail that the values of the x-axis classes were hidden due to the extension of the numbers, in this situation the most appropriate would be to create a legend?

  • This is up to you. You can include the argument cex.axis within the function boxplot to reduce font size.

  • If I use the argument cex.axis to reduce the font size, all values will only be displayed if the font is in a very small size, which makes it impossible to view. Is there any argument for "spacing" the boxplot?

  • Yes, you can specify the position of the boxplots with a vector of length equal to the number of boxplots in the argument at. For example at = c(1, 3, 5, 7, 9, 11). Here has a tutorial that can help.

  • Thanks for the tips Rafael. I took the test with the command at, but I didn’t think the result was very pleasing aesthetically, I chose to use ggplot2 and describe on the x axis the interval number and use the caption to specify the values.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.