What is the difference between the bar graph and columns in R?

Asked

Viewed 40 times

1

I would like to know the practical difference between the bar and column charts, because I see in ggplot for example that there are commands geom_bar and geom_col. What would be better or what the limitations of each?

1 answer

1


You can get the same results using geom_bar and geom_col. The syntax of the commands will be different, but aesthetically it is possible to get exactly the same graphical result.

geom_bar leaves the height (or length, if horizontal bars) of the members proportional to the number of cases in each group (unless the parameter weight is used, but that’s another story). That is, by default, geom_bar uses stat_count() to determine the size of the bars.

geom_col is used when we want the sliders to represent values directly present in the dataset. By default, geom_col uses stat_identity() to determine the size of the bars.

In practice, this means that geom_bar will directly make a bar graph representing the frequency of some categorical variable present in the data set. To get the same result with geom_col it is necessary to pre-process the data, first finding those frequencies.

library(tidyverse)

ggplot(mpg, aes(x = drv)) +
  geom_bar() +
  labs(title = "geom_bar")

ggplot(mpg, aes(x = drv)) +
  geom_col()
#> Error: geom_col requires the following missing aesthetics: y

mpg %>%
  group_by(drv) %>%
  count() %>%
  ggplot(aes(x = drv, y = n)) +
  geom_col() +
  labs(y = "count", title = "geom_col")

Created on 2021-05-03 by the reprex package (v2.0.0)

The choice of using each method will depend on each user. I prefer geom_col, even though he’s doing a little more work, because I can see exactly which count table is being plotted.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.