How to sort data on Y-axis using ggplot2 in R

Asked

Viewed 462 times

1

I’m charting some data on a Bubble Chart using the package ggplot2 of R.

My data is out of order on the Y axis. I would like you to follow the order of the numbers present in the names, for example: CenpSat1A, CenpSat2A, CenaSat15Y, but I’m not getting it when I just sort the data.

Below the code I used to generate the attached image chart:

library(ggplot2)

ggplot(dados, aes(x = Espécies, y = DNAsat, size = Reads, 
fill=Espécies)) + geom_point(shape = 21) + theme_bw() + 
scale_fill_brewer(palette="Pastel1") + scale_size_area(max_size=13)

dput to assist the response:

structure(list(DNAsat = structure(c(9L, 4L, 10L, 5L, 11L, 12L, 
13L, 14L, 15L, 6L, 1L, 2L, 7L, 8L, 3L, 9L, 4L, 10L, 5L, 11L, 
12L, 13L, 14L, 15L, 6L, 1L, 2L, 7L, 8L, 3L), .Label = c("CenaSat11B", 
"CenaSat12Y", "CenaSat15Y", "CenaSat2A", "CenaSat4A", "CenpSat10B", 
"CenpSat13Z", "CenpSat14Y", "CenpSat1A", "CenpSat3A", "CenpSat5Y", 
"CenpSat6A", "CenpSat7Y", "CenpSat8Z", "CenpSat9Y"), class = "factor"), 
    Espécies = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Cenchrus americanus", 
    "Cenchrus purpureus"), class = "factor"), Reads = c(35629, 
    32123, 33698, 31857, 31812, 30664, 7534, 7128, 6395, 1887, 
    1865, 1435, 1069, 272, 18, 28201, 26867, 27799, 26206, 25967, 
    25987, 0, 11419, 0, 11879, 11887, 336, 0, 0, 220)), class = "data.frame", row.names = c(NA, 
-30L))

Gráfico Bubble_Chart

  • Welcome to Stackoverflow! Unfortunately, this question cannot be reproduced by anyone trying to answer it. Please take a look at this link (mainly in the use of function dput) and see how to ask a reproducible question in R. So, people who wish to help you will be able to do this in the best possible way.

  • Hi, thanks for the tips. I entered the dput result.

2 answers

2


The fct_inorder function of the forcats of the can help you

library(forcats)
library(ggplot2)

dados <- structure(list(DNAsat = structure(c(9L, 4L, 10L, 5L, 11L, 12L,  13L, 14L, 15L, 6L, 1L, 2L, 7L, 8L, 3L, 9L, 4L, 10L, 5L, 11L,  12L, 13L, 14L, 15L, 6L, 1L, 2L, 7L, 8L, 3L), .Label = c("CenaSat11B",  "CenaSat12Y", "CenaSat15Y", "CenaSat2A", "CenaSat4A", "CenpSat10B",  "CenpSat13Z", "CenpSat14Y", "CenpSat1A", "CenpSat3A", "CenpSat5Y",  "CenpSat6A", "CenpSat7Y", "CenpSat8Z", "CenpSat9Y"), class = "factor"),  Espécies = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,  2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Cenchrus americanus",  "Cenchrus purpureus"), class = "factor"), Reads = c(35629,  32123, 33698, 31857, 31812, 30664, 7534, 7128, 6395, 1887,  1865, 1435, 1069, 272, 18, 28201, 26867, 27799, 26206, 25967,  25987, 0, 11419, 0, 11879, 11887, 336, 0, 0, 220)), class = "data.frame", row.names = c(NA,  -30L))


ggplot(dados,aes(x = Espécies, y = fct_rev(fct_inorder(DNAsat)), size = Reads, 
                  fill=Espécies)) + geom_point(shape = 21) + theme_bw() + 
  scale_fill_brewer(palette="Pastel1") + scale_size_area(max_size=13)

Result

  • Thanks for the code. I loaded the forcats package into the R, but when I run the mutate line I get the following answer: Error in mutate(DNAsat = DNAsat %>% fct_inorder() %>% fct_rev()) : &#xA; não foi possível encontrar a função "mutate"

  • function mutate is in dplyr package, tries to install tidyverse which already comes even ggplot2 together

  • Is there a way to install the package externally? Because it is not present in the bank of R Studio

  • I was able to install by command.

  • and then it worked out?

  • error loading package: > library(tidyverse)&#xA;Erro: package or namespace load failed for ‘tidyverse’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):&#xA; namespace ‘rlang’ 0.4.5 is already loaded, but >= 0.4.6 is required

  • which version of R, tries to install rlang tb but I don’t think it will be good, if it tries to install only dplyr I will change the code not to need dplyr but you should use pq it helps a lot

  • I’ll update the version and see if it solves.

  • If you need to quickly put a version without tidyverse, but the code will get more confused, this call nested of fct_inorder and fct_rev is very ugly

  • Right. I’ll leave it saved and if the first version doesn’t work I use this one. Thanks really guy.

  • It worked. Thanks again.

Show 6 more comments

1

Another way to solve this problem is by using a regular expression to extract only the numbers present in DNAsat. The advantage of this method is to avoid over-typing, which can lead to errors.

library(stringr)
library(tidyverse)

# extrair apenas os numeros presentes em DNAsat
numeros <- as.numeric(str_extract(dados$DNAsat, "[[:digit:]]+"))

ggplot(dados, aes(x = Espécies, y = reorder(DNAsat, -numeros), 
    size = Reads, fill=Espécies)) + 
  geom_point(shape = 21) + 
  theme_bw() + 
  scale_fill_brewer(palette="Pastel1") + 
  scale_size_area(max_size=13)

inserir a descrição da imagem aqui

In addition, it is very easy to reverse the chart order if necessary:

ggplot(dados, aes(x = Espécies, y = reorder(DNAsat, numeros), 
    size = Reads, fill=Espécies)) + 
  geom_point(shape = 21) + 
  theme_bw() + 
  scale_fill_brewer(palette="Pastel1") + 
  scale_size_area(max_size=13)

inserir a descrição da imagem aqui

Browser other questions tagged

You are not signed in. Login or sign up in order to post.