How to identify a single point and label it in R

Asked

Viewed 327 times

3

I’m looking to identify and label a single point on a graph. I am working with a dispersion chart with an indicator on the y-axis and the years on the x-axis. My intention is to highlight Brazil and identify it with a legend inside the chart.

The data below can be used to illustrate the problem:

pibs <- tibble::tibble(
  posicao = c(1, 2, 3, 7, 4, 5, 6, 8, 9, 11, 13, 14, 10, 12, 17, 15, 16, 18,
              20, 19, 21),
  pais = c("Brasil", "México", "Argentina", "Venezuela", "Colômbia",
           "Chile", "Peru", "Equador", "Cuba", "República Dominicana",
           "Uruguai", "Costa Rica", "Guatemala", "Panamá", "El Salvador",
           "Bolívia", "Paraguai", "Honduras", "Haiti", "Nicarágua",
           "Santa Lúcia"),
  valor = c(2055000, 1149000, 637700, 210100, 309200,
            277000, 215200, 102300, 93790, 75020, 58420,
            58060, 75660, 61840, 28020, 37120, 29620,
            22980, 8608, 13730, 1686)
)
  • 3

    Welcome to Stackoverflow! Unfortunately, this question cannot be reproduced by anyone trying to answer it. Please, take a look at this link and see how to ask a reproducible question in R. So, people who wish to help you will be able to do this in the best possible way.

  • 2

    Can you please, edit the question with the departure of dput(dados) or, if the base is too large, dput(head(dados, 20))? Note: dados is the base name, for example a data frame..

2 answers

5

The gghighlight is a sensational package to do this:

Here’s an example of how to do with it:

library(tidyverse)
library(gghighlight)

ggplot(pibs, aes(x = posicao, y = valor)) +
  geom_point(color = "blue") +
  gghighlight(pais == "Brasil", label_key = pais, unhighlighted_colour = "black")

Created on 2019-02-28 by the reprex package (v0.2.1)

3

First let’s create a dataset to use in the answer. I used the package datapasta to copy and paste the table of this page.

library(tidyverse)
pibs <- tibble(
  posicao = c(1, 2, 3, 7, 4, 5, 6, 8, 9, 11, 13, 14, 10, 12, 17, 15, 16, 18,
              20, 19, 21),
  pais = c("Brasil", "México", "Argentina", "Venezuela", "Colômbia",
           "Chile", "Peru", "Equador", "Cuba", "República Dominicana",
           "Uruguai", "Costa Rica", "Guatemala", "Panamá", "El Salvador",
           "Bolívia", "Paraguai", "Honduras", "Haiti", "Nicarágua",
           "Santa Lúcia"),
  valor = c(2055000, 1149000, 637700, 210100, 309200,
            277000, 215200, 102300, 93790, 75020, 58420,
            58060, 75660, 61840, 28020, 37120, 29620,
            22980, 8608, 13730, 1686)
)

pibs
#> # A tibble: 21 x 3
#>    posicao pais                   valor
#>      <dbl> <chr>                  <dbl>
#>  1       1 Brasil               2055000
#>  2       2 México               1149000
#>  3       3 Argentina             637700
#>  4       7 Venezuela             210100
#>  5       4 Colômbia              309200
#>  6       5 Chile                 277000
#>  7       6 Peru                  215200
#>  8       8 Equador               102300
#>  9       9 Cuba                   93790
#> 10      11 República Dominicana   75020
#> # ... with 11 more rows

Created on 2019-02-27 by the reprex package (v0.2.1)

The form with which the work is that there should be a column in the table indicating which countries to show. We can do it like this:

pibs2 <- pibs %>% 
  mutate(mostrar = pais == "Brasil")
pibs2
#> # A tibble: 21 x 4
#>    posicao pais                   valor mostrar
#>      <dbl> <chr>                  <dbl> <lgl>  
#>  1       1 Brasil               2055000 TRUE   
#>  2       2 México               1149000 FALSE  
#>  3       3 Argentina             637700 FALSE  
#>  4       7 Venezuela             210100 FALSE  
#>  5       4 Colômbia              309200 FALSE  
#>  6       5 Chile                 277000 FALSE  
#>  7       6 Peru                  215200 FALSE  
#>  8       8 Equador               102300 FALSE  
#>  9       9 Cuba                   93790 FALSE  
#> 10      11 República Dominicana   75020 FALSE  
#> # ... with 11 more rows

Created on 2019-02-27 by the reprex package (v0.2.1)

Now, with the column in hand it is possible to use it to achieve the goal. To highlight with the color, just use the new column to map the colors.

p <- ggplot(pibs2, aes(posicao, valor, col = mostrar)) +
  geom_point() +
  scale_color_manual(values = c("black", "darkgreen"))

p

Created on 2019-02-27 by the reprex package (v0.2.1)

And after that it is possible to use this column to filter the table and use only the record that matters to the layer with the text.

p + 
  geom_text(aes(label = pais), data = filter(pibs2, pais == "Brasil"),
            position = position_nudge(1)) +
  theme(legend.position = "none")

Created on 2019-02-27 by the reprex package (v0.2.1)

  • 2

    To get this a "ugly" of the legend use position = position_nudge(2), show.legend = FALSE. Note the change in the value of position_nudge. Otherwise, I vote up.

  • 1

    I did not want to complicate the answer too much (give a lot of new information), but I will accept the suggestion

  • Thanks for the answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.