Consider return routes in the same way

Asked

Viewed 49 times

5

I’m having trouble coding the variable rota in the R so that it assumes a unique value when the route is the same, independent of the point of origin (first 4 characters of the route variable) and destination (last 4 characters). The base is as follows:

    base <- data.frame(rota = c("SBAA - SBEE", "SBAA - SBBR", "SBAA - SBCI",
                                "SBEE - SBAA", "SBEE - SBBR", "SBBR - SBEE"),
                       assentos = c(1231, 1021, 715, 989, 759, 695))

    base$rota<-as.character(base$rota)

  rota        assentos
  <chr>          <dbl>
1 SBAA - SBBE     1231
2 SBAA - SBBR     1021
3 SBAA - SBCI      715
4 SBEE - SBAA      989
5 SBEE - SBBR      759
6 SBBR - SBEE      695

I thought of making a transformation to generate the variable codigo:

codigo<-as.numeric(as.factor(rota))

However, the output will be different for equal routes (same airports connecting), but having airport of origin and destination reversed. For example, "SBAA - SBBE" and SBBE - SBAA" should have the same code, but will remain as follows:

  rota        assentos        codigo
  <chr>           <dbl>       <dbl>
1 SBAA - SBEE     1231         1
2 SBAA - SBBR     1021         2
3 SBAA - SBCI      715         3
4 SBEE - SBAA      989         4
5 SBEE - SBBR      759         5
6 SBBR - SBEE      695         6 

I need the routes that have the same connecting airports to have the same code so that the variable "code" returns the following result:

  rota        assentos        codigo
  <chr>           <dbl>       <dbl>
1 SBAA - SBEE     1231         1
2 SBAA - SBBR     1021         2
3 SBAA - SBCI      715         3
4 SBEE - SBAA      989         1
5 SBEE - SBBR      759         4
6 SBBR - SBEE      695         4

Note that the code for "SBAA - SBEE" is identical to "SBEE - SBAA".


Solution

library(dplyr)
library(stringr)
library(purrr)
base %>% 
    mutate(codigo = as.integer(factor(map_chr(str_extract_all(rota, 
          "\\w+"), ~ str_c(sort(.x), collapse=" - ")))))
  • base[["rota"]] <- as.character(base[["rota"]]) base[["rota_unica"]] <- unlist(lapply(strsplit(base[["rota"]], " - "), function(x){&#xA; x <- sort(x, method="radix")&#xA; paste0(x, collapse= " - ")&#xA;}))

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.