How to count strings from a variable

Asked

Viewed 145 times

4

I have a variable in the database that’s like COMPOSIÇÃO DA COLIGAÇÃO :

DS_COMPOSICAO DA COLIGACAO
AVANTE / PDT / PODE / PMN
AVANTE / PR / PV
DC / PRTB / AVANTE / SOLIDARIEDADE / PRP / PATRI 
DC / PSL / PRTB / SOLIDARIEDADE...

I wanted to create one more variable by counting the number of parties within each coalition (each row).

For example: as in the first line there are 4 parties, in the 3° 6 parties.

My database has 29,400 lines so you can’t do it manually.

4 answers

4

I will use the data as it is in answer by Tomás Barcellos.

A single line of base R code solves the problem.

lengths(strsplit(dado[["DS_COMPOSICAO_DA_COLIGACAO"]], "/"))
#[1] 4 3 6 4

Now just create the new column with this instruction.

dado$contagem <- lengths(strsplit(dado[["DS_COMPOSICAO_DA_COLIGACAO"]], "/"))

4

One way is to count the character "/" and then add one, since the first party is never preceded by the bar.

library(tidyverse)
dado <- tibble(
  DS_COMPOSICAO_DA_COLIGACAO = c(
    "AVANTE / PDT / PODE / PMN", "AVANTE / PR / PV",
    "DC / PRTB / AVANTE / SOLIDARIEDADE / PRP / PATRI",
    "DC / PSL / PRTB / SOLIDARIEDADE"
  )
)


dado %>% 
  mutate(quantidade = str_count(DS_COMPOSICAO_DA_COLIGACAO, "/") + 1)

# A tibble: 4 x 2
  DS_COMPOSICAO_DA_COLIGACAO                       quantidade
  <chr>                                                 <dbl>
1 AVANTE / PDT / PODE / PMN                                 4
2 AVANTE / PR / PV                                          3
3 DC / PRTB / AVANTE / SOLIDARIEDADE / PRP / PATRI          6
4 DC / PSL / PRTB / SOLIDARIEDADE                           4

4

A solution without dependency on external packages:

partidos_txt <- c("AVANTE / PDT / PODE / PMN
AVANTE / PR / PV
DC / PRTB / AVANTE / SOLIDARIEDADE / PRP / PATRI 
DC / PSL / PRTB / SOLIDARIEDADE")

coligs <- trimws(unlist(strsplit(partidos_txt, split = "\\n")))

lista <- lapply(coligs, function(x){
  count <-  length(strsplit(x, "\\/")[[1]])

  return(data.frame(colig = x, count = count, stringsAsFactors = F))

  }) 

do.call(rbind, lista)

Producing:

> do.call(rbind, lista)
                                             colig count
1                        AVANTE / PDT / PODE / PMN     4
2                                 AVANTE / PR / PV     3
3 DC / PRTB / AVANTE / SOLIDARIEDADE / PRP / PATRI     6
4                  DC / PSL / PRTB / SOLIDARIEDADE     4

-6

Let’s think that "/" always follows this informed pattern.

declare @myvar varchar(20)
set @myvar = 'AVANTE / PDT / PODE / PMN'

select (len(@myvar) - len(replace(@myvar,'/',''))) + 1

Another way is by doing a split in "/"...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.