How to count repetitions in a range using length(which()) in R

Asked

Viewed 53 times

2

I create the data frame called producao:

producao<-data.frame(aluno = c("Pedro", "Joao", "Marcio", "Pedro", "Joao", "Pedro", "Marcio", "Marcio", "Marcio"),
                     qualis = c("A1", "B2", "A1", "B2", "NP", "C", "A2", "A1", "B2"))

producao

> producao
   aluno qualis
1  Pedro     A1
2   Joao     B2
3 Marcio     A1
4  Pedro     B2
5   Joao     NP
6  Pedro      C
7 Marcio     A2
8 Marcio     A1
9 Marcio     B2

I would like to know how to calculate the amount of repetitions of a given term in the "Qualis" column. I am particularly interested in how to do this with the function length() combined with which().

If I want to know how many "A1",:

length(which(producao$qualis == "A1"))

[1] 3

But, what if I want to know how many records there are in the range from "A1" to "A2"?

I can do it using %in%:

length(which(producao$qualis %in% c("A1", "A2")))

[1] 4

With few levels it is simple to use the comma, but what if I have many "levels" in the variable "Qualis"?

It is inconvenient to keep typing all levels I want to consider.

So I wanted to try a command that could capture one intermission

I figured that : work; but did not work

length(which(producao$qualis %in% c("A1":"A2")))

Error in "A1":"A2" : Argumento NA/NaN
Além disso: Warning messages:
1: In producao$qualis %in% c("A1":"A2") : NAs introduzidos por coerção
2: In producao$qualis %in% c("A1":"A2") : NAs introduzidos por coerção

LONG VECTOR EXAMPLE

exemplo<- factor (c(
  "B1",
  "B4",
  "NP",
  "A3",
  "B4",
  "B1",
  "B1",
  "B4",
  "B2",
  "A3",
  "A1",
  "B2",
  "B1",
  "A3",
  "B1",
  "B4",
  "A3",
  "B1",
  "B2",
  "B3",
  "B1",
  "A2",
  "B3",
  "A3",
  "NP",
  "B1",
  "B1",
  "B2",
  "A3",
  "B1",
  "B1",
  "B3",
  "A3",
  "A4",
  "A4",
  "B3",
  "B2",
  "B1",
  "B1",
  "A3",
  "B1",
  "B1",
  "B1",
  "A3",
  "A3",
  "A3",
  "B2",
  "B1",
  "NP",
  "A1",
  "NP",
  "NP",
  "A2",
  "A2",
  "B1",
  "B1",
  "A1",
  "B1",
  "B3",
  "B1",
  "B2",
  "C",
  "NP",
  "C",
  "B1",
  "A1",
  "A3",
  "A1",
  "A4",
  "C",
  "A1",
  "A2",
  "C",
  "B1",
  "A3",
  "A2",
  "A2",
  "A3",
  "B4",
  "A3",
  "B1",
  "A3",
  "NP",
  "C",
  "C",
  "B1",
  "C",
  "B4",
  "B2",
  "B1",
  "C",
  "B1",
  "B1",
  "B1",
  "C",
  "B2",
  "B1",
  "A3",
  "B3",
  "B1",
  "B4",
  "B1",
  "A4",
  "A4",
  "A4",
  "B3",
  "B1",
  "NP",
  "NP",
  "B2",
  "B2",
  "B2",
  "B1",
  "B2",
  "B2",
  "B1",
  "C",
  "B1",
  "B1",
  "B1",
  "A1",
  "B1",
  "A3",
  "A2",
  "B1",
  "B1",
  "A3",
  "A1",
  "A1",
  "B4",
  "A3",
  "B1",
  "A3",
  "B1",
  "A1",
  "B1",
  "A1",
  "C",
  "A3",
  "A3",
  "B3",
  "B2",
  "B2",
  "B1",
  "B1",
  "C",
  "B2",
  "A3",
  "A3",
  "B1",
  "B1",
  "NP",
  "A3",
  "NP",
  "A3",
  "A4",
  "B1",
  "A3",
  "A3",
  "B3",
  "A3",
  "B1",
  "NP",
  "B1",
  "A3",
  "A3",
  "B1",
  "B1",
  "B2",
  "B1",
  "NP",
  "B1",
  "A3",
  "A3",
  "A3",
  "B1",
  "A2",
  "C",
  "C",
  "B1",
  "B1",
  "A3",
  "B1",
  "B1",
  "NP",
  "A1",
  "A1",
  "A1",
  "A4",
  "A1",
  "B4",
  "NP",
  "B2",
  "C",
  "C",
  "C",
  "B1",
  "C",
  "A4",
  "B1",
  "C",
  "C",
  "A4",
  "B2",
  "B1",
  "C",
  "A3",
  "B1",
  "C",
  "A3",
  "B1",
  "B1",
  "A4",
  "B1",
  "B2",
  "A3",
  "B2",
  "B1",
  "B2",
  "B2",
  "NP",
  "NP",
  "A3",
  "A2",
  "NP",
  "A2",
  "NP",
  "A3",
  "A3",
  "B2",
  "A3",
  "A3",
  "C",
  "C",
  "A2",
  "A3",
  "A3",
  "C",
  "A4",
  "B1",
  "A3",
  "A3",
  "B2",
  "A2",
  "A2",
  "A2",
  "B2",
  "NP",
  "C",
  "A1",
  "B2",
  "B3",
  "B3",
  "B1",
  "B2",
  "A3",
  "A3",
  "B3",
  "B2",
  "A2",
  "A1",
  "B1",
  "A2",
  "B1",
  "A2",
  "B1",
  "B1",
  "B2",
  "B1",
  "A1",
  "B3",
  "A4",
  "B1",
  "A1",
  "A2",
  "B3",
  "B1",
  "B1",
  "B1",
  "B1",
  "A2",
  "NP",
  "A3",
  "B1",
  "B1",
  "A3",
  "B1",
  "A2",
  "B1",
  "NP",
  "B1",
  "B1",
  "B1",
  "B1",
  "A3",
  "B1",
  "B1",
  "B1",
  "B2",
  "A2",
  "A3",
  "A2",
  "A2",
  "A3",
  "B1",
  "A4",
  "A2",
  "B1",
  "A3",
  "A4",
  "A3",
  "B1",
  "B1",
  "NP",
  "B3",
  "A3",
  "NP",
  "NP",
  "B1",
  "B1",
  "B1",
  "B1",
  "B1",
  "B1",
  "B1",
  "A3",
  "B2",
  "B1",
  "A3",
  "A2",
  "A3",
  "B1",
  "B3",
  "B2",
  "B1",
  "NP",
  "B1",
  "A3",
  "A2",
  "NP",
  "A4",
  "A3",
  "B3",
  "A3",
  "A3",
  "B1",
  "A3",
  "B2",
  "B1",
  "A3",
  "A3",
  "A2",
  "B1",
  "A3",
  "B2",
  "A4",
  "A3",
  "B1",
  "NP",
  "B1",
  "A3"),
  levels = c("A1", "A2", "A3", "A4", "B1", "B2", "B3", "B4", "C", "NP"))

head(exemplo)

[1] B1 B4 NP A3 B4 B1
Levels: A1 A2 A3 A4 B1 B2 B3 B4 C NP
  • Is the letter always the same, only the numbers vary? For example, paste0("A", 2:9) gives a vector of "A2" until "A9".

  • Hello Rui Barradas. There are 10 levels: "A1", "A2", "A3", "A4", "B1", "B2", "B3", "B4", "C", "NP" exemplo already as factor. .

2 answers

4


A1:A2 will not work because it is dealing with text string. As suggested by @Rui-Barradas in comments, you can mount the string using paste:

length(which(exemplo %in% paste0("A", 1:4)))
#> [1] 134

Another option is to use a regular expression indicating the desired letter and digit range:

length(grep("A[1-4]", exemplo))
#> [1] 134

Or simply "A" if you want all Qualis A, regardless of the numbering. If you want to pick up tracks with more than one letter (e.g., all Qualis that have an impact factor):

length(grep("A|B[1-2]", exemplo))
#> [1] 281

2

Maybe the following function can solve the problem.
Accepts two strings in question format and creates an array of numbers in sequence.

rangeAlphanum <- function(x, y){
  xchar <- unlist(strsplit(x, "[[:digit:]]+"))
  xnum <- unlist(strsplit(x, "[^[:digit:]]"))
  ynum <- unlist(strsplit(y, "[^[:digit:]]"))
  xnum <- as.integer(xnum[xnum != ""])
  ynum <- as.integer(ynum[ynum != ""])
  paste0(xchar, xnum:ynum)
}

rangeAlphanum("A1", "A4")
#[1] "A1" "A2" "A3" "A4"

Applying the function to the vector exemplo, first with the sum of a logical vector, since FALSE/TRUE is coded as 0/1, and then with length(which(.)), as requested:

sum(exemplo %in% rangeAlphanum("A1", "A4"))
#[1] 134

length(which(exemplo %in% rangeAlphanum("A1", "A4")))
#[1] 134

Browser other questions tagged

You are not signed in. Login or sign up in order to post.