1
I’m working with the following database: Qualis.
I import this database using the rio::import()
and write to the object "df". And I load library(dplyr)
Upshot:
library(dplyr)
df<-rio::import("EXEMPLO.xlsx")
head(df)
ordem ano qualis.ref
1 1 2017 B1
2 2 2017 B4
3 3 2017 NP
4 4 2017 A3
5 5 2017 B4
6 6 2017 B1
It turns out that the values of the variable "Qualis.ref" correspond to weights. According to the following equivalence:
A1=1,0
A2=0,8
A3=0,7
A4=0,6
B1=0,5
B2=0,35
B3=0,2
B4=0,1
C=0
NP=0
What I’m trying to do is get the score, per year, of each "Qualis.ref"
To do so, I first convert the variable "Qualis.ref" into factor using the function factor()
:
df$qualis.ref<-as.factor(df$qualis.ref)
Then I create a new variable called "weight", which is a copy of "Qualis.ref":
peso<-df$qualis.ref
To then assign VALUES according to the above mentioned equivalence:
levels(peso)<-c(1, 0.85, 0.7, 0.6, 0.5, 0.35, 0.2, 0.1, 0, 0)
Then bundle everything into a new data.frame called "df2" using the function cbind()
:
df2<-cbind(df, peso)
ordem ano qualis.ref peso
1 1 2017 B1 0.5
2 2 2017 B4 0.1
3 3 2017 NP 0
4 4 2017 A3 0.7
5 5 2017 B4 0.1
6 6 2017 B1 0.5
Finally, group using the function group_by()
and ask to count the "Qualis.ref" with the function count()
.
That’s where my problem arises, I used the function mutate()
to create a new column called "score" in the perspective that I could multiply the amount of "Qualis.ref" counted by their respective weights.
Stayed like this:
df2 %>%
group_by(ano, qualis.ref, peso) %>%
count(qualis.ref) %>%
mutate(pontuacao=peso*n)
# A tibble: 40 x 5
# Groups: ano, qualis.ref, peso [40]
ano qualis.ref peso n pontuacao
<dbl> <fct> <fct> <int> <lgl>
1 2017 A1 1 4 NA
2 2017 A2 0.85 8 NA
3 2017 A3 0.7 26 NA
4 2017 A4 0.6 4 NA
5 2017 B1 0.5 39 NA
6 2017 B2 0.35 10 NA
7 2017 B3 0.2 3 NA
8 2017 B4 0.1 9 NA
9 2017 C 0 14 NA
10 2017 NP 0 10 NA
# ... with 30 more rows
There were 40 warnings (use warnings() to see them)
However, the whole variable "score" appears with "NA".
What makes me think that the problem is with the variable of type "factor".
I tested with a variable of type "double" from "mtcars" and the multiplication worked. I multiplied the variables "Gear" and "carb":
mtcars %>%
mutate(teste=gear*carb) %>%
head()
mpg cyl disp hp drat wt qsec vs am gear carb teste
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 16
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 16
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4
4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3
5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3
No, it is not possible to operate variables of the type
factor
recommend that reading– yoyo
How is it possible for a question to have 2 votes against and an answer with 5 votes in favour? The answer may be useful to others but the question is not?
– Rui Barradas