Error while converting numbers. How to convert factors to numbers?

Asked

Viewed 10,996 times

13

In the following example:

dados <- data.frame(x=c("11", "10", "20", "15"), y=c("25", "30", "35", "40"))
dados
   x  y
1 11 25
2 10 30
3 20 35
4 15 40

When trying to transform the variable x number, instead of 11, 10, 20 15 appear:

as.numeric(dados$x)
[1] 2 1 4 3

How to convert x for numbers?

2 answers

10

In the R, the standard behavior of data.frame is to turn texts into factors. This can generate unexpected results when numbers, during the data import/manipulation process, are misinterpreted as texts and transformed into factors.

In general, when working with data.frames, it is interesting to put the option stringsAsFactors = FALSE to avoid variables which should not be treated as factors.

However, once the variable has been improperly transformed into factor, a possible solution is to convert it into character first before passing to number:

as.numeric(as.character(dados$x))
[1] 11 10 20 15

8


If you analyze the structure of the object you will see where the problem occurs:

str(unclass(dados$x))
atomic [1:4] 2 1 4 3
- attr(*, "levels")= chr [1:4] "10" "11" "15" "20"

The object dados$x is composed of the vector [2,1,4,3] with the attribute levels. This attribute appears on the console when the dados$x.

To solve the problem, in addition to the solution already mentioned, you can adopt the following solution:

as.numeric(levels(dados$x))[dados$x]

In the first part of the solution the attributes of the object are extracted and converted into number dados$x. The R automatically puts these values in ascending order. Then you use [dados$x] to leave them in the original order.

This solution is slightly more efficient than as.numeric(as.character(dados$x)), however it may be harder to remember.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.