First see which columns are class "factor"
. These will be the columns to transform.
str(df)
#'data.frame': 6 obs. of 5 variables:
# $ fixed.acidity : Factor w/ 3 levels "7.4","7.8","11.2": 1 2 2 3 1 1
# $ volatile.acidity: Factor w/ 5 levels "0.28","0.66",..: 3 5 4 1 3 2
# $ citric.acid : Factor w/ 3 levels "0","0.04","0.56": 1 1 2 3 1 1
# $ residual.sugar : Factor w/ 4 levels "1.8","1.9","2.3",..: 2 4 3 2 2 1
# $ chlorides : Factor w/ 4 levels "0.075","0.076",..: 2 4 3 1 2 1
This can be obtained programmatically with
ind_cols <- sapply(df, is.factor)
ind_cols
# fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
# TRUE TRUE TRUE TRUE TRUE
This logical index can be used directly in the following, but taking into account that in the question the columns concerned are defined by a given numerical vector, 1:12
, I will also use a previously defined vector, with fewer columns, those of the sample data.
To transform columns, you need to have the column index both in the result date.frame and in the date.frame to which the anonymous function will be applied.
ind_cols <- 1:5
df[ind_cols] <- lapply(df[ind_cols], function(x) as.numeric(as.character(x)))
Check the result.
str(df)
#'data.frame': 6 obs. of 5 variables:
# $ fixed.acidity : num 7.4 7.8 7.8 11.2 7.4 7.4
# $ volatile.acidity: num 0.7 0.88 0.76 0.28 0.7 0.66
# $ citric.acid : num 0 0 0.04 0.56 0 0
# $ residual.sugar : num 1.9 2.6 2.3 1.9 1.9 1.8
# $ chlorides : num 0.076 0.098 0.092 0.075 0.076 0.075
head(df)
# fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
#1 7.4 0.70 0.00 1.9 0.076
#2 7.8 0.88 0.00 2.6 0.098
#3 7.8 0.76 0.04 2.3 0.092
#4 11.2 0.28 0.56 1.9 0.075
#5 7.4 0.70 0.00 1.9 0.076
#6 7.4 0.66 0.00 1.8 0.075
Note
The problem is solved above, but this problem is so frequent that it may be desirable to have a function that transforms a class object "factor"
in a class object "numeric"
or typeof
"double"
, which is identical. For this, one can take advantage of the S3 class system and write a method for as.numeric
or as.double
.
Of documentation, help("as.numeric")
:
as.Numeric is a Generic Function, but S3 methods must be Written for as.double. It is identical to as.double.
Translation:
as.Numeric is a generic function, but S3 methods should be written to as.double. It is identical to as.double.
That is, the method will be written to as.double
and both the S3 method name as.double
as the name as.numeric
call the same method.
as.double.factor <- function(x) as.double(as.character(x))
Now the following code gets the desired result.
ind_cols <- 1:5
df[ind_cols] <- lapply(df[ind_cols], as.double) # ou as.numeric
Dice
df <- read.table(text = "
fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
1 7.4 0.7 0 1.9 0.076
2 7.8 0.88 0 2.6 0.098
3 7.8 0.76 0.04 2.3 0.092
4 11.2 0.28 0.56 1.9 0.075
5 7.4 0.7 0 1.9 0.076
6 7.4 0.66 0 1.8 0.075
", header = TRUE)
df[] <- lapply(df, as.factor)