How do I return to "double" from the original df?

Asked

Viewed 24 times

1

When I convert df from "factor" to "Numeric" the values become integers. How do I return to "double" from the original df?

df <- read.csv2(file.choose())

View(df)
str(df)  

df1 <- as.numeric(df[1:12])  
df1 <- as.numeric(unlist(df))
df1 <- lapply(df, as.numeric)    

I tried that way, too, and I couldn’t

df3 <- as.double(df[1:12]) 
df3 <- as.double(unlist(df))
df3 <- lapply(df,as.double)

str(df1)

df2 <- as.data.frame(df1)
View(df2)  

In short:

head(df)
  fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
1           7.4              0.7           0            1.9     0.076
2           7.8             0.88           0            2.6     0.098
3           7.8             0.76        0.04            2.3     0.092
4          11.2             0.28        0.56            1.9     0.075
5           7.4              0.7           0            1.9     0.076
6           7.4             0.66           0            1.8     0.075


head(df2)
  fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
1            71               77           1             11        40
2            75              113           1             31        62
3            75               89           5             26        56
4            13               13          57             11        39
5            71               77           1             11        40
6            71               69           1             10        39

2 answers

1

df <- data.frame(valores = as.factor(c(1.1, 2.0, 3.3, 1.1, 1.0, 2.0)))

df$valores_num <- as.double(as.character(df$valores))
df$valores_num

You can try converting to 'string' first and then as double.

Exit:

1.1 2.0 3.3 1.1 1.0 2.0

Checking the date.frame:

str(df)
 $ valores    : Factor w/ 4 levels "1","1.1","2",..: 2 3 4 2 1 3
 $ valores_num: num  1.1 2 3.3 1.1 1 2

1

First see which columns are class "factor". These will be the columns to transform.

str(df)
#'data.frame':  6 obs. of  5 variables:
# $ fixed.acidity   : Factor w/ 3 levels "7.4","7.8","11.2": 1 2 2 3 1 1
# $ volatile.acidity: Factor w/ 5 levels "0.28","0.66",..: 3 5 4 1 3 2
# $ citric.acid     : Factor w/ 3 levels "0","0.04","0.56": 1 1 2 3 1 1
# $ residual.sugar  : Factor w/ 4 levels "1.8","1.9","2.3",..: 2 4 3 2 2 1
# $ chlorides       : Factor w/ 4 levels "0.075","0.076",..: 2 4 3 1 2 1

This can be obtained programmatically with

ind_cols <- sapply(df, is.factor)
ind_cols
#   fixed.acidity volatile.acidity      citric.acid   residual.sugar        chlorides 
#            TRUE             TRUE             TRUE             TRUE             TRUE 

This logical index can be used directly in the following, but taking into account that in the question the columns concerned are defined by a given numerical vector, 1:12, I will also use a previously defined vector, with fewer columns, those of the sample data.

To transform columns, you need to have the column index both in the result date.frame and in the date.frame to which the anonymous function will be applied.

ind_cols <- 1:5
df[ind_cols] <- lapply(df[ind_cols], function(x) as.numeric(as.character(x)))

Check the result.

str(df)
#'data.frame':  6 obs. of  5 variables:
# $ fixed.acidity   : num  7.4 7.8 7.8 11.2 7.4 7.4
# $ volatile.acidity: num  0.7 0.88 0.76 0.28 0.7 0.66
# $ citric.acid     : num  0 0 0.04 0.56 0 0
# $ residual.sugar  : num  1.9 2.6 2.3 1.9 1.9 1.8
# $ chlorides       : num  0.076 0.098 0.092 0.075 0.076 0.075
 
head(df)
#  fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
#1           7.4             0.70        0.00            1.9     0.076
#2           7.8             0.88        0.00            2.6     0.098
#3           7.8             0.76        0.04            2.3     0.092
#4          11.2             0.28        0.56            1.9     0.075
#5           7.4             0.70        0.00            1.9     0.076
#6           7.4             0.66        0.00            1.8     0.075

Note

The problem is solved above, but this problem is so frequent that it may be desirable to have a function that transforms a class object "factor" in a class object "numeric" or typeof "double", which is identical. For this, one can take advantage of the S3 class system and write a method for as.numeric or as.double.

Of documentation, help("as.numeric"):

as.Numeric is a Generic Function, but S3 methods must be Written for as.double. It is identical to as.double.

Translation:

as.Numeric is a generic function, but S3 methods should be written to as.double. It is identical to as.double.

That is, the method will be written to as.double and both the S3 method name as.double as the name as.numeric call the same method.

as.double.factor <- function(x) as.double(as.character(x))

Now the following code gets the desired result.

ind_cols <- 1:5
df[ind_cols] <- lapply(df[ind_cols], as.double)  # ou as.numeric

Dice

df <- read.table(text = "
  fixed.acidity volatile.acidity citric.acid residual.sugar chlorides
1           7.4              0.7           0            1.9     0.076
2           7.8             0.88           0            2.6     0.098
3           7.8             0.76        0.04            2.3     0.092
4          11.2             0.28        0.56            1.9     0.075
5           7.4              0.7           0            1.9     0.076
6           7.4             0.66           0            1.8     0.075
", header = TRUE)

df[] <- lapply(df, as.factor)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.