How do you turn a comma number into an R?

Asked

Viewed 4,592 times

4

I have a csv file, saved via Excel (using ";" as column separator). When I import it into the R the numbers that are in the 0.00 format are as factor and comma.

Ex: "123,45"

When making the conversion it becomes text.

num <- gsub("," , "." , "123,45")

num = "123.45"

when I convert them individually they turn number.

num <- as.numeric(num)

num = 123.45

But when I do it in an array, the numbers are rounded.

numeros <- gsub(",",".",numeros)
numeros <- as.numeric(numeros)

numbers = 123 457 ...

Even using a loop the same thing happens.

for (i in 1:lenght(numeros)) {
    numeros[i] <- as.numeric(numeros[i]
}

numbers = 123 457 ...

I wonder how numbers with comma numbers in numbers in the R pattern.

2 answers

6

When reading the csv file data, use the argument dec to specify the decimal separator:

read.csv('dados.csv', dec = ",")

6

I believe that what you want should be solved with sub and not with gsub.

x <- c("123,45", "456,78", "0,001")
y <- sub(",", ".", x)
y
[1] "123.45" "456.78" "0.001"

as.numeric(y)
[1] 123.450 456.780   0.001

Note that how 0.001 has three decimal places, the method print.numeric is smart enough to also give the other vector elements 3 decimal.

Regarding @Willian Vieira’s suggestion to use the argument dec = "," at the time of reading the file, of course this is desirable, but R has the function read.csv2 precisely to read files .csv which come from countries where decimals are separated with the comma.
On the page help("read.table") (or read.csv, is the same page) you can read the following. My emphasis.

read.csv and read.csv2 are identical to read.table except for the defaults. They are intended for Reading 'comma separated value' files (ờ.csv') or (read.csv2) the Variant used in countries that use a comma as decimal point and a semicolon as field separator.

In, English, translation Google edited by me.

read.csv and read.csv2 are identical to read.table, except for defaults. They are intended to read files of comma-separated values (>.csv') or (read.csv2) the variant used in countries that use a comma as a decimal point and a semicolon as a field separator.

This seems to be the case described in the question.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.