Correctly convert value into scientific notation for text in R

Asked

Viewed 100 times

1

I am a bit aimless with a problem with my data frame. Below a table of an SQL server and one of the columns brings the numbers of each process. This value is a long numerical sequence that R converts to scientific notation. The problem is that when I change the column to text it brings the wrong number. How do I make R understand the correct number?

Here is a code to illustrate the problem: When I switch to text the result comes with the end 6 and not 7.

df <- data.frame(Processo = (25351001641201357))
df

> df
    Processo
   1 2.5351e+16

df1 <- as.character(df$Processo)
df1

> df1
[1] "25351001641201356"

Adding the result of the solution suggested, but which has not yet worked.

> options(scipen = 1e9)
> df <- data.frame(Processo = c(25351001641201357))
> df
           Processo
1 25351001641201356

> as.character(format(df, scientific = FALSE))
[1] "25351001641201356"

I’m using Windows 10 Home Single Language 64bits Rstudio Version 1.4.1106

  • 2

    You can’t read how character?

  • It is that the table is provided by another sector of my work. I have no way to change the way the data arrive.

2 answers

2

Can use format to specify the display format:

n <- c(25351001641201357706367, 72952982679250725702754)

as.character(n)
#> [1] "2.53510016412014e+22" "7.29529826792507e+22"

as.character(format(n, scientific = FALSE))
#> [1] "25351001641201359126528" "72952982679250725240832"

Or use options to change the limit for scientific notation to a fairly high value, so that it is not used during the entire working session:

options(scipen = 1e6)

as.character(n)
#> [1] "25351001641201359126528" "72952982679250725240832"

Updated: large number of digits

Of aid to integer (free translation mine):

Note that current implementations of R use 32-bit integers for integer vectors. Therefore, the range of representable integers is restricted to about +/-2*10 9.

You can check the exact limit with .Machine$integer.max. Higher values are automatically converted to double:

n <- 25351001641201357

print(n, digits = 22)
#> [1] 25351001641201356

class(n)
#> [1] "numeric"

is.integer(n)
#> [1] FALSE

is.double(n)
#> [1] TRUE

Why this requires a very long explanation for an OS response, but the calculation of the number of significant digits in R depends on the implemented C library and follows international standard (can read more on Wikipedia in English).

The solution to this is to load the data as character. Because R automatically converts high values to double, even packages that implement higher limits (e.g. gmp) require them to be read first as a string.

  • Carlos, thank you for the answer. Unfortunately the problem has not been solved yet. I tested the codes: options(scipen = 1e9) &#xA; df <- data.frame(Processo = c(25351001641201357,25351001641201385,25351123456202188)) &#xA; df &#xA; Processo&#xA;1 25351001641201356&#xA;2 25351001641201384&#xA;3 25351123456202188

  • @rodgreg, do not post code block in the comments; if necessary edit the question. Works using format(..., scientific = FALSE)? If not, edit the question to include this detail and also enter your system/session data.

  • 1

    @rodgreg, I’m sorry it took me so long to respond. Your problem is related to number of significant digits and is a little more complicated than just changing the display. I will update the answer.

-4

options(scipen = 999)

df <- data.frame(Processo = (25351001641201357))
df

  • 1

    When answering questions, try not only to play code, but also explain what has been done and why.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.