Format latitude and numerical longitude coordinates in R

Asked

Viewed 242 times

1

I received a data frame with the coordinates of the schools of the city of São Paulo. They were separated by dot in the thousands, as a number -23,456,789. Thus, R reads the value as Chr.

There were also cases where the coordinate had fewer characters and the punctuation was (perhaps automatically) incorrectly placed -2,345,678.

However, for st_to_sf to read and transform into geospatial coordinates it is necessary that it is numerical and that there is only one point (in my case, at all times after the first two paragraphs, e.g. -23.456789)

How to format coordinates to decimal points after the first two characters?

2 answers

3


I believe that this role solves the problem. In the question it is said that there are always two digits before the decimal point, even if there have been errors in the base. Then the function

  1. Removes all points;
  2. Replaces the first two digits \\d\\d by the same standard \\1 (first group captured by regex) followed by a point.

And ends up turning into "numeric".

formatCoord <- function(x){
  y <- gsub("\\.", "", x)
  y <- sub("(\\d\\d)", "\\1.", y)
  as.numeric(y)
}

x <- c("-23.456.789", "-2.345.678", "12.345.678")

formatCoord(x)
#[1] -23.45679 -23.45678  12.34568
  • I don’t understand your line y <- gsub("\\.", "", sub("(^-*\\d+)\\.", "\\1,", x)) I tested like this y <- gsub("\\.", "", sub("\\.", ",", x)) and gave the same result

  • 1

    @See now, I think you’re okay.

1

First, we must remove the points in the numbers to stay -23456789 in each column (latitude and longitude):

dados$latitude <- as.numeric(gsub("\\.", "", dados$latitude)) # O "\\." é necessário para que ele leia o . como um caracter.
dados$longitude <- as.numeric(gsub("\\.", "", dados$longitude))

Then we will use str_c to concatenate the separation it will make of the numbers, with str_sub, in the positions indicated by start and end, placing the point in the desired position:

dados$latitude <- str_c(
  str_sub(dados$latitude, start = 1, end = 3),
  ".",
  str_sub(dados$latitude, start = 4)
)

dados$longitude <- str_c(
  str_sub(dados$longitude, start = 1, end = 3),
  ".",
  str_sub(dados$longitude, start = 4) # não coloquei posição final caso alguma coordenada não tenha o mesmo número de caracteres das outras
)

Without this, if we use direct str_sub, it will replace one of the coordinate numbers with ".", which will cause wrong location on the map.

  • 1

    Failure in "12.345.678". There’s no guarantee that coordinates will always start with a negative signal.

  • No, that was my case. They are coordinates of the schools of the city of São Paulo. There they are all negative. The d.f. came with numbers full of points. If there are no negatives, you have to change the end for 2.

  • Your solution is much better. But I don’t think I explained it properly. Still needed to correct that some coordinates had a number less, so the score was wrong tb. You have to include these cases in this role?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.