I get NA when I convert Character to time (Posixlt)

Asked

Viewed 278 times

3

Why do I get NA when I do this conversion from Character to Posixlt?

    library(bReeze)
    data(winddata)

    tempo <- winddata[,1]
    tempo[1:6] # Preview 
    # [1] "06.05.2009 11:20" "06.05.2009 11:30" "06.05.2009 11:40"

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M")
    sum(is.na(tempo_POSIX))
    # [1] 6

    valores_NA <- which(is.na(tempo_POSIX))
    tempo[valores_NA]
    # [1] "18.10.2009 00:00" "18.10.2009 00:10" "18.10.2009 00:20" 
    # [3] "18.10.2009 00:30" "18.10.2009 00:40" "18.10.2009 00:50"

As you can see, the values that have been converted to NA behave normally... they follow the same format as the others.

Interestingly, the error DOES NOT occur if I pass a value to the argument tz

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M", tz = "GMT")
    sum(is.na(tempo_POSIX))
    # [1] 0

My system information is:

    > sessionInfo()
    R version 3.0.2 (2013-09-25)
    Platform: x86_64-w64-mingw32/x64 (64-bit)

    locale:
    [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
    [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
    [5] LC_TIME=Portuguese_Brazil.1252    

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    other attached packages:
    [1] bReeze_0.4-0

    loaded via a namespace (and not attached):
    [1] tools_3.0.2
  • 1

    Here it works smoothly. tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M")&#xA;sum(is.na(tempo_POSIX))&#xA;[1] 0&#xA;# [1] 6

  • @Djongs, I don’t understand your output... After all, the number of missings you get is 0 or 6? And the error keeps happening to me... So I edited the question and put my system information there.

  • Rogério, this result varies according to the locale, because of daylight saving time. Djongs system must be with another locale. @Djongs to play the example, run Sys.setenv(TZ='America/Sao_Paulo') before.

1 answer

3


At the help of as.POSIXlt, there is the following passage that highlights that the conversion of date time formats needs a time-zone and will validate this time and that this can cause problems in summer times (Daylight Savings Time - DST):

Character input is first converted to class "Posixlt" by strptime: Numeric input is first converted to "Posixct". Any Conversion that needs to go between the two date-time classes requires a time zone: > Conversion from "Posixlt" to "Posixct" will validate times in the Selected time zone. One Issue is what Happens at Transitions to and from DST

When you do the strptime(tempo, format = "%d.%m.%Y %H:%M"), you are converting the object to the Posixlt class.

class(tempo_POSIX)
[1] "POSIXlt" "POSIXt" 

But when you do the is.na(), you are converting to Posixct. See that method is.na.POSIXlt uses the function as.POSIXct:

is.na.POSIXlt
function (x) 
is.na(as.POSIXct(x))
<bytecode: 0x26519a14>
<environment: namespace:base>

Daylight saving time in Brazil in 2009 started on 18 October at 00:00. That is, considering daylight saving time, there is no 00:00 in Brazil on October 18, 2009, because when the watch turned 23:59 the day before, it jumped automatically to 1:00 in the morning.

So when you do the is.na() you are transforming the date into Posixct and this conversion validates the date provided with your locale (which is probably Brazil/São Paulo, because since you did not specify the time zone, the time zone will be used). And since there is no 00:00 on October 28th in this time zone, it results (correctly, but unexpectedly) in NA. When you put the time-zone GMT or other that exists the date you are going through (like London) it does the conversion normally, so it worked with tz = "GMT" (and why it worked with the Djongs, it must be in another locale).

  • 1

    Sensational, @Carlos!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.