How to work with timezones without using a timestamp?

Asked

Viewed 1,326 times

4

I was reading that answer, of the Stack Overflow in English, and I came across the following excerpt:

For Mysql (or Mariadb), if you don’t need the time information consider using the DATE column type. If you need Higher Accuracy, use DATETIME rather than TIMESTAMP. Beware that DATETIME Columns do not store information about the Timezone, so your application will have to know which Timezone was used.

According to the bold section of the above quote, if I use the type DATETIME, my application will no longer be aware of Timezone user, what would happen if I were using a TIMESTAMP, right?

Questions are:

  • How can I save the date of an event that will occur in the future, also saving the Timezone user’s?
  • How do I make even with timezones different, the point in time is the same?

It’s important to point out that I would even use the type TIMESTAMP if he were not susceptible to bug of the year 2038.

2 answers

4


Actually, none of the Mysql types (DATETIME and TIMESTAMP) store information relating to the time zone. Both have information about the date and time, but according to the documentation, there are some differences:

  • DATETIME: supports dates between 1000-01-01 00:00:00 and 9999-12-31 23:59:59
  • TIMESTAMP: supports dates between 1970-01-01 00:00:01 and 2038-01-19 03:14:07

Another detail is that the TIMESTAMP always works internally with UTC. When saving a date, it converts the current Timezone date/time to UTC, and when querying the data, converts it back (from UTC to the current Timezone). In this case, the "current Timezone" is the one configured on the server, but this can be overwritten if you specify a Timezone in the connection (for example: jdbc:mysql://localhost:3306/dbname?serverTimezone=UTC or jdbc:mysql://localhost:3306/dbname?serverTimezone=America/Sao_Paulo, etc.).

This can cause confusion if the data has been recorded using a Timezone, but when consulting another is used (there are examples of this in that article, and also I have an example in Java on Github showing this behavior). This is not a problem if the Timezone settings are "stable", but it cannot be guaranteed that no one will ever change them (either on purpose or unintentionally). Already one DATETIME does not suffer these conversion problems between timezones and UTC.

Another point to note is that, despite the name, the TIMESTAMP mysql nay is the same as a timestamp (a numerical value - usually in seconds or milliseconds - indicating the amount of time elapsed since the Unix Epoch). Obviously, since the value of this is in UTC, it can be converted to the numeric value of the timestamp, but thanks to these conversion details from/to UTC, it is not always "obvious" which value will be saved.


If you want to record an exact instant something occurred (a point on the timeline, independent of the Timezone), and bypass the limitation of the year 2038, an option is to use a numeric field (such as a BIGINT) and save the timestamp value (the amount of seconds elapsed since 1970-01-01T00:00Z), which you can get using your preferred language (all have some way to get it). If you need to display this information to the user, most languages have mechanisms to convert this value to a date and time in a specific Timezone. And how BIGINT supports very large values, you can even save the amount of milliseconds if you want (the highest value for BIGINT Signed is 263 - 1, and if this value represents a timestamp in milliseconds, it is equivalent to a date in the year 292.278.994 - already if this value is in seconds, is equivalent to a date in the year 292,277,026,596).

That is, in this case the bank would only keep a numerical value, and the responsibility to translate this value to a date and time is in the language that is consulting and manipulating this data. The downside is that because it is no longer a date field, it will not be possible to use Mysql-specific functions (such as adding days to a date, etc.), and all of this manipulation is off the bank (although it is still possible to compare whether a timestamp is larger or smaller than another, to know if an instant is in the past or future, or even put the data in chronological order, since it is a numerical field). More details about this approach in this reply by Soen.

If the problem of the year 2038 is not relevant, you can use TIMESTAMP also, looking at the Timezone settings mentioned above. There are still functions to convert timestamps from/to dates (such as UNIX_TIMESTAMP and UTC_TIMESTAMP), but they are also limited to ranges mysql.

Finally, in this question by Soen has several answers about DATETIME x TIMESTAMP.


UTC and future events

In general the recommendation of "always use UTC" (or the variation "always use timestamps") is valid for most cases. A very recommended "good practice" is to convert to UTC as soon as possible (as soon as you receive the data, for example) and convert to any other Timezone at the last instant (eg to show the date and time to the user, using your time zone).

In fact, for many cases this solves. But "always" using "good practice" will not work 100% of the time. An example are dates in future events.

Let’s assume we’re in 2016, and we have a system where users can register future events. Then a user registers an event that will take place in 31 October 2018, at 10 am, in the Official Time of Brasilia.

In 2016, the Brazilian summer time rule stated that it would begin on the third Sunday in October. That is, on October 30, 2018, it would already be daylight saving time, and the offset used (the difference with respect to UTC) is two hours behind the UTC (-02:00).

Therefore, the date/time and offset of the future event would be 2018-10-31T10:00-02:00, which in UTC corresponds to 2018-10-31T12:00Z (the "Z" at the end means that the date/time is in UTC, in accordance with the format defined by ISO 8601). Then you follow the "good practice" and record the value in UTC in the bank. When any user wants to know the date and time of the event, you query the value (which is in UTC: 2018-10-31T12:00Z) and converts to the user’s Timezone (if it is the Brasilia Time, the result will be 2018-10-31T10:00-02:00). So far, so good.

But there’s one detail, which we don’t always take into account: daylight saving time is set by the government, and it can change its mind at any time (just see historical). And in this case, it really changed: the Brazilian Summer Time rule was changed by a decree published in December 2017. According to this decree, from 2018 the beginning of daylight saving time would be on the first Sunday of November.

Therefore, by the new rules, 31 October 2018 is no longer in daylight saving time, which means that on this day the Schedule of Brasilia is still 3 hours behind UTC (ie, the offset is -03:00). And converting the value into UTC that had been recorded in the bank (2018-10-31T12:00Z) to the offset -03:00, the result is 2018-10-31T09:00-03:00 (9 am, one hour before the user registered). Remember that UTC is a standard that defines a time from which time zones are based, and does not suffer the effects of daylight saving time. It is an absolute value, and if any Timezone changes its rules (such as the offset used at each time of the year), the difference between local time and UTC will also change.

In that case, the solution would be to keep a DATETIME with the local date and time (2018-10-31T10:00) and record Timezone separately. If a user just wants to know when the event will be, show the date and time, and tell which Timezone that day and time refer to, if this information is relevant to users. If you want to convert to UTC, most languages have some date API that does this conversion, using the Timezone rules in question.

For Timezone, most languages and Apis have support for IANA Time Zone Database, defining identifiers as America/Sao_Paulo, Europe/London and Asia/Tokyo. Each of these identifiers has history of changes to the local time of a particular region (America/Sao_Paulo, for example, corresponds to the Time of Brasilia).

Each time a government decides to change the time zone rules for a region, IANA updates its bank and releases a new version. All languages and systems that use IANA’s TZBD have some mechanism to update their data, as IANA makes these updates available. In the above example, the change of the Brazilian daylight saving time rule was released in version 2018c (in January 2018).

That is, if you have saved the Timezone separate local time and date, simply update the TZBD so that the conversion to UTC starts giving the correct value. If you had already saved the date and time in UTC, you would have to analyze case by case and change the dates manually (and not to change all the dates between October and November, only those that were registered before you update TZDB).

In this article by Jon Skeet has a much more complete example about this problem.


It is also worth remembering that if you are really going to save the timezones, keep in mind that they change all the time. In Brazil, for example, we had the aforementioned change in the beginning of summer time in 2018, and in 2019 it changed again (in that year we won’t have daylight savings time - including the IANA has already released a version containing this change). And nothing guarantees that it will continue like this forever, because in the future the government can change the rules again.

And of course this is not restricted to Brazil. The EU is on track to eliminate daylight saving time, and in various parts of the world there’s always someone arguing whether or not to change these rules. So this is another point of attention, to keep your system always updated with the new versions of IANA.

2

There are errors there. One of them is that all kinds of Mysql schedules do not save time zones, especially TIMESTAMP It doesn’t do this because it doesn’t even make sense to have this in this guy, so much so that he converts times that are in a UTC zone. And it’s usually right in most situations.

In general times should be recorded normalized, so using UTC always. In some rare cases it should store in a specific zone and other rare to store which zone, and just because this information has some relevance to the system (almost always does not, but programmers think it has).

If you really need to save the time zone create a column to store this data. I issue a numerical or textual format that gives you this information in a manner appropriate to your need. Then the time would be recorded without converting to UTC. And treat the time as a piece of data composed of the two columns, converting if necessary by observing the time zone field.

A common mistake that programmers make is to mix the pure data and the representation of the data, the spindle is usually only interesting for the representation in specific scenarios, the data should be stored free of this and normalized to the simplest way that in the case of time it is using the UTC zone (converted to time zone 00:00).

So for all this TIMESTAMP is wrong even, it should only be used for internal computer schedules which was the motivation that it was created. Some people create their own kind of timestamp to get around the problem of the year 2038, which is already another issue.

  • Thank you for the reply! I confess that it was very enlightening. Regarding mistakes, sorry for them. Do you think I edit the question to try to remove incorrect information or can I leave as is? ps: I did not check how you accept to see if someone else answers as well. :)

  • Not because many people would make the same mistake and can help them, for Iss what there are questions.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.