Should I check dates with Datetime or regex?

Asked

Viewed 1,804 times

7

I have seen two different ways to check whether a date is valid.

In a "modern" way, with DateTime:

$date="2014-02-04";
$dt = DateTime::createFromFormat("Y-m-d", $date);

return $dt !== false && !array_sum($dt->getLastErrors());

And using a regular expression:

$date="2014-02-04";

// regex bem simples, apenas para demonstrar a situação.
return (preg_match("/^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$/",$date));

I didn’t find much comparative information about these methods. I know that the DateTime is more modern, but this is enough to consider it more recommendable? What are the advantages and disadvantages of each method?

  • I recommend the use of the http://www.php.net/manual/en/class.datetime.phpclass, as well as being specialized in dates Regular expressions should be used in the last case, as it is quite expensive both to work with them and to the server. May in some cases cause slowness.

4 answers

10

Dates are in the background numbers. Even if you declare in string form, it is more efficient to represent them as integers.

The class DateTime is specialized, therefore, has all possible optimizations to work with dates.

Regular expressions are commonly used and process strings. They usually have a high computational cost (usually O(n²)).

Add that to the fact that it is much easier for a computer to interpret integers than strings and you have your answer.

As a rule: whenever there is an alternative to regular expressions, use it. Regular expressions are like a Swiss Army knife: always useful, but if you have real pliers, why use those pliers more or less that come in the Swiss Army knife?

  • +1 Excellent comparison with the penknife :)

5

No need to complicate. Use native function checkdate of PHP:

http://br1.php.net/manual/en/function.checkdate.php

bool checkdate ( int $Month , int $day , int $year )

Example:

if (checkdate(2, 29, 2014)) {
    echo 'Data válida';
}
else {
    echo 'Data inválida';
}

This will print on the screen: Invalid date. Because 2014 is not leap year.

4

It is best to use Datetime::createFromFormat() itself. It is designed for this and treats conditions that your regular expression does not address.

I can’t tell you which is the fastest (vote on Datetime) because it will depend on the implementation and version of PHP.


For PHP versions smaller than 5.3 the documentation presents this alternative code for Datetime::createFromFormat():

static function createFromFormat ($format, $time){
assert ($format!="");
if($time==""){ 
    return new DateClass();
}

    $regexpArray['Y'] = "(?P<Y>19|20\d\d)";        
    $regexpArray['m'] = "(?P<m>0[1-9]|1[012])";
    $regexpArray['d'] = "(?P<d>0[1-9]|[12][0-9]|3[01])";
    $regexpArray['-'] = "[-]";
    $regexpArray['.'] = "[\. /.]";
    $regexpArray[':'] = "[:]";            
    $regexpArray['space'] = "[\s]";
    $regexpArray['H'] = "(?P<H>0[0-9]|1[0-9]|2[0-3])";
    $regexpArray['i'] = "(?P<i>[0-5][0-9])";
    $regexpArray['s'] = "(?P<s>[0-5][0-9])";

    $formatArray = str_split ($format);
    $regex = "";

    // create the regular expression
    foreach($formatArray as $character){
        if ($character==" ") $regex = $regex.$regexpArray['space'];
        elseif (array_key_exists($character, $regexpArray)) $regex = $regex.$regexpArray[$character];
    }
    $regex = "/".$regex."/";

    // get results for regualar expression
    preg_match ($regex, $time, $result);

    // create the init string for the new DateTime
    $initString = $result['Y']."-".$result['m']."-".$result['d'];

// if no value for hours, minutes and seconds was found add 00:00:00
    if (isset($result['H'])) $initString = $initString." ".$result['H'].":".$result['i'].":".$result['s'];
    else {$initString = $initString." 00:00:00";}

    $newDate = new DateClass ($initString);
    return $newDate;
    }    
}

And as you can see they also use regex (not if in versions 5.3+). But, regex by regex, use Datetime::createFromFormat() same.

2

Regular expression is not very good to validate dates, since you do not only want to validate if there are numbers or strokes you also want to validate the date.

The expression you put on

^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$

Would be responsible for validating the date "2014-02-31", and that date does not exist.

One option would be to use a more complex regular expression (valid between 1800-2099 with leap years)

^((?:18|19|20)[0-9]{2})-(?:0[13578]|1[02])-31|(?:18|19|20)[0-9]{2}-(?:01|0[3-9]|1[1-2])-(?:29|30)|(?:18|19|20)[0-9]{2}-(?:0[1-9]|1[0-2])-(?:0[1-9]|1[0-9]|2[0-8])|(?:(?:(?:(?:18|19|20)(?:04|08|[2468][048]|[13579][26]))|2000)-02-29)$

But the higher the regular expression the higher the processing, so it is recommended to turn text into the Date type as the best option, so the conversion prevents it to be done with invalid dates.

  • Don’t... like, don’t do it like that.

  • @Henriquebarcelos How about expanding your comment to better explain what’s wrong with this form? It’s better for everyone

  • 1

    It is explained in my reply just above.

  • The answer makes it clear that the best option is the other option and not this one. I think it validates this answer by showing the other way to do it and why not to use this form @Henriquebarcelos.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.