As I do not know the purpose of your code, in principle it seems ok to me except for the fact that it never draws days 29, 30 and 31. If that date (which I see as "naive", or naive) represents a date in UTC, so he also never draws leap seconds (Leap Seconds) - although the Python documentation does not support them anyway.
Including these missing values brings an additional complication: the probability of a random date falling in a 31-day month is slightly higher than falling in a 30-day month (idem to 28 and 29), as well as falling in a leap year vs. in an ordinary year. So that if the goal is a uniform distribution, drawing fields by field would become excessively laborious, long and subject to errors.
An alternative is to draw a delta: get the value of min_year-01-01 00:00:00.000000
least (max_year+1)-01-01 00:00:00.000000
(i.e. the total seconds of a timedelta
, float) and draw a number of seconds between zero and this delta, then convert back to date:
def gen_timestamp(min_year=1915, max_year=1996):
min_date = datetime(min_year, 1,1)
max_date = datetime(max_year+1,1,1)
delta = random()*(max_date - min_date).total_seconds()
return (min_date + timedelta(seconds=delta)).isoformat(" ")
So any date in the range can be drawn, and the draw will be uniform. See for example he drawing a 29 February:
>>> i, d = 0, gen_timestamp()
>>> while d[5:10] != '02-29' and i < 100000:
... i, d = i+1, gen_timestamp()
...
>>> i,d
(770, '1960-02-29 21:28:40.688135')
Note: second the documentation, if the interval between the longest and shortest date is very large (270 years on most platforms) this method loses precision in microseconds.
Is the month only November and December, or was it a typo? (i.e. you wanted to
month = random.randint(1, 12)
) And what is "PR"?– mgibsonbr
PR is Pull Request.
– Regis Santos