Issue 13957: parsedate_tz doesn't distinguish -0000 from +0000 (original) (raw)

This is what I'm seeing:

import email.utils email.utils.parsedate_tz('Fri, 09 Nov 2001 01:08:47 +0000') (2001, 11, 9, 1, 8, 47, 0, 1, -1, 0) email.utils.parsedate_tz('Fri, 09 Nov 2001 01:08:47 -0000') (2001, 11, 9, 1, 8, 47, 0, 1, -1, 0)

But RFC 5322 says:

minutes). The form "+0000" SHOULD be used to indicate a time zone at Universal Time. Though "-0000" also indicates Universal Time, it is used to indicate that the time was generated on a system that may be in a local time zone other than Universal Time and that the date-time contains no information about the local time zone.

(As does RFC 2822, which RFC 5322 obsoletes.)

And the documentation for email.utils.parsedate_tz is:

Performs the same function as parsedate(), but returns either None or a 10-tuple; the first 9 elements make up a tuple that can be passed directly to time.mktime(), and the tenth is the offset of the date’s timezone from UTC (which is the official term for Greenwich Mean Time) [1]. If the input string has no timezone, the last element of the tuple returned is None. Note that indexes 6, 7, and 8 of the result tuple are not usable.

So it seems like I should have seen:

email.utils.parsedate_tz('Fri, 09 Nov 2001 01:08:47 -0000') (2001, 11, 9, 1, 8, 47, 0, 1, -1, None)

This is fixed already in 3.3. It is a behavior change that could theoretically cause some problems. Currently, you can think of None as meaning "there was no timezone info at all", which is subtly different from -0000, which means "this time is UTC, but I don't know what timezone it originated from". These two tend to be conflated in practice (how else are you going to interpret a time with no timezone attached?), and since we are making other additions to email in 3.3 we decided it was a small enough change that it was OK for a dot release. But not for a maintenance release, just in case. (I'm open to argument on that, but these backward compatibility calls are notoriously hard to make.)