PEP 495 – Local Time Disambiguation | peps.python.org (original) (raw)
Author:
Alexander Belopolsky <alexander.belopolsky at gmail.com>, Tim Peters <tim.peters at gmail.com>
Discussions-To:
Status:
Final
Type:
Standards Track
Created:
02-Aug-2015
Python-Version:
3.6
Resolution:
Table of Contents
- Abstract
- Rationale
- Terminology
- Proposal
- The “fold” attribute
- Affected APIs
* Attributes
* Constructors
* Methods
* C-API - Affected Behaviors
* What time is it?
* Conversion from naive to aware
* Conversion from POSIX seconds from EPOCH
* Conversion to POSIX seconds from EPOCH
* Aware datetime instances
* Combining and splitting date and time
* Pickles
- Implementations of tzinfo in the Standard Library
- Guidelines for New tzinfo Implementations
- Temporal Arithmetic and Comparison Operators
- Backward and Forward Compatibility
- Questions and Answers
- Implementation
- Copyright
- Picture Credit
Abstract
This PEP adds a new attribute fold
to instances of thedatetime.time
and datetime.datetime
classes that can be used to differentiate between two moments in time for which local times are the same. The allowed values for the fold
attribute will be 0 and 1 with 0 corresponding to the earlier and 1 to the later of the two possible readings of an ambiguous local time.
Rationale
In most world locations, there have been and will be times when local clocks are moved back. [1] In those times, intervals are introduced in which local clocks show the same time twice in the same day. In these situations, the information displayed on a local clock (or stored in a Python datetime instance) is insufficient to identify a particular moment in time. The proposed solution is to add an attribute to the datetime
instances taking values of 0 and 1 that will enumerate the two ambiguous times.
Terminology
When clocks are moved back, we say that a fold [2] is created in time. When the clocks are moved forward, a gap is created. A local time that falls in the fold is called ambiguous. A local time that falls in the gap is called missing.
Proposal
The “fold” attribute
We propose adding an attribute called fold
to instances of thedatetime.time
and datetime.datetime
classes. This attribute should have the value 0 for all instances except those that represent the second (chronologically) moment in time in an ambiguous case. For those instances, the value will be 1. [3]
Affected APIs
Attributes
Instances of datetime.time
and datetime.datetime
classes will get a new attribute fold
with two possible values: 0 and 1.
Constructors
The __new__
methods of the datetime.time
anddatetime.datetime
classes will get a new keyword-only argument called fold
with the default value 0. The value of thefold
argument will be used to initialize the value of thefold
attribute in the returned instance.
Methods
The replace()
methods of the datetime.time
anddatetime.datetime
classes will get a new keyword-only argument called fold
. It will behave similarly to the other replace()
arguments: if the fold
argument is specified and given a value 0 or 1, the new instance returned by replace()
will have itsfold
attribute set to that value. In CPython, any non-integer value of fold
will raise a TypeError
, but other implementations may allow the value None
to behave the same as when fold
is not given. [4] (This is a nod to the existing difference in treatment of None
arguments in other positions of this method across Python implementations; it is not intended to leave the door open for future alternative interpretation of fold=None
.) If the fold
argument is not specified, the original value of the fold
attribute is copied to the result.
C-API
Access macros will be defined to extract the value of fold
fromPyDateTime_DateTime
and PyDateTime_Time
objects.
int PyDateTime_DATE_GET_FOLD(PyDateTime_DateTime *o)
Return the value of fold
as a C int
.
int PyDateTime_TIME_GET_FOLD(PyDateTime_Time *o)
Return the value of fold
as a C int
.
New constructors will be defined that will take an additional argument to specify the value of fold
in the created instance:
PyObject* PyDateTime_FromDateAndTimeAndFold( int year, int month, int day, int hour, int minute, int second, int usecond, int fold)
Return a datetime.datetime
object with the specified year, month, day, hour, minute, second, microsecond and fold.
PyObject* PyTime_FromTimeAndFold( int hour, int minute, int second, int usecond, int fold)
Return a datetime.time
object with the specified hour, minute, second, microsecond and fold.
Affected Behaviors
What time is it?
The datetime.now()
method called without arguments will setfold=1
when returning the second of the two ambiguous times in a system local time fold. When called with a tzinfo
argument, the value of the fold
will be determined by the tzinfo.fromutc()
implementation. When an instance of the datetime.timezone
class (the stdlib’s fixed-offset tzinfo
subclass,e.g. datetime.timezone.utc
) is passed as tzinfo
, the returned datetime instance will always have fold=0
. The datetime.utcnow()
method is unaffected.
Conversion from naive to aware
A new feature is proposed to facilitate conversion from naive datetime instances to aware.
The astimezone()
method will now work for naive self
. The system local timezone will be assumed in this case and the fold
flag will be used to determine which local timezone is in effect in the ambiguous case.
For example, on a system set to US/Eastern timezone:
dt = datetime(2014, 11, 2, 1, 30) dt.astimezone().strftime('%D %T %Z%z') '11/02/14 01:30:00 EDT-0400' dt.replace(fold=1).astimezone().strftime('%D %T %Z%z') '11/02/14 01:30:00 EST-0500'
An implication is that datetime.now(tz)
is fully equivalent todatetime.now().astimezone(tz)
(assuming tz
is an instance of a post-PEP tzinfo
implementation, i.e. one that correctly handles and sets fold
).
Conversion from POSIX seconds from EPOCH
The fromtimestamp()
static method of datetime.datetime
will set the fold
attribute appropriately in the returned object.
For example, on a system set to US/Eastern timezone:
datetime.fromtimestamp(1414906200) datetime.datetime(2014, 11, 2, 1, 30) datetime.fromtimestamp(1414906200 + 3600) datetime.datetime(2014, 11, 2, 1, 30, fold=1)
Conversion to POSIX seconds from EPOCH
The timestamp()
method of datetime.datetime
will return different values for datetime.datetime
instances that differ only by the value of their fold
attribute if and only if these instances represent an ambiguous or a missing time.
When a datetime.datetime
instance dt
represents an ambiguous time, there are two values s0
and s1
such that:
datetime.fromtimestamp(s0) == datetime.fromtimestamp(s1) == dt
(This is because ==
disregards the value of fold – see below.)
In this case, dt.timestamp()
will return the smaller of s0
and s1
values if dt.fold == 0
and the larger otherwise.
For example, on a system set to US/Eastern timezone:
datetime(2014, 11, 2, 1, 30, fold=0).timestamp() 1414906200.0 datetime(2014, 11, 2, 1, 30, fold=1).timestamp() 1414909800.0
When a datetime.datetime
instance dt
represents a missing time, there is no value s
for which:
datetime.fromtimestamp(s) == dt
but we can form two “nice to know” values of s
that differ by the size of the gap in seconds. One is the value of s
that would correspond to dt
in a timezone where the UTC offset is always the same as the offset right before the gap and the other is the similar value but in a timezone the UTC offset is always the same as the offset right after the gap.
The value returned by dt.timestamp()
given a missingdt
will be the greater of the two “nice to know” values if dt.fold == 0
and the smaller otherwise. (This is not a typo – it’s intentionally backwards from the rule for ambiguous times.)
For example, on a system set to US/Eastern timezone:
datetime(2015, 3, 8, 2, 30, fold=0).timestamp() 1425799800.0 datetime(2015, 3, 8, 2, 30, fold=1).timestamp() 1425796200.0
Aware datetime instances
Users of pre-PEP implementations of tzinfo
will not see any changes in the behavior of their aware datetime instances. Two such instances that differ only by the value of the fold
attribute will not be distinguishable by any means other than an explicit access to the fold
value. (This is because these pre-PEP implementations are not using the fold
attribute.)
On the other hand, if an object’s tzinfo
is set to a fold-aware implementation, then in a fold or gap the value of fold
will affect the result of several methods:utcoffset()
, dst()
, tzname()
, astimezone()
,strftime()
(if the “%Z” or “%z” directive is used in the format specification), isoformat()
, and timetuple()
.
Combining and splitting date and time
The datetime.datetime.combine()
method will copy the value of thefold
attribute to the resulting datetime.datetime
instance.
The datetime.datetime.time()
method will copy the value of thefold
attribute to the resulting datetime.time
instance.
Pickles
The value of the fold attribute will only be saved in pickles created with protocol version 4 (introduced in Python 3.4) or greater.
Pickle sizes for the datetime.datetime
and datetime.time
objects will not change. The fold
value will be encoded in the first bit of the 3rd byte of the datetime.datetime
pickle payload; and in the first bit of the 1st byte of thedatetime.time
payload. In the current implementationthese bytes are used to store the month (1-12) and hour (0-23) values and the first bit is always 0. We picked these bytes because they are the only bytes that are checked by the current unpickle code. Thus loading post-PEP fold=1
pickles in a pre-PEP Python will result in an exception rather than an instance with out of range components.
Implementations of tzinfo in the Standard Library
No new implementations of datetime.tzinfo
abstract class are proposed in this PEP. The existing (fixed offset) timezones do not introduce ambiguous local times and their utcoffset()
implementation will return the same constant value as they do now regardless of the value of fold
.
The basic implementation of fromutc()
in the abstractdatetime.tzinfo
class will not change. It is currently not used anywhere in the stdlib because the only included tzinfo
implementation (the datetime.timezone
class implementing fixed offset timezones) overrides fromutc()
. Keeping the default implementation unchanged has the benefit that pre-PEP 3rd party implementations that inherit the default fromutc()
are not accidentally affected.
Guidelines for New tzinfo Implementations
Implementors of concrete datetime.tzinfo
subclasses who want to support variable UTC offsets (due to DST and other causes) should follow these guidelines.
Ignorance is Bliss
New implementations of utcoffset()
, tzname()
and dst()
methods should ignore the value of fold
unless they are called on the ambiguous or missing times.
In the Fold
New subclasses should override the base-class fromutc()
method and implement it so that in all cases where two different UTC times u0
andu1
(u0
<u1
) correspond to the same local time t
,fromutc(u0)
will return an instance with fold=0
andfromutc(u1)
will return an instance with fold=1
. In all other cases the returned instance should have fold=0
.
The utcoffset()
, tzname()
and dst()
methods should use the value of the fold attribute to determine whether an otherwise ambiguous time t
corresponds to the time before or after the transition. By definition, utcoffset()
is greater before and smaller after any transition that creates a fold. The values returned by tzname()
and dst()
may or may not depend on the value of the fold
attribute depending on the kind of the transition.
The sketch above illustrates the relationship between the UTC and local time around a fall-back transition. The zig-zag line is a graph of the function implemented by fromutc()
. Two intervals on the UTC axis adjacent to the transition point and having the size of the time shift at the transition are mapped to the same interval on the local axis. New implementations of fromutc()
method should set the fold attribute to 1 when self
is in the region marked in yellow on the UTC axis. (All intervals should be treated as closed on the left and open on the right.)
Mind the Gap
The fromutc()
method should never produce a time in the gap.
If the utcoffset()
, tzname()
or dst()
method is called on a local time that falls in a gap, the rules in effect before the transition should be used if fold=0
. Otherwise, the rules in effect after the transition should be used.
The sketch above illustrates the relationship between the UTC and local time around a spring-forward transition. At the transition, the local clock is advanced skipping the times in the gap. For the purposes of determining the values of utcoffset()
, tzname()
and dst()
, the line before the transition is extended forward to find the UTC time corresponding to the time in the gap with fold=0
and for instances with fold=1
, the line after the transition is extended back.
Summary of Rules at a Transition
On ambiguous/missing times utcoffset()
should return values according to the following table:
fold=0 | fold=1 | |
---|---|---|
Fold | oldoff | newoff = oldoff - delta |
Gap | oldoff | newoff = oldoff + delta |
where oldoff
(newoff
) is the UTC offset before (after) the transition and delta
is the absolute size of the fold or the gap.
Note that the interpretation of the fold attribute is consistent in the fold and gap cases. In both cases, fold=0
(fold=1
) means use fromutc()
line before (after) the transition to find the UTC time. Only in the “Fold” case, the UTC times u0
and u1
are “real” solutions for the equation fromutc(u) == t
, while in the “Gap” case they are “imaginary” solutions.
The DST Transitions
On a missing time introduced at the start of DST, the values returned by utcoffset()
and dst()
methods should be as follows
fold=0 | fold=1 | |
---|---|---|
utcoffset() | stdoff | stdoff + dstoff |
dst() | zero | dstoff |
On an ambiguous time introduced at the end of DST, the values returned by utcoffset()
and dst()
methods should be as follows
fold=0 | fold=1 | |
---|---|---|
utcoffset() | stdoff + dstoff | stdoff |
dst() | dstoff | zero |
where stdoff
is the standard (non-DST) offset, dstoff
is the DST correction (typically dstoff = timedelta(hours=1)
) and zero = timedelta(0)
.
Temporal Arithmetic and Comparison Operators
In mathematicks he was greater
Than Tycho Brahe, or Erra Pater:
For he, by geometric scale,
Could take the size of pots of ale;
Resolve, by sines and tangents straight,
If bread or butter wanted weight,
And wisely tell what hour o’ th’ day
The clock does strike by algebra.
– “Hudibras” by Samuel Butler
The value of the fold
attribute will be ignored in all operations with naive datetime instances. As a consequence, naivedatetime.datetime
or datetime.time
instances that differ only by the value of fold
will compare as equal. Applications that need to differentiate between such instances should check the value offold
explicitly or convert those instances to a timezone that does not have ambiguous times (such as UTC).
The value of fold
will also be ignored whenever a timedelta is added to or subtracted from a datetime instance which may be either aware or naive. The result of addition (subtraction) of a timedelta to (from) a datetime will always have fold
set to 0 even if the original datetime instance had fold=1
.
No changes are proposed to the way the difference t - s
is computed for datetime instances t
and s
. If both instances are naive or t.tzinfo
is the same instance as s.tzinfo
(t.tzinfo is s.tzinfo
evaluates to True
) then t - s
is a timedelta d
such that s + d == t
. As explained in the previous paragraph, timedelta addition ignores both fold
andtzinfo
attributes and so does intra-zone or naive datetime subtraction.
Naive and intra-zone comparisons will ignore the value of fold
and return the same results as they do now. (This is the only way to preserve backward compatibility. If you need an aware intra-zone comparison that uses the fold, convert both sides to UTC first.)
The inter-zone subtraction will be defined as it is now: t - s
is computed as (t - t.utcoffset()) - (s - s.utcoffset()).replace(tzinfo=t.tzinfo)
, but the result will depend on the values of t.fold
and s.fold
when eithert.tzinfo
or s.tzinfo
is post-PEP. [5]
Aware datetime Equality Comparison
The aware datetime comparison operators will work the same as they do now, with results indirectly affected by the value of fold
whenever the utcoffset()
value of one of the operands depends on it, with one exception. Whenever one or both of the operands in inter-zone comparison is such that its utcoffset()
depends on the value of its fold
fold attribute, the result is False
. [6]
Formally, t == s
when t.tzinfo is s.tzinfo
evaluates toFalse
can be defined as follows. Let toutc(t, fold)
be a function that takes an aware datetime instance t
and returns a naive instance representing the same time in UTC assuming a given value of fold
:
def toutc(t, fold): u = t - t.replace(fold=fold).utcoffset() return u.replace(tzinfo=None)
Then t == s
is equivalent to
toutc(t, fold=0) == toutc(t, fold=1) == toutc(s, fold=0) == toutc(s, fold=1)
Backward and Forward Compatibility
This proposal will have little effect on the programs that do not read the fold
flag explicitly or use tzinfo implementations that do. The only visible change for such programs will be that conversions to and from POSIX timestamps will now round-trip correctly (up to floating point rounding). Programs that implemented a work-around to the old incorrect behavior may need to be modified.
Pickles produced by older programs will remain fully forward compatible. Only datetime/time instances with fold=1
pickled in the new versions will become unreadable by the older Python versions. Pickles of instances with fold=0
(which is the default) will remain unchanged.
Questions and Answers
Why not call the new flag “isdst”?
A non-technical answer
- Alice: Bob - let’s have a stargazing party at 01:30 AM tomorrow!
- Bob: Should I presume initially that Daylight Saving Time is or is not in effect for the specified time?
- Alice: Huh?
- Bob: Alice - let’s have a stargazing party at 01:30 AM tomorrow!
- Alice: You know, Bob, 01:30 AM will happen twice tomorrow. Which time do you have in mind?
- Bob: I did not think about it, but let’s pick the first.
(same characters, an hour later)
- Bob: Alice - this Py-O-Clock gadget of mine asks me to choose between fold=0 and fold=1 when I set it for tomorrow 01:30 AM. What should I do?
- Alice: I’ve never hear of a Py-O-Clock, but I guess fold=0 is the first 01:30 AM and fold=1 is the second.
A technical reason
While the tm_isdst
field of the time.struct_time
object can be used to disambiguate local times in the fold, the semantics of such disambiguation are completely different from the proposal in this PEP.
The main problem with the tm_isdst
field is that it is impossible to know what value is appropriate for tm_isdst
without knowing the details about the time zone that are only available to the tzinfo
implementation. Thus while tm_isdst
is useful in the output of methods such as time.localtime
, it is cumbersome as an input of methods such as time.mktime
.
If the programmer misspecified a non-negative value of tm_isdst
totime.mktime
, the result will be time that is 1 hour off and since there is rarely a way to know anything about DST before a call totime.mktime
is made, the only sane choice is usuallytm_isdst=-1
.
Unlike tm_isdst
, the proposed fold
attribute has no effect on the interpretation of the datetime instance unless without that attribute two (or no) interpretations are possible.
Since it would be very confusing to have something called isdst
that does not have the same semantics as tm_isdst
, we need a different name. Moreover, the datetime.datetime
class already has a method called dst()
and if we called fold
“isdst”, we would necessarily have situations when “isdst” is zero but dst()
is not or the other way around.
Why “fold”?
Suggested by Guido van Rossum and favored by one (but initially disfavored by another) author. A consensus was reached after the allowed values for the attribute were changed from False/True to 0/1. The noun “fold” has correct connotations and easy mnemonic rules, but at the same time does not invite unbased assumptions.
What is “first”?
This was a working name of the attribute chosen initially because the obvious alternative (“second”) conflicts with the existing attribute. It was rejected mostly on the grounds that it would make True a default value.
The following alternative names have also been considered:
later
A close contender to “fold”. One author dislikes it because it is confusable with equally fitting “latter,” but in the age of auto-completion everywhere this is a small consideration. A stronger objection may be that in the case of missing time, we will have later=True
instance converted to an earlier time by.astimezone(timezone.utc)
that that with later=False
. Yet again, this can be interpreted as a desirable indication that the original time is invalid.
which
The original placeholder name for the localtime
function branch index was independently proposed for the name of the disambiguation attribute and received some support.
repeated
Did not receive any support on the mailing list.
ltdf
(Local Time Disambiguation Flag) - short and no-one will attempt to guess what it means without reading the docs. (This abbreviation was used in PEP discussions with the meaning ltdf=False
is the earlier by those who didn’t want to endorse any of the alternatives.)
Are two values enough?
Several reasons have been raised to allow a None
or -1 value for the fold
attribute: backward compatibility, analogy with tm_isdst
and strict checking for invalid times.
Backward Compatibility
It has been suggested that backward compatibility can be improved if the default value of the fold
flag was None
which would signal that pre-PEP behavior is requested. Based on the analysis below, we believe that the proposed changes with the fold=0
default are sufficiently backward compatible.
This PEP provides only three ways for a program to discover that two otherwise identical datetime instances have different values offold
: (1) an explicit check of the fold
attribute; (2) if the instances are naive - conversion to another timezone using theastimezone()
method; and (3) conversion to float
using thetimestamp()
method.
Since fold
is a new attribute, the first option is not available to the existing programs. Note that option (2) only works for naive datetimes that happen to be in a fold or a gap in the system time zone. In all other cases, the value of fold
will be ignored in the conversion unless the instances use a fold
-aware tzinfo
which would not be available in a pre-PEP program. Similarly, theastimezone()
called on a naive instance will not be available in such program because astimezone()
does not currently work with naive datetimes.
This leaves us with only one situation where an existing program can start producing different results after the implementation of this PEP: when a datetime.timestamp()
method is called on a naive datetime instance that happen to be in the fold or the gap. In the current implementation, the result is undefined. Depending on the systemmktime
implementation, the programs can see different results or errors in those cases. With this PEP in place, the value of timestamp will be well-defined in those cases but will depend on the value of the fold
flag. We consider the change indatetime.timestamp()
method behavior a bug fix enabled by this PEP. The old behavior can still be emulated by the users who depend on it by writing time.mktime(dt.timetuple()) + 1e-6*dt.microsecond
instead of dt.timestamp()
.
Analogy with tm_isdst
The time.mktime
interface allows three values for the tm_isdst
flag: -1, 0, and 1. As we explained above, -1 (asking mktime
to determine whether DST is in effect for the given time from the rest of the fields) is the only choice that is useful in practice.
With the fold
flag, however, datetime.timestamp()
will return the same value as mktime
with tm_isdst=-1
in 99.98% of the time for most time zones with DST transitions. Moreover,tm_isdst=-1
-like behavior is specified regardless of the value of fold
.
It is only in the 0.02% cases (2 hours per year) that thedatetime.timestamp()
and mktime
with tm_isdst=-1
may disagree. However, even in this case, most of the mktime
implementations will return the fold=0
or the fold=1
value even though relevant standards allow mktime
to return -1 and set an error code in those cases.
In other words, tm_isdst=-1
behavior is not missing from this PEP. To the contrary, it is the only behavior provided in two different well-defined flavors. The behavior that is missing is when a given local hour is interpreted as a different local hour because of the misspecified tm_isdst
.
For example, in the DST-observing time zones in the Northern hemisphere (where DST is in effect in June) one can get
from time import mktime, localtime t = mktime((2015, 6, 1, 12, 0, 0, -1, -1, 0)) localtime(t)[:] (2015, 6, 1, 13, 0, 0, 0, 152, 1)
Note that 12:00 was interpreted as 13:00 by mktime
. With thedatetime.timestamp
, datetime.fromtimestamp
, it is currently guaranteed that
t = datetime.datetime(2015, 6, 1, 12).timestamp() datetime.datetime.fromtimestamp(t) datetime.datetime(2015, 6, 1, 12, 0)
This PEP extends the same guarantee to both values of fold
:
t = datetime.datetime(2015, 6, 1, 12, fold=0).timestamp() datetime.datetime.fromtimestamp(t) datetime.datetime(2015, 6, 1, 12, 0)
t = datetime.datetime(2015, 6, 1, 12, fold=1).timestamp() datetime.datetime.fromtimestamp(t) datetime.datetime(2015, 6, 1, 12, 0)
Thus one of the suggested uses for fold=-1
– to match the legacy behavior – is not needed. Either choice of fold
will match the old behavior except in the few cases where the old behavior was undefined.
Strict Invalid Time Checking
Another suggestion was to use fold=-1
or fold=None
to indicate that the program truly has no means to deal with the folds and gaps and dt.utcoffset()
should raise an error whenever dt
represents an ambiguous or missing local time.
The main problem with this proposal, is that dt.utcoffset()
is used internally in situations where raising an error is not an option: for example, in dictionary lookups or list/set membership checks. So strict gap/fold checking behavior would need to be controlled by a separate flag, say dt.utcoffset(raise_on_gap=True, raise_on_fold=False)
. However, this functionality can be easily implemented in user code:
def utcoffset(dt, raise_on_gap=True, raise_on_fold=False): u = dt.utcoffset() v = dt.replace(fold=not dt.fold).utcoffset() if u == v: return u if (u < v) == dt.fold: if raise_on_fold: raise AmbiguousTimeError else: if raise_on_gap: raise MissingTimeError return u
Moreover, raising an error in the problem cases is only one of many possible solutions. An interactive program can ask the user for additional input, while a server process may log a warning and take an appropriate default action. We cannot possibly provide functions for all possible user requirements, but this PEP provides the means to implement any desired behavior in a few lines of code.
Implementation
- Github fork: https://github.com/abalkin/cpython/tree/issue24773-s3
- Tracker issue: http://bugs.python.org/issue24773
Copyright
This document has been placed in the public domain.
Picture Credit
This image is a work of a U.S. military or Department of Defense employee, taken or made as part of that person’s official duties. As a work of the U.S. federal government, the image is in the public domain.