[Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Feb 14 03:14:32 CET 2012


On Tue, Feb 14, 2012 at 4:33 AM, Victor Stinner <victor.stinner at gmail.com> wrote:

However, I am still -1 on the solution proposed by the PEP.  I still think that migrating to datetime use is a better way to go, rather than a proliferation of the data types used to represent timestamps, along with an API to specify the type of data returned.

Let's look at each item in the PEPs rationale for discarding the use of datetimes: Oh, I forgot to mention my main concern about datetime: many functions returning timestamp have an undefined starting point (an no timezone information ), and so cannot be converted to datetime:  - time.clock(), time.wallclock(), time.monotonic(), time.clockgettime() (except for CLOCKREALTIME)  - time.clockgetres()  - signal.get/setitimer()  - os.wait3(), os.wait4(), resource.getrusage()  - etc. Allowing datetime.datetime type just for few functions (like datetime.datetime or time.time) but not the others (raise an exception) is not an acceptable solution.

A datetime module based approach would need to either use a mix of datetime.datetime() (when returning an absolute time) and datetime.timedelta() (when returning a time relative to an unknown starting point), or else just always return datetime.timedelta (even when we know the epoch and could theoretically make the time absolute).

In the former case, it may be appropriate to adopt a boolean flag API design and the "I want high precision time" request marker would just be "datetime=True". You'd then get back either datetime.datetime() or datetime.timedelta() as appropriate for the specific API.

In the latter case, the design would be identical to the current PEP, only with "datetime.timedelta" in place of "decimal.Decimal".

The challenge relative to the current PEP is that any APIs that wanted to accept either of these as a timestamp would need to do some specific work to avoid failing with a TypeError.

For timedelta values, we'd have to define a way to easily extract the full precision timestamp as a number (total_seconds() currently returns a float, and hence can't handle nanosecond resolutions), as well as improving interoperability with algorithms that expected a floating point value.

If handed a datetime value, you need to know the correct epoch value, do the subtraction, then extract the full precision timestamp from the resulting timedelta object.

To make a datetime module based counter-proposal acceptable, it would need to be something along the following lines:

It may also take some fancy footwork to avoid a circular dependency between time and datetime while supporting this (Victor allowed this in an earlier version of his patch, but he did it by accepting datetime.datetime and datetime.time_delta directly as arguments to the affected APIs). That's a relatively minor implementation concern, though (at worst it would require factoring out a support module used by both datetime and time). The big problem is that datetime and timedelta pose a huge problem for compatibility with existing third party APIs that accept timestamp values.

This is in stark contrast to what happens with decimal.Decimal: coercion to float() or int() will potentially lose precision, but still basically works. While addition and subtraction of floats will fail, addition and subtraction of integers works fine. To avoid losing precision, it's sufficient to just avoid the coercion.

I think the outline above really illustrates why the raw data type for timestamps should just be a number, not a higher level semantic type like timedelta or datetime. Eventually, you want to be able to express a timestamp as a number of seconds relative to a particular epoch. To do that, you want a number. Originally, we used ints, then, to support microsecond resolution, we used floats. The natural progression to support arbitrary resolutions is to decimal.Decimal.

Then, the higher level APIs can be defined in terms of that high precision number. Would it be nice if there was a PyPI module that provided APIs that converted the raw timestamps in stat objects and other OS level APIs into datetime() and timedelta() objects as appropriate? Perhaps, although I'm not sure it's necessary. But are those types low-level enough to be suitable for the OS interface definition? I don't think so - we really just want a number to express "seconds since a particular time" that plays fairly nicely with other numbers, not anything fancier than that.

Notice that PEP 410 as it stands can be used to solve the problem of how to extract the full precision timestamp from a timedelta object as a number: timedelta.total_seconds() can be updated to accept a "timestamp" argument, just like the other time related APIs already mentioned in the PEP. Then "delta.total_seconds(timestamp=decimal.Decimal)" will get you a full precision timestamp. If PEP 410 was instead defined in terms of timedelta, it would need to come up with a different solution for this.

Also, by using decimal.Decimal, we open up the possibility of, at some point in the future, switching to returning high precision values by default (there are at least two prerequisites for that, though: incorporation of cdecimal into CPython and implicit promotion of floats to decimal values in binary operations without losing data. We've already started down that path by accepting floating point values directly in the Decimal constructor). No such migration path for the default behaviour presents itself for an API based on datetime or timedelta (unless we consider making timedelta behave a lot more like a number than it does now).

Cheers, Nick.

-- Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-Dev mailing list