[Python-Dev] PEP: New timestamp formats (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Thu Feb 2 02:03:15 CET 2012

Previous message: [Python-Dev] Switching to Visual Studio 2010
Next message: [Python-Dev] PEP: New timestamp formats
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Even if I am not really conviced that a PEP helps to design an API, here is a draft of a PEP to add new timestamp formats to Python 3.3. Don't see the draft as a final proposition, it is just a document supposed to help the discussion :-)

PEP: xxx Title: New timestamp formats Version: RevisionRevisionRevision Last-Modified: DateDateDate Author: Victor Stinner <victor.stinner at haypocalc.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-Feburary-2012 Python-Version: 3.3

Abstract

Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 only supports int or float to store timestamps, but these types cannot be use to store a timestamp with a nanosecond resolution.

Motivation

Python 2.3 introduced float timestamps to support subsecond resolutions, os.stat() uses float timestamps by default since Python 2.5. Python 3.3 introduced functions supporting nanosecond resolutions:

os.stat()
os.utimensat()
os.futimens()
time.clock_gettime()
time.clock_getres()
time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC))

The problem is that floats of 64 bits are unable to store nanoseconds (10^-9) for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an Epoch timestamp) without loosing precision.

.. note:: 64 bits float starts to loose precision with microsecond (10^-6) resolution for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch timestamp).

Timestamp formats

Choose a new format for nanosecond resolution

To support nanosecond resolution, four formats were considered:

128 bits float
decimal.Decimal
datetime.datetime
tuple of integers

Criteria

It should be possible to do arithmetic, for example::

t1 = time.time()
# ...
t2 = time.time()
dt = t2 - t1

Two timestamps should be comparable (t2 > t1).

The format should have a resolution of a least 1 nanosecond (without loosing precision). It is better if the format can have an arbitrary resolution.

128 bits float

Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa.

128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely on a type only available on some platforms. Another example: GCC 4.3 does not support __float128 in 32-bit mode on x86 (but gcc 4.4 does).

Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions. Other CPU vendors don't support this float size.

There is also a license issue: GCC uses the MPFR library which is distributed under the GNU LGPL license. This license is incompatible with the Python Software License.

datetime.datetime

datetime.datetime only supports microsecond resolution, but can be enhanced to support nanosecond.

datetime.datetime has issues:

there is no easy way to convert it into "seconds since the epoch"
any broken-down time has issues of time stamp ordering in the duplicate hour of switching from DST to normal time
time zone support is flaky-to-nonexistent in the datetime module

decimal.Decimal

The decimal module is implemented in Python and is not really fast.

Using Decimal by default would cause bootstrap issue because the module is implemented in Python.

Decimal can store a timestamp with any resolution, not only nanosecond, the resolution is configurable at runtime.

Decimal objects support all arithmetics operations and are compatible with int and float.

The decimal module is slow, but there is a C reimplementation of the decimal module which is almost ready for inclusion.

tuple

Various kind of tuples have been proposed. All propositions only use integers:

a) (sec, nsec): C timespec structure, useful for os.futimens() for example
b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent
c) (sec, floatpart, divisor): value = sec + floatpart / divisor

The format (a) only supports nanosecond resolution.

The format (a) and (b) may loose precision if the clock divisor is not a power of 10.

For format (c) should be enough for most cases.

Creating a tuple of integers is fast.

Arithmetic operations cannot be done directly on tuple: t2-t1 doesn't work for example.

Final formats

The PEP proposes to provide 5 different timestamp formats:

numbers:
- int
- float
- decimal.Decimal
- datetime.timedelta
broken-down time:
- datetime.datetime

API design

Change the default result type

Python 2.3 introduced os.stat_float_times(). The problem is that this flag is global, and so may break libraries if the application changes the type.

Changing the default result type would break backward compatibility.

Callback and creating a new module to convert timestamps

Use a callback taking integers to create a timestamp. Example with float:

def timestamp_to_float(seconds, floatpart, divisor):
    return seconds + floatpart / divisor

The time module can provide some builtin converters, and other module, like datetime, can provide their own converters. Users can define their own types.

An alternative is to add new module for all functions converting timestamps.

The problem is that we have to design the API of the callback and we cannot change it later. We may need more information for future needs later.

os.stat: add new fields

It was proposed to add 3 fields to os.stat() structure to get nanoseconds of timestamps.

Add an argument to change the result type

Add a argument to all functions creating timestamps, like time.time(), to change their result type. It was first proposed to use a string argument, e.g. time.time(format="decimal"). The problem is that the function has to import internally a module. Then it was decided to pass directly the type, e.g. time.time(format=decimal.Decimal). Using a type, the user has first to import the module. There is no direct link between a type and the function used to create the timestamp.

By default, the float type is used to keep backward compatibility. For stat functions like os.stat(), the default type depends on os.stat_float_times().

Add new functions

Add new functions for each type, examples:

time.time_decimal()
os.stat_decimal()
os.stat_datetime()
etc.

Changes

Add format optional argument to time.clock(), time.clock_gettime(), time.clock_getres(), time.time() and time.wallclock().
Add timestamp optional argument to os.fstat(), os.fstatat(), os.lstat() and os.stat().

Functions accepting timestamp as input should support decimal.Decimal objects without an internal conversion to float which may loose precision:

datetime.datetime.fromtimestamp()
time.localtime()
time.gmtime()

TODO:

Change os.utimensat() and os.futimens() to accept Decimal
Change os.utimensat() and os.futimens() to not accept tuple anymore
Drop os.utimensat() and os.futimens() and patch os.utimeat() instead?
datetime should maybe support nanosecond?

Backwards Compatibility

Changes only add an new optional argument. The default type is unchanged and there is no impact on performances.

Copyright

This document has been placed in the public domain.

Previous message: [Python-Dev] Switching to Visual Studio 2010
Next message: [Python-Dev] PEP: New timestamp formats
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list