[Python-Dev] Adding a tzidx cache to datetime (original) (raw)

Paul Ganssle paul at ganssle.io
Mon May 13 20:01:39 EDT 2019


From Marc-Andre Lemburg, I understand that Paul's PR is a good compromise and that other datetime implementations which cannot use tzidx() cache (because it's limited to an integer in [0; 254]) can subclass datetime or use a cache outside datetime.

One idea that we can put out there (though I'm hesitant to suggest it, because generally Python avoids this sort of language lawyering anyway), is that I think it's actually fine to allow the situations under which tzidx() will cache a value could be implementation-dependent, and to document that in CPython it's only integers  in [0; 254].

The reason to mention this is that I suspect that PyPy, which has a pure-python implementation of datetime, will likely either choose to forgo the cache entirely and always fall through to the underlying function call or cache /any/ Python object returned, since with a pure Python implementation, they do not have the advantage of storing the tzidx cache in an unused padding byte.

Other than the speed concerns, because of the fallback nature of datetime.tzidx, whether or not the cache is hit will not be visible to the end user, so I think it's fair to allow interpreter implementations to choose when a value is or is not cached according to what works best for their users.

On 5/13/19 7:52 PM, Victor Stinner wrote:

Le ven. 10 mai 2019 à 09:22, M.-A. Lemburg <mal at egenix.com> a écrit :

Given that many datetime objects in practice don't use timezones (e.g. in large data stores you typically use UTC and naive datetime objects), I think that making the object itself larger to accommodate for a cache, which will only be used a smaller percentage of the use cases, isn't warranted. Going from 64 bytes to 72 bytes also sounds like this could have negative effects on cache lines.

If you need a per object cache, you can either use weakref objects or maintain a separate dictionary in dateutil or other timezone helpers which indexes objects by id(obj). That said, if you only add a byte field which doesn't make the object larger in practice (you merely use space that alignments would use anyway), this shouldn't be a problem. The use of that field should be documented, though, so that other implementations can use/provide it as well. From Marc-Andre Lemburg, I understand that Paul's PR is a good compromise and that other datetime implementations which cannot use tzidx() cache (because it's limited to an integer in [0; 254]) can subclass datetime or use a cache outside datetime. Note: right now, creating a weakref to a datetime fails. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20190513/c4ea0d59/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://mail.python.org/pipermail/python-dev/attachments/20190513/c4ea0d59/attachment.sig>



More information about the Python-Dev mailing list