Message 333615 - Python tracker (original) (raw)

I dislike adding a public API for an optimization. Would it be possible to make it private? Would it make sense? tzidx => _tzidx.

One other thing I might mention here is that I did explore the idea of storing this cache on the tzinfo implementation itself, but it is problematic for a number of reasons:

It would either need to use some sort of expiring cache (lru, ttl) or require a great deal of memory, greatly reducing the utility - the proposed implementation requires no additional memory.

In test of your PR, tzinfo allocates memory for its cache:

        offsets = [timedelta(hours=0), timedelta(hours=1)]
        names = ['+00:00', '+01:00']
        dsts = [timedelta(hours=0), timedelta(hours=1)]

This memory isn't free. I don't see how using an index completely prevents the need to allocate memory for a cache.

Somehow, we need a method to clear the cache and decide a caching policy. The simplest policy is to have no limit. The re.compile() uses a cache of 512 entries. functools.lru_cache uses a default limit of 128 entries.

Instead of adding a new API, would it be possible to reuse functools.lru_cache somehow?

Because the implementation of datetime.hash invokes utcoffset(), it is impossible to implement utcoffset in terms of a dictionary of tz-aware datetimes. This means that you need to construct a new, naive datetime, which is a fairly slow operation and really puts a damper in the utility of the cache.

For special local timezones, would it be possible to explicitly exclude them, and restrict the cache the simple timespace (fixed offset)?