[Python-Dev] Fwd: Re: PEP 267 improvement idea (original) (raw)

Oren Tirosh oren-py-d@hishome.net
Tue, 14 Jan 2003 12:20:50 -0500

Previous message: [Python-Dev] import vs. data files (Fwd: ResourcePackage 1.0.0a2 available...)
Next message: [Python-Dev] Re: Re: PEP 267 improvement idea
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I guess this really belongs on python-dev, not python-list.

----- Forwarded message from Oren Tirosh <oren-py-l@hishome.net> -----

X-Envelope-To: oren-py-l@hishome.net From: Oren Tirosh <oren-py-l@hishome.net> To: Skip Montanaro <skip@pobox.com> Cc: python-list@python.org Subject: Re: PEP 267 improvement idea (fwd) Date: Tue, 14 Jan 2003 12:14:33 -0500

On Tue, Jan 14, 2003 at 09:54:48AM -0600, Skip Montanaro wrote:

Oren> I know Guido likes the idea but the implementation is still far Oren> from complete. Can you elaborate on its incompleteness?

Ummmm... I've experimented with several different strategies and tricks. I'm not so sure which ones are actually implemented in the latest version of the patch on sourceforge.

One of the big improvements was speeding up failed lookups by adding negative entries to "positively" identify keys not in the dictionary (me_value == NULL, me_key != NULL, != dummy). This considerably speeds up the global/builtin lookup chain. Management of these entries is somewhere between limited to nonexistent. New entries are simply not added if there is not enough free space the table, resizing a dictionary with negative entries is probably buggy. Negative entries should also help instance/class/superclass lookup chain but it segfaults if used for anything other than LOAD_NAME and LOAD_GLOBAL so there must be other bugs lurking there.

Inlining works really fast - but only if the entry is found in the first hash probe. I experimented with dynamically shuffling the entries to ensure that the entry accessed most frequently would be first >90% of the time instead of ~%75 of the time. This is not in the patch and I don't have code that is anywhere near usable.

Interning - now that interned strings are not immortal this patch could do be much more aggressive about interning everything. Dictionaries used as namespaces could be guaranteed to have only interned strings as entry keys and avoid the need for a second search pass on first failed lookup (before the negative entry is added). This is important because many small temporary objects have attributes that are accessed only once. Currently this first access is made slower by the patch instead of faster.

The last strategy I was going to try before my free time shrunk considerably (a job found me) was to create a new species of dictionary in addition to "lookdict" and "lookdict_string". Dictionaries will start their life as either "lookdict_string" or "lookdict_namespace", depending on their expected usage. This is just an optimization hint - the semantics are otherwise identical. A "lookdict_namespace" dictionary interns all keys and adds negative entries for failed lookups. This overhead pays when it's used as a namespace. Both "lookdict_namespace" and "lookdict_string" will fall back to a "lookdict" dictionary when a non-string key is inserted.

Optimizing or inlining setitem for the common case where the entry already exists and is only set to a new value will also be nice.

Oren

-- http://mail.python.org/mailman/listinfo/python-list

----- End forwarded message -----

Previous message: [Python-Dev] import vs. data files (Fwd: ResourcePackage 1.0.0a2 available...)
Next message: [Python-Dev] Re: Re: PEP 267 improvement idea
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]