(original) (raw)

On Wed, 20 Jan 2016 at 10:46 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

Brett,

On 2016-01-20 1:22 PM, Brett Cannon wrote:
\>
\>
\> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov.ml@gmail.com
\> yselivanov.ml@gmail.com>> wrote:
\>
\> On 2016-01-18 5:43 PM, Victor Stinner wrote:
\> > Is someone opposed to this PEP 509?
\> >
\> > The main complain was the change on the public Python API, but
\> the PEP
\> > doesn't change the Python API anymore.
\> >
\> > I'm not aware of any remaining issue on this PEP.
\>
\> Victor,
\>
\> I've been experimenting with the PEP to implement a per-opcode
\> cache in ceval loop (I'll share my progress on that in a few
\> days). This allows to significantly speedup LOAD\_GLOBAL and
\> LOAD\_METHOD opcodes, to the point, where they don't require
\> any dict lookups at all. Some macro-benchmarks (such as
\> chameleon\_v2) demonstrate impressive \~10% performance boost.
\>
\>
\> Ooh, now my brain is trying to figure out the design of the cache. :)

Yeah, it's tricky. I'll need some time to draft a comprehensible
overview. And I want to implement a couple more optimizations and
benchmark it better.

BTW, I've some updates (html5lib benchmark for py3, new benchmarks
for calling C methods, and I want to port some PyPy benchmakrs)
to the benchmarks suite. Should I just commit them, or should I
use bugs.python.org?

I actually emailed speed@ to see if people were interested in finally sitting down with all the various VM implementations at PyCon and trying to come up with a reasonable base set of benchmarks that better reflect modern Python usage, but I never heard back.

Anyway, issues on bugs.python.org are probably best to talk about new benchmarks before adding them (fixes and updates to pre-existing benchmarks can just go in).

\>
\> I rely on your dict->ma\_version to implement cache invalidation.
\>
\> However, besides guarding against version change, I also want
\> to guard against the dict being swapped for another dict, to
\> avoid situations like this:
\>
\>
\> def foo():
\> print(bar)
\>
\> exec(foo.\_\_code\_\_, {'bar': 1}, {})
\> exec(foo.\_\_code\_\_, {'bar': 2}, {})
\>
\>
\> What I propose is to add a pointer "ma\_extra" (same 64bits),
\> which will be set to NULL for most dict instances (instead of
\> ma\_version). "ma\_extra" can then point to a struct that has a
\> globally unique dict ID (uint64), and a version tag (unit64).
\> A macro like PyDict\_GET\_ID and PyDict\_GET\_VERSION could then
\> efficiently fetch the version/unique ID of the dict for guards.
\>
\> "ma\_extra" would also make it easier for us to extend dicts
\> in the future.
\>
\>
\> Why can't you simply use the id of the dict object as the globally
\> unique dict ID? It's already globally unique amongst all Python
\> objects which makes it inherently unique amongst dicts.

We have a freelist for dicts -- so if the dict dies, there
could be a new dict in its place, with the same ma\_version.

Ah, I figured it would be too simple to use something we already had.

While the probability of such hiccups is low, we still have
to account for it.

Yep.