[Python-Dev] Speeding up CPython 5-10% (original) (raw)

Damien George damien.p.george at gmail.com
Wed Jan 27 15:10:03 EST 2016


Hi Yuri,

I think these are great ideas to speed up CPython. They are probably the simplest yet most effective ways to get performance improvements in the VM.

MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired by PyPy, and the main reason to have it is because you don't need to allocate on the heap when doing a simple method call). The specific opcodes are:

LOAD_METHOD # same behaviour as you propose CALL_METHOD # for calls with positional and/or keyword args CALL_METHOD_VAR_KW # for calls with one or both of /*

We also have LOAD_ATTR, CALL_FUNCTION and CALL_FUNCTION_VAR_KW for non-method calls.

MicroPython also has dictionary lookup caching, but it's a bit different to your proposal. We do something much simpler: each opcode that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR, etc) includes a single byte in the opcode which is an offset-guess into the dictionary to find the desired element. Eg for LOAD_GLOBAL we have (pseudo code):

CASE(LOAD_GLOBAL): key = DECODE_KEY; offset_guess = DECODE_BYTE; if (global_dict[offset_guess].key == key) { // found the element straight away } else { // not found, do a full lookup and save the offset offset_guess = dict_lookup(global_dict, key); UPDATE_BYTECODE(offset_guess); } PUSH(global_dict[offset_guess].elem);

We have found that such caching gives a massive performance increase, on the order of 20%. The issue (for us) is that it increases bytecode size by a considerable amount, requires writeable bytecode, and can be non-deterministic in terms of lookup time. Those things are important in the embedded world, but not so much on the desktop.

Good luck with it!

Regards, Damien.



More information about the Python-Dev mailing list