[Python-Dev] VM imaging based launch optimizations for CPython? (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Sat Dec 20 22:55:30 CET 2008
- Previous message: [Python-Dev] VM imaging based launch optimizations for CPython?
- Next message: [Python-Dev] VM imaging based launch optimizations for CPython?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Any opinions?
I would use a different marshal implementation. Instead of defining a stream format for marshal, make marshal dump its graph of objects along with the actual memory layout. On load, copying can be avoided; just a few pointers need to be updated. The resulting marshal files would be platform-specific (wrt. endianness and pointer width).
On marshaling, you copy all objects into a contiguous block of memory (8-aligned), and dump that. On unmarshaling, you just map that block. If the target supports true memory mapping with page boundaries, you might be able to store multiple .pyc files into a single page. This reformatting could be done offline also.
A few things need to be considered:
- compatibility. The original marshal code would probably need to be preserved for the "marshal" module.
- relative pointers. Code objects, tuples, etc. contain pointers. Assuming the marshaled object cannot be loaded back into the same address, you need to adjust pointers. A common trick is to put a desired load address into the memory block, then try to load into that address. If the address is already taken, load into a different address, and walk though all objects, adjusting pointers.
- type references. On loading, you will need to patch all ob_type fields. Put the marshal codes into the ob_type field on marshalling, then switch on unmarshalling.
- references to interned strings. On loading, you can either intern them all, or you have a "fast interning" algorithm that assigns a fixed table of interned-string numbers.
- reference counting. Make sure all these objects start out with a reference count of 1, so they will never become garbage.
If you use a container file for multiple .pyc files, you can have additional savings by sharing strings across modules; this should help in particular for reference to builtin symbols, and for common method names. A fixed interning might become unnecessary as the unique single string object in the container will either become the interned string itself, or point it it after being interned once. With such a container system, unmarshalling should be lazy; e.g. for each object, the value of ob_type can be used to determine whether the object was unmarshalled.
Of course, you still have the actual interpretation of the top-level module code - if it's not the marshalling but this part that actually costs performance, this efficient marshalling algorithm won't help. It would be interesting to find out which modules have a particularly high startup cost - perhaps they can be rewritten.
Regards, Martin
- Previous message: [Python-Dev] VM imaging based launch optimizations for CPython?
- Next message: [Python-Dev] VM imaging based launch optimizations for CPython?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]