[Python-Dev] Split unicodeobject.c into subfiles (original) (raw)

Maciej Fijalkowski fijall at gmail.com
Thu Oct 25 11🔞53 CEST 2012


On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg <mal at egenix.com> wrote:

On 25.10.2012 08:42, Nick Coghlan wrote:

Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff, but in principle any codec could be needed. Is it just an heuristic that the codecs needed for 99% of the world are here, and other codecs live in separate modules?

I believe it's a combination of history and whether or not they're needed by the interpreter during the bootstrapping process before the encodings namespace is importable. They are in unicodeobject.c so that the compilers can inline the code in the various other places where they are used in the Unicode implementation directly as necessary and because the codecs use a lot of functions from the Unicode API (obviously), so the other direction of inlining (Unicode API in codecs) is needed as well.

I'm sorry to interrupt, but have you actually measured? What effect the lack of said inlining has on any benchmark is definitely beyond my ability to guess and I suspect is beyond the ability to guess of anyone else on this list.

I challenge you to find a benchmark that is being significantly affected (>15%) with the split proposed by Victor. It does not even have to be a real-world one, although that would definitely buy it more credibility.

Cheers, fijal



More information about the Python-Dev mailing list