[Python-Dev] Split unicodeobject.c into subfiles (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Fri Oct 26 04:35:38 CEST 2012


Antoine Pitrou writes:

Well, "tangled monolithic mess" is quite true about unicodeobject.c, IMO.

s/object.c// and your point remains valid. Just reading the table of contents for UTR#17 (http://www.unicode.org/reports/tr17/) should convince you that it's not going to be easy to produce an elegant implementation!

Seriously, I agree with Victor: navigating around unicodeobject.c is a PITA. Perhaps it isn't if you are using emacs, or you have 35 fingers, or just a lot of spare time, but in my experience it's painful.

Sure, but I don't know of a Unicode implementation which isn't.

I don't think that having a unicode/*.[ch] with a dozen files (including the README etc) in it is going to make it much more navigable. If there are too many files, it's going to be a PITA to maintain because there won't be an obvious place to put certain functions. Eg, I've already mentioned my suspicions about the charmap code (I apologize for not reading Victor's code to confirm them).

I don't object in principle to splitting the unicodeobject.c. At the very least, with all due respect to MAL, XEmacs experience with coding systems (the Emacs equivalent of Python codecs) suggests that there is very little to be lost by moving the codec implementations to a separate file from the Unicode object implementation. (Here I'm talking about codecs in the narrow sense of wire-format to Python3 str and back, not the more general Python2 sense that included zip and base64 and so on. Ie, PyUnicode_Translate is not a codec in the relevant sense.)

On the other hand, I wouldn't be surprised if (despite my earlier suggestion) codecs and unicode object internals need a close relationship. (My intuition and sense of style says splitting codecs from the low level memory management and PEP 393 stuff is a good idea, but I'm not confident it would have no impact on performance.)



More information about the Python-Dev mailing list