[Python-Dev] Re: python/dist/src/Objectsunicodeobject.c, 2.204, 2.205 (original) (raw)

Tim Peters tim.one at comcast.net
Fri Dec 19 10:40:13 EST 2003


[Hye-Shik Chang]

BTW, do we really support architectures with 9bits-sized char?

I don't think so. There are assumptions that a char is 8 bits scattered throughout Python's code, not so much in the context of using characters as characters, but more indirectly by assuming that the number of bits in an object of a non-char type T can be computed as sizeof(T)*8.

Skip's idea of making config smarter about this is a good one, but instead of trying to "fix stuff" for a case that's probably never going to arise, and that can't really be tested anyway until it does, I'd add a block like this everywhere we know we're relying on 8-bit char:

#ifdef HAS_FUNNY_SIZE_CHAR #error "The following code needs rework when a char isn't 8 bits" #endif /* A comment explaining why the following code needs rework

Crays are a red herring here. It's true that some Cray hardware can't address anything smaller than 64 bits, and that's also true of some other architectures. char is nevertheless 8 bits on all such 64-bit boxes I know of (and since I worked in a 64-bit world for 15 years, I know about most of them ). On Crays, this is achieved (albeit at major expense) in software: by software convention, a pointer-to-char stores the byte offset in the most-significant 3 bits of a pointer, and long-winded generated coded picks that part at runtime, loading or storing 8 bytes at a time (the HW can't do less than that), shifting and masking and or'ing to give the illusion of byte addressing for char. Some Alphas do something similar, but that HW's loads and stores simply ignore the last 3 bits of a memory address, and the CPU has special-purpose instructions to help generated code do the subsequent extraction and insertion of 8-bit chunks efficiently and succinctly.



More information about the Python-Dev mailing list