[Python-Dev] What type of object mmap.read_byte should return on py3k? (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Feb 28 13:01:09 CET 2009


Hirokazu Yamamoto wrote:

Hello. I noticed mmap.readbyte returns 1-length unicode on py3k. I felt this was strange, so I created issue on bug tracker (http://bugs.python.org/issue5391) and Martin proposed this is suitable for discussion on python-dev. I'll quote messages on bug tracker here.

I wrote: On Python3000, mmap.readbyte returns str not bytes, and mmap.writebyte accepts str. Is this intended behavior?

import mmap m = mmap.mmap(-1, 10) type(m.readbyte()) <class 'str'> m.writebyte("a") m.writebyte(b"a") Maybe another possibility. readbyte() returns int which represents byte, writebyte accepts int which represents byte. (Like b"abc"[0] returns int not 1-length bytes) Martin wrote: Indeed, I think it should use the "b" code, instead of the "c" code. Please discuss this on python-dev, though. It might not be ok to backport this to 3.0, since it may break existing code. Furthermore, all other uses of the "c" code might need to be reconsidered.

It certainly seems like mmap should be playing in an all-bytes world (where only already encoded strings are allowed). On the specific question of whether it would be better for read_byte()/write_byte to use 1-length bytes objects or integers, I have no strong opinion (the former is closer to the 2.x class API, the later more consistent with the operation of the 3.x bytes class).

However, as Martin says, it wouldn't be reasonable to backport the fixes in this to 3.0 - the associated API changes would almost certainly break otherwise working code.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list