[Python-Dev] bytes thoughts (original) (raw)
Baptiste Carvello baptiste13 at altern.org
Thu Mar 2 01:59:04 CET 2006
- Previous message: [Python-Dev] wiki as scratchpad
- Next message: [Python-Dev] bytes thoughts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
some more thoughts about the bytes object:
- it would be nice to have an trivial way to change a bytes object to an int / long, and vice versa.
Rationale:
while manipulating binary data will happen mostly with bytes objects, some operations are better done with ints, like the bit manipulations with the &|~^ operators. So we should make sure there is no impedance mismatch between those 2 ways of editing binary data. Getting an individual byte at a time is not sufficient, because the part of the data you want to edit might span over a few bytes, or simply fall across a byte boundary.
Toy implementation:
class bytes(list): ... def from_int(cls, value, length): ... return cls([(value >> 8i) % 256 for i in range(length)[::-1]]) ... from_int=classmethod(from_int) ... def int(self): ... return sum([256**in for i,n in enumerate(self[::-1])]) ...
The length argument to from_int is necessary to create a fixed number of bytes, event if those bytes are 0.
Use case:
let's say you have a binary record made of 7 bits of padding and 3x3 bytes of unix permissions. You want to change the user permissions, and store the record back to a bytes object:
record=bytes([1,36]) # this could be a slice of a preexisting bytes object perms=record.int() print oct(perms) 0444 perms &=~( 7 <<6 ) # clear the bits corresponding to user permissions perms |= 6 <<6 # set the bits to the new value print oct(perms) 0644 record=bytes.from_int(perms,2)
- a common case of interactive use is to display a bytes string as a character string in order to spot which parts are text. In this case you ignore non-ASCII characters, and replace everything that cannot be printed with a space (as some hex editors do). So you don't need to care about encodings.
import string def printable(c): ... if not c in string.printable: return ' ' ... if c.isspace(): return ' ' ... return c ... class bytes(list): ... def printable_ascii(self): ... return u"".join([printable(chr(i)) for i in nb]) ... nb=bytes([48,0,10,12,34,65,66]) print nb.printable_ascii() 0 "AB
by the way, what will chr return in py3k ?
Cheers, BC
- Previous message: [Python-Dev] wiki as scratchpad
- Next message: [Python-Dev] bytes thoughts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]