[Python-Dev] bytes.from_hex() (original) (raw)
Josiah Carlson jcarlson at uci.edu
Sat Feb 18 10:16:07 CET 2006
- Previous message: [Python-Dev] bytes.from_hex()
- Next message: [Python-Dev] bytes.from_hex()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Ron Adam <rrr at ronadam.com> wrote:
Josiah Carlson wrote: > Bengt Richter had a good idea with bytes.recode() for strictly bytes > transformations (and the equivalent for text), though it is ambiguous as > to the direction; are we encoding or decoding with bytes.recode()? In > my opinion, this is why .encode() and .decode() makes sense to keep on > both bytes and text, the direction is unambiguous, and if one has even a > remote idea of what the heck the codec is, they know their result. > > - Josiah
I like the bytes.recode() idea a lot. +1 It seems to me it's a far more useful idea than encoding and decoding by overloading and could do both and more. It has a lot of potential to be an intermediate step for encoding as well as being used for many other translations to byte data.
Indeed it does.
I think I would prefer that encode and decode be just functions with well defined names and arguments instead of being methods or arguments to string and Unicode types.
Attaching it to string and unicode objects is a useful convenience. Just like x.replace(y, z) is a convenience for string.replace(x, y, z) . Tossing the encode/decode somewhere else, like encodings, or even string, I see as a backwards step.
I'm not sure on exactly how this would work. Maybe it would need two sets of encodings, ie.. decoders, and encoders. An exception would be given if it wasn't found for the direction one was going in.
Roughly... something or other like: import encodings encodings.tostr(obj, encoding): if encoding not in encoders: raise LookupError 'encoding not found in encoders' # check if obj works with encoding to string # ... b = bytes(obj).recode(encoding) return str(b) encodings.tounicode(obj, decodeing): if decoding not in decoders: raise LookupError 'decoding not found in decoders' # check if obj works with decoding to unicode # ... b = bytes(obj).recode(decoding) return unicode(b) Anyway... food for thought.
Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding to something, or are we decoding from something? Are we going to need to embed the direction in the encoding/decoding name (to_base64, from_base64, etc.)? That doesn't any better than binascii.b2a_base64 . What about .reencode and .redecode? It seems as though the 're' added as a prefix to .encode and .decode makes it clearer that you get the same type back as you put in, and it is also unambiguous to direction.
The question remains: is str.decode() returning a string or unicode depending on the argument passed, when the argument quite literally names the codec involved, difficult to understand? I don't believe so; am I the only one?
- Josiah
- Previous message: [Python-Dev] bytes.from_hex()
- Next message: [Python-Dev] bytes.from_hex()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]