(original) (raw)

Instead of byte literals, how about a classmethod bytes.from_hex(), which works like this:

# two equivalent things

expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')

expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, 131, 79, 229, 201, 46, 106])

It's just a nicety; the former fits my brain a little better. This would work fine both in 2.5 and in 3.0.

I thought about unicode.encode('hex'), but obviously it will continue
to return a str in 2.x, not bytes. Also the pseudo-encodings
('hex', 'rot13', 'zip', 'uu', etc.) generally scare me. And now
that bytes and text are going to be two very different types, they're
even weirder than before. Consider:

text.encode('utf-8') ==> bytes

text.encode('rot13') ==> text

bytes.encode('zip') ==> bytes

bytes.encode('uu') ==> text (?)

This state of affairs seems kind of crazy to me.

Actually users trying to figure out Unicode would probably be better served if bytes.encode() and text.decode() did not exist.

-j