[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] (original) (raw)

Ron Adam rrr at ronadam.com
Tue Feb 21 00:40:19 CET 2006


Bengt Richter wrote:

On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters <thomas at xs4all.net> wrote:

note what base64 really is for. It's essence is to create a character sequence which can succeed in being encoded as ascii. The concept of base64 going str->str is really a mental shortcut for sstr.decode('base64').encode('ascii'), where 3 octets are decoded as code for 4 characters modulo padding logic.

Wouldn't it be...

obj.encode('base64').encode('ascii')

This would probably also work...

obj.encode('base64').decode('ascii')  ->  ascii alphabet in unicode

Where the underlying sequence might be ...

obj -> bytes -> bytes:base64 -> base64 ascii character set

The point is to have the data in a safe to transmit form that can survive being encoded and decoded into different forms along the transmission path and still be restored at the final destination.

base64 ascii character set -> bytes:base64 -> original bytes -> obj

If the str type constructor had an encode argument like the unicode type does, along with a str.encoded_with attribute. Then it might be possible to depreciate the .decode() and .encode() methods and remove them form P3k entirely or use them as data coders/decoders instead of char type encoders.

It could also create a clear separation between character encodings and data coding. The following should give an exception.

str(str, 'rot13'))

Rot13 isn't a character encoding, but a data coding method.

data_str.encode('rot13') # could be ok

But this wouldn't...

new_str = data_str.encode('latin_1') # could cause an exception

We'd have to use...

new_str = str(data_str, 'latin_1') # New string sub type...

Cheers, Ronald Adam



More information about the Python-Dev mailing list