[Python-Dev] Why does base64 return bytes? (original) (raw)
Steven D'Aprano steve at pearwood.info
Tue Jun 14 11:19:35 EDT 2016
- Previous message (by thread): [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux
- Next message (by thread): [Python-Dev] Why does base64 return bytes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Normally I'd take a question like this to Python-List, but this question has turned out to be quite diversive, with people having strong opinions but no definitive answer. So I thought I'd ask here and hope that some of the core devs would have an idea.
Why does base64 encoding in Python return bytes?
base64.b64encode take bytes as input and returns bytes. Some people are arguing that this is wrong behaviour, as RFC 3548 specifies that Base64 should transform bytes to characters:
https://tools.ietf.org/html/rfc3548.html
albeit US-ASCII characters. E.g.:
The encoding process represents 24-bit groups of input bits
as output strings of 4 encoded characters.
[...]
Each 6-bit group is used as an index into an array of 64 printable
characters. The character referenced by the index is placed in the
output string.
Are they misinterpreting the standard? Has Python got it wrong? Is there a good reason for returning bytes?
I see that other languages choose different strategies. Microsoft's languages C#, F# and VB (plus their C++ compiler) take an array of bytes as input, and outputs a UTF-16 string:
https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx
Java's base64 encoder takes and returns bytes:
https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html
and Javascript's Base64 encoder takes input as UTF-16 encoded text and returns the same:
https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
I'm not necessarily arguing that Python's strategy is the wrong one, but I am interested in what (if any) reasons are behind it.
Thanks in advance,
Steve
- Previous message (by thread): [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux
- Next message (by thread): [Python-Dev] Why does base64 return bytes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]