[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5 (original) (raw)

Chris Barker chris.barker at noaa.gov
Sat Jan 11 00:50:04 CET 2014


On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop <juraj.sukop at gmail.com> wrote:

What this all means is that the PDF objects are expressed in ASCII, "stream" objects like images and fonts may have a binary part and I never saw those UTF+16 strings.

hmm -- I wonder if they are out there in the wild, though....

u"stream\n%s\nendstream\nendobj"%binarydata.decode('latin-1')

The argument for dropping "%f" et al. has been that if something is a text, then it should be Unicode. Conversely, if it is not text, then it should not be Unicode.

????

What I'm trying to demostrate / test is that you can use unicode objects for mixed binary + ascii, if you make sure to encode/decode using latin-1 ( any others?). The idea is that ascii can be seen/used as text, and other bytes are preserved, and you can ignore whatever meaning latin-1 gives them.

using unicode objects means that you can use the existing string formatting (%s), and if you want to pass in binary blobs, you need to decode them as latin-1, creating a unicode object, which will get interpolated into your unicode object, but then that unicode gets encoded back to latin-1, the original bytes are preserved.

I think this it confusing, as we are calling it latin-1, but not really using it that way, but it seems it should work.

-Chris

--

Christopher Barker, Ph.D. Oceanographer

Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140110/7d2c7526/attachment.html>



More information about the Python-Dev mailing list