[Python-Dev] Filename as byte string in python 2.6 or 3.0? (original) (raw)
glyph at divmod.com glyph at divmod.com
Mon Sep 29 16:01:33 CEST 2008
- Previous message: [Python-Dev] Filename as byte string in python 2.6 or 3.0?
- Next message: [Python-Dev] Filename as byte string in python 2.6 or 3.0?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11:59 am, eckhardt at satorlaser.com wrote:
Sorry, I wasn't clear enough. I'll try to explain further...
Let's assume we have a filename like this: 0xc2 0xa9 0x2f 0x7f The first two bytes are the copyright sign encoded in UTF-8, followed by a slash (0x2f, path separator) and a character encoded in an unknown codepage (0x7f is not ASCII!).
Originally I thought that this was a valid idea, but then it became clear that this could be a problem. Consider a filename which includes a UTF-8 encoding of a PUA code point.
I'm not sure if the use I proposed is correct according to the intended use of the PUA. I know that ideally no such string would escape from Python, i.e. it should only be visible internally. I would guess that that is something the PUA was intended for.
Viewing the PUA with GNOME charmap, I can see that many code points there have character renderings on my Ubuntu system. I have to assume, therefore, that there are other (and potentially conflicting) uses for this unicode feature.
- Previous message: [Python-Dev] Filename as byte string in python 2.6 or 3.0?
- Next message: [Python-Dev] Filename as byte string in python 2.6 or 3.0?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]