[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)
Paul Moore p.f.moore at gmail.com
Tue Apr 28 11:20:44 CEST 2009
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
2009/4/28 Glenn Linderman <v+python at g.nevcal.com>:
So assume a non-decodable sequence in a name. That puts us into Martin's funny-decode scheme. His funny-decode scheme produces a bare string, indistinguishable from a bare string that would be produced by a str API that happens to contain that same sequence. Data puns.
So when open is handed the string, should it open the file with the name that matches the string, or the file with the name that funny-decodes to the same string? It can't know, unless it knows that the string is a funny-decoded string or not.
Sorry for picking on Glenn's comment - it's only one of many in this thread. But it seems to me that there is an assumption that problems will arise when code gets a potentially funny-decoded string and doesn't know where it came from.
Is that a real concern? How many programs really don't know where their data came from? Maybe a general-purpose library routine might just need to document explicitly how it handles funny-encoded data (I can't actually imagine anything that would, but I'll concede it may be possible) but that's just a matter of documenting your assumptions - no better or worse than many other cases.
This all sounds similar to the idea of "tainted" data in security - if you lose track of untrusted data from the environment, you expose yourself to potential security issues. So the same techniques should be relevant here (including ignoring it if your application isn't such that it's s concern!)
I've yet to hear anyone claim that they would have an actual problem with a specific piece of code they have written. (NB, if such a claim has been made, feel free to point me to it - I admit I've been skimming this thread at times).
Paul.
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]