[Python-Dev] bytes (original) (raw)
[Python-Dev] bytes / unicode
P.J. Eby pje at telecommunity.com
Sun Jun 27 19:02:28 CEST 2010
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
At 03:53 PM 6/27/2010 +1000, Nick Coghlan wrote:
We could talk about this even longer, but the most effective way forward is going to be a patch that improves the URL parsing situation.
Certainly, it's the only practical solution for the immediate problems in 3.2.
I only mentioned that I "hate the idea" because I'd be more comfortable if it was explicitly declared to be a temporary hack to work around the absence of a string coercion protocol, due to the moratorium on language changes.
But, since the moratorium is in effect, I'll try to make this my last post on string protocols for a while... and maybe wait until I've looked at the code (str/bytes C implementations) in more detail and can make a more concrete proposal for what the protocol would be and how it would work. (Not to mention closer to the end of the moratorium.)
There are a very small number of APIs where it is appropriate to be polymorphic
This is only true if you focus exclusively on bytes vs. unicode, rather than the general issue that it's currently impractical to pass any sort of user-defined string type through code that you don't directly control (stdlib or third-party).
The virtues of a separate polystr type are that: 1. It can be simple and implemented in Python, dispatching to str or bytes as appropriate (probably in the strings module) 2. No chance of impacting the performance of the core interpreter (as builtins are not affected)
Note that adding a string coercion protocol isn't going to change core performance for existing cases, since any place where the protocol would be invoked would be a code branch that either throws an error or already falls back to some other protocol (e.g. the buffer protocol).
3. Lower impact if it turns out to have been a bad idea
How many protocols have been added that turned out to be bad ideas? The only ones that have been removed in 3.x, IIRC, are three-way compare, slice-specific operations, and coerce... and I'm going to miss cmp. ;-)
However, IIUC, the reason these protocols were dropped isn't because they were "bad ideas". Rather, they're things that can be implemented in terms of a finer-grained protocol. i.e., if you want cmp or getslice or coerce, you can always implement them via a mixin that converts the newer fine-grained protocols into invocations of the older protocol. (As I plan to do for cmp in the handful of places I use it.)
At the moment, however, this isn't possible for multi-string operations outside of add/radd and comparison -- the coercion rules are hard-wired and can't be overridden by user-defined types.
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]