[Python-Dev] bytes (original) (raw)
[Python-Dev] bytes / unicode
Nick Coghlan ncoghlan at gmail.com
Sun Jun 27 04:43:23 CEST 2010
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Jun 27, 2010 at 4:17 AM, P.J. Eby <pje at telecommunity.com> wrote:
The idea that I'm proposing is that the basic string and byte types should defer to "user-defined" string types for mixed type operations, so that polymorphism of string-manipulation functions is the default case, rather than a special case. This makes tainting easier to implement, as well as optimizing and other special cases (like my "source string w/file and line info", or a string with font/formatting attributes).
Rather than building this into the base string type, perhaps it would be better (at least initially) to add in a polymorphic str subtype that worked along the following lines:
- Has an encoded argument in the constructor (e.g. poly_str("/", encoded=b"/")
- If given objects with an encode() method, assumes they're strings and uses its own parent class methods
- If given objects with a decode() method, assumes they're encoded and delegates to the encoded attribute
str/bytes agnostic functions would need to invoke poly_str deliberately, while bytes-only and text-only algorithms could just use the appropriate literals.
Third party types would be supported to some degree (by having either encode or decode methods), although they could still run into trouble with some operations (While full support for third party strings and byte sequence implementations is an interesting idea, I think it's overkill for the specific problem of making it easier to write str/bytes agnostic functions for tasks like URL parsing).
Regards, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]