[Python-Dev] Divorcing str and unicode (no more implicit conversions). (original) (raw)
Antoine Pitrou solipsis at pitrou.net
Mon Oct 3 14:32:48 CEST 2005
- Previous message: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
- Next message: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Le lundi 03 octobre 2005 à 02:09 -0400, Martin Blais a écrit :
What if we could completely disable the implicit conversions between unicode and str?
This would be very annoying when dealing with some modules or libraries where the type (str / unicode) returned by a function depends on the context, build, or platform.
A good rule of thumb is to convert to unicode everything that is semantically textual, and to only use str for what is to be semantically treated as a string of bytes (network packets, identifiers...). This is also, AFAIU, the semantic model which is favoured for a hypothetical future version of Python.
This is what I'm using to do safe conversion to a given type without worrying about the type of the argument:
DEFAULT_CHARSET = 'utf-8'
def safe_unicode(s, charset=None): """ Forced conversion of a string to unicode, does nothing if the argument is already an unicode object. This function is useful because the .decode method on an unicode object, instead of being a no-op, tries to do a double conversion back and forth (which often fails because 'ascii' is the default codec). """ if isinstance(s, str): return s.decode(charset or DEFAULT_CHARSET) else: return s
def safe_str(s, charset=None): """ Forced conversion of an unicode to string, does nothing if the argument is already a plain str object. This function is useful because the .encode method on an str object, instead of being a no-op, tries to do a double conversion back and forth (which often fails because 'ascii' is the default codec). """ if isinstance(s, unicode): return s.encode(charset or DEFAULT_CHARSET) else: return s
Good luck
Antoine.
- Previous message: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
- Next message: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]