[Python-Dev] Re: [I18n-sig] Changes to gettext.py for Python 2.3 (original) (raw)
"Martin v. L�wis" martin@v.loewis.de
Fri, 11 Apr 2003 21:54:50 +0200
- Previous message: [Python-Dev] Changes to gettext.py for Python 2.3
- Next message: [Python-Dev] Re: [I18n-sig] Changes to gettext.py for Python 2.3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Barry Warsaw wrote:
- Set the default charset to iso-8859-1. It used to be None, which would cause problems with .ugettext() if the file had no charset parameter. Arguably, the po/mo file would be broken, but I still think iso-8859-1 is a reasonable default.
I'm -1 here. Why do you think it is a reasonable default?
Errors should never pass silently. Unless explicitly silenced.
While iso-8859-1 might be a reasonable default in other application domains, in the context of non-English text (which it typically is), assuming Latin-1 is bound to create mojibake.
If your application can accept creating mojibake, I suggest a method setdefaultencoding on the catalog, which has no effect if an encoding was found in the catalog.
- Add a "coerce" default argument to GNUTranslations's constructor. The reason for this is that in Zope, we want all msgids and msgstrs to be Unicode. For the latter, we could use .ugettext() but there isn't currently a mechanism for Unicode-ifying msgids.
Could you please in what context this is needed? msgids are ASCII, and you can pass a Unicode string to ugettext just fine.
The plan then is that the charset parameter specifies the encoding for both the msgids and msgstrs, and both are decoded to Unicode when read. For example, we might encode po files with utf-8. I think the GNU gettext tools don't care.
They complain loudly if they find bytes > 127 in the msgid.
Since this could potentially break code [*] that wants to use the encoded interface .gettext(), the constructor flag is added, defaulting to False. Most code I suspect will want to set this to True and use .ugettext().
To avoid breakage, you could define ugettext as
def ugettext(self, message): if isinstance(message, unicode): tmsg = self._catalog.get(message.encode(self._charset)) if tmsg is None: return message else: tmsg = self._catalog.get(message, message) return unicode(tmsg, self._charset)
- A few other minor changes from the Zope project, including asserting that a zero-length msgid must have a Project-ID-Version header for it to be counted as the metadata record.
That test was there, and removed on request of Bruno Haible, the GNU gettext maintainer, as he points out that Project-ID-Version is not mandatory for the metadata (see Patch #700839).
Regards, Martin
- Previous message: [Python-Dev] Changes to gettext.py for Python 2.3
- Next message: [Python-Dev] Re: [I18n-sig] Changes to gettext.py for Python 2.3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]