Issue 19459: Python does not support the GEORGIAN-PS charset (original) (raw)

Created on 2013-10-31 10:52 by Caolán.McNamara, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
georgian_ps.py vstinner,2013-10-31 11:24
Messages (7)
msg201800 - (view) Author: Caolán McNamara (Caolán.McNamara) Date: 2013-10-31 10:52
LANG=ka_GE.georgianps /usr/bin/python3 Fatal Python error: Py_Initialize: Unable to get the locale encoding LookupError: unknown encoding: GEORGIAN-PS Aborted (core dumped) but with python-2.7.5 no crash... LANG=ka_GE.georgianps /usr/bin/python2 Python 2.7.5 (default, Oct 8 2013, 12:19:40) [GCC 4.8.1 20130603 (Red Hat 4.8.1-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> (fedora 19)
msg201801 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 10:56
This bug was initially reported in LibreOffice: https://bugs.freedesktop.org/show_bug.cgi?id=68850
msg201802 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 11:24
I found three georgian encodings: https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/charmaps/GEORGIAN-PS;h=64615ff4344d74ea0c70cfd7a6c6c8019afb884e;hb=HEAD https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/charmaps/GEORGIAN-ACADEMY;h=9dc1bc9e782e9fe6092a00daf1a75274fd6dd738;hb=HEAD http://tools.ietf.org/html/draft-giasher-geostd8-00 The first one ("GEORGIAN-PS") is probably the most accurate because it is the one included in the GNU libc. Could you please try to copy attached georgian_ps.py file into /usr/lib64/python3.3/encodings/ (or /usr/lib/python3.3/encodings/ for 32-bit Linux)? Then try to print georgian letters using: print(bytes(range(0xc0, 0xe6)).decode("GEORGIAN-PS")) Please give me also your locale encoding: import locale; print(locale.getpreferredencoding()) @Caolán: Do you know the GEORGIAN-ACADEMY encoding? It doesn't look to be used by any glibc locale. On my Fedora 18, I have 3 georgian locales: * ka_GE.georgianps: locale encoding GEORGIAN-PS * ka_GE: locale encoding GEORGIAN-PS * ka_GE.utf8: locale encoding UTF-8 You can workaround this issue by switching your locale from ka_GE.georgianps to ka_GE.utf8.
msg404214 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-10-18 19:46
With recent versions of Python (e.g. 3.9) this no longer causes a crash. Python apparently falls back to UTF-8, at least on my system: $ LANG=ka_GE.georgianps python3.9 Python 3.9.7 (default, Sep 9 2021, 23:20:13) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import locale; print(locale.getpreferredencoding()) UTF-8 I'm marking this as fixed. If someone still has issues with this encoding, please open a new issue with up-to-date information.
msg404250 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-18 23:46
Python uses UTF-8 if the locale is not supported: $ LANG=xxx python3.9 -c "import sys; print(sys.flags.utf8_mode)" 1 On Fedora 34, the locale is still supported, and Python 3.11 still fails: vstinner@apu$ LANG=ka_GE.georgianps locale LANG=ka_GE.georgianps LC_CTYPE="ka_GE.georgianps" LC_NUMERIC="ka_GE.georgianps" LC_TIME="ka_GE.georgianps" LC_COLLATE="ka_GE.georgianps" LC_MONETARY="ka_GE.georgianps" LC_MESSAGES="ka_GE.georgianps" LC_PAPER="ka_GE.georgianps" LC_NAME="ka_GE.georgianps" LC_ADDRESS="ka_GE.georgianps" LC_TELEPHONE="ka_GE.georgianps" LC_MEASUREMENT="ka_GE.georgianps" LC_IDENTIFICATION="ka_GE.georgianps" LC_ALL= vstinner@apu$ LANG=ka_GE.georgianps python3.11 -c "import sys; print(sys.flags.utf8_mode)" Python path configuration: PYTHONHOME = (not set) PYTHONPATH = (not set) program name = './python' isolated = 0 environment = 1 user site = 1 import site = 1 stdlib dir = '/home/vstinner/python/main/Lib' sys._base_executable = '/home/vstinner/python/main/python' sys.base_prefix = '/usr/local' sys.base_exec_prefix = '/usr/local' sys.platlibdir = 'lib' sys.executable = '/home/vstinner/python/main/python' sys.prefix = '/usr/local' sys.exec_prefix = '/usr/local' sys.path = [ '/usr/local/lib/python311.zip', '/home/vstinner/python/main/Lib', '/home/vstinner/python/main/build/lib.linux-x86_64-3.11-pydebug', ] Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding Python runtime state: core initialized LookupError: unknown encoding: GEORGIAN-PS Current thread 0x00007ff89b81d2c0 (most recent call first):
msg404275 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-19 08:44
Possible solutions (they can be combined): 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (). The problem is that it is difficult to get the official information about these encodings. 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed.
msg404290 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2021-10-19 11:20
On 19.10.2021 10:44, Serhiy Storchaka wrote: > > Possible solutions (they can be combined): > > 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (). The problem is that it is difficult to get the official information about these encodings. As with all encodings we add: there has to be a real need to support them natively in Python (as opposed to installing codecs via PyPI) and we need a definite source for the encoding, e.g. a standards document from an official body. IMO, we should not really add more encodings to the stdlib, but instead point people to e.g. the iconv package: https://pypi.org/project/python-iconv/ Perhaps we ought to make it easier for such packages to provide additional codecs even during the startup phase, e.g. via a special env var which points Python to a list of codec packages to load prior to initializing the I/O encoding... not sure whether this is possible, though. > 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed. I think this would be a more general solution to such cases, provided the startup logic issues a visible warning about the fallback.
History
Date User Action Args
2022-04-11 14:57:52 admin set github: 63658
2021-12-11 19:13:45 iritkatriel set versions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.3, Python 3.4
2021-10-19 11:20:36 lemburg set messages: +
2021-10-19 08:44:49 serhiy.storchaka set messages: +
2021-10-18 23:46:36 vstinner set status: closed -> openresolution: fixed -> messages: +
2021-10-18 19:46:45 taleinat set status: open -> closednosy: + taleinatmessages: + resolution: fixedstage: resolved
2014-10-28 14:29:49 jwilk set nosy: + jwilk
2014-10-20 16:50:51 serhiy.storchaka link issue22679 dependencies
2013-10-31 11:37:25 serhiy.storchaka set nosy: + lemburg, loewis, serhiy.storchaka
2013-10-31 11:25:06 vstinner set title: Fatal Python error: Py_Initialize: Unable to get the locale encoding: GEORGIAN-PS -> Python does not support the GEORGIAN-PS charsetversions: + Python 3.4
2013-10-31 11:24:45 vstinner set files: + georgian_ps.pymessages: +
2013-10-31 10:56:24 vstinner set nosy: + vstinnermessages: +
2013-10-31 10:52:59 Caolán.McNamara create