These failures happen now that issue 28596 has been fixed and that locale.getpreferredencoding(False) returns 'UTF-8'. ====================================================================== FAIL: test_strcoll_with_diacritic (test.test_locale.TestEnUSCollation) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sdcard/org.bitbucket.pyona/lib/python3.7/test/test_locale.py", line 362, in test_strcoll_with_diacritic self.assertLess(locale.strcoll('à', 'b'), 0) AssertionError: 1 not less than 0 ====================================================================== FAIL: test_strxfrm_with_diacritic (test.test_locale.TestEnUSCollation) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sdcard/org.bitbucket.pyona/lib/python3.7/test/test_locale.py", line 365, in test_strxfrm_with_diacritic self.assertLess(locale.strxfrm('à'), locale.strxfrm('b')) AssertionError: 'à' not less than 'b' ----------------------------------------------------------------------
Both strcoll() and strxfrm() are broken (character 'à' unicode code point is 'e0'): >>> import locale >>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8') 'C.UTF-8' >>> locale.strcoll('\u00e0', 'b') 1 >>> locale.strxfrm('\u00e0') < locale.strxfrm('b') False The correct results are -1 and True.
I'm afraid that the sentence "wcscoll/wcsxfrm have known bugs" is misleading for people who are not quite familiar with Android. The actual cause is that BioniC's setlocale() behaves differently than other platforms. Most implementations returns NULL if en_US.UTF-8 is not available, but BioniC returns C.UTF-8. I guess it's better to add some comments for the real cause on Android.