[Python-Dev] strftime/strptime locale funnies... (original) (raw)

Brett Cannon brett at python.org
Wed Apr 5 21:13:35 CEST 2006


On 4/5/06, Donovan Baarda <abo at minkirri.apana.org.au> wrote:

G'day,

Just noticed on Debian (testing), Ubuntu (warty?), and RedHat (old) based systems Python's time.strptime() seems to ignore the environment's Locale and just uses "C". Last time I looked at this, time.strptime() leveraged off the platform's strptime(), which meant it had all the extra features, bugs and missingness of the platform's implementation. We now seem to be using a Python implementation in strptime.py. This implementation does Locale's by feeding a magic date to time.strftime() and figuring out how it formats it.

The Python implementation of time.strptime() has been in Python since summer 2002 so it was first introduced in 2.3 (I can't believe I have been on python-dev that long!).

This revealed that time.strftime() is not honouring the Locale settings, which is causing the new Python strptime() to also get it wrong.

This isn't time.strftime() . If you look at Modules/timemodule.c:450 you will find buflen = strftime(outbuf, i, fmt, &buf); for the actual strftime() call. Before that the only things that could possibly change the locale are localtime() or gettmarg(). Everything else is error-checking of arguments.

$ set | grep "^LC|LANG" GDMLANG=enAU.UTF-8 LANG=enAU.UTF-8 LANGUAGE=enAU.UTF-8 LCCOLLATE=C

$ date -d "1999-02-22" +%x 22/02/99 $ python ... >>> import time >>> time.strftime("%x", time.strptime("1999-02-22","%Y-%m-%d")) '02/22/99' This is consistent across all three platforms for multiple Python versions, including 2.1 and 1.5 (where they were available) which BTW don't use the Python implementation of strptime(). This suggests that all three of these platforms have a broken libc strftime() implementation... but all three? And why does date work?

Beats me. This could be a locale thing. If I remember correctly Python assumes the C locale on some things. I suspect the reason for this is in the locale module or libc. But you can't even find the word 'locale' or 'Locale' in timemodule.c nor do I know of any calls that mess with the locale, so I doubt 'time' is at fault for this.

Can others reproduce this? Have I done something stupid? Is this a bug, and in what, libc or Python?

Slightly OT, is it wise to use a Python strptime() on platforms that have a perfectly good one in libc? The Python reverse-engineering of libc's strftime() output to figure out locale formatting is clever, but...

The reason it was made the default implementation is for consistency across platforms. Since the trouble to implement a pure Python version was done and it is not a performance critical operation, consistency across platforms was deemed more important than allowing various platforms to support whatever directives they chose and having people writing code the relied upon it.

Plus it has been in there for so long there would be backwards-compatibility issues if we switched this now.

I see there have already been bugs submitted about strftime/strptime non-symmetry for things like support of extensions. There has also been a bug against strptime() Locale switching not working because of caching Locale formatting info from the strftime() analysis,

None open, right? Those should all be closed and fixed.

-Brett



More information about the Python-Dev mailing list