Issue 7739: time.strftime may hung while trying to open /etc/localtime but does not release GIL (original) (raw)

Created on 2010-01-19 09:23 by dorontal, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
time_strftime_gil.patch vstinner,2010-01-21 01:37
Messages (10)
msg98049 - (view) Author: Doron Tal (dorontal) Date: 2010-01-19 09:23
I've encountered a hung of python process for more than a second. It appears that the stall happens due to time.strftime call, which internally opens a file ('/etc/localtime'). I think it is best if the GIL would have been released to allow other threads to continue working. Snippet from strace on: ----------------------- import time time.strftime('%Z') ----------------------- Note the line: open("/etc/localtime", O_RDONLY) = 4 ----=== strace output Starts here ===--- stat("/usr/local/lib/python2.6/lib-old/time", 0x7fff871deff0) = -1 ENOENT (No such file or directory) open("/usr/local/lib/python2.6/lib-old/time.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/local/lib/python2.6/lib-old/timemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/local/lib/python2.6/lib-old/time.py", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/local/lib/python2.6/lib-old/time.pyc", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/usr/local/lib/python2.6/lib-dynload/time", 0x7fff871deff0) = -1 ENOENT (No such file or directory) open("/usr/local/lib/python2.6/lib-dynload/time.so", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=46995, ...}) = 0 futex(0x393b4030ec, FUTEX_WAKE_PRIVATE, 2147483647) = 0 open("/usr/local/lib/python2.6/lib-dynload/time.so", O_RDONLY) = 4 read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\30\0\0\0\0\0\0"..., 832) = 832 fstat(4, {st_mode=S_IFREG 0755, st_size=46995, ...}) = 0 mmap(NULL, 2115944, PROT_READ PROT_EXEC, MAP_PRIVATE
msg98055 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-01-19 13:54
You might want to investigate which C library call opens this file, and propose a patch to release the GIL before doing that call. As far as I can tell, Python itself does not open /etc/localtime in any place.
msg98066 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-01-19 20:27
I'd also be curious why it takes a second to open /etc/localtime. That sounds like a bug in the C library, or a mismanaged system.
msg98069 - (view) Author: Doron Tal (dorontal) Date: 2010-01-19 21:36
We didn't prove this point yet, currently it's merely circumstantial. The application calls strftime few dozen times each second from different threads. We spot the stall using another two threads, the first takes a time stamp before calling a python function and reset the time stamp when the python function is done. The other thread watches the timestamp and if it gets over a certain limit (2 seconds) it aborts the process, leaving a core dump behind. We've build this infrastructure to tackle python stalls caused by misbehaving C calls which takes long but does not release the GIL. Here is the traceback (one of the threads): #0 0x0000003424ac5800 in __open_nocancel () from /lib64/libc.so.6 #1 0x0000003424a6c383 in __GI__IO_file_open () from /lib64/libc.so.6 #2 0x0000003424a6c43c in _IO_new_file_fopen () from /lib64/libc.so.6 #3 0x0000003424a61764 in __fopen_internal () from /lib64/libc.so.6 #4 0x0000003424a8d666 in __tzfile_read () from /lib64/libc.so.6 #5 0x0000003424a8c56f in tzset_internal () from /lib64/libc.so.6 #6 0x0000003424a8d010 in tzset () from /lib64/libc.so.6 #7 0x0000003424a91934 in strftime_l () from /lib64/libc.so.6 #8 0x00002aea01bced34 in ?? () from /usr/lib64/python2.4/lib-dynload/timemodule.so #9 0x00000034304360f0 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0 #10 0x0000003430436bbc in PyObject_CallMethod () from /usr/lib64/libpython2.4.so.1.0 #11 0x00002aea01dda64a in ?? () from /usr/lib64/python2.4/lib-dynload/datetime.so #12 0x00002aea01dda9c3 in ?? () from /usr/lib64/python2.4/lib-dynload/datetime.so #13 0x00000034304950ba in PyEval_EvalFrame () from /usr/lib64/libpython2.4.so.1.0 #14 0x0000003430494b66 in PyEval_EvalFrame () from /usr/lib64/libpython2.4.so.1.0 #15 0x0000003430495fe5 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.4.so.1.0 #16 0x000000343044c45a in ?? () from /usr/lib64/libpython2.4.so.1.0 #17 0x00000034304360f0 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0 #18 0x000000343049352c in PyEval_EvalFrame () from /usr/lib64/libpython2.4.so.1.0 #19 0x0000003430495fe5 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.4.so.1.0 #20 0x000000343044c45a in ?? () from /usr/lib64/libpython2.4.so.1.0 #21 0x00000034304360f0 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0 #22 0x000000343049352c in PyEval_EvalFrame () from /usr/lib64/libpython2.4.so.1.0 #23 0x0000003430494b66 in PyEval_EvalFrame () from /usr/lib64/libpython2.4.so.1.0 #24 0x0000003430495fe5 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.4.so.1.0 #25 0x000000343044c367 in ?? () from /usr/lib64/libpython2.4.so.1.0 #26 0x00000034304360f0 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0 #27 0x000000343043c1ef in ?? () from /usr/lib64/libpython2.4.so.1.0 #28 0x00000034304360f0 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0 #29 0x000000343048fc3d in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.4.so.1.0 #30 0x00000034304bbbbd in ?? () from /usr/lib64/libpython2.4.so.1.0 #31 0x00000034256064a7 in start_thread () from /lib64/libpthread.so.0 #32 0x0000003424ad3c2d in clone () from /lib64/libc.so.6 So, we didn't actually proved that opening /etc/localtime takes so long. I assumed that it is better to release the GIL prior to opening a file. In any case, it won't solve our problem, since we use old Python 2.4.3 (CentOS5.4)... --doron On Tue, Jan 19, 2010 at 10:27 PM, Martin v. Löwis <report@bugs.python.org>wrote: > > Martin v. Löwis <martin@v.loewis.de> added the comment: > > I'd also be curious why it takes a second to open /etc/localtime. That > sounds like a bug in the C library, or a mismanaged system. > > ---------- > nosy: +loewis > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue7739> > _______________________________________ >
msg98088 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-01-21 01:01
> It appears that the stall happens due to time.strftime call, which internally opens > a file ('/etc/localtime') In the GNU libc, strftime() calls tzset() and tzset() uses a lock to be thread safe. Yes, releasing the GIL before calling strftime() would speed up your use case.
msg98092 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-01-21 01:37
Here you have a patch releasing the GIL for the call to strftime().
msg98093 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-01-21 01:47
time_tzset() might also release the GIL, but: - i hope that multithreaded programs don't call regulary tzset() in different threads - time_tzset() calls tzset() and inittimezone() - inittimezone() modify the time module attributes and may call tzset() (depending on the OS / C library). it's also called from inittime() So I prefer to leave it unchanged :-)
msg107415 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-06-09 20:55
Victor, I understand your last comment as wanting to keep the GIL while calling tzset(), but release it around the strftime() call. You still want to apply your patch, right? I think the problem is not with a concurrent tzset() call, but with another thread changing TZ environment variable. See http://sourceware.org/bugzilla/show_bug.cgi?id=4350
msg107614 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-06-12 00:20
> I understand your last comment as wanting to keep the GIL while calling > tzset(), but release it around the strftime() call. Yes > You still want to apply your patch, right? http://sourceware.org/bugzilla/show_bug.cgi?id=4350 describes a crash. This issue is not a crash but a concurrency problem. I don't know if my patch fixes the initial problem: dorontal didn't try it and he doesn't explain us how to reproduce it. I also not sure that releasing the GIL may not introduce a new concurrency issue: it might crash if a thread calls strftime() and another thread calls setenv("TZ", "...") (if I understood correctly the GNU libc bug 4350).
msg125964 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-11 00:11
-1 on releasing the GIL around strftime(). Until we have an implementation that can support multiple locales and multiple timezones, people will work around this limitation by messing up with the TZ environment variable or global locale settings.
History
Date User Action Args
2022-04-11 14:56:56 admin set github: 51988
2015-08-28 21:05:57 belopolsky set status: open -> closedresolution: wont fixstage: patch review -> resolved
2011-01-11 00:11:27 belopolsky set versions: + Python 3.3, - Python 2.6nosy:loewis, belopolsky, pitrou, vstinner, dorontalmessages: + assignee: belopolsky -> type: performancestage: patch review
2010-06-12 00:20:19 vstinner set messages: +
2010-06-09 20:55:10 belopolsky set assignee: belopolskymessages: + nosy: + belopolsky
2010-01-21 01:47:39 vstinner set messages: +
2010-01-21 01:37:42 vstinner set files: + time_strftime_gil.patchkeywords: + patchmessages: +
2010-01-21 01:37:06 vstinner set files: - unnamed
2010-01-21 01:01:23 vstinner set nosy: + vstinnermessages: +
2010-01-19 21:36:53 dorontal set files: + unnamedmessages: +
2010-01-19 20:27:45 loewis set nosy: + loewismessages: +
2010-01-19 13:54:46 pitrou set nosy: + pitroumessages: +
2010-01-19 09:23:55 dorontal create