msg296896 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-06-26 13:50 |
The current code of PyThread_acquire_lock_timed() (the implementation not using semaphore) doesn't compute correctly the timeout when pthread_cond_timedwait() is interrupted by a signal. We should recompute the timeout using a deadline. Something like select.select(): if (tvp) deadline = _PyTime_GetMonotonicClock() + timeout; do { ... use tvp if (errno != EINTR) break; /* select() was interrupted by a signal */ if (PyErr_CheckSignals()) goto finally; if (tvp) { timeout = deadline - _PyTime_GetMonotonicClock(); if (timeout < 0) { n = 0; break; } _PyTime_AsTimeval_noraise(timeout, &tv, _PyTime_ROUND_CEILING); /* retry select() with the recomputed timeout */ } } while (1); |
|
|
msg296898 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-06-26 13:51 |
See also the old bpo-12822: "NewGIL should use CLOCK_MONOTONIC if possible.". |
|
|
msg296899 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-06-26 13:51 |
See also the PEP 475 "Retry system calls failing with EINTR" and PEP 418 (time.monotonic). |
|
|
msg304892 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-24 10:21 |
interrupted_lock.py: test threading.Lock.acquire(timeout=1.0) with SIGALRM sent every 1 ms (so up to 1000 times in total). Example: haypo@selma$ ./python interrupted_lock.py acquire(timeout=1.0) took 1.0 seconds and got 1000 signals Oh, in fact, threading.Lock.acquire(timeout=1.0) already recomputes the timeout when interrupted. In Python stdlib, PyThread_acquire_lock_timed() is only called from one place with intr_flag=0: faulthandler watchdog thread, but this thread blocks all signals: /* we don't want to receive any signal */ sigfillset(&set); pthread_sigmask(SIG_SETMASK, &set, NULL); |
|
|
msg304893 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-24 10:28 |
Ah, I found another caller of PyThread_acquire_lock_timed() with a timeout > 0 and intr_flag=0: _enter_buffered_busy() of Modules/_io/bufferedio.c: /* When finalizing, we don't want a deadlock to happen with daemon * threads abruptly shut down while they owned the lock. * Therefore, only wait for a grace period (1 s.). * Note that non-daemon threads have already exited here, so this * shouldn't affect carefully written threaded I/O code. */ st = PyThread_acquire_lock_timed(self->lock, (PY_TIMEOUT_T)1e6, 0); |
|
|
msg304895 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-24 10:30 |
I wrote PR 4103 to fix the pthread+semaphore implementation of PyThread_acquire_lock_timed(). Apply PR 4103, apply attached test.patch, recompile Python, and run interrupted_lock.py to test the PR. Result: haypo@selma$ ./python interrupted_lock.py acquire(timeout=1.0) took 1.0 seconds and got 911 signals |
|
|
msg304896 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-24 10:31 |
To check if you are using pthread+semaphore, use: haypo@selma$ ./python -c 'import sys; print(sys.thread_info)' sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.25') Here you have pthread+semaphore. It's Fedora 26 running Linux kernel 4.13. |
|
|
msg304951 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-24 23:53 |
New changeset 850a18e03e8f8309bc8c39adc6e7d51a4568cd9a by Victor Stinner in branch 'master': bpo-30768: Recompute timeout on interrupted lock (GH-4103) https://github.com/python/cpython/commit/850a18e03e8f8309bc8c39adc6e7d51a4568cd9a |
|
|
msg304952 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-25 00:13 |
I merged my PR. Thanks Antoine Pitrou for the review! This change only impacts the io.BufferedWriter and io.BufferedReader during Python finalization. It has no effect on theading.Lock.acquire(). It might impact faulthandler.dump_traceback_later(), but in practice, it shouldn't change anything since the internal faulthandler watchdog thread blocks all signals. If I understand correctly, if the system clock is stepped back by 1 hour during Python finalization, after PyThread_acquire_lock_timed() computed the deadline, but before sem_timedwait() completed, _enter_buffered_busy() can be blocked during 1 hour. Moving the system clock backward by 1 hour occurs once a year on the DST change. But the race condition is unlikely since the size of the time window is only 1 second. The DST change should occur at Python shutdown when an io.BufferedReader or io.BufferedWriter is used. Since the race condition seems very unlikely and was never reported by another user, I propose to not backward this change. Moreover, I'm not confident to modify locks in a stable release :-) |
|
|
msg304953 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-10-25 00:15 |
Oh, the pthread condvar+mutex implementation still has the bug, so I reopen the issue. |
|
|