gh-91351: Fix some bugs in importlib handling of re-entrant imports by exarkun · Pull Request #94504 · python/cpython (original) (raw)
See #91351 for details about the problem.
Re-entrancy is always tricky and given the requirements of _bootstrap.py (to operate with re-entrancy and multi-threading and to do so without exposing any of the details to application code doing an import) I think this goes double.
This PR does a few things to achieve better safety in the face of re-entrancy:
- Switch some data structures to those that support atomic operation so that they are consistent in case of asynchronous re-entrancy (eg from the garbage collector or a signal handler).
- Use RLock and add more RLock-like behavior to prevent deadlocks in the re-entrant case.
- Update the deadlock detection algorithm to support the fact that one thread might be "blocked on" acquiring the module lock for more than one module at a time.
I'm not quite sure I believe this new version of the code is 100% correct with respect to re-entrancy but it does fixes mishandling of two specific cases:
- A re-entrant import is performed between the time
_blocking_on
is populated and cleaned up inside_ModuleLock.acquire
. Previous this failed with aKeyError
(as described in the linked issue). - A re-entrant import is performed while the module lock is held inside
_ModuleLock.acquire
. Previously this failed by deadlocking.
This PR also does not include any new unit tests. I have a small stand-alone program which can reproduce both of these but only with the assistance of some additional instrumentation inside _bootstrap.py
to make sure the re-entrancy happens at the interesting times. If adding this kind of instrumentation is acceptable then it may be possible to turn this program into some unit tests.
Compared to the previous PR, this one is a bit simpler because it uses RLock
in a place where main currently uses a regular Lock
.
For reference, here is a stand-alone reproducer. This one very reliably reproduces the KeyError
problem (by exercising the codepath forever until it hits it). I haven't managed to create a deterministic reproducer for the deadlock case - except with the assistance of instrumentation inside _bootstrap.py
itself.
import sys, socket, gc
class Cycle:
pass
def a_cycle():
c = Cycle()
c.cycle = c
c.s = socket.socket()
def main():
while True:
# import a module that socket.__del__ is going to import to exercise
# re-entrant _ModuleLock.lock handling
a_cycle()
import linecache
del sys.modules["linecache"]
main()