msg274005 - (view) |
Author: Dima Tisnek (Dima.Tisnek) * |
Date: 2016-08-31 07:27 |
To reproduce: ``` import threading import time class U(threading.Thread): def run(self): time.sleep(1) if not x.ident: x.start() x = U() for u in [U() for i in range(10000)]: u.start() time.sleep(10) ``` Chance to reproduce ~20% in my setup. This script has a data race (check then act on x.ident). I expected it to occasionally hit `RuntimeError: threads can only be started once` Instead, I get: ``` Unhandled exception in thread started by <bound method Thread._bootstrap of <U(Thread-1, started 139798116361984)>> Traceback (most recent call last): File "/usr/lib64/python3.5/threading.py", line 882, in _bootstrap self._bootstrap_inner() File "/usr/lib64/python3.5/threading.py", line 906, in _bootstrap_inner del _limbo[self] KeyError: <U(Thread-1, started 139798116361984)> ``` |
|
|
msg276325 - (view) |
Author: Maciej Urbański (rooter) * |
Date: 2016-09-13 18:58 |
Successfully reproduced on 2.7.12 and 3.5.2 . Currently there seems to be no protection against starting the same thread twice at the same time. What was checked was only if start operation already finished once. Attached patch makes it so limbo, our starting threads' waiting room, is checked first. |
|
|
msg276331 - (view) |
Author: Dima Tisnek (Dima.Tisnek) * |
Date: 2016-09-13 19:24 |
@rooter, if you go this way, you should also `self._started.set()` with lock held, together with removing thread from `_limbo` |
|
|
msg276333 - (view) |
Author: Maciej Urbański (rooter) * |
Date: 2016-09-13 19:40 |
@Dima.Tisnek, only reason for having both of these conditions together is so I won't have to repeat the same error message in the code at little price of the performance in this edge case (trying to start the same thread multiple times). Unless I'm missing something there should be no way how it would make this lock required for setting `self._started`. |
|
|
msg276337 - (view) |
Author: Dima Tisnek (Dima.Tisnek) * |
Date: 2016-09-13 20:31 |
Your logic is accurate; _started is in fact separate from _limbo. As such taking a lock for "test-then-set" only would suffice. Now when you bring the other primitive under this lock in one place, it would look cleaner if it was also brought in the other. There's one more issue with proposed change: Before the change, if "already started" exception is ever raised for a given Thread object, then it's guaranteed that that object was started successfully. With the change, it is possible that exception is raised, and thread fails to start, leaving the object in initial state. If it were up to me, I would buy this limitation as price of safety. Someone may disagree. |
|
|
msg276605 - (view) |
Author: Maciej Urbański (rooter) * |
Date: 2016-09-15 19:35 |
To address @Dima.Tisnek concern I have changed exception message in case thread start process is merely in progress. I kept `self._started` check under a lock so we can avoid more extreme race condition of one thread checking `self._started` right before another sets it and exits the limbo. As for testing `self._started` under a lock, but setting it without one. I'm avoiding it only because of performance reasons. The read is super cheap, while notification of `.set()` is more complex, so if aesthetics are only reasons for doing it there then I would advise against holding that lock while executing it. Of course I could also do a `self in _active` check under a lock, but that is slightly more costly, than `self._started` check and not any more useful. I may be prematurely optimizing things, but people tend to like they threads to startup ASAP. |
|
|