Issue 5502: io-c: TextIOWrapper is faster than BufferedReader but not protected by a lock (original) (raw)

Created on 2009-03-18 01:16 by vstinner, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
crash_textiowrapper.py vstinner,2009-03-18 01:59
speedup-bufio.patch pitrou,2009-04-06 22:16
Messages (5)
msg83724 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-18 01:16
TextIOWrapper.readline() is much faster (eg. 72 ms vs 95 ms) than BufferedReader.readline(). It's because BufferedReader always acquires the file lock, whereas TextIOWrapper only acquires the file lock when the buffer is empty. I would like a BufferedReader.readline() as fast as TextIOWrapper.readline(), or faster! Why BufferedReader's attributes are protected by a lock whereas TextIOWrapper's attributes are not? Does it mean that TextIOWrapper may crash if two threads calls readline() (or different methods) at the "same time"? How does Python 2.x and 3.0 fix this issue?
msg83728 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-18 01:59
I wrote a short script to test TextIOWrapper.readline() with 32 threads. After 5 seconds, I found this issue in Python trunk (2.7): Exception in thread Thread-26: Traceback (most recent call last): File "/home/SHARE/SVN/python-trunk/Lib/threading.py", line 522, in __bootstrap_inner self.run() File "/home/haypo/crash_textiowrapper.py", line 15, in run line = self.file.readline() File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1835, in readline self._rewind_decoded_chars(len(line) - endpos) File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1541, in _rewind_decoded_chars raise AssertionError("rewind decoded_chars out of bounds") AssertionError: rewind decoded_chars out of bounds But it looks that py3k is stronger because it doesn't crash. Is it the power of the GIL?
msg83739 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-18 10:35
> But it looks that py3k is stronger because it doesn't crash. Is it the > power of the GIL? Yes, it is. In theory, we needn't take the lock in all of BufferedReader.readline(), only when calling external code which might itself release the GIL. In practice, we didn't bother optimizing the lock-taking, for the sake of simplicity. If the lock really accounts for a significant part of the runtime cost, we can try to do better.
msg85674 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-04-06 22:16
Here is a patch which provides a significant speedup (up to 30%) on small operations (small reads, iteration) on binary files. Please test.
msg85865 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-04-11 15:39
Committed in r71483.
History
Date User Action Args
2022-04-11 14:56:46 admin set github: 49752
2009-04-11 15:39:45 pitrou set status: open -> closedresolution: fixedmessages: +
2009-04-06 22:16:56 pitrou set files: + speedup-bufio.patchkeywords: + patchmessages: +
2009-03-22 13:46:09 pitrou set priority: normalassignee: pitroutype: performancestage: needs patch
2009-03-18 10:35:38 pitrou set messages: +
2009-03-18 01:59:31 vstinner set files: + crash_textiowrapper.pymessages: +
2009-03-18 01:16:11 vstinner create