Issue 5502: io-c: TextIOWrapper is faster than BufferedReader but not protected by a lock (original) (raw)

Created on 2009-03-18 01:16 by vstinner, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
crash_textiowrapper.py	vstinner,2009-03-18 01:59
speedup-bufio.patch	pitrou,2009-04-06 22:16

Messages (5)
msg83724 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-18 01:16
TextIOWrapper.readline() is much faster (eg. 72 ms vs 95 ms) than BufferedReader.readline(). It's because BufferedReader always acquires the file lock, whereas TextIOWrapper only acquires the file lock when the buffer is empty. I would like a BufferedReader.readline() as fast as TextIOWrapper.readline(), or faster! Why BufferedReader's attributes are protected by a lock whereas TextIOWrapper's attributes are not? Does it mean that TextIOWrapper may crash if two threads calls readline() (or different methods) at the "same time"? How does Python 2.x and 3.0 fix this issue?
msg83728 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-18 01:59
I wrote a short script to test TextIOWrapper.readline() with 32 threads. After 5 seconds, I found this issue in Python trunk (2.7): Exception in thread Thread-26: Traceback (most recent call last): File "/home/SHARE/SVN/python-trunk/Lib/threading.py", line 522, in __bootstrap_inner self.run() File "/home/haypo/crash_textiowrapper.py", line 15, in run line = self.file.readline() File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1835, in readline self._rewind_decoded_chars(len(line) - endpos) File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1541, in _rewind_decoded_chars raise AssertionError("rewind decoded_chars out of bounds") AssertionError: rewind decoded_chars out of bounds But it looks that py3k is stronger because it doesn't crash. Is it the power of the GIL?
msg83739 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-03-18 10:35
> But it looks that py3k is stronger because it doesn't crash. Is it the > power of the GIL? Yes, it is. In theory, we needn't take the lock in all of BufferedReader.readline(), only when calling external code which might itself release the GIL. In practice, we didn't bother optimizing the lock-taking, for the sake of simplicity. If the lock really accounts for a significant part of the runtime cost, we can try to do better.
msg85674 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-04-06 22:16
Here is a patch which provides a significant speedup (up to 30%) on small operations (small reads, iteration) on binary files. Please test.
msg85865 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-04-11 15:39
Committed in r71483.

History
Date	User	Action	Args
2022-04-11 14:56:46	admin	set	github: 49752
2009-04-11 15:39:45	pitrou	set	status: open -> closedresolution: fixedmessages: +
2009-04-06 22:16:56	pitrou	set	files: + speedup-bufio.patchkeywords: + patchmessages: +
2009-03-22 13:46:09	pitrou	set	priority: normalassignee: pitroutype: performancestage: needs patch
2009-03-18 10:35:38	pitrou	set	messages: +
2009-03-18 01:59:31	vstinner	set	files: + crash_textiowrapper.pymessages: +
2009-03-18 01:16:11	vstinner	create