TextIOWrapper.readline() is much faster (eg. 72 ms vs 95 ms) than BufferedReader.readline(). It's because BufferedReader always acquires the file lock, whereas TextIOWrapper only acquires the file lock when the buffer is empty. I would like a BufferedReader.readline() as fast as TextIOWrapper.readline(), or faster! Why BufferedReader's attributes are protected by a lock whereas TextIOWrapper's attributes are not? Does it mean that TextIOWrapper may crash if two threads calls readline() (or different methods) at the "same time"? How does Python 2.x and 3.0 fix this issue?
I wrote a short script to test TextIOWrapper.readline() with 32 threads. After 5 seconds, I found this issue in Python trunk (2.7): Exception in thread Thread-26: Traceback (most recent call last): File "/home/SHARE/SVN/python-trunk/Lib/threading.py", line 522, in __bootstrap_inner self.run() File "/home/haypo/crash_textiowrapper.py", line 15, in run line = self.file.readline() File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1835, in readline self._rewind_decoded_chars(len(line) - endpos) File "/home/SHARE/SVN/python-trunk/Lib/io.py", line 1541, in _rewind_decoded_chars raise AssertionError("rewind decoded_chars out of bounds") AssertionError: rewind decoded_chars out of bounds But it looks that py3k is stronger because it doesn't crash. Is it the power of the GIL?
> But it looks that py3k is stronger because it doesn't crash. Is it the > power of the GIL? Yes, it is. In theory, we needn't take the lock in all of BufferedReader.readline(), only when calling external code which might itself release the GIL. In practice, we didn't bother optimizing the lock-taking, for the sake of simplicity. If the lock really accounts for a significant part of the runtime cost, we can try to do better.