bpo-30228: TextIOWrapper uses abs_pos, not tell() by vstinner · Pull Request #1385 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation11 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
The TextIOWrapper constructor now gets directly the abs_pos attribute
of BufferedWriter and BufferedRandom instead of calling the tell()
method to avoid one lseek() syscall on open(fname, "w") and
open(fname, "w+").
Move the buffered structure to _iomodule.h and rename it to
_PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by
_PyIO_buffered lock.
https://bugs.python.org/issue30228
The TextIOWrapper constructor now gets directly the abs_pos attribute of BufferedWriter and BufferedRandom instead of calling the tell()
This change means that TextIOWrapper becomes inconsistent if the file descriptor is moved direcly using os.lseek()... but BufferedReader/BufferedWriter don't detect neither when the file descriptor is moved directly, no? I mean, abs_pos cached attribute already has the bug, no?
@pitrou: Would you mind to review this one? Does it look like an acceptable optimization?
@serhiy-storchaka: Same questions. Would you mind to review this one? Does it look like an acceptable optimization?
@Haypo: this looks like an acceptable optimization, but the question is whether it brings any significant speedup.
/* Fast-path for _io.BufferedWriter and _io.BufferedRandom: |
---|
read directly the private abs_pos attribute */ |
_PyIO_buffered *bw = (_PyIO_buffered *)buffer; |
/* _buffered_init() sets abs_pos (if the file is seekable) */ |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's not seekable?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->seekable is set.
goto error; |
---|
if (Py_TYPE(buffer) == &PyBufferedWriter_Type |
| |
/* Fast-path for _io.BufferedWriter and _io.BufferedRandom: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about BufferedReader?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->encoder is set. self->encoder is only set if the buffer is writable.
@pitrou: I replied to your comments. So, what do you think? Is my PR safe?
As I said above : this looks like an acceptable optimization, but the question is whether it brings any significant speedup :-)
Of course I'm not against making optimizations in buffered I/O. I also know that buffering can be tricky (being responsible for the data corruption issue in Python 3.2 made me cautious about this!). But if this can increase performance significantly then ok (and you'll bear the responsability of any regression ;-)).
The TextIOWrapper constructor now gets directly the private abs_pos attribute of BufferedWriter and BufferedRandom instead of calling the tell() method to avoid one lseek() syscall on open(fname, "w") and open(fname, "w+").
Move the buffered structure to _iomodule.h and rename it to _PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by _PyIO_buffered lock.