Issue 36111: Non-zero offset
s are no longer acceptable with SEEK_END/SEEK_CUR implementation of seek
in python3 when in text mode, breaking py 2.x behavior/POSIX (original) (raw)
Created on 2019-02-25 22:12 by ngie, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (6)
Author: Enji Cooper (ngie) *
Date: 2019-02-25 22:12
I tried using os.SEEK_END in a technical interview, but unfortunately, that didn't work with python 3.x:
pinklady:cpython ngie$ python3 Python 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0 (clang-1000.11.45.5)] on darwin Type "help", "copyright", "credits" or "license" for more information.
import os fp = open("configure"); fp.seek(-100, os.SEEK_END) Traceback (most recent call last): File "", line 1, in io.UnsupportedOperation: can't do nonzero end-relative seeks
It does however work with 2.x, which is aligned with the POSIX spec implementation, as shown below:
pinklady:cpython ngie$ python Python 2.7.15 (default, Oct 2 2018, 11:47:18) [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.2)] on darwin Type "help", "copyright", "credits" or "license" for more information.
import os fp = open("configure"); fp.seek(-100, os.SEEK_END) fp.tell() 501076 os.stat("configure").st_size 501176
Author: Steven D'Aprano (steven.daprano) *
Date: 2019-02-25 22:33
I believe you will find that this is because you opened the file in text mode, which means Unicode, not bytes. If you open it in binary mode, the POSIX spec applies:
py> fp = open("sample", "rb"); fp.seek(-100, os.SEEK_END) 350
Supported values for seeking in text (Unicode) files are documented here:
https://docs.python.org/3/library/io.html#io.TextIOBase.seek
I don't believe this is a bug, or possible to be changed. Do you still think otherwise? If not, we should close this ticket.
Author: Enji Cooper (ngie) *
Date: 2019-02-25 22:42
?!
Being blunt: why should opening a file in binary vs text mode matter? POSIX doesn't make this distinction.
Per the pydoc (https://docs.python.org/2/library/functions.html#open):
The default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading.
If this is one of the only differentiators between binary and text mode, why should certain types of seeking be made impossible?
Having to stat the file, then set the cursor to the size of the file, minus the offset breaks the 'seek(..)' interface, and having to use 'rb', then convert from bytes to unicode overly complicates things :(.
Author: Enji Cooper (ngie) *
Date: 2019-02-26 00:29
Opening and seeking using SEEK_END worked in text mode with python 2.7. I'm not terribly sure why 3.x should depart from this behavior:
fp = open("configure", "rt"); fp.seek(-100, os.SEEK_END) fp.tell() 501076
Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2019-02-26 05:26
This does not have relation to POSIX, since POSIX says nothing about Unicode files. "Text mode" in POSIX means binary files with converted newlines. This mode is not supported in Python 3.
Author: Inada Naoki (methane) *
Date: 2019-02-26 06:32
If you want byte IO, you can use "rb" mode. You can seek on it.
History
Date
User
Action
Args
2022-04-11 14:59:11
admin
set
github: 80292
2019-02-26 06:32:35
methane
set
nosy: + methane
messages: +
2019-02-26 06:31:08
methane
set
status: open -> closed
resolution: not a bug
stage: resolved
2019-02-26 05:26:43
serhiy.storchaka
set
nosy: + serhiy.storchaka
messages: +
2019-02-26 03:43:53
ngie
set
title: Non-zero `offset`s are no longer acceptable with implementation of `seek` in some cases with python3 when in text mode; should be per POSIX -> Non-zero `offset`s are no longer acceptable with SEEK_END/SEEK_CUR implementation of `seek` in python3 when in text mode, breaking py 2.x behavior/POSIX
2019-02-26 03:43:02
ngie
set
title: Negative `offset` values are no longer acceptable with implementation of `seek` with python3 when in text mode; should be per POSIX -> Non-zero `offset`s are no longer acceptable with implementation of `seek` in some cases with python3 when in text mode; should be per POSIX
2019-02-26 00:30:14
ngie
set
versions: + Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8
2019-02-26 00:29:37
ngie
set
messages: +
2019-02-25 22:44:15
ngie
set
title: Negative `offset` values are no longer acceptable with implementation of `seek` with python3; should be per POSIX -> Negative `offset` values are no longer acceptable with implementation of `seek` with python3 when in text mode; should be per POSIX
2019-02-25 22:42:10
ngie
set
messages: +
2019-02-25 22:33:23
steven.daprano
set
nosy: + steven.daprano
messages: +
2019-02-25 22:12:02
ngie
create