Deprecating codecs.open? (original) (raw)
codecs.open
was a way to open text files that worked in Python 2, but with the introduction of io.open
, its significance has greatly diminished.
Now, difference between codecs.open
and TextIOWrapper
is a source of paper cuts.
It seems we do not have enough resources to maintain the codecs
module.
How do you think about deprecating codecs.open
, StreamReader
and StreamWriter
?
Does they have any use cases that cannot be replaced with open
and io.TextIOWrapper
?
Since they have replacements from old days,
I think 3 years deprecation in document + 3 years DeprecationWarning is enough for remove them.
hugovk (Hugo van Kemenade) April 13, 2025, 11:58am 2
Searching the top 15k PyPI projects (downloaded today) for codecs\.open
:
- Found 1,567 matching lines in 613 projects
\bStreamReader\b
:
- Found 2,959 matching lines in 247 projects
\bStreamWriter\b
:
- Found 2,565 matching lines in 185 projects
I can share the details if wanted, it’s too big for Discord.
methane (Inada Naoki) April 14, 2025, 6:47am 3
Most StreamReader and StreamWriter are asyncio or aiohttp.
Anyway, it is difficult to remove codecs.open and StreamReader/Writer cannot be deprecated unless codecs.open deprecated.
How about deprecate codecs.open
without removal schedule?
Like alias methods in TestCase, we need very long deprecation period. Maybe 10+ years.
malemburg (Marc-André Lemburg) April 14, 2025, 8:16am 4
I think you are forgetting that StreamRead/Writer
play a central part in the whole codecs sub-system, so eventually removing them would require a lot more redesign work for the sub-system to continue working (both are part of the what defines a codec in Python - see CodecInfo
)
You’d have to essentially replace the StreamReader/Writer logic which the codec sub-system uses with io
stack classes - that is, after investigating whether this is easily possible. They look fairly similar, but their method signatures are different, TextIOWrapper
does not separate reading and writing and the semantics are different as well.
So overall, I think the idea of simply deprecating the two base classes is premature at this point.
methane (Inada Naoki) April 14, 2025, 8:49am 5
Now I am proposing deprecating only codecs.open
. I updated the thread title.
vstinner (Victor Stinner) April 14, 2025, 10:01am 6
vstinner (Victor Stinner) April 14, 2025, 10:37am 7
CodecInfo.streamreader
and CodecInfo.streamwriter
are only used by codecs.open()
. Would you mind to elaborate what do you mean by “play a central part in the whole codecs sub-system”?
It’s possible to emit a DeprecationWarning
in StreamReader
and StreamWriter
constructor without breaking Python. It remains possible to define sub-classes (without emitting DeprecationWarning
) which is needed to define codecs such as UTF-8 (Lib/encodings/utf_8.py
):
class StreamWriter(codecs.StreamWriter):
encode = codecs.utf_8_encode
class StreamReader(codecs.StreamReader):
decode = codecs.utf_8_decode
malemburg (Marc-André Lemburg) April 14, 2025, 6:58pm 8
Every single codec in Python exposes subclasses of these two base classes via the CodecInfo
returned by the codec search function. PEP 100 has the details.
+1 on soft deprecating codecs.open()
, without actually removing the function for a longer while. The standard open()
is the better choice these days.
Still, the function is still in wide spread use, so it’ll take a longer while to convince people to reconsider their choice. It is still needed by projects wanting to maintain Python 2 compatibility.
Note that deprecating the function will not allow deprecating StreamReader/Writer
as a result, unless there’s a working migration path forward to e.g. use new base classes around the io
stack for the codecs. These could be added as additional fields in CodecInfo
and have codecs slowly migrate over with a longer deprecation period