Issue 20413: Errors in documentation of standard codec error handlers (original) (raw)
Created on 2014-01-27 20:41 by RalfM, last changed 2022-04-11 14:57 by admin.
Messages (5)
Author: (RalfM)
Date: 2014-01-27 20:41
The standard library documentation lists the standard codec error handlers in three places:
(a) 2. Build-in Functions, section open() (b) 7.2 codecs - Codec registry and base classes (c) 7.2.1 Codec Base Classes
As far as I can judge these lists, (c) looks ok, but (a) and (b) contain two errors:
- 'surrogatepass' is not mentioned.
- 'surrogateescape' is described as: 'on decoding, replace with code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will ...' This is incorrect in so far as U+DC80 to U+DCFF are not private code points, but (low-)surrogate code points. This is correctly explained in (c) and in PEP383 (and, of course, in the Unicode standard, chapter 16).
I suggest to correct (a) and (b) by
- adding 'surrogatepass' with the description given in (c),
- changing the description of 'surrogateescape' to something like: 'on decoding, replace with surrogate code points ranging from U+DC80 to U+DCFF. These surrogate code points will ...'.
These errors are present in the documentation (more precisely, the .chm files) of at least
- Python 3.3.3
- Python 3.3.4rc1
- Python 3.4.0b3.
Author: Alyssa Coghlan (ncoghlan) *
Date: 2014-01-28 05:43
I plan to take a look at the codec docs in general in the next week or so, I'll tackle this as well.
Author: Matthew Barnett (mrabarnett) *
Date: 2015-02-06 20:40
The docs for Python 3.5.0a0 still say "Unicode Private Use Area".
Author: Martin Panter (martin.panter) *
Date: 2015-02-06 21:44
I changed “code point in the Unicode Private Use Area” to “individual surrogate code” in the “codecs” module documentation for Issue 19548. So perhaps (a) still needs addressing, but (b) and (c) are hopefully already fixed.
Author: Alyssa Coghlan (ncoghlan) *
Date: 2015-02-07 00:01
Ah, February 2014, many of my plans went in rather different directions than expected that month, and this was one of them :)
As Martin noted, he already fixed (b) and (c), but we missed that the list of error handlers was also duplicated in the builtin open() docs.
That duplication is likely worthwhile from a docs usability perspective, but we should:
- Bring it in line with Martin's recent fixes to the codecs module docs
- Add a comment in the error handler docs noting that the open() docs may need to be updated to reflect changes to error handler semantics
History
Date
User
Action
Args
2022-04-11 14:57:57
admin
set
github: 64612
2015-02-07 00:01:07
ncoghlan
set
assignee: ncoghlan ->
messages: +
2015-02-06 21:44:08
martin.panter
set
nosy: + martin.panter
messages: +
2015-02-06 20:40:15
mrabarnett
set
nosy: + mrabarnett
messages: +
versions: + Python 3.5
2014-02-15 15:48:31
ezio.melotti
set
nosy: + ezio.melotti
2014-01-28 05:43:55
ncoghlan
set
assignee: docs@python -> ncoghlan
messages: +
nosy: + ncoghlan
2014-01-27 20:41:06
RalfM
create