Issue 20413: Errors in documentation of standard codec error handlers (original) (raw)

Created on 2014-01-27 20:41 by RalfM, last changed 2022-04-11 14:57 by admin.

Messages (5)

msg209477 - (view)

Author: (RalfM)

Date: 2014-01-27 20:41

The standard library documentation lists the standard codec error handlers in three places:

(a) 2. Build-in Functions, section open() (b) 7.2 codecs - Codec registry and base classes (c) 7.2.1 Codec Base Classes

As far as I can judge these lists, (c) looks ok, but (a) and (b) contain two errors:

  1. 'surrogatepass' is not mentioned.
  2. 'surrogateescape' is described as: 'on decoding, replace with code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will ...' This is incorrect in so far as U+DC80 to U+DCFF are not private code points, but (low-)surrogate code points. This is correctly explained in (c) and in PEP383 (and, of course, in the Unicode standard, chapter 16).

I suggest to correct (a) and (b) by

These errors are present in the documentation (more precisely, the .chm files) of at least

msg209502 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2014-01-28 05:43

I plan to take a look at the codec docs in general in the next week or so, I'll tackle this as well.

msg235496 - (view)

Author: Matthew Barnett (mrabarnett) * (Python triager)

Date: 2015-02-06 20:40

The docs for Python 3.5.0a0 still say "Unicode Private Use Area".

msg235500 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2015-02-06 21:44

I changed “code point in the Unicode Private Use Area” to “individual surrogate code” in the “codecs” module documentation for Issue 19548. So perhaps (a) still needs addressing, but (b) and (c) are hopefully already fixed.

msg235506 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2015-02-07 00:01

Ah, February 2014, many of my plans went in rather different directions than expected that month, and this was one of them :)

As Martin noted, he already fixed (b) and (c), but we missed that the list of error handlers was also duplicated in the builtin open() docs.

That duplication is likely worthwhile from a docs usability perspective, but we should:

  1. Bring it in line with Martin's recent fixes to the codecs module docs
  2. Add a comment in the error handler docs noting that the open() docs may need to be updated to reflect changes to error handler semantics

History

Date

User

Action

Args

2022-04-11 14:57:57

admin

set

github: 64612

2015-02-07 00:01:07

ncoghlan

set

assignee: ncoghlan ->
messages: +

2015-02-06 21:44:08

martin.panter

set

nosy: + martin.panter
messages: +

2015-02-06 20:40:15

mrabarnett

set

nosy: + mrabarnett

messages: +
versions: + Python 3.5

2014-02-15 15:48:31

ezio.melotti

set

nosy: + ezio.melotti

2014-01-28 05:43:55

ncoghlan

set

assignee: docs@python -> ncoghlan

messages: +
nosy: + ncoghlan

2014-01-27 20:41:06

RalfM

create