bpo-28393: Update encoding lookup docs wrt bpo-27938 (GH-4871) (#4881) · python/cpython@77bf6da (original) (raw)

Original file line number Diff line number Diff line change
@@ -977,10 +977,14 @@ e.g. ``'utf-8'`` is a valid alias for the ``'utf_8'`` codec.
977 977
978 978 Some common encodings can bypass the codecs lookup machinery to
979 979 improve performance. These optimization opportunities are only
980 - recognized by CPython for a limited set of aliases: utf-8, utf8,
981 - latin-1, latin1, iso-8859-1, mbcs (Windows only), ascii, utf-16,
982 - and utf-32. Using alternative spellings for these encodings may
983 - result in slower execution.
980 + recognized by CPython for a limited set of (case insensitive)
981 + aliases: utf-8, utf8, latin-1, latin1, iso-8859-1, iso8859-1, mbcs
982 + (Windows only), ascii, us-ascii, utf-16, utf16, utf-32, utf32, and
983 + the same using underscores instead of dashes. Using alternative
984 + aliases for these encodings may result in slower execution.
985 +
986 + .. versionchanged:: 3.6
987 + Optimization opportunity recognized for us-ascii.
984 988
985 989 Many of the character sets support the same languages. They vary in individual
986 990 characters (e.g. whether the EURO SIGN is supported or not), and in the