Issue 26717: wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO (original) (raw)

Created on 2016-04-08 20:48 by Anthony Sottile, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patch Anthony Sottile,2016-04-08 20:48 review
patch Anthony Sottile,2016-04-08 22:34 review
patch Anthony Sottile,2016-04-09 01:47 review
patch Anthony Sottile,2016-04-09 02:55 review
simple_server.py.diff Александр Эри,2016-04-20 10:46
Messages (10)
msg263043 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-08 20:48
Patch attached with test. In summary: A request to the url b'/\x80' appears to the application as a request to b'\xc2\x80' -- The issue being the latin1 decoded PATH_INFO is re-encoded as UTF-8 and then decoded as latin1 (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> b'\xc2\x80' My patch cuts out the encode(utf-8)->decode(latin1)
msg263044 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-08 20:53
A few typos in my previous comment, pressed enter too quickly, here's an updated comment: Patch attached with test. In summary: A request to the url b'/\x80' appears to the application as a request to b'/\xc2\x80' -- The issue being the latin1 decoded PATH_INFO is re-encoded as UTF-8 and then decoded as latin1 (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> u'\xc2\x80' My patch cuts out the encode(utf-8)->decode(latin1): (on the wire) b'\x80' -(decode latin1) -> u'\x80'
msg263048 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-08 22:34
Oops, broke b'/%80'. Here's a better fix that now takes: (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> u'\xc2\x80' to: (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode latin1) -> b'\x80' -(decode latin1)-> u'\x80'
msg263050 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-08 23:50
I was going to say your original fix was the reverse of a change in r86146. But you seem to be fixing the problems before I express them :) For the fix I would suggest something like unquote(path, "latin-1") would be simpler. I left some other review comments about the tests.
msg263054 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-09 01:47
Updates after review.
msg263055 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-09 02:41
Thanks, this version looks pretty good to me.
msg263056 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-09 02:55
Forgot to remove the pyver code (leaning a bit too much on pre-commit)
msg263596 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-04-17 03:04
New changeset 1f2cfcd5a83f by Martin Panter in branch '3.5': Issue #26717: Stop encoding Latin-1-ized WSGI paths with UTF-8 https://hg.python.org/cpython/rev/1f2cfcd5a83f New changeset 815a4ac67e68 by Martin Panter in branch 'default': Issue #26717: Merge wsgiref fix from 3.5 https://hg.python.org/cpython/rev/815a4ac67e68
msg263818 - (view) Author: Александр Эри (Александр Эри) Date: 2016-04-20 10:46
Why wsgiref uses latin1? It must use utf-8.
msg263844 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2016-04-20 14:34
PEP3333 states that environ variables are str variables decoded using latin1: https://www.python.org/dev/peps/pep-3333/#id19 Therefore, to get the original bytes, one must encode using latin1 On Apr 20, 2016 3:46 AM, "Александр Эри" <report@bugs.python.org> wrote: > > Александр Эри added the comment: > > Why wsgiref uses latin1? It must use utf-8. > > ---------- > keywords: +patch > nosy: +Александр Эри > Added file: http://bugs.python.org/file42531/simple_server.py.diff > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue26717> > _______________________________________ >
History
Date User Action Args
2022-04-11 14:58:29 admin set github: 70904
2016-04-20 14:34:55 Anthony Sottile set messages: +
2016-04-20 10:46:18 Александр Эри set files: + simple_server.py.diffnosy: + Александр Эриmessages: + keywords: + patch
2016-04-17 08:23:34 martin.panter set status: open -> closedresolution: fixedstage: patch review -> resolved
2016-04-17 03:04:43 python-dev set nosy: + python-devmessages: +
2016-04-09 02:55:45 Anthony Sottile set files: + patchmessages: +
2016-04-09 02:41:20 martin.panter set messages: +
2016-04-09 01:47:58 Anthony Sottile set files: + patchmessages: +
2016-04-08 23:50:19 martin.panter set versions: - Python 3.4nosy: + martin.pantermessages: + type: behaviorstage: patch review
2016-04-08 22:34:31 Anthony Sottile set files: + patchmessages: +
2016-04-08 20:53:29 Anthony Sottile set messages: +
2016-04-08 20:48:05 Anthony Sottile create