Issue 26585: Use html.escape to replace _quote_html in http.server (original) (raw)

Created on 2016-03-18 03:18 by xiang.zhang, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
_quote_html_to_html_escape.patch xiang.zhang,2016-03-18 03:18 Use html.escape to replace _quote_html for BaseHTTPRequestHandler review
_quote_html_to_html_escape_v2.patch xiang.zhang,2016-03-18 16:55 review
_quote_html_to_html_escape_v3.patch xiang.zhang,2016-03-18 17:14 review
_quote_html_to_html_escape_v4.patch xiang.zhang,2016-03-19 18:13 review
_quote_html_to_html_escape_v5.patch xiang.zhang,2016-03-20 07:50 review
Messages (17)
msg261943 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-18 03:18
In http.server, _quote_html is used to escape content for BaseHTTPServer. The function has already been implemented by html.escape.
msg261944 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-18 03:47
It's BaseHTTPRequestHandler, not BaseHTTPServer.
msg261950 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-03-18 07:31
Removing the redundant _quote_html() function seems like a good idea to me. I wonder if the code should be using the html.escape(..., quote=False) option though, because it does not need to encode quote signs. It might be good to add a test. It looks like there may not be anything testing the HTML encoding.
msg261951 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-18 07:37
At first I also want to use html.escape(..., quote=False) since the spec only asks to escape quote signs in attribute. But after some search on Google, there are articles recommends escaping quote in content too: https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
msg261975 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-18 16:55
I add two tests for html escaping. One for the error message and the other for display name in SimpleHTTPRequesthandler.
msg261978 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-18 17:14
make some change to test for html escape in SimpleHTTPRequestHandler
msg262001 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-03-18 22:32
Thanks for the tests. I left a couple comments. About encoding quotes: Personally I don’t see much value unless you are encoding an attribute value, in which case I would prefer to use xml.sax.saxutils.quoteattr(). Encoded quotes would only become useful if the “error_message_format” attribute was modified. A more practical downside is that if “error_content_type” is set to say text/plain, we are adding two somewhat common characters that will get messed up. E.g. the “explain” string for 429 Too Many Requests will include the double-quoted "rate limiting". And an apostrophe could easily be given in a custom error message, e.g. “Can't write a clean error message”.
msg262018 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-19 05:07
Thanks for your review. Set quote to False when html.escape is OK to me. But except for the usage in BaseHTTPRequestHandler, I think we should also set quote to False for the usage in SimpleHTTPRequestHandler since they do not appear in attribute either. And I left a reply to your comment. How do you think about these?
msg262021 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-03-19 06:59
Yeah I would be happy if you want to change to html.escape(quote=False) in list_directory() as well. BTW I am getting Undelivered Mail replies from the review comments. I guess they might be getting rejected due to looking like spam (they tend to be classified as spam by default for me, something about how the server sends them).
msg262040 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-19 11:33
OK. You left the comment preferring using ascii or utf-8 to encode the filename in test. But actually the response SimpeHTTPRequestHandler send is explicitly encoded in filesystemdefaultencoding. So in such a case, am I doing right?
msg262041 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-19 11:40
Oops. You have already left new comment. No notify either. :( I like the idea that extract the actual encoding from response header. I will update my patch then.
msg262057 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-19 18:13
The uploaded patch is updated. Hope you have time to review.
msg262062 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-03-20 07:50
Thanks for the reviews. I have updated the patch.
msg262814 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-03 04:40
I left a couple notes on minor style nits (which I can fix when I commit this). But perhaps we should fix the business with tempdir_name and absolute URL paths (Issue 26609) first.
msg263159 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-04-11 01:20
New changeset bf44913588b7 by Martin Panter in branch 'default': Issue #26585: Eliminate _quote_html() and use html.escape(quote=False) https://hg.python.org/cpython/rev/bf44913588b7
msg263164 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-11 04:04
Thanks for the patch Xiang
msg263170 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-11 07:15
Happy to see it works. Thanks for your reviews too. :)
History
Date User Action Args
2022-04-11 14:58:28 admin set github: 70772
2016-04-11 07:15:56 xiang.zhang set messages: +
2016-04-11 04:04:32 martin.panter set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2016-04-11 01:20:28 python-dev set nosy: + python-devmessages: +
2016-04-03 04:40:49 martin.panter set dependencies: + Wrong request target in test_httpservers.pymessages: +
2016-03-20 07:50:41 xiang.zhang set files: + _quote_html_to_html_escape_v5.patchmessages: +
2016-03-19 18:13:41 xiang.zhang set files: + _quote_html_to_html_escape_v4.patchmessages: +
2016-03-19 11:40:03 xiang.zhang set messages: +
2016-03-19 11:33:01 xiang.zhang set messages: +
2016-03-19 06:59:19 martin.panter set messages: +
2016-03-19 05:07:31 xiang.zhang set messages: +
2016-03-18 22:32:17 martin.panter set messages: +
2016-03-18 17:14:47 xiang.zhang set files: + _quote_html_to_html_escape_v3.patchmessages: +
2016-03-18 16:55:56 xiang.zhang set files: + _quote_html_to_html_escape_v2.patchmessages: +
2016-03-18 07:37:20 xiang.zhang set messages: +
2016-03-18 07:33:57 martin.panter set stage: patch review
2016-03-18 07:31:06 martin.panter set nosy: + martin.pantermessages: +
2016-03-18 03:47:46 xiang.zhang set messages: +
2016-03-18 03🔞06 xiang.zhang create