Created on 2010-05-16 14:47 by dabrahams, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (6) |
|
|
msg105870 - (view) |
Author: Dave Abrahams (dabrahams) |
Date: 2010-05-16 14:47 |
According to the RFC, the server is allowed to send back any encoding it likes when no Accept-Encoding header is supplied, but all the examples I can find of urllib2.urlopen usage assume they're getting plain text back. I think it would be better to inject an Accept-Encoding header when none is explicitly supplied so that nobody else trips over this issue. See http://support.github.com/discussions/site/1510 |
|
|
msg105937 - (view) |
Author: Senthil Kumaran (orsenthil) *  |
Date: 2010-05-17 20:30 |
HTTP Ref says that Server can send any encoding, if client does not specify Accept-Encoding header. But if 'identity' is one of the encoding that server recognizes (?), then it should send it as identity, which indicates untransformed content. I also see in the httplib that Accept-Encoding = 'identity' is added in the request level to the headers. I shall see what is missing here, if it is not being sent for all requests. BTW, I could not figure out the problem you are facing from the url mentioned. I specifically do not see any interleaving gzip and no-gzip request behaviours at different points. |
|
|
msg105959 - (view) |
Author: Dave Abrahams (dabrahams) |
Date: 2010-05-18 10:02 |
How many tests did you run? My two tests were minutes apart. I have the feeling that this has something to do with cacheing behavior on the server. |
|
|
msg183573 - (view) |
Author: karl (karlcow) * |
Date: 2013-03-06 02:32 |
What was the content of http://support.github.com/discussions/site/1510 I can't find it. Is the issue still going on? |
|
|
msg239926 - (view) |
Author: Demian Brecht (demian.brecht) *  |
Date: 2015-04-02 15:32 |
This doesn't seem to be an issue in 3.4+, the following headers are injected in a call to urlopen(): GET / HTTP/1.1 Accept-Encoding: identity Host: example.com User-Agent: Python-urllib/3.4 Connection: close However, this is not the same behaviour in 2.7: GET / HTTP/1.0 Host: example.com User-Agent: Python-urllib/1.17 That said, I wouldn't see this as a bug but a feature request, so it should be invalid for 2.7. Setting this to pending to close unless anyone has any objections or further details. |
|
|
msg265526 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2016-05-14 12:46 |
I suspect for Demian’s 2.7 experiment, he used the older urllib.urlopen(), rather than urllib2.urlopen() as given in the original description. When I use urllib2.urlopen("http://localhost/"), I see GET / HTTP/1.1 Accept-Encoding: identity Host: localhost Connection: close User-Agent: Python-urllib/2.7 Even in the urllib (no 2) case, since it is using HTTP 1.0, I suspect not having Accept-Encoding is not such a problem. The underlying HTTP library has always added “Accept-Encoding: identity” for HTTP 1.1 by default (https://hg.python.org/cpython/annotate/4a3e9871b41b/Lib/httplib.py#l444), so I am closing this. |
|
|
History |
|
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:01 |
admin |
set |
github: 52978 |
2016-05-14 12:46:10 |
martin.panter |
set |
status: pending -> closedtitle: Should urrllib2.urlopen send an Accept-Encoding header? -> Should urllib2.urlopen send an Accept-Encoding header?nosy: + martin.pantermessages: + resolution: works for me |
2015-04-02 15:32:23 |
demian.brecht |
set |
status: open -> pendingnosy: + demian.brechtmessages: + |
2013-03-06 02:32:12 |
karlcow |
set |
nosy: + karlcowmessages: + |
2010-12-22 07:48:02 |
eric.araujo |
set |
nosy: + eric.araujoversions: - Python 2.6 |
2010-05-18 10:02:40 |
dabrahams |
set |
messages: + |
2010-05-17 20:30:17 |
orsenthil |
set |
messages: + |
2010-05-16 18:24:46 |
pitrou |
set |
assignee: orsenthiltype: behaviornosy: + orsenthilversions: + Python 3.1, Python 2.7, Python 3.2 |
2010-05-16 14:47:09 |
dabrahams |
create |
|