Issue 10486: http.server doesn't set all CGI environment variables (original) (raw)

Issue10486

Created on 2010-11-21 07:41 by v+python, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23604 merged orsenthil,2020-12-01 22:29
Messages (13)
msg121878 - (view) Author: Glenn Linderman (v+python) * Date: 2010-11-21 07:41
HTTP_HOST HTTP_PORT REQUEST_URI are variables that my CGI scripts use, but which are not available from http.server or CGIHTTPServer (until I added them). There may be more standard variables that are not set, I didn't attempt to enumerate the whole list.
msg122258 - (view) Author: Glenn Linderman (v+python) * Date: 2010-11-24 03:41
Took a little more time to do a little more analysis on this one. Compared a sample query via Apache on Linux vs http.server, then looked up the CGI RFC for more info: DOCUMENT_ROOT: ... GATEWAY_INTERFACE: CGI/1.1 HTTP_ACCEPT: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 HTTP_ACCEPT_CHARSET: ISO-8859-1,utf-8;q=0.7,*;q=0.7 HTTP_ACCEPT_ENCODING: gzip,deflate HTTP_ACCEPT_LANGUAGE: en-us,en;q=0.5 HTTP_CONNECTION: keep-alive HTTP_COOKIE: ... HTTP_HOST: ... HTTP_KEEP_ALIVE: 115 HTTP_USER_AGENT: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 PATH: /usr/local/bin:/usr/bin:/bin PATH_INFO: ... PATH_TRANSLATED: ... QUERY_STRING: REMOTE_ADDR: 173.75.100.22 REMOTE_PORT: 50478 REQUEST_METHOD: GET REQUEST_URI: ... SCRIPT_FILENAME: ... SCRIPT_NAME: ... SERVER_ADDR: ... SERVER_ADMIN: ... SERVER_NAME: ... SERVER_PORT: ... SERVER_PROTOCOL: HTTP/1.1 SERVER_SIGNATURE:
Apache Server at rkivs.com Port 80
SERVER_SOFTWARE: Apache UNIQUE_ID: TLEs8krc24oAABQ1TIUAAAPN Above from Apache, below from http.server GATEWAY_INTERFACE: CGI/1.1 HTTP_USER_AGENT: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 PATH_INFO: ... PATH_TRANSLATED: ... QUERY_STRING: ... REMOTE_ADDR: 127.0.0.1 REQUEST_METHOD: GET SCRIPT_NAME: ... SERVER_NAME: ... SERVER_PORT: ... SERVER_PROTOCOL: HTTP/1.0 SERVER_SOFTWARE: SimpleHTTP/0.6 Python/3.2a4 Analysis of missing variables between Apache and http.server: DOCUMENT_ROOT HTTP_ACCEPT HTTP_ACCEPT_CHARSET HTTP_ACCEPT_ENCODING HTTP_ACCEPT_LANGUAGE HTTP_CONNECTION HTTP_COOKIE HTTP_HOST HTTP_KEEP_ALIVE HTTP_PORT PATH REQUEST_URI SCRIPT_FILENAME SERVER_ADDR SERVER_ADMIN Additional variables mentioned in RFC 3875, not used for my test requests: AUTH_TYPE CONTENT_LENGTH CONTENT_TYPE REMOTE_IDENT REMOTE_USER
msg158532 - (view) Author: Glenn Linderman (v+python) * Date: 2012-04-17 06:05
Reading the CGI 1.1 spec, it says: The QUERY_STRING value provides the query-string part of the Script-URI. (See section 3.3). The server MUST set this variable; if the Script-URI does not include a query component, the QUERY_STRING MUST be defined as an empty string (""). Therefore the code in run_cgi that says: if query: env['QUERY_STRING'] = query should have the conditional removed.
msg235797 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-12 04:58
Issue 5054 is for HTTP_ACCEPT
msg329766 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2018-11-12 22:10
The reference given in https://github.com/python/cpython/blob/b36b0a3765bcacb4dcdbf12060e9e99711855da8/Lib/http/server.py#L1074 is not accessible anymore. I think we should replace it by https://tools.ietf.org/html/rfc3875#section-4.1
msg329771 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2018-11-12 23:10
AUTH_TYPE, CONTENT_LENGTH, CONTENT_TYPE, REMOTE_USER are present REMOTE_IDENT is not but I'm not sure it's worth adding. I can send a PR to add REMOTE_HOST and remove the condition for QUERY_STRING. Otherwise, I don't think the other environment variables should be added, they are implementation dependant and not defined in RFC 3875. Should we close this issue?
msg329815 - (view) Author: Glenn Linderman (v+python) * Date: 2018-11-13 08:11
Rémi Lapeyre, glad to see your interest here, as this is an old and languishing bug. I would have hoped based on my input, that had there been anyone that was maintaining the Python web server code, that they might have done a more complete analysis than I did. I note the document you reference is from 2004 (and I referenced it too), and doesn't include mention of the HTTP_COOKIE header, yet that header is frequently used in practical web applications. Apache supports it (as noted). My point is that it is not clear that conforming to the RFC 3875 from 2004 is really sufficient to build a useful web server. While it is true that my references to Apache are to a particular implementation, it is a widespread implementation, which other implementations attempt to be compatible with, indicating that being reasonably compatible with Apache would seem to be a good thing for other web server implementations. A few more environment variables don't cost a lot, and seem to be useful. I don't know if some or all of the additional environment variables implemented by Apache are standardized by RFC or other standards, or whether they are common practice, or unique to Apache. Nor where such standards might be fonud, but I would hope a maintainer of the Python web server would be interested in sorting out such environment variables and making that determination, rather than relying on a 14 year old RFC as the definitive source, when web technologies have progressed significantly in the last 14 years. I would agree that variables that are unique to Apache might not want to be implemented, but on the other hand, with other implementations following Apache's lead, there may be few that are unique to Apache.
msg329822 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2018-11-13 10:11
Hi Glenn, I'm not aware of a document that defines CGI better than the RFC and I don't know it enough to disgress from the published standard (even if it is not what isdone today as I don't know the current practices enough). Here is the variables defined by nginx 1.10.3 on Debian: fastcgi_param QUERY_STRING querystring;fastcgiparamREQUESTMETHODquery_string; fastcgi_param REQUEST_METHOD querystring;fastcgiparamREQUESTMETHODrequest_method; fastcgi_param CONTENT_TYPE contenttype;fastcgiparamCONTENTLENGTHcontent_type; fastcgi_param CONTENT_LENGTH contenttype;fastcgiparamCONTENTLENGTHcontent_length; fastcgi_param SCRIPT_NAME fastcgiscriptname;fastcgiparamREQUESTURIfastcgi_script_name; fastcgi_param REQUEST_URI fastcgiscriptname;fastcgiparamREQUESTURIrequest_uri; fastcgi_param DOCUMENT_URI documenturi;fastcgiparamDOCUMENTROOTdocument_uri; fastcgi_param DOCUMENT_ROOT documenturi;fastcgiparamDOCUMENTROOTdocument_root; fastcgi_param SERVER_PROTOCOL serverprotocol;fastcgiparamREQUESTSCHEMEserver_protocol; fastcgi_param REQUEST_SCHEME serverprotocol;fastcgiparamREQUESTSCHEMEscheme; fastcgi_param HTTPS httpsifnotempty;fastcgiparamGATEWAYINTERFACECGI/1.1;fastcgiparamSERVERSOFTWAREnginx/https if_not_empty; fastcgi_param GATEWAY_INTERFACE CGI/1.1; fastcgi_param SERVER_SOFTWARE nginx/httpsifnotempty;fastcgiparamGATEWAYINTERFACECGI/1.1;fastcgiparamSERVERSOFTWAREnginx/nginx_version; fastcgi_param REMOTE_ADDR remoteaddr;fastcgiparamREMOTEPORTremote_addr; fastcgi_param REMOTE_PORT remoteaddr;fastcgiparamREMOTEPORTremote_port; fastcgi_param SERVER_ADDR serveraddr;fastcgiparamSERVERPORTserver_addr; fastcgi_param SERVER_PORT serveraddr;fastcgiparamSERVERPORTserver_port; fastcgi_param SERVER_NAME $server_name; # PHP only, required if PHP was built with --enable-force-cgi-redirect fastcgi_param REDIRECT_STATUS 200; Someone that knows CGI better than me may know the way forward
msg329851 - (view) Author: Pierre Quentel (quentel) * Date: 2018-11-13 15:29
The QUERY_STRING value is always set by the code at lines 1135-1137 of http.server: for k in ('QUERY_STRING', 'REMOTE_HOST', 'CONTENT_LENGTH', 'HTTP_USER_AGENT', 'HTTP_COOKIE', 'HTTP_REFERER'): env.setdefault(k, "") The RFC for CGI has not evolved since 2004, probably because the technology is stable, and also because other, more efficient protocols have been defined to avoid the "CGI overhead" (FastCGI for instance). I think that http.server should only implement the "meta-variables" defined in RFC 3875: - AUTH_TYPE CONTENT_LENGTH CONTENT_TYPE GATEWAY_INTERFACE PATH_INFO PATH_TRANSLATED QUERY_STRING REMOTE_ADDR REMOTE_HOST REMOTE_IDENT REMOTE_USER REQUEST_METHOD SCRIPT_NAME SERVER_NAME SERVER_PORT SERVER_PROTOCOL SERVER_SOFTWARE. Some of these must always be set (eg QUERY_STRING, REQUEST_METHOD, SERVER_NAME...) but for other ones, there are conditions (for instance for CONTENT_LENGTH: "The server MUST set this meta-variable if and only if the request is accompanied by a message-body entity") - "protocol-specific meta variables" : for HTTP, variables determined by the HTTP request headers such as HTTP_COOKIE (cf section 4.1.18. Protocol-Specific Meta-Variables) Other meta variables are probably beyond the scope of a module in the standard library. In short, in my opinion the issue can be closed.
msg329863 - (view) Author: Glenn Linderman (v+python) * Date: 2018-11-13 19:37
That's interesting, Pierre, I hadn't really read the RFC carefully, to realize that many of the "missing" variables from Apache are HTTP headers, and that section 4.1.18 tell how to convert HTTP headers to meta variables. The code in server.py 3.6 (Sorry, I should check the master branch) picks specific HTTP_ headers to include, rather than including them all per the rules. Doing the latter would go a long way toward being more compatible with Apache. I don't know if Rémi got his NGINX list from source code (looks like it) and if maybe NGINX also defines meta variables from the HTTP_ headers, that are not listed in the header file he seems to be quoting. Unless the code has already been improved for Python 3.7, I think there is still some work to do to make server.py conform even to the RFC, if not be compatible with Apache.
msg375207 - (view) Author: Maarten (maarten) * Date: 2020-08-12 02:54
The CGI examples of urwid (see http://urwid.org/manual/displaymodules.html#cgi-web-display-module-web-display) don't work on http.server because of missing meta variables. Using cgitb, I found out that the webdriver expects the environment variable `HTTP_X_URWID_METHOD` to be set. The javascript sets the "X-Urwid-Method" header (using XmlHttpRequest), but these are not visible by the CGI python script. So some scripts extra Meta-Variables neet to be set. I think section 4.1.18 applied because it is a http header that is being set. The sections says that these meta-variables are optional though. I argue that having access to extra headers is useful.
msg375801 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2020-08-22 14:36
Hello Maarten, > Using cgitb, I found out that the webdriver expects the environment variable `HTTP_X_URWID_METHOD` to be set. The javascript sets the "X-Urwid-Method" header (using XmlHttpRequest), but these are not visible by the CGI python script. > So some scripts extra Meta-Variables neet to be set Thanks for your comment on this old issue. The topic under discussion was about some existing "more standard" CGI variables than special meta variables. Even if the first standard CGI variables issue get exposed, I doubt the meta variables will get added. I will think about considering the minimal change required to accomplish the task and close the issue.
msg382280 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2020-12-01 22:40
I spent some time reviewing and researching the specification. It also says The server is not required to create meta-variables for all the header fields that it receives. And this in issue, open since 2010, we have issue two different set of variables one from Apache and from Nginx. So, it Is not certain if http.server should be alinged it any or all, and plus if anything is required. The discussion on QUERY_STRING was noted, but as Pierre pointed out it was set too for k in ('QUERY_STRING', 'REMOTE_HOST', 'CONTENT_LENGTH', 'HTTP_USER_AGENT', 'HTTP_COOKIE', 'HTTP_REFERER'): env.setdefault(k, "") For cosmetic purpose, I could remove the existing if condition - https://github.com/python/cpython/pull/23604 I am not sure if we need to add other variables with an empty string value for any reason. As a maintainer, I think, we should close this issue. If there is a bug report, like , then that is a valid issue, and we should fix it. If there any specific issues raised with parsing or lack of "required" meta variable that caused the application to break, even that could be fixed. I am closing this issue with a cosmetic change that stemmed out from the discussion - https://github.com/python/cpython/pull/23604
History
Date User Action Args
2022-04-11 14:57:09 admin set github: 54695
2020-12-01 22:40:05 orsenthil set status: open -> closedversions: + Python 3.10, - Python 2.7, Python 3.2, Python 3.3messages: + resolution: wont fixstage: patch review -> resolved
2020-12-01 22:29:09 orsenthil set keywords: + patchstage: needs patch -> patch reviewpull_requests: + <pull%5Frequest22473>
2020-08-22 14:36:23 orsenthil set messages: + stage: needs patch
2020-08-12 02:54:55 maarten set nosy: + maartenmessages: +
2018-11-13 19:37:32 v+python set messages: +
2018-11-13 15:29:05 quentel set messages: +
2018-11-13 10:11:44 remi.lapeyre set messages: +
2018-11-13 08:11:29 v+python set messages: +
2018-11-12 23:10:38 remi.lapeyre set messages: +
2018-11-12 22:10:59 remi.lapeyre set messages: +
2018-11-10 21:33:11 quentel set nosy: + quentel
2018-11-10 09:50:34 remi.lapeyre set nosy: + remi.lapeyre
2015-02-12 16:26:39 demian.brecht set nosy: + demian.brecht
2015-02-12 04:58:47 martin.panter set nosy: + martin.pantermessages: +
2012-08-12 12:48:16 berker.peksag set versions: + Python 3.3, - Python 2.6, Python 3.1
2012-04-17 06:05:07 v+python set messages: +
2011-03-18 02:08:24 orsenthil set assignee: orsenthilnosy:fdrake, facundobatista, orsenthil, v+python
2010-11-24 03:41:16 v+python set messages: +
2010-11-21 16:58:47 pitrou set nosy: + fdrake, facundobatista, orsenthil
2010-11-21 07:41:33 v+python create