Issue 1097597: SimpleHTTPServer sends wrong Content-Length header (original) (raw)
On Microsoft Windows, text files use \r\n for newline. The SimpleHTTPServer class's "send_head()" method opens files with "r" or "rb" mode depending on the MIME type. Files opened in "r" mode will have \r\n -> \n translation performed automatically, so the stream of bytes sent to the client will be smaller than the size of the file on disk.
Unfortunately, the send_head() method sets the Content-Length header using the file size on disk, without compensating for the \r\n -> \n translation.
I remedied this on my copy thusly:
if mode == "r":
content = f.read()
contentLength = str(len(content))
f.seek(0)
else:
contentLength = str(os.fstat(f.fileno())[6])
self.send_header("Content-Length", contentLength)
This may not be as inefficient as it seems: the entire file was going to be read in anyway for the newline translation.
Hmmm. The code could be slightly simpler:
if mode == "r":
contentLength = len(f.read())
f.seek(0)
else:
contentLength = os.fstat(f.fileno())[6]
self.send_header("Content-Length",
str(contentLength))
The documentation for SimpleHTTPServer in Python 2.3.4 for Windows says:
A 'Content-type:' with the guessed content type is output, and then a blank line, signifying end of headers, and then the contents of the file. The file is always opened in binary mode.
Actually, after Content-type, the Content-Length header is sent.
It would probably be nice if "Content-Length" was "Content-length" or if "Content-type" was "Content-Type", for consistency. The latter is probably best, per RFC 2016.
By the way, clients weren't caching the files I sent. I added another line after the Content-Length handling:
self.send_header("Expires", "Fri, 31 Dec 2100
12:00:00 GMT")
This is egregiously wrong in the general case and just fine in my case.
Logged In: YES user_id=341410
Would it be wrong to open all files with a mode of 'rb', regardless of file type?
While I don't know MIME embeddings all that well, I do have experience with email and that most codecs that use MIME embeddings (like base 64, 85, 95, etc.) are \r, \n and \r\n agnostic..