[Python-Dev] mimetypes broken on Windows (original) (raw)
Terry Jan Reedy tjreedy at udel.edu
Tue Apr 16 20:00:53 CEST 2013
- Previous message: [Python-Dev] mimetypes broken on Windows
- Next message: [Python-Dev] mimetypes broken on Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 4/15/2013 10:04 PM, Ben Hoyt wrote:
Hi folks,
The built-in mimetypes module is broken on Windows, and it has been since Python 2.7 alpha 1. On all Windows systems I've tried, guesstype() returns the wrong mime type for common types like .png and .jpg. For example (on Python 2.7.4 and 3.3.1): >>> import mimetypes >>> mimetypes.guesstype('f.png') ('image/x-png', None) >>> mimetypes.guesstype('f.jpg') ('image/pjpeg', None) These should be 'image/png' and 'image/jpeg', respectively. There's an open issue for this: http://bugs.python.org/issue15207. However, it hasn't gotten any love in the last few months, so per r.david.murray's comment, I'm posting it here. Dave Chambers, who opened the bug, has proposed a fix, which is significantly better (i.e., not totally broken for common types). However, as I mentioned in http://bugs.python.org/issue15207#msg177030, using the Windows registry for this at all is basically a bad idea, because:
The actual mapping is fixed and more or less system independent while the windows registry is for volatile system and user dependent mappings.
1) Important keys like .jpg and .png aren't in the registry anyway. 2) Some that do exist are wrong in the Windows registry. This includes .zip, which is "application/x-zip-compressed" (at least in my registry) but should be "application/zip". 3) It makes the first call to guesstype() slow (~100ms), which isn't terrible, but with the above concerns, not worth it. 4) Perhaps most importantly: the keys in the Windows registry depend on what programs you have installed. And the users and programs can change registry keys at will.
And change what a given key is mapped to.
Obviously one can work around this bug, either by calling mimetypes.init(files=[]) before any calls to guesstype, or calling init() with your own mime types file. However, "broken out of the box" is going to cause a lot of people headaches. :-)
So my proposal is simply to get rid of readwindowsregistry() altogether, and fall back to the default type mapping in mimetypes.py on Windows systems. This is correct and fast, even if not complete. As
I basicallly agree, but am not sure what to do about back-compatibility considerations. But we do not have to reproduce buggy behavior.
always, folks can always use their own mimetypes file if they want.
In summary: the current behaviour is buggy and broken, the behaviour proposed in Issue 15207 is problematic, getting this from the Windows registry is bad idea, and we should revert the whole registry thing. :-) If folks agree with my reasoning above, I can provide a patch to fix this, along with a patch to the Windows unit tests. -Ben P.S. Kind of proving my point about the fragility of using the registry, the Python 2.7.4 unit test testregistryparsing in testmimetypes.py fail on my machine. It's because I've installed some SQL server, and text/plain is my registry is mapped from .sql (instead of .txt), causing this: Traceback (most recent call last): File "C:\python27\lib\test\testmimetypes.py", line 85, in testregistryparsing eq(self.db.guesstype("foo.txt"), ("text/plain", None)) AssertionError: Tuples differ: (None, None) != ('text/plain', None)
- Previous message: [Python-Dev] mimetypes broken on Windows
- Next message: [Python-Dev] mimetypes broken on Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]