Issue 22347: mimetypes.guess_type("//example.com") misinterprets host name as file name (original) (raw)

Created on 2014-09-06 02:52 by martin.panter, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
mimetypes-host.patch martin.panter,2015-02-24 05:53 review
Pull Requests
URL Status Linked Edit
PR 15522 merged corona10,2019-08-26 16:04
PR 15685 merged miss-islington,2019-09-05 00:34
PR 15687 merged corona10,2019-09-05 00:49
PR 16724 merged maxking,2019-10-12 00:42
PR 16725 closed miss-islington,2019-10-12 05:41
PR 16727 merged maxking,2019-10-12 15:52
PR 16728 merged maxking,2019-10-12 16:04
Messages (15)
msg226467 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-09-06 02:52
The documentation says that guess_type() takes a URL, but: >>> mimetypes.guess_type("http://example.com") ('application/x-msdownload', None) I suspect the MS download is a reference to *.com files (like DOS's command.com). My current workaround is to strip out the host name from the URL, since I cannot imagine it would be useful for determining the content type. I am also stripping the fragment part. An argument could probably be made for stripping the “;parameters” and “?query” parts as well. >>> # Workaround for mimetypes.guess_type("//example.com") ... # interpreting host name as file name ... url = urlparse("http://example.com") >>> url = net.url_replace(url, netloc="", fragment="") >>> url 'http://' >>> mimetypes.guess_type(url, strict=False) (None, None)
msg236479 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-24 05:53
Posting a patch to fix this. It passes the URL through a urlsplit() → urlunsplit() stage, while removing the scheme://netloc parts.
msg335123 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-02-09 02:15
The proposed patch I mentioned on bpo-35939 also solve the above situation. Python 3.8.0a1+ (heads/bpo-12317:96d37dbcd2, Feb 8 2019, 12:03:40) [Clang 9.1.0 (clang-902.0.39.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import mimetypes >>> mimetypes.guess_type("http://example.com") (None, None) >>> mimetypes.guess_type("example.com") ('application/x-msdownload', None) >>> I've also added the unit tests of mimetypes-host.patch. It works well. I think that we close this issue also when the bpo-35939 is closed. Thanks alot!
msg351156 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 00:34
New changeset 87bd2071c756188b6cd577889fb1682831142ceb by Miss Islington (bot) (Dong-hee Na) in branch 'master': bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) https://github.com/python/cpython/commit/87bd2071c756188b6cd577889fb1682831142ceb
msg351157 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 00:55
New changeset 6d7a786d2e4b48a6b50614e042ace9ff996f0238 by Miss Islington (bot) in branch '3.8': bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) https://github.com/python/cpython/commit/6d7a786d2e4b48a6b50614e042ace9ff996f0238
msg351158 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 01:16
New changeset 8873bff2871078e9f23e6c7d942d3a8edbd0921f by Miss Islington (bot) (Dong-hee Na) in branch '3.7': [3.7] bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) (GH-15687) https://github.com/python/cpython/commit/8873bff2871078e9f23e6c7d942d3a8edbd0921f
msg351162 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-09-05 01:26
@vstinner(my mentor) @maxking Now this issue is solved. I'd like to close this issue. Is it okay?
msg351164 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-09-05 01:29
I think so, yes. Also, while you are at it, can you also close bpo-35939 with a comment that points to this issue and the right PR for the fix?
msg351167 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-09-05 01:34
Great! I will close bpo-35939 also.
msg354471 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-10-11 17:18
This change introduces a potential 3.7 regression; see Issue38449.
msg354521 - (view) Author: miss-islington (miss-islington) Date: 2019-10-12 05:41
New changeset 19a3d873005e5730eeabdc394c961e93f2ec02f0 by Miss Islington (bot) (Abhilash Raj) in branch 'master': bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15522)" (GH-16724) https://github.com/python/cpython/commit/19a3d873005e5730eeabdc394c961e93f2ec02f0
msg354535 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-10-12 16:30
I am going to re-open this since the fixes were reverted in all the branches.
msg354538 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-10-12 16:58
New changeset 5a638a805503131f4a9cc2bbc5944611295c1500 by Abhilash Raj in branch '3.8': [3.8] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs" (GH-16724) (GH-16728) https://github.com/python/cpython/commit/5a638a805503131f4a9cc2bbc5944611295c1500
msg354543 - (view) Author: miss-islington (miss-islington) Date: 2019-10-12 18:50
New changeset 164bee296ab1f87cc05566b39ee8fb9fb64b3e5a by Miss Islington (bot) (Abhilash Raj) in branch '3.7': [3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15685)" (GH-16724) (GH-16727) https://github.com/python/cpython/commit/164bee296ab1f87cc05566b39ee8fb9fb64b3e5a
msg354697 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-10-15 07:30
New changeset 2a405598bbccbc42710dc5ecf3d44c8de4c16582 by Ned Deily (Abhilash Raj) in branch '3.7': [3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15685)" (GH-16724) (GH-16727) https://github.com/python/cpython/commit/2a405598bbccbc42710dc5ecf3d44c8de4c16582
History
Date User Action Args
2022-04-11 14:58:07 admin set github: 66543
2019-10-15 07:30:25 ned.deily set messages: +
2019-10-14 12:44:20 vstinner set nosy: - vstinner
2019-10-12 18:50:06 miss-islington set messages: +
2019-10-12 16:58:15 maxking set messages: +
2019-10-12 16:30:21 maxking set status: closed -> openmessages: +
2019-10-12 16:04:59 maxking set pull_requests: + <pull%5Frequest16310>
2019-10-12 15:52:49 maxking set pull_requests: + <pull%5Frequest16307>
2019-10-12 05:41:53 miss-islington set pull_requests: + <pull%5Frequest16305>
2019-10-12 05:41:50 miss-islington set messages: +
2019-10-12 00:42:58 maxking set pull_requests: + <pull%5Frequest16303>
2019-10-11 17🔞38 ned.deily set nosy: + ned.deilymessages: +
2019-09-05 12:19:43 corona10 link issue35939 superseder
2019-09-05 01:44:00 corona10 set status: open -> closedresolution: fixed
2019-09-05 01:34:56 corona10 set stage: patch review -> resolved
2019-09-05 01:34:34 corona10 set messages: +
2019-09-05 01:29:06 maxking set messages: +
2019-09-05 01:26:52 corona10 set nosy: + vstinner, maxkingmessages: +
2019-09-05 01:16:41 miss-islington set messages: +
2019-09-05 00:55:01 miss-islington set messages: +
2019-09-05 00:49:06 corona10 set pull_requests: + <pull%5Frequest15345>
2019-09-05 00:34:48 miss-islington set pull_requests: + <pull%5Frequest15343>
2019-09-05 00:34:39 miss-islington set nosy: + miss-islingtonmessages: +
2019-08-26 16:04:37 corona10 set stage: patch reviewpull_requests: + <pull%5Frequest15206>
2019-02-09 02:19:48 corona10 set versions: + Python 3.7, Python 3.8, - Python 3.4
2019-02-09 02:15:26 corona10 set nosy: + corona10messages: +
2019-02-08 23:25:29 martin.panter set dependencies: + Remove urllib.parse._splittype from mimetypes.guess_type
2015-02-24 05:53:55 martin.panter set files: + mimetypes-host.patchkeywords: + patchmessages: +
2014-09-06 02:52:37 martin.panter create