Issue 36207: robotsparser deny all with some rules (original) (raw)

Created on 2019-03-06 09:42 by quentin-maire, last changed 2022-04-11 14:59 by admin.

Messages (6)
msg337285 - (view) Author: wats0ns (quentin-maire) Date: 2019-03-06 09:42
RobotsParser parse a "Disallow: ?" rule as a deny all, but this is a valid rule that should be interpreted as "Disallow: /?*" or "Disallow: /*?*"
msg338293 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2019-03-18 22:13
Can you provide a link to documentation showing that "Disallow: ?" shouldn't be the same as deny all? Thanks!
msg338298 - (view) Author: wats0ns (quentin-maire) Date: 2019-03-18 23:20
I can't find a documentation about it, but all of the robots.txt checkers I find behave like this. You can test on this site: http://www.eskimoz.fr/robots.txt, I believe that this is how it's implemented now in most parsers ?
msg390073 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-02 15:48
I removed almost all messages of this issue since most of them looked list SPAM. I also blocked user accounts who posted SPAM. If it was a mistake, contact me. This is the Python bug tracker, not a forum to ask questions how to use Python, or to report bugs in your website. Multiple comments were written in French, whereas this bug tracker is in English. I even hesitate to close the issue since it got too many SPAM comments.
msg408351 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-12-12 00:11
I restored one non-spam message from the OP that was deleted. Changing to enhancement because this is not a bug (i.e., deviation from documentation). I don't know enough about this to have a view on whether this enhancement request should be accepted.
msg416852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-04-06 10:21
I removed two comments: none of the mentioned URL contains a "Disallow: ?" rule and the comments didn't add any value to this issue. It looks like regular spam (SEO).
History
Date User Action Args
2022-04-11 14:59:12 admin set github: 80388
2022-04-06 10:21:58 vstinner set messages: +
2022-04-06 10:21:10 vstinner set messages: -
2022-04-06 10:21:08 vstinner set messages: -
2022-04-06 09:17:05 adiboo67 set messages: +
2022-04-05 10:27:26 adiboo67 set nosy: + adiboo67messages: +
2021-12-12 00:11:21 iritkatriel set versions: + Python 3.11, - Python 3.5nosy: + iritkatrielmessages: + type: behavior -> enhancement
2021-12-12 00:08:06 iritkatriel set nosy: + quentin-mairemessages: +
2021-09-29 17:08:26 vstinner set messages: -
2021-09-29 16:17:35 nico.bonefato set nosy: + nico.bonefatomessages: +
2021-04-02 15:48:12 vstinner set nosy: + vstinnermessages: +
2021-04-02 15:46:06 vstinner set messages: -
2021-04-02 15:46:05 vstinner set messages: -
2021-04-02 15:45:26 vstinner set messages: -
2021-04-02 15:44:53 vstinner set messages: -
2021-04-02 15:44:49 vstinner set messages: -
2021-04-02 15:44:33 vstinner set messages: -
2021-04-02 15:44:05 vstinner set messages: -
2021-04-02 15:42:31 vstinner set messages: -
2021-04-02 15:41:58 vstinner set messages: -
2021-04-02 15:41:37 vstinner set messages: -
2021-04-02 15:41:22 vstinner set messages: -
2021-04-02 15:40:51 vstinner set messages: -
2021-04-02 15:39:54 vstinner set messages: -
2021-04-02 15:39:52 vstinner set messages: -
2021-04-02 15:38:37 vstinner set messages: -
2021-04-02 15:37:42 vstinner set messages: -
2021-04-02 15:36:49 vstinner set title: référencement naturel -> robotsparser deny all with some rules
2021-04-02 15:36:09 vstinner set messages: -
2021-04-02 15:36:07 vstinner set messages: -
2021-04-02 15:33:20 EricG set messages: +
2021-04-02 15:30:36 EricG set nosy: + EricG, - jeanotlapin, nico702, ideeanimationanniversairemessages: + title: robotsparser deny all with some rules -> référencement naturel
2021-01-28 13:35:19 jeanotlapin set nosy: + jeanotlapinmessages: +
2020-11-19 17:30:47 ideeanimationanniversaire set nosy: + ideeanimationanniversairemessages: +
2020-10-25 23:01:14 nico702 set messages: +
2020-10-25 22:55:38 nico702 set nosy: + nico702, - matthieuhemeamessages: +
2020-10-05 18:11:29 matthieuhemea set nosy: + matthieuhemea, - Patrick Valibus 410 Gone, Jmgray47, arnaud, calamina, amiir.mascud, jeanotlapinmessages: +
2020-09-18 15:20:07 jeanotlapin set nosy: + jeanotlapinmessages: +
2020-09-17 15:15:46 amiir.mascud set nosy: + amiir.mascudmessages: +
2020-08-28 12:00:03 calamina set nosy: + calaminamessages: +
2020-07-31 13:24:17 arnaud set nosy: + arnaudmessages: +
2020-07-31 04:34:49 Jmgray47 set nosy: + Jmgray47messages: +
2020-06-22 20:35:43 Patrick Valibus 410 Gone set nosy: + Patrick Valibus 410 Gone, - cheryl.sabella, quentin-maire, lagustais, artasca, Fred AYERS, mathias44messages: +
2020-05-28 23:54:56 mathias44 set nosy: + mathias44messages: +
2020-04-28 17:20:52 Fred AYERS set nosy: + Fred AYERSmessages: +
2020-04-15 12:57:20 artasca set nosy: + artascamessages: +
2020-04-04 16:46:51 lagustais set nosy: + lagustaismessages: +
2019-03-18 23:20:00 quentin-maire set messages: +
2019-03-18 22:13:37 cheryl.sabella set nosy: + cheryl.sabellamessages: +
2019-03-06 09:42:01 quentin-maire create