Issue 21475: Support the Sitemap extension in robotparser (original) (raw)

Created on 2014-05-12 01:35 by rhettinger, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
robotparser_site_maps_v1.patch pwirtz,2015-10-15 19:51 review
Pull Requests
URL Status Linked Edit
PR 6883 merged python-dev,2018-05-15 21:55
Messages (14)
msg218308 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-05-12 01:35
Resources: * http://en.wikipedia.org/wiki/Robots_exclusion_standard#Nonstandard_extensions * https://support.google.com/webmasters/answer/183669?hl=en * https://github.com/seomoz/reppy
msg218318 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-05-12 09:26
There is a patch for Crawl-delay in issue 16099.
msg252528 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-10-08 10:09
The Crawl-delay part(issue 16099) is now committed.
msg253027 - (view) Author: Peter Wirtz (pwirtz) * Date: 2015-10-15 03:10
I would like to tackle this issue. Should I wait for to be resolved first?
msg253035 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-10-15 08:23
issue 25400 is not a blocker of this, so feel free to write a patch.
msg253063 - (view) Author: Peter Wirtz (pwirtz) * Date: 2015-10-15 19:51
Here is a patch that provides support for the Sitemap extension.
msg255225 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2015-11-23 20:50
Add a test with your patch. Thank you
msg255228 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-11-23 21:17
Peter didn't write a test because issue 25497 (test_robotparser rewrite) needs to be committed first. See in issue 25400 for more information.
msg311416 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-02-01 10:10
Hi @berker and @pwirtz. could you write a test for this issue? thanks
msg315362 - (view) Author: Steven Steven (stevensalbert) Date: 2018-04-16 18:29
Kindly add a test for this issue
msg316740 - (view) Author: Lady Red (mcscope@gmail.com) * Date: 2018-05-15 22:06
I wrote a test for this as it seems to have been abandoned, and opened a PR. https://github.com/python/cpython/pull/6878
msg316743 - (view) Author: Lady Red (mcscope@gmail.com) * Date: 2018-05-15 22:09
Sorry, wrong PR number. it is 6883, and attached to this ticket
msg316811 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-05-16 14:52
New changeset 5db5c0669e624767375593cc1a01f32092c91c58 by Ned Deily (Christopher Beacham) in branch 'master': bpo-21475: Support the Sitemap extension in robotparser (GH-6883) https://github.com/python/cpython/commit/5db5c0669e624767375593cc1a01f32092c91c58
msg316813 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-05-16 14:54
Thanks for the patch, Peter, and thanks for the PR and test, Lady Red! Merged for release in 3.8.0.
History
Date User Action Args
2022-04-11 14:58:03 admin set github: 65674
2018-05-16 14:54:05 ned.deily set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2018-05-16 14:52:15 ned.deily set nosy: + ned.deilymessages: +
2018-05-15 22:09:13 mcscope@gmail.com set messages: +
2018-05-15 22:06:35 mcscope@gmail.com set nosy: + mcscope@gmail.commessages: +
2018-05-15 21:55:17 python-dev set pull_requests: + <pull%5Frequest6556>
2018-04-16 18:29:21 stevensalbert set nosy: + stevensalbertmessages: +
2018-02-01 10:10:27 matrixise set messages: +
2018-01-29 20:55:19 rhettinger set versions: + Python 3.8, - Python 3.6
2015-11-25 13:32:45 vstinner set dependencies: + Rewrite test_robotparser
2015-11-23 21:17:14 berker.peksag set messages: + stage: needs patch -> patch review
2015-11-23 20:50:32 matrixise set nosy: + matrixisemessages: +
2015-10-15 19:51:18 pwirtz set files: + robotparser_site_maps_v1.patchkeywords: + patchmessages: +
2015-10-15 08:23:49 berker.peksag set messages: +
2015-10-15 03:10:23 pwirtz set nosy: + pwirtzmessages: +
2015-10-08 10:09:09 berker.peksag set title: Support the Sitemap and Crawl-delay extensions in robotparser -> Support the Sitemap extension in robotparserstage: needs patchmessages: + versions: + Python 3.6, - Python 3.5
2014-05-12 09:26:56 berker.peksag set nosy: + berker.peksagmessages: +
2014-05-12 01:35:57 rhettinger create