[Python-Dev] Question regarding: Lib/_markupbase.py (original) (raw)

Guido van Rossum guido at python.org
Mon Feb 11 20:02:04 CET 2013


Warning: see http://bugs.python.org/issue17170. Depending on the length of the string being scanned and the probability of finding the specific character, the proposed change could actually be a pessimization. OTOH if the character occurs many times, the slice will actually cause O(N**2) behavior. So yes, it depends greatly on the distribution of the input data.

On Mon, Feb 11, 2013 at 4:37 AM, Oleg Broytman <phd at phdru.name> wrote:

On Mon, Feb 11, 2013 at 12:16:48PM +0000, Developer Developer <_ _justanotherdeveloper at yahoo.de> wrote: > I was having a look at the file: Lib/markupbase.py (@ 82151), function: "parsedoctypeelement" and have seen something that has caught my attention: > > if '>' in rawdata[j:]: > return rawdata.find(">", j) + 1 > > > Wouldn't it be better to do the following? > pos = rawdata.find(">", j) > if pos != -1: > return pos + 1 > > Otherwise I think we are scanning rawdata[j:] twice.

Is it really a significant optimization? Can you do an experiment and show figures? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN.


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20130211/c59eb806/attachment.html>



More information about the Python-Dev mailing list