[Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken (original) (raw)
Tres Seaver tseaver at palladion.com
Wed Jul 7 22:11:29 CEST 2010
- Previous message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Next message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Antoine Pitrou wrote:
On Wed, 7 Jul 2010 19:44:31 +0200 Eli Bendersky <eliben at gmail.com> wrote:
For what it's worth, my benchmarking showed that modifying the heuristic to only kick in when there are more than 100 kinds of elements (Terry's option A) didn't affect the runtime of matching whatsoever, even when the heuristic does kick in. All it adds, really, is the overhead of a single 'if' statement. So it wouldn't be right to assume that somehow modifying the heuristic or allowing to turn it off will negatively affect performance in the special case Tim originally optimized for. Just because it doesn't affect performance in your tests doesn't mean it won't do so in the general case. Consider a case where Tim's junk optimization kicked in and helped improve performance a lot, but where there are still less than 100 alphabet symbols. The new heuristic will ruin this use case.
That would describe pretty much every C program ever written, for instance, and nearly as high a percentage of all Python modules / scripts ever written: the 'string.printable' set, less formfeed and vertical tab, is 98 characters long.
That's why I'm advocating a dedicated flag instead.
+1.
Tres. - --
Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkw032wACgkQ+gerLs4ltQ5xigCfVLhTzFX733cZAO2Jv6JZQm0i HoIAmQEnTyxa2oLAuE22M7FZHUS00xDu =WYt2 -----END PGP SIGNATURE-----
- Previous message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Next message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]