[Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken (original) (raw)
Terry Reedy tjreedy at udel.edu
Wed Jul 14 03:45:25 CEST 2010
- Previous message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Next message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Summary: adding an autojunk heuristic to difflib without also adding a way to turn it off was a bug because it disabled running code.
2.6 and 3.1 each have, most likely, one final version each. Don't fix for these but add something to the docs explaining the problem and future fix.
2.7 will have several more versions over several years and will be used by newcomers who might encounter the problem but not know to diagnose it and patch a private copy of the module. So it should have a fix. Solutions thought of so far.
Modify the heuristic to somewhat fix the problem. Bad (unacceptable) because this would silently change behavior and could break tests.
Add a parameter that defaults to using the heuristic but allows turning it off. Perhaps better, but code that used the new API would crash if run on 2.7.0
Tim Peters
Think the most pressing thing is to give people a way to turn the damn thing off. An ugly way would be to trigger on an unlikely input-output behavior of the existing isjunk argument. For example, if
isjunk("what's the airspeed velocity of an unladen swallow?") returned "don't use auto junk!" and 2.7.1 recognized that as meaning "don't use auto junk", code could be written under 2.7.1 that didn't blow up under 2.7. It could behave differently, although that's true of any way of disabling the auto-junk heuristics.
Ugly, but perhaps crazy brilliant. Use of such a hack would obviously be temporary. Perhaps its use could be made to issue a -3 warning if such were enabled.
I would simplify the suggestion to something like isjunk("disable!heuristic") == True so one could pass lambda s:s=="disable!heuristic" It should be something easy to document and write. This issue is the only place such a string should appear, so it should be safe.
Tim and Antoine: if you two can agree on what to do for 2.7, Eli and I will code it.
This suggestion amounts to a suggestion that the fix for 2.7 be decoupled from a better fix for 3.2. I agree. The latter can be discussed once 2.7 is settled.
[copied to the tracker]
Terry Jan Reedy
- Previous message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Next message: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]