[Python-Dev] Counting collisions for the win (original) (raw)

Guido van Rossum guido at python.org
Fri Jan 20 19:15:21 CET 2012


On Fri, Jan 20, 2012 at 5:10 AM, Barry Warsaw <barry at python.org> wrote:

On Jan 20, 2012, at 01:50 PM, Victor Stinner wrote:

>Counting collision doesn't solve this case, but it doesn't make the >situation worse than before. Raising quickly an exception is better >than stalling for minutes, even if I agree than it is not the best >behaviour. ISTM that adding the possibility of raising a new exception on dictionary insertion is more backward incompatible than changing dictionary order, which for a very long time has been known to not be guaranteed. You're running some application, you upgrade Python because you apply all security fixes, and suddenly you're starting to get exceptions in places you can't really do anything about. Yet those exceptions are now part of the documented public API for dictionaries. This is asking for trouble. Bugs will suddenly start appearing in that application's tracker and they will seem to the application developer like Python just added a new public API in a security release.

Dict insertion can already raise an exception: MemoryError. I think we should be safe if the new exception also derives from BaseException. We should actually eriously consider just raising MemoryException, since introducing a new built-in exception in a bugfix release is also very questionable: code explicitly catching or raising it would not work on previous bugfix releases of the same feature release.

OTOH, if you change dictionary order and that breaks the application, then

the bugs submitted to the application's tracker will be legitimate bugs that have to be fixed even if nothing else changed.

There are lots of things that are undefined according to the language spec (and quite possibly known to vary between versions or platforms or implementations like PyPy or Jython) but which we would never change in a bugfix release.

So I still think we should ditch the paranoia about dictionary order

changing, and fix this without counting. A little bit of paranoia could creep back in by disabling the hash fix by default in stable releases, but I think it would be fine to make that a compile-time option.

I'm sorry, but I don't want to break a user's app with a bugfix release and say "haha your code was already broken you just didn't know it".

Sure, the dict order already varies across Python implementations, possibly across 32/64 bits or operating systems. But many organizations (I know a few :-) have a very large installed software base, created over many years by many people with varying skills, that is kept working in part by very carefully keeping the environment as constant as possible. This means that the target environment is much more predictable than it is for the typical piece of open source software.

Sure, a good Python developer doesn't write apps or tests that depend on dict order. But time and again we see that not everybody writes perfect code every time. Especially users writing "in-house" apps (as opposed to frameworks shared as open source) are less likely to always use the most robust, portable algorithms in existence, because they may know with much more certainty that their code will never be used on certain combinations of platforms. For example, I rarely think about whether code I write might not work on IronPython or Jython, or even CPython on Windows. And if something I wrote suddenly needs to be ported to one of those, well, that's considered a port and I'll just accept that it might mean changing a few things.

The time to break a dependency on dict order is not with a bugfix release but with a feature release: those are more likely to break other things as well anyway, and uses are well aware that they have to test everything and anticipate having to fix some fraction of their code for each feature release. OTOH we have established a long and successful track record of conservative bugfix releases that don't break anything. (I am aware of exactly one thing that was broken by a bugfix release in application code I am familiar with.)

-- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120120/87d46f0d/attachment.html>



More information about the Python-Dev mailing list