[Python-Dev] [Python-checkins] cpython: Close #14205: dict lookup raises a RuntimeError if the dict is modified during (original) (raw)

Jim Jewett jimjjewett at gmail.com
Fri Mar 9 22:32:33 CET 2012


I do not believe the change set below is valid.

As I read it, the new test verifies that one particular type of Nasty key will provoke a RuntimeError -- but that particular type already did so, by hitting the recursion limit. (It doesn't even really mutate the dict.)

Meanwhile, the patch throws out tests for several different types of mutations that have caused problems -- even segfaults -- in the past, even after the dict implementation code was already "fixed".

Changing these tests to "assertRaises" would be fine, but they should all be kept; if nothing else, they test whether you've caught all mutation avenues.

-jJ

On Mon, Mar 5, 2012 at 7:13 PM, victor.stinner <python-checkins at python.org> wrote:

http://hg.python.org/cpython/rev/934aaf2191d0 changeset:   75445:934aaf2191d0 user:        Victor Stinner <victor.stinner at gmail.com> date:        Tue Mar 06 01:03:13 2012 +0100 summary:  Close #14205: dict lookup raises a RuntimeError if the dict is modified during a lookup.

"if you want to make a sandbox on top of CPython, you have to fix segfaults" so let's fix segfaults! files:  Lib/test/crashers/nastyeqvsdict.py |   47 --  Lib/test/testdict.py                 |   22 +-  Lib/test/testmutants.py              |  291 --------------  Misc/NEWS                             |    5 +-  Objects/dictobject.c                  |   18 +-  5 files changed, 31 insertions(+), 352 deletions(-)

diff --git a/Lib/test/crashers/nastyeqvsdict.py b/Lib/test/crashers/nastyeqvsdict.py deleted file mode 100644 --- a/Lib/test/crashers/nastyeqvsdict.py +++ /dev/null @@ -1,47 +0,0 @@ -# from http://mail.python.org/pipermail/python-dev/2001-June/015239.html - -# if you keep changing a dictionary while looking up a key, you can -# provoke an infinite recursion in C - -# At the time neither Tim nor Michael could be bothered to think of a -# way to fix it. - -class Yuck: -    def init(self): -        self.i = 0 - -    def makedangerous(self): -        self.i = 1 - -    def hash(self): -        # direct to slot 4 in table of size 8; slot 12 when size 16 -        return 4 + 8 - -    def eq(self, other): -        if self.i == 0: -            # leave dict alone -            pass -        elif self.i == 1: -            # fiddle to 16 slots _-            self.filldict(6) -            self.i = 2 -        else: -            # fiddle to 8 slots _-            self.filldict(4) -            self.i = 1 - -        return 1 - _-    def filldict(self, n): -        self.i = 0 -        dict.clear() -        for i in range(n): -            dict[i] = i -        dict[self] = "OK!" - -y = Yuck() -dict = {y: "OK!"} - -z = Yuck() -y.makedangerous() -print(dict[z]) diff --git a/Lib/test/testdict.py b/Lib/test/testdict.py --- a/Lib/test/testdict.py +++ b/Lib/test/testdict.py @@ -379,7 +379,7 @@ x.fail = True self.assertRaises(Exc, d.pop, x) -    def testmutatingiteration(self): +    def testmutatingiteration(self): # changing dict size during iteration d = {} d[1] = 1 @@ -387,6 +387,26 @@ for i in d: d[i+1] = 1 +    def testmutatinglookup(self): +        # changing dict during a lookup +        class NastyKey: +            mutatedict = None + +            def hash(self): +                # hash collision! +                return 1 + +            def eq(self, other): +                if self.mutatedict: +                    self.mutatedict[self] = 1 +                return self == other + +        d = {} +        d[NastyKey()] = 0 +        NastyKey.mutatedict = d +        with self.assertRaises(RuntimeError): +            d[NastyKey()] = None + def testrepr(self): d = {} self.assertEqual(repr(d), '{}') diff --git a/Lib/test/testmutants.py b/Lib/test/testmutants.py deleted file mode 100644 --- a/Lib/test/testmutants.py +++ /dev/null @@ -1,291 +0,0 @@ -from test.support import verbose, TESTFN -import random -import os - -# From SF bug #422121:  Insecurities in dict comparison. - -# Safety of code doing comparisons has been an historical Python weak spot. -# The problem is that comparison of structures written in C naturally -# wants to hold on to things like the size of the container, or "the -# biggest" containee so far, across a traversal of the container; but -# code to do containee comparisons can call back into Python and mutate -# the container in arbitrary ways while the C loop is in midstream.  If the -# C code isn't extremely paranoid about digging things out of memory on -# each trip, and artificially boosting refcounts for the duration, anything -# from infinite loops to OS crashes can result (yes, I use Windows ). -# -# The other problem is that code designed to provoke a weakness is usually -# white-box code, and so catches only the particular vulnerabilities the -# author knew to protect against.  For example, Python's list.sort() code -# went thru many iterations as one "new" vulnerability after another was -# discovered. -# -# So the dict comparison test here uses a black-box approach instead, -# generating dicts of various sizes at random, and performing random -# mutations on them at random times.  This proved very effective, -# triggering at least six distinct failure modes the first 20 times I -# ran it.  Indeed, at the start, the driver never got beyond 6 iterations -# before the test died. - -# The dicts are global to make it easy to mutate tham from within functions. -dict1 = {} -dict2 = {} - -# The current set of keys in dict1 and dict2.  These are materialized as -# lists to make it easy to pick a dict key at random. -dict1keys = [] -dict2keys = [] - -# Global flag telling maybemutate() whether to consider mutating. -mutate = 0 - -# If global mutate is true, consider mutating a dict.  May or may not -# mutate a dict even if mutate is true.  If it does decide to mutate a -# dict, it picks one of {dict1, dict2} at random, and deletes a random -# entry from it; or, more rarely, adds a random element. - -def maybemutate(): -    global mutate -    if not mutate: -        return -    if random.random() < 0.5:_ _-        return_ _-_ _-    if random.random() < 0.5:_ _-        target, keys = dict1, dict1keys_ _-    else:_ _-        target, keys = dict2, dict2keys_ _-_ _-    if random.random() < 0.2:_ _-        # Insert a new key._ _-        mutate = 0   # disable mutation until key inserted_ _-        while 1:_ _-            newkey = Horrid(random.randrange(100))_ _-            if newkey not in target:_ _-                break_ _-        target[newkey] = Horrid(random.randrange(100))_ _-        keys.append(newkey)_ _-        mutate = 1_ _-_ _-    elif keys:_ _-        # Delete a key at random._ _-        mutate = 0   # disable mutation until key deleted_ _-        i = random.randrange(len(keys))_ _-        key = keys[i]_ _-        del target[key]_ _-        del keys[i]_ _-        mutate = 1_ _-_ _-# A horrid class that triggers random mutations of dict1 and dict2 when_ _-# instances are compared._ _-_ _-class Horrid:_ _-    def _init_(self, i):_ _-        # Comparison outcomes are determined by the value of i._ _-        self.i = i_ _-_ _-        # An artificial hashcode is selected at random so that we don't_ _-        # have any systematic relationship between comparison outcomes_ _-        # (based on self.i and other.i) and relative position within the_ _-        # hash vector (based on hashcode)._ _-        # XXX This is no longer effective._ _-        ##self.hashcode = random.randrange(1000000000)_ _-_ _-    def _hash_(self):_ _-        return 42_ _-        return self.hashcode_ _-_ _-    def _eq_(self, other):_ _-        maybemutate()   # The point of the test._ _-        return self.i == other.i_ _-_ _-    def _ne_(self, other):_ _-        raise RuntimeError("I didn't expect some kind of Spanish inquisition!")_ _-_ _-    _lt_ = _le_ = _gt_ = _ge_ = _ne__ _-_ _-    def _repr_(self):_ _-        return "Horrid(%d)" % self.i_ _-_ _-# Fill dict d with numentries (Horrid(i), Horrid(j)) key-value pairs,_ _-# where i and j are selected at random from the candidates list._ _-# Return d.keys() after filling._ _-_ _-def filldict(d, candidates, numentries):_ _-    d.clear()_ _-    for i in range(numentries):_ _-        d[Horrid(random.choice(candidates))] = _ _-            Horrid(random.choice(candidates))_ _-    return list(d.keys())_ _-_ _-# Test one pair of randomly generated dicts, each with n entries._ _-# Note that dict comparison is trivial if they don't have the same number_ _-# of entires (then the "shorter" dict is instantly considered to be the_ _-# smaller one, without even looking at the entries)._ _-_ _-def testone(n):_ _-    global mutate, dict1, dict2, dict1keys, dict2keys_ _-_ _-    # Fill the dicts without mutating them._ _-    mutate = 0_ _-    dict1keys = filldict(dict1, range(n), n)_ _-    dict2keys = filldict(dict2, range(n), n)_ _-_ _-    # Enable mutation, then compare the dicts so long as they have the_ _-    # same size._ _-    mutate = 1_ _-    if verbose:_ _-        print("trying w/ lengths", len(dict1), len(dict2), end=' ')_ _-    while dict1 and len(dict1) == len(dict2):_ _-        if verbose:_ _-            print(".", end=' ')_ _-        c = dict1 == dict2_ _-    if verbose:_ _-        print()_ _-_ _-# Run testone n times.  At the start (before the bugs were fixed), 20_ _-# consecutive runs of this test each blew up on or before the sixth time_ _-# testone was run.  So n doesn't have to be large to get an interesting_ _-# test._ _-# OTOH, calling with large n is also interesting, to ensure that the fixed_ _-# code doesn't hold on to refcounts *too* long (in which case memory would_ _-# leak)._ _-_ _-def test(n):_ _-    for i in range(n):_ _-        testone(random.randrange(1, 100))_ _-_ _-# See last comment block for clues about good values for n._ _-test(100)_ _-_ _-##########################################################################_ _-# Another segfault bug, distilled by Michael Hudson from a c.l.py post._ _-_ _-class Child:_ _-    def _init_(self, parent):_ _-        self._dict_['parent'] = parent_ _-    def _getattr_(self, attr):_ _-        self.parent.a = 1_ _-        self.parent.b = 1_ _-        self.parent.c = 1_ _-        self.parent.d = 1_ _-        self.parent.e = 1_ _-        self.parent.f = 1_ _-        self.parent.g = 1_ _-        self.parent.h = 1_ _-        self.parent.i = 1_ _-        return getattr(self.parent, attr)_ _-_ _-class Parent:_ _-    def _init_(self):_ _-        self.a = Child(self)_ _-_ _-# Hard to say what this will print!  May vary from time to time.  But_ _-# we're specifically trying to test the tpprint slot here, and this is_ _-# the clearest way to do it.  We print the result to a temp file so that_ _-# the expected-output file doesn't need to change._ _-_ _-f = open(TESTFN, "w")_ _-print(Parent()._dict_, file=f)_ _-f.close()_ _-os.unlink(TESTFN)_ _-_ _-##########################################################################_ _-# And another core-dumper from Michael Hudson._ _-_ _-dict = {}_ _-_ _-# Force dict to malloc its table._ _-for i in range(1, 10):_ _-    dict[i] = i_ _-_ _-f = open(TESTFN, "w")_ _-_ _-class Machiavelli:_ _-    def _repr_(self):_ _-        dict.clear()_ _-_ _-        # Michael sez:  "doesn't crash without this.  don't know why."_ _-        # Tim sez:  "luck of the draw; crashes with or without for me."_ _-        print(file=f)_ _-_ _-        return repr("machiavelli")_ _-_ _-    def _hash_(self):_ _-        return 0_ _-_ _-dict[Machiavelli()] = Machiavelli()_ _-_ _-print(str(dict), file=f)_ _-f.close()_ _-os.unlink(TESTFN)_ _-del f, dict_ _-_ _-_ _-##########################################################################_ _-# And another core-dumper from Michael Hudson._ _-_ _-dict = {}_ _-_ _-# let's force dict to malloc its table_ _-for i in range(1, 10):_ _-    dict[i] = i_ _-_ _-class Machiavelli2:_ _-    def _eq_(self, other):_ _-        dict.clear()_ _-        return 1_ _-_ _-    def _hash_(self):_ _-        return 0_ _-_ _-dict[Machiavelli2()] = Machiavelli2()_ _-_ _-try:_ _-    dict[Machiavelli2()]_ _-except KeyError:_ _-    pass_ _-_ _-del dict_ _-_ _-##########################################################################_ _-# And another core-dumper from Michael Hudson._ _-_ _-dict = {}_ _-_ _-# let's force dict to malloc its table_ _-for i in range(1, 10):_ _-    dict[i] = i_ _-_ _-class Machiavelli3:_ _-    def _init_(self, id):_ _-        self.id = id_ _-_ _-    def _eq_(self, other):_ _-        if self.id == other.id:_ _-            dict.clear()_ _-            return 1_ _-        else:_ _-            return 0_ _-_ _-    def _repr_(self):_ _-        return "%s(%s)"%(self._class_._name_, self.id)_ _-_ _-    def _hash_(self):_ _-        return 0_ _-_ _-dict[Machiavelli3(1)] = Machiavelli3(0)_ _-dict[Machiavelli3(2)] = Machiavelli3(0)_ _-_ _-f = open(TESTFN, "w")_ _-try:_ _-    try:_ _-        print(dict[Machiavelli3(2)], file=f)_ _-    except KeyError:_ _-        pass_ _-finally:_ _-    f.close()_ _-    os.unlink(TESTFN)_ _-_ _-del dict_ _-del dict1, dict2, dict1keys, dict2keys_ _diff --git a/Misc/NEWS b/Misc/NEWS_ _--- a/Misc/NEWS_ _+++ b/Misc/NEWS_ _@@ -10,10 +10,13 @@_  _Core and Builtins_  _-----------------_ _+- Issue #14205: dict lookup raises a RuntimeError if the dict is modified_ _+  during a lookup._ _+_  _Library_  _-------_ _-- Issue #14168: Check for presence of Element.attrs in minidom before_ _+- Issue #14168: Check for presence of Element.attrs in minidom before_ _accessing it._  _- Issue #12328: Fix multiprocessing's use of overlapped I/O on Windows._ _diff --git a/Objects/dictobject.c b/Objects/dictobject.c_ _--- a/Objects/dictobject.c_ _+++ b/Objects/dictobject.c_ _@@ -347,12 +347,9 @@_ _return ep;_ _}_ _else {_ _-                /* The compare did major nasty stuff to the_ _-                 * dict:  start over._ _-                 * XXX A clever adversary could prevent this_ _-                 * XXX from terminating._ _-                 */_ _-                return lookdict(mp, key, hash);_ _+                PyErrSetString(PyExcRuntimeError,_ _+                                "dictionary changed size during lookup");_ _+                return NULL;_ _}_ _}_ _freeslot = NULL;_ _@@ -379,12 +376,9 @@_ _return ep;_ _}_ _else {_ _-                /* The compare did major nasty stuff to the_ _-                 * dict:  start over._ _-                 * XXX A clever adversary could prevent this_ _-                 * XXX from terminating._ _-                 */_ _-                return lookdict(mp, key, hash);_ _+                PyErrSetString(PyExcRuntimeError,_ _+                                "dictionary changed size during lookup");_ _+                return NULL;_ _}_ _}_ _else if (ep->mekey == dummy && freeslot == NULL) -- Repository URL: http://hg.python.org/cpython


Python-checkins mailing list Python-checkins at python.org http://mail.python.org/mailman/listinfo/python-checkins



More information about the Python-Dev mailing list