Issue 46615: Use-after-free by mutating set during set operations (original) (raw)
Created on 2022-02-02 18:01 by Dennis Sweeney, last changed 2022-04-11 14:59 by admin.
Messages (13)
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-02 18:01
Maybe related to https://bugs.python.org/issue8420
Somewhat obscure, but using only standard Python, and no frame- or gc-hacks, it looks like we can get a use-after-free:
from random import random
BADNESS = 0.0
class Bad: def eq(self, other): if random() < BADNESS: set1.clear() if random() < BADNESS: set2.clear() return True def hash(self): return 42
SIZE = 100 TRIALS = 10_000
ops = [ "|", "|=", "==", "!=", "<", "<=", ">", ">=", # "&", # crash! # "&=", # crash! "^", # "^=", # crash # "-", # crash "-=", ]
for op in ops: stmt = f"set1 {op} set2" print(stmt, "...") for _ in range(TRIALS): BADNESS = 0.00 set1 = {Bad() for _ in range(SIZE)} set2 = {Bad() for _ in range(SIZE)} BADNESS = 0.02 exec(stmt) print("ok.")
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-02 18:12
replacing return True
with return random() < 0.5
makes all of the operations crash, except for |
and |=
.
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-02 18:17
set1.isdisjoint(set2) also crashes
Author: Raymond Hettinger (rhettinger) *
Date: 2022-02-03 14:22
The likely culprit is the set_next() loop. Perhaps it is never safe to use set_next() because any lookup can callback to eq which can mutate the set.
Since set_isdisjoint() method isn't a mutating method, that is the easiest place to start investigating. Try disabling the exact set fast path to see if the issue persists.
Author: Raymond Hettinger (rhettinger) *
Date: 2022-02-03 16:42
Presumably _PyDict_Next is also suspect. Even the advertised "safe" calls to PyDict_SetItem() for existing keys would be a trigger. Calling clear() in either eq or hash would suffice.
If the next loops are the culprint, the new challenge is figuring out how to fix it without wrecking code clarity and performance (and having to deprecate PyDict_Next() which is part of the stable ABI).
Author: Raymond Hettinger (rhettinger) *
Date: 2022-02-04 01:19
Marking as low priority given that ehe next loop code has been deployed without incident for two decades (a little less for sets and a little more for dicts).
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-04 03:12
It looks like usages of the PyDict_Next API assume the resulting references are borrowed and so INCREF them.
Usages of set_next do not, but should.
It should hopefully be a straightforward fix of adding INCREF/DECREFs.
Author: Tim Peters (tim.peters) *
Date: 2022-02-04 05:42
Raised the priority back to normal.
I agree with Dennis's observation that PyDict_Next is safe, provided it's used as intended: it returns borrowed references, but to things that absolutely are legitimate at the time. In the presence of mutations, what it returns isn't defined at all, but I don't see a way for it to blow up (unless its caller screws up by believing it owns the references). It doesn't assume anything about the structure of the dict across calls.
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-11 16:25
New changeset 4a66615ba736f84eadf9456bfd5d32a94cccf117 by Dennis Sweeney in branch 'main': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) https://github.com/python/cpython/commit/4a66615ba736f84eadf9456bfd5d32a94cccf117
Author: miss-islington (miss-islington)
Date: 2022-02-11 20:44
New changeset 1f5fe9962f768c8bfd4ed06a22532d31d3424dc9 by Miss Islington (bot) in branch '3.10': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) https://github.com/python/cpython/commit/1f5fe9962f768c8bfd4ed06a22532d31d3424dc9
Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2022-02-13 10:30
New changeset c31b8a97a8a7e8255231c9e12ed581c6240c0d6c by Dennis Sweeney in branch '3.9': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) (GH-31312) https://github.com/python/cpython/commit/c31b8a97a8a7e8255231c9e12ed581c6240c0d6c
Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2022-02-13 10:41
Thanks Dennis for your report and PRs.
Do you mind to analyze also uses of _PySet_NextEntry(), PyDict_Next() and _PyDict_Next()? Many of them look safe, but _pickle.c seems vulnerable.
Author: Dennis Sweeney (Dennis Sweeney) *
Date: 2022-02-19 05:24
It does look like there are some pickle situations that crash. Attached is a randomized crasher. I haven't done too much careful reasoning about it, but adding INCREFs everywhere seems to fix most of the issues.
History
Date
User
Action
Args
2022-04-11 14:59:55
admin
set
github: 90773
2022-04-06 17:05:20
serhiy.storchaka
unlink
2022-02-19 05:24:06
Dennis Sweeney
set
files: + picklecrasher.py
messages: +
2022-02-13 10:41:21
serhiy.storchaka
set
messages: +
2022-02-13 10:30:20
serhiy.storchaka
set
messages: +
2022-02-13 02:47:09
Dennis Sweeney
set
pull_requests: + <pull%5Frequest29473>
2022-02-11 20:44:24
miss-islington
set
messages: +
2022-02-11 20:09:01
miss-islington
set
nosy: + miss-islington
pull_requests: + <pull%5Frequest29444>
2022-02-11 16:25:17
Dennis Sweeney
set
messages: +
2022-02-11 15:40:10
serhiy.storchaka
link
2022-02-04 05:42:31
tim.peters
set
priority: low -> normal
nosy: + tim.peters
messages: +
2022-02-04 04:24:42
Dennis Sweeney
set
keywords: + patch
stage: patch review
pull_requests: + <pull%5Frequest29301>
2022-02-04 03:12:21
Dennis Sweeney
set
messages: +
2022-02-04 01:19:08
rhettinger
set
priority: normal -> low
messages: +
2022-02-03 16:42:14
rhettinger
set
messages: +
2022-02-03 14:22:04
rhettinger
set
messages: +
2022-02-03 09:06:00
serhiy.storchaka
set
nosy: + serhiy.storchaka
2022-02-02 18:17:01
Dennis Sweeney
set
messages: +
2022-02-02 18:15:04
Dennis Sweeney
set
title: Segfault in set intersection (&) and difference (-) -> Use-after-free by mutating set during set operations
2022-02-02 18:12:23
Dennis Sweeney
set
messages: +
2022-02-02 18:01:22
Dennis Sweeney
create