Issue 46615: Use-after-free by mutating set during set operations (original) (raw)

Created on 2022-02-02 18:01 by Dennis Sweeney, last changed 2022-04-11 14:59 by admin.

Messages (13)

msg412386 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-02 18:01

Maybe related to https://bugs.python.org/issue8420

Somewhat obscure, but using only standard Python, and no frame- or gc-hacks, it looks like we can get a use-after-free:

from random import random

BADNESS = 0.0

class Bad: def eq(self, other): if random() < BADNESS: set1.clear() if random() < BADNESS: set2.clear() return True def hash(self): return 42

SIZE = 100 TRIALS = 10_000

ops = [ "|", "|=", "==", "!=", "<", "<=", ">", ">=", # "&", # crash! # "&=", # crash! "^", # "^=", # crash # "-", # crash "-=", ]

for op in ops: stmt = f"set1 {op} set2" print(stmt, "...") for _ in range(TRIALS): BADNESS = 0.00 set1 = {Bad() for _ in range(SIZE)} set2 = {Bad() for _ in range(SIZE)} BADNESS = 0.02 exec(stmt) print("ok.")

msg412387 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-02 18:12

replacing return True with return random() < 0.5 makes all of the operations crash, except for | and |=.

msg412389 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-02 18:17

set1.isdisjoint(set2) also crashes

msg412444 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2022-02-03 14:22

The likely culprit is the set_next() loop. Perhaps it is never safe to use set_next() because any lookup can callback to eq which can mutate the set.

Since set_isdisjoint() method isn't a mutating method, that is the easiest place to start investigating. Try disabling the exact set fast path to see if the issue persists.

msg412448 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2022-02-03 16:42

Presumably _PyDict_Next is also suspect. Even the advertised "safe" calls to PyDict_SetItem() for existing keys would be a trigger. Calling clear() in either eq or hash would suffice.

If the next loops are the culprint, the new challenge is figuring out how to fix it without wrecking code clarity and performance (and having to deprecate PyDict_Next() which is part of the stable ABI).

msg412487 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2022-02-04 01:19

Marking as low priority given that ehe next loop code has been deployed without incident for two decades (a little less for sets and a little more for dicts).

msg412489 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-04 03:12

It looks like usages of the PyDict_Next API assume the resulting references are borrowed and so INCREF them.

Usages of set_next do not, but should.

It should hopefully be a straightforward fix of adding INCREF/DECREFs.

msg412494 - (view)

Author: Tim Peters (tim.peters) * (Python committer)

Date: 2022-02-04 05:42

Raised the priority back to normal.

I agree with Dennis's observation that PyDict_Next is safe, provided it's used as intended: it returns borrowed references, but to things that absolutely are legitimate at the time. In the presence of mutations, what it returns isn't defined at all, but I don't see a way for it to blow up (unless its caller screws up by believing it owns the references). It doesn't assume anything about the structure of the dict across calls.

msg413085 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-11 16:25

New changeset 4a66615ba736f84eadf9456bfd5d32a94cccf117 by Dennis Sweeney in branch 'main': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) https://github.com/python/cpython/commit/4a66615ba736f84eadf9456bfd5d32a94cccf117

msg413097 - (view)

Author: miss-islington (miss-islington)

Date: 2022-02-11 20:44

New changeset 1f5fe9962f768c8bfd4ed06a22532d31d3424dc9 by Miss Islington (bot) in branch '3.10': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) https://github.com/python/cpython/commit/1f5fe9962f768c8bfd4ed06a22532d31d3424dc9

msg413175 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2022-02-13 10:30

New changeset c31b8a97a8a7e8255231c9e12ed581c6240c0d6c by Dennis Sweeney in branch '3.9': bpo-46615: Don't crash when set operations mutate the sets (GH-31120) (GH-31312) https://github.com/python/cpython/commit/c31b8a97a8a7e8255231c9e12ed581c6240c0d6c

msg413176 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2022-02-13 10:41

Thanks Dennis for your report and PRs.

Do you mind to analyze also uses of _PySet_NextEntry(), PyDict_Next() and _PyDict_Next()? Many of them look safe, but _pickle.c seems vulnerable.

msg413530 - (view)

Author: Dennis Sweeney (Dennis Sweeney) * (Python committer)

Date: 2022-02-19 05:24

It does look like there are some pickle situations that crash. Attached is a randomized crasher. I haven't done too much careful reasoning about it, but adding INCREFs everywhere seems to fix most of the issues.

History

Date

User

Action

Args

2022-04-11 14:59:55

admin

set

github: 90773

2022-04-06 17:05:20

serhiy.storchaka

unlink

issue46721 dependencies

2022-02-19 05:24:06

Dennis Sweeney

set

files: + picklecrasher.py

messages: +

2022-02-13 10:41:21

serhiy.storchaka

set

messages: +

2022-02-13 10:30:20

serhiy.storchaka

set

messages: +

2022-02-13 02:47:09

Dennis Sweeney

set

pull_requests: + <pull%5Frequest29473>

2022-02-11 20:44:24

miss-islington

set

messages: +

2022-02-11 20:09:01

miss-islington

set

nosy: + miss-islington
pull_requests: + <pull%5Frequest29444>

2022-02-11 16:25:17

Dennis Sweeney

set

messages: +

2022-02-11 15:40:10

serhiy.storchaka

link

issue46721 dependencies

2022-02-04 05:42:31

tim.peters

set

priority: low -> normal
nosy: + tim.peters
messages: +

2022-02-04 04:24:42

Dennis Sweeney

set

keywords: + patch
stage: patch review
pull_requests: + <pull%5Frequest29301>

2022-02-04 03:12:21

Dennis Sweeney

set

messages: +

2022-02-04 01:19:08

rhettinger

set

priority: normal -> low

messages: +

2022-02-03 16:42:14

rhettinger

set

messages: +

2022-02-03 14:22:04

rhettinger

set

messages: +

2022-02-03 09:06:00

serhiy.storchaka

set

nosy: + serhiy.storchaka

2022-02-02 18:17:01

Dennis Sweeney

set

messages: +

2022-02-02 18:15:04

Dennis Sweeney

set

title: Segfault in set intersection (&) and difference (-) -> Use-after-free by mutating set during set operations

2022-02-02 18:12:23

Dennis Sweeney

set

messages: +

2022-02-02 18:01:22

Dennis Sweeney

create