gh-97514: Authenticate the forkserver control socket. by gpshead · Pull Request #99309 · python/cpython (original) (raw)

This adds authentication. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type seems desirable.

This reuses the HMAC based shared key auth already used by multiprocessing.connection sockets for other purposes.

Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.

Tasks remaining

pyperformance benchmarks

No significant changes. Including concurrent_imap which exercises multiprocessing.Pool.imap in that suite.

Microbenchmarks

This does slightly slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.

Using my multiprocessing process-creation-benchmark.py:

I switched between this PR branch and main via a simple git checkout after my build as the changes are pure Python so no rebuild is needed.

On an AMD zen4 system:

889 Procs/sec dropped to 874. 1.5% slower. Insignificant.

AMD 7800X3D single-CCD 8 cores.

% ../b/python process-creation-benchmark.py 5 forkserver
Process Creation Microbenchmark (max 7 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          666.77      0.049     108.39
128         831.09      0.154      44.05
384         887.16      0.433       9.27
1024        886.02      1.156       1.37
2048        888.99      2.304       2.76

% ./b/python ~/Downloads/process-creation-benchmark.py 5 forkserver
Process Creation Microbenchmark (max 7 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          640.53      0.052     130.27
128         809.62      0.158      38.79
384         867.22      0.443       7.66
1024        873.75      1.172       2.76
2048        873.57      2.344       2.85

Expand for baseline fork (2659 Procs/sec) and spawn (268) measurements.``` % ../b/python ~/Downloads/process-creation-benchmark.py 13 fork Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: fork sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 2,300.78 0.014 78.91 128 2,391.11 0.054 114.68 384 2,650.31 0.145 13.23 1024 2,646.28 0.387 16.47 2048 2,641.08 0.775 13.65 5120 2,659.42 1.925 11.82 % ../b/python ~/Downloads/process-creation-benchmark.py 13 spawn Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: spawn sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 235.96 0.136 13.91 128 259.53 0.493 0.79 384 267.62 1.435 1.00 1024 267.89 3.822 0.35 ```

On an Intel Broadwell Xeon E5-2698 v4 system:

828 Procs/sec dropped to 717. ~15% slower. Significant. BUT... if I drop the active processes from 19 to 9. The difference was far less. 414 dropped to 398 for a ~4% slower. Moderate.

20 cores, 2 ring busses, 4 memory controllers, single socket. A large die Broadwell Xeon is complicated. At high parallelism counts, interprocess communication latencies add up. I predict similar results from multi-core-complex-die zen/epycs and multi socket systems, probably also on big.little mixed power/perf core arrangements.

% ../b/python ~/process-creation-benchmark.py 13 forkserver
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          535.14      0.062      77.23
128         735.49      0.174       6.53
384         798.69      0.481       4.43
1024        820.84      1.248       1.90
2048        827.63      2.475       4.31

% ../b/python ~/process-creation-benchmark.py 13 forkserver
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          449.24      0.073      63.19
128         614.39      0.208      16.66
384         668.49      0.575      11.36
1024        716.77      1.430      18.10
2048        716.73      2.858      13.12

Expand for baseline fork (1265 Procs/sec) and spawn (233) measurements.

% ../b/python ~/process-creation-benchmark.py 13 fork
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: fork
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32        1,241.39      0.026      51.43
128       1,259.44      0.102       5.01
384       1,254.59      0.306       3.86
1024      1,258.45      0.814       6.77
2048      1,265.48      1.618       8.34
% ./b/python ~/process-creation-benchmark.py 13 spawn
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: spawn
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          188.08      0.170       2.58
128         221.20      0.579       0.75
384         227.56      1.687       0.85
1024        233.34      4.388       0.54

On an Raspberry Pi 5

126 Proc/sec dropped to 121. A ~4% slowdown. Moderate.

Raspberry Pi 5 running 32-bit raspbian.

% ./python ../process-creation-benchmark.py 
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          121.23      0.266       9.82
128         125.45      1.020       0.84
384         125.71      3.055       0.27

% ./python ../process-creation-benchmark.py 
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey:07c01d4, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          114.57      0.281      10.29
128         119.70      1.069       0.28
384         120.84      3.178       0.41

Expand for baseline fork (973 Procs/sec) and spawn (32) measurements.

% /python ../process-creation-benchmark.py 5 fork
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: fork
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          933.01      0.034      44.03
128         973.00      0.132       1.33
384         968.48      0.396       1.55
1024        972.78      1.053       0.77
% ./python ../process-creation-benchmark.py 5 spawn
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: spawn
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32           31.97      1.001       0.12
128          32.46      3.943       0.02