[Python-Dev] Have a big machine and spare time? Here's a possible Python bug. (original) (raw)

Tim Peters tim.peters at gmail.com
Fri May 24 13:48:30 EDT 2019


[Inada Naoki <songofacandy at gmail.com>]

For the record, result for 10M nodes, Ubuntu 18.04 on AWS r5a.4xlarge:

I'm unclear on what "nodes" means. If you mean you changed 27M to 10M in this line:

for token in random_strings(27_000_000):

that's fine, but there are about 40 times more than that Node objects created by the program. So if you did change that to 10_000_000, you'd have about 400M Node objects, consuming about 80x that many bytes of RAM (or about 32GB).

$ local/bin/python3 t1.py # default 1138.1123778309993 -- end train, start del 688.7927911250008 -- end

$ arena-1m/bin/python3 t1.py # Changed ARENASIZE to 1MiB 1085.3363994170013 -- end train, start del 84.57135540099989 -- end

Nice! I assume these are seconds. On Stackoverflow, the OP reported that boosting ARENA_SIZE the same way cut deallocation time in his original program to about 13 minutes.

$ PYTHONMALLOC=malloc local/bin/python3 t1.py 1157.4882792789995 -- end train, start del 27.919834706000074 -- end

While the OP reported, for the original program:

""" With PYTHONMALLOC=malloc, insertion takes nearly twice as long, but deallocation only takes 2 minutes! """

The "nearly twice as long" for building the tree is in sharp contrast to what you saw, but there's about 3x as much of everything in the original program, and, e.;g., it's certainly possible malloc is tripping over fragmentation then that obmalloc avoids.

$ LDPRELOAD=/usr/lib/x8664-linux-gnu/libjemalloc.so.1 PYTHONMALLOC=malloc local/bin/python3 t1.py 1098.4383037820007 -- end train, start del 117.93938426599925 -- end

In this case, glibc malloc is the fastest. glibc is know to weak about fragmentation. But algorithm to avoid fragmentation is just an overhead in this script.

I can't say.

I'll note that the approach I very briefly sketched before (restructure the list of arenas to partition it into multiple lists partitioned by number of free pools) "should make" obmalloc competitive with malloc here (it would eliminate all searches, except perhaps (depending on how gonzo the code is) a rare need to search for "the first" non-empty list).



More information about the Python-Dev mailing list