[Python-Dev] PEP 454: add a new tracemalloc module (second round) (original) (raw)

Victor Stinner victor.stinner at gmail.com
Tue Sep 17 12:36:03 CEST 2013


2013/9/17 Victor Stinner <victor.stinner at gmail.com>:

Issue tracking the implementation: http://bugs.python.org/issue18874

If you want to test the implementation, you can try the following repository: http://hg.python.org/features/tracemalloc

Or try the patch attached on the issue #18874 on the Python default version. Compile Python and use "-X tracemalloc" command line option to enable the module at startup. Then you can play with tracemalloc.DisplayTopTask.display() and tracemalloc.TakeSnapshot.take_snapshot().

To create Python snapshots easily, modify Lib/test/regrtest.py to use the following block:

take = tracemalloc.TakeSnapshot()

tracemalloc.set_traceback_limit(15); take.with_traces = True

take.filename_template = "/tmp/tracemalloc-$pid-$counter.pickle" take.start(10)

And then start:

./python -X tracemalloc -m test

"take.with_traces = True" is slower but required if you want to use cumulative views, show the traceback, or group traces by address. Use a lower traceback limit if the test suite is too slow.

When you get a snapshot file, you can analyze it using:

./python -m tracemalloc /tmp/tracemalloc-1564-0001.pickle

Add --help to see all options. I like the --traceback option.

getfilters() function:

Get the filters on Python memory allocations as list of Filter instances.

I hesitate to add a Filters class which would contain a list of filters. The logic to check if list of filters matchs is non-trivial. You have to split inclusive and exclusive filters and take care of empty list of inclusive/exclusive filters. See the code of Snapshot.apply_filters() for example.

getobjecttrace(obj) function:

Get the trace of a Python object obj as a Trace instance. The function only returns the trace of the memory block directly holding to object. The size attribute of the trace is smaller than the total size of the object if the object is composed of more than one memory block. Return None if the tracemalloc module did not trace the allocation of the object. See also gc.getreferrers() and sys.getsizeof() functions.

The function can be see of a lie because it does not count all bytes of a object (as explained in the doc above). The function should maybe be renamed to "get_trace(address)" to avoid the confusion.

DisplayTop class ----------------

Oh, I forgot to document the new "previous_top_stats" attribute. It is used to compare two snapshots.

DisplayTopTask class --------------------

start(delay: int) method: Start a task using the starttimer() function calling the display() method every delay seconds.

I should probably repeat here that only one timer can used at the same time. So only one DisplayTopTask or one TakeSnapshot instance can be used at the same time.

It's a design choice to keep start_timer() simple, there is no need for a complex scheduler for such simple debug tool.

You can run the two tasks at the same time by writing your own function:

def mytask(top_task, snapshot_task): top_task.display() snapshot_task.take_snapshot()

tracemalloc.start_timer(10, mytask, top_task, snapshot_task)

Snapshot class --------------

``create(*, withtraces=False, withstats=True, userdatacallback=None)`` classmethod:

It's the only function using keyword-only parameters. I don't know it's a good practice and should be used on other methods, or if it should be avoided?

userdatacallback is an optional callable object. Its result should be serializable by the pickle module, or Snapshot.write() would fail. If userdatacallback is set, it is called and the result is stored in the Snapshot.userdata attribute. Otherwise, Snapshot.userdata is set to None.

The idea is to attach arbitrary data to a snapshot. Examples:

I hesitate to use a dictionary for user_data. The problem is to decice how to display such data in DisplayTop. For example, gc.get_count() is a number whereas tracemalloc_size is size is bytes (should be formatted using kB, MB, etc. suffixes).

What do you think?

Victor



More information about the Python-Dev mailing list