[Python-Dev] PEP 454: add a new tracemalloc module (second round) (original) (raw)
Victor Stinner victor.stinner at gmail.com
Tue Sep 17 12:36:03 CEST 2013
- Previous message: [Python-Dev] PEP 454: add a new tracemalloc module (second round)
- Next message: [Python-Dev] PEP 453: Explicit bootstrapping of pip
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
2013/9/17 Victor Stinner <victor.stinner at gmail.com>:
Issue tracking the implementation: http://bugs.python.org/issue18874
If you want to test the implementation, you can try the following repository: http://hg.python.org/features/tracemalloc
Or try the patch attached on the issue #18874 on the Python default version. Compile Python and use "-X tracemalloc" command line option to enable the module at startup. Then you can play with tracemalloc.DisplayTopTask.display() and tracemalloc.TakeSnapshot.take_snapshot().
To create Python snapshots easily, modify Lib/test/regrtest.py to use the following block:
take = tracemalloc.TakeSnapshot()
tracemalloc.set_traceback_limit(15); take.with_traces = True
take.filename_template = "/tmp/tracemalloc-$pid-$counter.pickle" take.start(10)
And then start:
./python -X tracemalloc -m test
"take.with_traces = True" is slower but required if you want to use cumulative views, show the traceback, or group traces by address. Use a lower traceback limit if the test suite is too slow.
When you get a snapshot file, you can analyze it using:
./python -m tracemalloc /tmp/tracemalloc-1564-0001.pickle
Add --help to see all options. I like the --traceback option.
getfilters()
function:Get the filters on Python memory allocations as list of
Filter
instances.
I hesitate to add a Filters class which would contain a list of filters. The logic to check if list of filters matchs is non-trivial. You have to split inclusive and exclusive filters and take care of empty list of inclusive/exclusive filters. See the code of Snapshot.apply_filters() for example.
getobjecttrace(obj)
function:Get the trace of a Python object obj as a
Trace
instance. The function only returns the trace of the memory block directly holding to object. Thesize
attribute of the trace is smaller than the total size of the object if the object is composed of more than one memory block. ReturnNone
if thetracemalloc
module did not trace the allocation of the object. See alsogc.getreferrers()
andsys.getsizeof()
functions.
The function can be see of a lie because it does not count all bytes of a object (as explained in the doc above). The function should maybe be renamed to "get_trace(address)" to avoid the confusion.
DisplayTop class ----------------
Oh, I forgot to document the new "previous_top_stats" attribute. It is used to compare two snapshots.
DisplayTopTask class --------------------
start(delay: int)
method: Start a task using thestarttimer()
function calling thedisplay()
method every delay seconds.
I should probably repeat here that only one timer can used at the same time. So only one DisplayTopTask or one TakeSnapshot instance can be used at the same time.
It's a design choice to keep start_timer() simple, there is no need for a complex scheduler for such simple debug tool.
You can run the two tasks at the same time by writing your own function:
def mytask(top_task, snapshot_task): top_task.display() snapshot_task.take_snapshot()
tracemalloc.start_timer(10, mytask, top_task, snapshot_task)
Snapshot class --------------
``create(*, withtraces=False, withstats=True, userdatacallback=None)`` classmethod:
It's the only function using keyword-only parameters. I don't know it's a good practice and should be used on other methods, or if it should be avoided?
userdatacallback is an optional callable object. Its result should be serializable by the
pickle
module, orSnapshot.write()
would fail. If userdatacallback is set, it is called and the result is stored in theSnapshot.userdata
attribute. Otherwise,Snapshot.userdata
is set toNone
.
The idea is to attach arbitrary data to a snapshot. Examples:
- size of Python caches: cache of linecache and re modules
- size of the internal Unicode intern dict
- gc.get_stats()
- gc.get_count()
- len(gc.get_objects())
- ("tracemalloc_size" should maybe moved to the user_data)
I hesitate to use a dictionary for user_data. The problem is to decice how to display such data in DisplayTop. For example, gc.get_count() is a number whereas tracemalloc_size is size is bytes (should be formatted using kB, MB, etc. suffixes).
What do you think?
Victor
- Previous message: [Python-Dev] PEP 454: add a new tracemalloc module (second round)
- Next message: [Python-Dev] PEP 453: Explicit bootstrapping of pip
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]