gh-108634: Py_TRACE_REFS uses a hash table by vstinner · Pull Request #108663 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation25 Commits3 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

vstinner

Member

@vstinner vstinner commented

Aug 30, 2023

edited by github-actionsbot

Loading

Python built with "configure --with-trace-refs" (tracing references) is now ABI compatible with Python release build and debug build. Moreover, it now also supports the Limited API.

Change Py_TRACE_REFS build:

Test changes for Py_TRACE_REFS:


📚 Documentation preview 📚: https://cpython-previews--108663.org.readthedocs.build/

@vstinner

Py_TRACE_REFS adds _Py_AddToAllObjects() and _Py_ForgetReference() to the private functions. These functions are not called by the public C API (only by the internal C API).

_Py_AddToAllObjects() is called by PyType_Ready() and _Py_NewReference().

_Py_ForgetReference() is called by _Py_Dealloc() and allocator functions using free lists (bytes, tuple, str).

These changes have no impact on the ABI. So the Py_TRACE_REFS is now ABI compatible with release and debug builds.

@vstinner

@bedevere-bot

🤖 New build scheduled with the buildbot fleet by @vstinner for commit b52011d 🤖

The command will test the builders whose names match following regular expression: TraceRefs

The builders matched are:

@vstinner

@vstinner

_PyRefchain_Trace() calls Py_FatalError() which kills the program with abort() (which may write a coredump file, depending on the system configuration). That's not great, but I didn't want to change _Py_NewReference() API to report an error only for Py_TRACE_REFS build, whereas this function cannot fail otherwise.

@vstinner

@vstinner

Failed builds:

test_zippath_from_non_installed_posix() and test_upgrade_dependencies() of test_venv failed: known issue #103224, unrelated to this PR.

@erlend-aasland

I'm not qualified to review this, but nevertheless, here's my unqualified opinion: the idea of using a hash map instead of patching the object layout seems good to me; making the trace ref build ABI compatible with ordinary build seems like a big win. I'll leave the proper review to someone who knows the GC better (for example, Pablo was already CC'd on this PR).

The configure changes are fine.

methane

@vstinner

@vstinner

@vstinner

I removed the force parameter and rebased the PR on the main branch.

@pitrou

How often is Py_TRACE_REFS used? Did you measure the overhead of your new implementation compared to the old one?

@vstinner

For a debug build, I don't really care of the memory usage (if it's "reasonable"). For example, enabling tracemalloc to trace Python memory allocation almost doubles the Python memory usage!

This PR increases the peak memory usage of a Py_TRACE_REFS build by +29%: 11 899 kiB => 15 408 kiB (+3 509 kiB). Test on ./python -m test test_sys peak Python memory usage.

How often is Py_TRACE_REFS used? Did you measure the overhead of your new implementation compared to the old one?

As far as I know, the main user is @pablogsal's TraceRefs buildbot which makes sure that Python still builds successfully with Py_TRACE_REFS (and that the test suite pass, as well).

On Python 3.7 and older, ./configure --with-pydebug defined Py_TRACE_REFS. Since Python 3.8, you now have to opt-in for it: ./configure --with-trace-refs. I expect that nobody uses such build, only a few core devs for specific debug tasks (who? I have no idea).


I expect that the main difference of my change is the memory usage. I measured it with:

$ ./python -X tracemalloc -i -m test test_sys
(...)
>>> import tracemalloc; print("peak Python memory usage: %.0f kiB" % (tracemalloc.get_traced_memory()[1] / 1024))

Python built with: ./configure --with-pydebug --with-trace-refs && make. It uses gcc -Og.

Memoy usage of Py_REF_DEBUG + Py_TRACE_REFS:

I ran the command 3 times, the first run has a higher peak memory usage, maybe because the first one used more memory to build PYC files.

By the way, len(sys.getobjects(0)) lists 72 289 objects on this test.


For comparison, memory usage of Py_REF_DEBUG (without Py_TRACE_REFS):

On the main branch, Py_TRACE_REFS adds 1 172 kiB (+11%).

@pitrou

Well, why do we care about improving Py_TRACE_REFS if we think nobody uses it? :-)

@vstinner

Well, why do we care about improving Py_TRACE_REFS if we think nobody uses it? :-)

I would like to convert some stdlib C extensions to the limited C API:

The practical problem is that we do have a TraceRefs buildbot, and I care of having only green buildbots.

Also, since it's unclear to me who uses TraceRefs, I prefer to fix it rather than ask who use it, or propose to deprecate/remove the feature. In the past, I removed COUNT_ALLOCS build, but this one was used by no one, and there was no buildbot :-)

Right now, trying to workaround TraceRefs build error with Py_LIMITED_API causes me headaches... I wrote a trivial PR to convert the _stat extension to the limited C API: PR #108573. But it breaks the Py_TRACE_REFS build, and so will break the TraceRefs buildbot. I tried to skip the _stat extension if Py_TRACE_REFS, it's optional, but then I got new issues: issue #108638...

In short, fixing Py_LIMITED_API support for Py_TRACE_REFS looks simpler to me than trying to workaround Py_TRACE_REFS build errors :-)

Also, well, IMO it's nice to have the same ABI for all Python builds! It should make these builds usable in more cases. I always wanted to give a try to the hash table idea. As you can see, it's not that complicated.

@vstinner

@erlend-aasland

This comment was marked as off-topic.

@pitrou

Hmm, I don't think I have ever used Py_TRACE_REFS for CPython core development.

@vstinner

@erlend-aasland

This comment was marked as off-topic.

@vstinner

methane

@vstinner

I checked that this change does not introduce new memory leak:

vstinner@mona$ PYTHONMALLOC=malloc valgrind ./python -c pass
==2715186== Memcheck, a memory error detector
==2715186== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==2715186== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==2715186== Command: ./python -c pass
==2715186== 
==2715186== 
==2715186== HEAP SUMMARY:
==2715186==     in use at exit: 0 bytes in 0 blocks
==2715186==   total heap usage: 52,597 allocs, 52,597 frees, 5,260,667 bytes allocated
==2715186== 
==2715186== All heap blocks were freed -- no leaks are possible
==2715186== 
==2715186== For lists of detected and suppressed errors, rerun with: -s
==2715186== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

There is no leak. For comparison, if I comment _PyObject_FiniState() in PyInterpreterState_Delete(), leak on purpose for the test:

...
==2715502== LEAK SUMMARY:
==2715502==    definitely lost: 0 bytes in 0 blocks
==2715502==    indirectly lost: 0 bytes in 0 blocks
==2715502==      possibly lost: 0 bytes in 0 blocks
==2715502==    still reachable: 14,448 bytes in 195 blocks
==2715502==         suppressed: 0 bytes in 0 blocks
==2715502== Rerun with --leak-check=full to see details of leaked memory
...

@vstinner

Thanks for the review @methane and @pitrou! I merged my PR which adds more tests and enhance some documentation.

andersk added a commit to andersk/cpython that referenced this pull request

Sep 7, 2023

@andersk

Commit 13a0007 (python#108663) made all Python builds compatible with the Limited API, and removed the LIMITED_API_AVAILABLE flag. However, some tests were still checking for that flag, so they were now being incorrectly skipped. Remove these checks to let these tests run again.

Signed-off-by: Anders Kaseorg andersk@mit.edu

andersk added a commit to andersk/cpython that referenced this pull request

Sep 7, 2023

@andersk

Commit 13a0007 (python#108663) made all Python builds compatible with the Limited API, and removed the LIMITED_API_AVAILABLE flag. However, some tests were still checking for that flag, so they were now being incorrectly skipped. Remove these checks to let these tests run again.

Signed-off-by: Anders Kaseorg andersk@mit.edu

andersk

#ifdef Py_TRACE_REFS
#undef LIMITED_API_AVAILABLE
#else
#define LIMITED_API_AVAILABLE 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vstinner pushed a commit that referenced this pull request

Sep 7, 2023

@andersk

…09046)

Commit 13a0007 (#108663) made all Python builds compatible with the Limited API, and removed the LIMITED_API_AVAILABLE flag. However, some tests were still checking for that flag, so they were now being incorrectly skipped. Remove these checks to let these tests run again.

Signed-off-by: Anders Kaseorg andersk@mit.edu

Reviewers

@andersk andersk andersk left review comments

@methane methane methane approved these changes

@erlend-aasland erlend-aasland Awaiting requested review from erlend-aasland erlend-aasland is a code owner

@corona10 corona10 Awaiting requested review from corona10 corona10 is a code owner

@serhiy-storchaka serhiy-storchaka Awaiting requested review from serhiy-storchaka

@pablogsal pablogsal Awaiting requested review from pablogsal

@pitrou pitrou Awaiting requested review from pitrou

@markshannon markshannon Awaiting requested review from markshannon markshannon is a code owner