bpo-36611: Disable serialno field of debug memory allocators by vstinner · Pull Request #12796 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation17 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

vstinner

Omit serialno field from debug hooks on Python memory allocators to
reduce the memory footprint by 5%.

Enable tracemalloc to get the traceback where a memory block has been
allocated when a fatal memory error is logged to decide where to put
a breakpoint.

Compile Python with PYMEM_DEBUG_SERIALNO defined to get back the
field.

https://bugs.python.org/issue36611

methane

@vstinner

@methane: Just to be sure, you prefer this approach (disable by default, but keep the code) over PR #12795 which simply removes the code?

I have also a preference for this approach: keep the code, by disable by default.

serhiy-storchaka

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Did you test with PYMEM_DEBUG_SERIALNO defined?

@serhiy-storchaka

Just to be sure, you prefer this approach (disable by default, but keep the code) over PR #12795 which simply removes the code?

It is up to you as the author and the main user of this code.

@vstinner

Oh, Travis CI failed with a crash in test_PyThreadState_SetAsyncExc()... I'm not sure if it's related to my change?

test_PyThreadState_SetAsyncExc (test.test_threading.ThreadTests) ...     started worker thread
    trying nonsensical thread id
Fatal Python error: Segmentation fault
Thread 0x00007f7df4ff9700 (most recent call first):
  File "/home/travis/build/python/cpython/Lib/test/test_threading.py", line 230 in run
  File "/home/travis/build/python/cpython/Lib/threading.py", line 917 in _bootstrap_inner
  File "/home/travis/build/python/cpython/Lib/threading.py", line 885 in _bootstrap
Current thread 0x00007f7e290da700 (most recent call first):
  File "/home/travis/build/python/cpython/Lib/test/test_threading.py", line 244 in test_PyThreadState_SetAsyncExc
  File "/home/travis/build/python/cpython/Lib/unittest/case.py", line 680 in run

https://travis-ci.org/python/cpython/jobs/519191613

@vstinner

LGTM. Did you test with PYMEM_DEBUG_SERIALNO defined?

I failed to find a smart way to fix test_capi.test_api_misuse() depending if PYMEM_DEBUG_SERIALNO is defined or not, since this define might be only defined in obmalloc.c (not exported)... But I just found one! Since the test uses a regular expression, I just made the line optional in the expected test!

So yes, I just tested and, with my new commit, test_capi pass.

@zooba

Wait, I didn't know about this! Can I use it to debug the dump refs crash? (Where a tuple element is being deallocated even though there's a strong reference to it)

@methane

@methane: Just to be sure, you prefer this approach (disable by default, but keep the code) over PR #12795 which simply removes the code?

Yes. And if there are no feedback about it, we can remove it in the future.

@vstinner

Oh wow, that's scary: now it's test_array_from_size() of test_multiprocessing_spawn which crashed on Travis CI!

https://travis-ci.org/python/cpython/jobs/519311289

0:07:55 load avg: 4.26 [419/420/9] test_multiprocessing_spawn crashed (Exit code -11) -- running: test_concurrent_futures (2 min 14 sec)
Fatal Python error: Segmentation fault
Current thread 0x00007fc151785700 (most recent call first):
  File "/home/travis/build/python/cpython/Lib/multiprocessing/sharedctypes.py", line 62 in RawArray
  File "/home/travis/build/python/cpython/Lib/multiprocessing/sharedctypes.py", line 88 in Array
  File "/home/travis/build/python/cpython/Lib/multiprocessing/context.py", line 140 in Array
  File "/home/travis/build/python/cpython/Lib/test/_test_multiprocessing.py", line 2019 in test_array_from_size
  File "/home/travis/build/python/cpython/Lib/unittest/case.py", line 680 in run
  File "/home/travis/build/python/cpython/Lib/unittest/case.py", line 740 in __call__

@vstinner

@zooba: "Wait, I didn't know about this! Can I use it to debug the dump refs crash? (Where a tuple element is being deallocated even though there's a strong reference to it)"

I don't understand your question. What do you want to do? This PR is about removing a "serialno" stored in every memory blocks by the debug hooks installed on Python memory allocators. It's lower level than Python objects.

What is the current behavior on your "refs crash"? Which function crash? How?

@vstinner

@vstinner

I tried to reproduce the Travis CI:

$ ./configure --with-pydebug CFLAGS=-O3
$ make # use GCC, whereas Travis CI uses clang

I ran ./python -m test -u all,-gui -j0 -r twice in parallel locally: I don't get any issue!? AppVeyor didn't crash. These 2 crashes on Travis CI are really surprising.

@vstinner

I ran ./python -m test -u all,-gui -j0 -r twice in parallel locally: I don't get any issue!? AppVeyor didn't crash. These 2 crashes on Travis CI are really surprising.

I succeeded to reproduce the crash, but only using clang with -O3. In fact, it's a memory alignement issues. clang makes more assumptions than GCC and Visual Studio. The bug isn't caused by my change, but my change "triggered the bug".

See https://bugs.python.org/issue36618

@vstinner

Omit serialno field from debug hooks on Python memory allocators to reduce the memory footprint by 5%.

Enable tracemalloc to get the traceback where a memory block has been allocated when a fatal memory error is logged to decide where to put a breakpoint.

Compile Python with PYMEM_DEBUG_SERIALNO defined to get back the field.

@vstinner

I pushed my fix for clang: PR #12809. I rebased this PR on top of it.

@bedevere-bot

@vstinner: Please replace # with GH- in the commit message next time. Thanks!

@vstinner