Stop emitting less useful debug sections: .debug_pubnames
and .debug_pubtypes
· Issue #688 · rust-lang/compiler-team (original) (raw)
Proposal
Background
With -C debuginfo=2
, the debug info emitted on Linux would include two “pub” sections, .debug_pubnames
and .debug_pubtypes
. These two sections are for faster name lookup in debugger, but were designed with flaws, and has been superceded by a new debug section in DWARF version 5.
While they have been there since DWARF version 2, debuggers seldom make use of them. Clang doesn't emit it by default, so does GCC. GCC/GDB even have their own “enhanced” format of these pub sections, i.e. .debug_gnu_pubnames
and .debug_gnu_pubtypes
. These sections are useful only when .gdb_index
is present, and that requires an explicit flag passed in to linker invocations.
Let me quote what other people have said (in an random order):
- “They are just wasted space” — https://fedoraproject.org/wiki/Features/GdbIndex#Detailed_Description
- “The “.debug_pubnames” and “.debug_pubtypes” formats are not what a debugger needs” — https://llvm.org/docs/SourceLevelDebugging.html
- “The pubnames and pubtypes data described in Section 6.1 are much larger than needed, since they contain redundant information, and the are missing other information, such as local symbol names.” — https://dwarfstd.org/issues/140817.1.html
- “Nope, it's removed completely. No debugger that I know of uses it at all and it's useless for many reasons.” — https://discourse.llvm.org/t/dwarf-info-and-debug-pubnames-section/22411/2
- “Right now LLDB and GDB don’t trust .debug_pubnames or .debug_pubtypes because they don’t index everything.” — https://discourse.llvm.org/t/dwarf-reconstituting-mangled-names-skipping-dw-at-linkage-name/58523/5
- “The main reason for this is we don't trust the “.debug_pubnames” section as it it useless for debuggers. Why? .debug_pubnames is an accelerator table that shows only functions that are externally visible in a program. This means all static functions and data, and any functions that are hidden in a shared library won't be in the list.” — https://discourse.llvm.org/t/some-basic-lldb-usage-questions/16356/2
- “…let's remove debug_pubnames/debug_pubtypes. They are entirely optional (DWARFv4, section 6.1, page 105), anything that can be done with them can be done without them and pubnames has been wrong for a long time.” — cmd/link: .debug_pubnames and .debug_pubtypes not following DWARF4 spec golang/go#30573 (comment)
- “The .debug_pubnames section was always intended as a comprehensive list of symbols that gdb could use for quick lookup, but bugs in gcc have so far prevented gdb from using this section for its intended purpose…” — https://gcc.gnu.org/wiki/DebugFission
- “We propose that GCC simply stop emitting .debug_pubnames and .debug_pubtypes, as experience has shown that they are not very useful. (In fact, on Linux GCC did not even generate .debug_pubtypes until 2009, and no one ever complained.)” — https://gcc.gnu.org/wiki/DebugGNUIndexSection
“Okay I see they are rarely useful, but why we do want to remove them?”
Because they are huge!
On x86_64-unknown-linux-gnu
with rustc 1.76.0-nightly (dd430bc8c 2023-11-14)
. For example, cargo
built in debug mode has 86 MiB of .debug_pubtypes
section and 45 MiB of .debug_pubnames
. They take 31% out of the entire 413 MiB for the cargo
binary.
click to see size of each section headers of cargo
with pub sections:
target/debug/cargo :
section size addr
.fini_array 8 54931704
.fini 13 45644480
.init_array 24 54931680
.init 27 3465216
.note.ABI-tag 32 968
.note.gnu.property 32 936
.debug_gdb_scripts 34 47631760
.comment 64 0
.interp 83 848
.plt.got 568 3469088
.gnu.version_r 592 20784
.dynamic 624 56279608
.gnu.hash 632 1000
.gnu.version 942 19836
.tbss 1083 54931680
.plt 3840 3465248
.bss 4128 56866560
.rela.plt 5736 3457696
.dynstr 6900 12936
.dynsym 11304 1632
.data 46848 56819712
.got 539472 56280232
.debug_abbrev 1168593 0
.eh_frame_hdr 1321700 47631796
.data.rel.ro 1347896 54931712
.gcc_except_table 1463472 53466184
.rodata 1985936 45645824
.debug_loc 3047693 0
.debug_aranges 3401968 0
.rela.dyn 3436320 21376
.eh_frame 4512688 48953496
.debug_ranges 16426448 0
.debug_line 22984630 0
.text 42174816 3469664
.debug_pubnames 47198910 0
.debug_str 86259041 0
.debug_pubtypes 90834708 0
.debug_info 104595048 0
Total 432782853
without pub sections:
target/debug/cargo :
section size addr
.fini_array 8 54931704
.fini 13 45644480
.init_array 24 54931680
.init 27 3465216
.note.ABI-tag 32 968
.note.gnu.property 32 936
.debug_gdb_scripts 34 47631760
.comment 64 0
.interp 83 848
.plt.got 568 3469088
.gnu.version_r 592 20784
.dynamic 624 56279608
.gnu.hash 632 1000
.gnu.version 942 19836
.tbss 1083 54931680
.plt 3840 3465248
.bss 4128 56866560
.rela.plt 5736 3457696
.dynstr 6900 12936
.dynsym 11304 1632
.data 46848 56819712
.got 539472 56280232
.debug_abbrev 1163721 0
.eh_frame_hdr 1321700 47631796
.data.rel.ro 1347896 54931712
.gcc_except_table 1463472 53466184
.rodata 1985936 45645824
.debug_loc 3047693 0
.debug_aranges 3401968 0
.rela.dyn 3436320 21376
.eh_frame 4512688 48953496
.debug_ranges 16426448 0
.debug_line 22984630 0
.text 42174816 3469664
.debug_str 86259041 0
.debug_info 104595048 0
Total 294744363
Another example is a popular CLI tool ripgrep. It is 63 MiB in total in debug mode build, and pub sections made up 27% of the size, which is 17 MiB.
click to see size of each section headers of ripgrep
with pub sections
target/debug/rg :
section size addr
.fini_array 8 13007480
.fini 9 9010744
.init_array 16 13007464
.init 23 678448
.gnu.hash 28 720
.interp 28 624
.note.ABI-tag 32 652
.debug_gdb_scripts 34 9699224
.note.gnu.build-id 36 684
.comment 70 0
.plt 160 678480
.tbss 187 13007464
.data 200 13455360
.rela.plt 216 678232
.gnu.version 246 5520
.gnu.version_r 368 5768
.bss 472 13455560
.dynamic 576 13383304
.dynstr 1816 3704
.dynsym 2952 752
.got 71480 13383880
.gcc_except_table 217276 10690888
.eh_frame_hdr 221724 9699260
.debug_loc 305044 0
.data.rel.ro 375816 13007488
.debug_abbrev 404234 0
.debug_aranges 544864 0
.rela.dyn 672096 6136
.rodata 688408 9010816
.eh_frame 769904 9920984
.debug_ranges 2950368 0
.debug_line 4093857 0
.debug_pubnames 6489064 0
.text 8332101 678640
.debug_pubtypes 11176543 0
.debug_str 12135329 0
.debug_info 17493971 0
Total 66949556
without pub sections
target/debug/rg :
section size addr
.fini_array 8 13007480
.fini 9 9010744
.init_array 16 13007464
.init 23 678448
.gnu.hash 28 720
.interp 28 624
.note.ABI-tag 32 652
.debug_gdb_scripts 34 9699224
.note.gnu.build-id 36 684
.comment 70 0
.plt 160 678480
.tbss 187 13007464
.data 200 13455360
.rela.plt 216 678232
.gnu.version 246 5520
.gnu.version_r 368 5768
.bss 472 13455560
.dynamic 576 13383304
.dynstr 1816 3704
.dynsym 2952 752
.got 71480 13383880
.gcc_except_table 217276 10690888
.eh_frame_hdr 221724 9699260
.debug_loc 305044 0
.data.rel.ro 375816 13007488
.debug_abbrev 401462 0
.debug_aranges 544864 0
.rela.dyn 672096 6136
.rodata 688408 9010816
.eh_frame 769904 9920984
.debug_ranges 2950368 0
.debug_line 4093857 0
.text 8332101 678640
.debug_str 12135329 0
.debug_info 17493971 0
Total 49281177
That is, in general. If we can remove both of these sections, the final binary size can be cut off around ~30%.
If we can remove both of the sections, conceptually the memory usage would drop a lot during linking stage, and we can save the disk usage as well.
People has suffered from OOM-killed in linking stage and I've traced down. It is usually the output binary size exceeding available RAM.
There is an issue in Rust-for-Linux calling out Rust binaries are huge and looking forward to some solutions. An existing issue in rust-lang/rust: rust-lang/rust#48762.
Change Proposal
We should turn generation of these sections off since they are not useful.
User-facing impact
If any debuggers were using these sections, initial loading of debuginfo might become slower but there should be no functional regressions (debuggers in general do not use these tables so there should not be any such performance degradation, but that is the most likely conceivable negative outcome).
It is possible that other tools use these sections for various purposes, but we do not know of any such tools at this time.
Alternative
Stabilize -Zdwarf-version
, or make DWARF version 5 the default on certain platform (presumably on Linux). Pub sections are replaced in DWARFv5 with a better, more compact and useful .debug_names
section.
Mentors or Reviewers
@wesleywiser is willing to review
Process
The main points of the Major Change Process are as follows:
- File an issue describing the proposal.
- A compiler team member or contributor who is knowledgeable in the area can second by writing
@rustbot second
.- Finding a “second” suffices for internal changes. If however, you are proposing a new public-facing feature, such as a
-C flag
, then full team check-off is required. - Compiler team members can initiate a check-off via
@rfcbot fcp merge
on either the MCP or the PR.
- Finding a “second” suffices for internal changes. If however, you are proposing a new public-facing feature, such as a
- Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.
You can read more about Major Change Proposals on forge.
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.