Stop emitting less useful debug sections: .debug_pubnames and .debug_pubtypes · Issue #688 · rust-lang/compiler-team (original) (raw)

Proposal

Background

With -C debuginfo=2, the debug info emitted on Linux would include two “pub” sections, .debug_pubnames and .debug_pubtypes. These two sections are for faster name lookup in debugger, but were designed with flaws, and has been superceded by a new debug section in DWARF version 5.

While they have been there since DWARF version 2, debuggers seldom make use of them. Clang doesn't emit it by default, so does GCC. GCC/GDB even have their own “enhanced” format of these pub sections, i.e. .debug_gnu_pubnames and .debug_gnu_pubtypes. These sections are useful only when .gdb_index is present, and that requires an explicit flag passed in to linker invocations.

Let me quote what other people have said (in an random order):

“Okay I see they are rarely useful, but why we do want to remove them?”

Because they are huge!

On x86_64-unknown-linux-gnu with rustc 1.76.0-nightly (dd430bc8c 2023-11-14). For example, cargo built in debug mode has 86 MiB of .debug_pubtypes section and 45 MiB of .debug_pubnames. They take 31% out of the entire 413 MiB for the cargo binary.

click to see size of each section headers of cargo

with pub sections:

target/debug/cargo  :
section                   size       addr
.fini_array                  8   54931704
.fini                       13   45644480
.init_array                 24   54931680
.init                       27    3465216
.note.ABI-tag               32        968
.note.gnu.property          32        936
.debug_gdb_scripts          34   47631760
.comment                    64          0
.interp                     83        848
.plt.got                   568    3469088
.gnu.version_r             592      20784
.dynamic                   624   56279608
.gnu.hash                  632       1000
.gnu.version               942      19836
.tbss                     1083   54931680
.plt                      3840    3465248
.bss                      4128   56866560
.rela.plt                 5736    3457696
.dynstr                   6900      12936
.dynsym                  11304       1632
.data                    46848   56819712
.got                    539472   56280232
.debug_abbrev          1168593          0
.eh_frame_hdr          1321700   47631796
.data.rel.ro           1347896   54931712
.gcc_except_table      1463472   53466184
.rodata                1985936   45645824
.debug_loc             3047693          0
.debug_aranges         3401968          0
.rela.dyn              3436320      21376
.eh_frame              4512688   48953496
.debug_ranges         16426448          0
.debug_line           22984630          0
.text                 42174816    3469664
.debug_pubnames       47198910          0
.debug_str            86259041          0
.debug_pubtypes       90834708          0
.debug_info          104595048          0
Total                432782853

without pub sections:

target/debug/cargo  :
section                   size       addr
.fini_array                  8   54931704
.fini                       13   45644480
.init_array                 24   54931680
.init                       27    3465216
.note.ABI-tag               32        968
.note.gnu.property          32        936
.debug_gdb_scripts          34   47631760
.comment                    64          0
.interp                     83        848
.plt.got                   568    3469088
.gnu.version_r             592      20784
.dynamic                   624   56279608
.gnu.hash                  632       1000
.gnu.version               942      19836
.tbss                     1083   54931680
.plt                      3840    3465248
.bss                      4128   56866560
.rela.plt                 5736    3457696
.dynstr                   6900      12936
.dynsym                  11304       1632
.data                    46848   56819712
.got                    539472   56280232
.debug_abbrev          1163721          0
.eh_frame_hdr          1321700   47631796
.data.rel.ro           1347896   54931712
.gcc_except_table      1463472   53466184
.rodata                1985936   45645824
.debug_loc             3047693          0
.debug_aranges         3401968          0
.rela.dyn              3436320      21376
.eh_frame              4512688   48953496
.debug_ranges         16426448          0
.debug_line           22984630          0
.text                 42174816    3469664
.debug_str            86259041          0
.debug_info          104595048          0
Total                294744363

Another example is a popular CLI tool ripgrep. It is 63 MiB in total in debug mode build, and pub sections made up 27% of the size, which is 17 MiB.

click to see size of each section headers of ripgrep

with pub sections

target/debug/rg  :
section                  size       addr
.fini_array                 8   13007480
.fini                       9    9010744
.init_array                16   13007464
.init                      23     678448
.gnu.hash                  28        720
.interp                    28        624
.note.ABI-tag              32        652
.debug_gdb_scripts         34    9699224
.note.gnu.build-id         36        684
.comment                   70          0
.plt                      160     678480
.tbss                     187   13007464
.data                     200   13455360
.rela.plt                 216     678232
.gnu.version              246       5520
.gnu.version_r            368       5768
.bss                      472   13455560
.dynamic                  576   13383304
.dynstr                  1816       3704
.dynsym                  2952        752
.got                    71480   13383880
.gcc_except_table      217276   10690888
.eh_frame_hdr          221724    9699260
.debug_loc             305044          0
.data.rel.ro           375816   13007488
.debug_abbrev          404234          0
.debug_aranges         544864          0
.rela.dyn              672096       6136
.rodata                688408    9010816
.eh_frame              769904    9920984
.debug_ranges         2950368          0
.debug_line           4093857          0
.debug_pubnames       6489064          0
.text                 8332101     678640
.debug_pubtypes      11176543          0
.debug_str           12135329          0
.debug_info          17493971          0
Total                66949556

without pub sections

target/debug/rg  :
section                  size       addr
.fini_array                 8   13007480
.fini                       9    9010744
.init_array                16   13007464
.init                      23     678448
.gnu.hash                  28        720
.interp                    28        624
.note.ABI-tag              32        652
.debug_gdb_scripts         34    9699224
.note.gnu.build-id         36        684
.comment                   70          0
.plt                      160     678480
.tbss                     187   13007464
.data                     200   13455360
.rela.plt                 216     678232
.gnu.version              246       5520
.gnu.version_r            368       5768
.bss                      472   13455560
.dynamic                  576   13383304
.dynstr                  1816       3704
.dynsym                  2952        752
.got                    71480   13383880
.gcc_except_table      217276   10690888
.eh_frame_hdr          221724    9699260
.debug_loc             305044          0
.data.rel.ro           375816   13007488
.debug_abbrev          401462          0
.debug_aranges         544864          0
.rela.dyn              672096       6136
.rodata                688408    9010816
.eh_frame              769904    9920984
.debug_ranges         2950368          0
.debug_line           4093857          0
.text                 8332101     678640
.debug_str           12135329          0
.debug_info          17493971          0
Total                49281177

That is, in general. If we can remove both of these sections, the final binary size can be cut off around ~30%.

If we can remove both of the sections, conceptually the memory usage would drop a lot during linking stage, and we can save the disk usage as well.

People has suffered from OOM-killed in linking stage and I've traced down. It is usually the output binary size exceeding available RAM.
There is an issue in Rust-for-Linux calling out Rust binaries are huge and looking forward to some solutions. An existing issue in rust-lang/rust: rust-lang/rust#48762.

Change Proposal

We should turn generation of these sections off since they are not useful.

User-facing impact

If any debuggers were using these sections, initial loading of debuginfo might become slower but there should be no functional regressions (debuggers in general do not use these tables so there should not be any such performance degradation, but that is the most likely conceivable negative outcome).

It is possible that other tools use these sections for various purposes, but we do not know of any such tools at this time.

Alternative

Stabilize -Zdwarf-version, or make DWARF version 5 the default on certain platform (presumably on Linux). Pub sections are replaced in DWARFv5 with a better, more compact and useful .debug_names section.

Mentors or Reviewers

@wesleywiser is willing to review

Process

The main points of the Major Change Process are as follows:

You can read more about Major Change Proposals on forge.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.