DebugFission - GCC Wiki (original) (raw)
DWARF Extensions for Separate Debug Information Files
Updated January 24, 2013
The "Fission" project was started in response to the problems caused by huge amounts of debug information in large applications. By splitting the debug information into two parts at compile time -- one part that remains in the .o file and another part that is written to a parallel .dwo ("DWARF object") file -- we can reduce the total size of the object files processed by the linker.
Invocation
Fission is implemented in GCC 4.7, and requires support from recent versions of objcopy and the gold linker.
Use the -gsplit-dwarf option to enable the generation of split DWARF at compile time. This option must be used in conjunction with -c; Fission cannot be used when compiling and linking in the same step.
Use the gold linker's --gdb-index option (-Wl,--gdb-index when linking with gcc or g++) at link time to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them.
Problems with Size of the Debug Information
Large applications compiled with debug information experience slow link times, possible out-of-memory conditions at link time, and slow gdb startup times. In addition, they can contribute to significant increases in storage requirements, and additional network latency when transferring files in a distributed build environment.
- Out-of-memory conditions: When the total size of the input files is large, the linker may exceed its total memory allocation during the link and may get killed by the operating system. As a rule of thumb, the link job total memory requirements can be estimated at about 200% of the total size of its input files.
- Slow link times: Link times can be frustrating when recompiling only a small source file or two. Link times may be aggravated when linking on a machine that has insufficient RAM, resulting in excessive page thrashing.
- Slow gdb startup times: The debugger today performs a partial scan of the debug information in order to build its internal tables that allow it to map names and addresses to the debug information. This partial scan was designed to improve startup performance, and avoids a full scan of the debug information, but for large applications, it can still take a minute or more before the debugger is ready for the first command. The debugger now has the ability to save a ".gdb_index" section in the executable and the gold linker now supports a --gdb-index option to build this index at link time, but both of these options still require the initial partial scan of the debug information.
These conditions are largely a direct result of the amount of debug information generated by the compiler. In a large C++ application compiled with -O2 and -g, the debug information accounts for 87% of the total size of the object files sent as inputs to the link step, and 84% of the total size of the output binary.
Recently, the -Wa,--compress-debug-sections option has been made available. This option reduces the total size of the object files sent to the linker by more than a third, so that the debug information now accounts for 70-80% of the total size of the object files. The output file is unaffected: the linker decompresses the debug information in order to link it, and outputs the uncompressed result (there is an option to recompress the debug information at link time, but this step would only reduce the size of the output file without improving link time or memory usage).
What's All That Space Being Used For?
The debugging information in the relocatable object files sent to the linker consists of a number of separate tables (percentages are for uncompressed debug information relative to the total object file size):
- Debug Information Entries - .debug_info (11%): This table contains the debug info for subprograms and variables defined in the program, and many of the trivial types used.
- Type Units - .debug_types (12%): This table contains the debug info for most of the non-trivial types (e.g., structs and classes, enums, typedefs), keyed by a hashed type signature so that duplicate type definitions can be eliminated by the linker. During the link, about 85% of this data is discarded as duplicate. These sections have the same structure as the .debug_info sections.
- Strings - .debug_str (25%): This table contains strings that are not placed inline in the .debug_info and .debug_types sections. The linker merges the string tables to eliminate duplicates, discarding about 93% of the data as duplicate.
- Range tables - .debug_ranges (2%) and .debug_aranges (0.1%): These tables contain range lists to define what pieces of a program's text belong to which subprograms and compilation units.
- Location lists - .debug_loc (2%): These tables contain lists of expressions that describe to the debugger the location of a variable based on the PC value.
- Line number tables - .debug_line (1%): These tables contain a description of the mapping from PC values to source locations.
- Debug abbreviation codes - .debug_abbrev (<1%): These tables provide the definitions for abbreviation codes used in describing the debug info in the .debug_info and .debug_types sections.
- Public names - .debug_pubnames (<1%): These tables provide a list of public names defined in the compilation unit, intended to allow the debugger to find the appropriate compilation units quickly for a given name. (In practice, these tables are unused by gdb.)
- Relocations for debug information (46%): The relocations identify to the linker where all the relocatable references are in the debug information. Of the 46%, about 20 percentage points are for the .debug_info section and about 17 are for the .debug_types section. Nine of ten of these relocations are for references to the .debug_str section; the remaining tenth are mostly references to locations in the program. Another 9 percentage points are for the .debug_ranges and .debug_loc sections; these are entirely references to locations in the program. These relocations are used by the linker and are not copied to the output file.
Using compressed debug sections, the percentages are adjusted as follows:
- Debug Information Entries (4%)
- Type Units (7%)
- Strings (10%)
- Range tables (<<1%) (These compress to almost nothing, because most of the information is in the relocations.)
- Location lists (1%)
- Line number tables (<1%)
- Debug abbreviation codes (<1%)
- Public names (<1%)
- Relocations (60%) (Relocations are not compressed.)
Towards Reducing the Amount of Debug Information Sent to the Linker
The numbers above suggest several possible approaches to improving build performance by reducing the amount of debug information sent to the linker. Of the approaches listed below, the first has already been implemented, and we are planning to proceed with options #3 and #5.
1. Compress debug sections
This option is already in the binutils assembler.
Total estimated benefit: 36% reduction
All estimated benefits below assume the use of compressed debug sections.
2. Intermediate links
We can use intermediate (ld -r) link steps to discard a good fraction of the duplicate type information and strings in the debug information. COMDAT elimination on the .debug_types section would ultimately reduce the total size by 75%, and string merge processing would ultimately reduce the total size of the .debug_str section by 80%.
By using ld -r on the input files, there would be some risk of including object files that would have been passed over during an archive library search, but this could be mitigated by removing duplicate definitions of the same symbol in different libraries.
Total estimated benefit: 6% reduction
3. Eliminate relocations to strings
Relocations in the .debug_info and .debug_types sections that refer to the .debug_str section can be replaced by an extra indirection and a new dedicated table of string offsets. In the debug info, each string reference takes 8 bytes in the debug info plus 24 bytes for the relocation (the 8 bytes compress to 1 byte on average, but the relocations are not compressed). These would be replaced by an average 1-2 byte string index, and an 8-byte string offset in a separate table. The separate table can be implicitly relocated, so no relocations are necessary, and would probably compress by about 80% (estimated). Furthermore, the number of unique strings (within each compilation unit) is only about 55% of the total number of string references, so the separate table would be reduced by another 45%.
Total estimated benefit: 53% reduction
4. Move type units to a separate repository
The information in .debug_types is largely independent of the rest of the compilation unit; each entry describes a type and is the same in each compilation unit that contains a definition of that type. The compiler computes a unique signature for each type definition, and could store the debug info for that type in an external repository, keyed by the type's signature. This would reduce the object file size by the 7% used directly for the .debug_types sections, by another estimated 5% for the strings referenced by the type entries, and by 22% for the relocations associated with the .debug_types sections.
Total estimated benefit: 34% reduction (less if option 3 is also implemented)
5. Move debug info and type units to a ".dwo" file
Alternatively, we can move the .debug_info, .debug_types, and .debug_str sections from the object (.o) files to a separate DWARF object (.dwo) file (or ".dsym", in Apple's nomenclature). Assuming that we could ignore these .dwo files during the link step, this would remove the bulk of the data that would be sent to the linker, and debug builds would be be much closer to non-debug builds in terms of object file size and link speed.
There are two options for how to deal with the separate .dwo files when it's time to debug a program.
The first option is for the debugger to look for the debug info directly in the .dwo files, requiring both that the output binary contains enough information to find the .dwo files, and that the .dwo files must remain available for use for as long as the binary might need debugging. This option requires no additional link step for the debug information.
The second option is to invoke a separate link step to combine the .dwo files into a single .dwo file that can be easily stored and located by the debugger (this is the approach that Apple takes with its dsymutil tool). While this option dramatically reduces the size of the main link, the separate link step for the debug information is still close to the original order of magnitude (debug info being more than 80% of the total size of the object files). A dsymutil-like linker, however, could be made to operate much more efficiently than a full-featured ELF linker.
Total estimated benefit: 70% reduction
Previous Implementations of Separate Debug Information
Sun, HP, and Apple have all implemented similar mechanisms where the debug information is not linked into the final binary. All three implementations simply leave the debug information in the relocatable object (.o) files, with summary information in the final binary to enable the debugger to locate the object files and to apply the relocations to the debug information. Sun's implementation is for stabs only; for DWARF, the linker always copies the full debug information to the output file. HP's implementation is for DWARF, and includes summary information with the names of the original object files, and a link map that allows the debugger to locate each object file's contribution to each section in the output file. Apple's solution is similar, but the summary information is synthesized by the linker in stabs format, and the DWARF information is linked together in a separate link step by the dsymutil utility.
In the Sun and HP implementations, the debug information in the relocatable objects still requires relocation at debug time, and the debugger must read the summary information from the executable file in order to map symbols and sections to the output file when processing and applying the relocations. The Apple implementation avoids this cost at debug time, but at the cost of having a separate link step for the debug information.
As we have seen above, a significant factor in the space used by debug information is the number of relocations, so a solution that minimizes the number of relocations not only reduces the total size of the binary plus its debug information, but also reduces the complexity and cost of reading the debug information at debug time.
Design for Moving Debug Information to ".dwo" Files
In order for the debugger to be able to locate and process the information in raw (unrelocated) .dwo files, some information must still be left behind in the .o files for the linker to combine and relocate. The bulk of the reduction in .o file size will come from moving .debug_info, .debug_types, .debug_loc, and .debug_str sections into the .dwo file (and eliminating most of the associated relocations). The .debug_ranges, .debug_line, .debug_pubnames, .debug_pubtypes, .debug_aranges, and .debug_gdb_scripts sections (and their relocation sections) will remain in the .o file. The .debug_abbrev section, although small, will naturally move with the .debug_info and .debug_types sections.
Instead of a full .debug_info section the .o file will contain a "skeleton" .debug_info section, and a corresponding .debug_abbrev section. This .debug_info section will contain a single DW_TAG_compile_unit DIE, with no children. A DW_AT_dwo_name attribute will provide the name of the .dwo file, and a DW_AT_dwo_id attribute will contain a unique 64-bit ID for the compilation unit. The skeleton compilation unit DIE will contain a few additional attributes as described below.
The .dwo file will follow the ELF format, but will have only a file header, section table, and the debug sections. The debug sections in the .dwo file will all end with ".dwo".
For the sections that will move into a .dwo file, the existing relocations will be handled as follows:
- Each compilation unit header normally has a relocation to the corresponding abbrev table. This relocation is no longer necessary, and the field in the compilation unit header should be interpreted as a direct offset relative to the .debug_abbrev.dwo section in the .dwo file.
- The DW_TAG_compile_unit DIE at the top level of the .debug_info section contains a relocated reference to a line table in the .debug_line section (for the DW_AT_stmt_list attribute). To handle this case, the DW_AT_stmt_list attribute will be placed in the skeleton compilation unit DIE in the .o file instead of the full compilation unit DIE in the .dwo file.
- Likewise, the DW_TAG_type_unit DIE at the top level of the .debug_types section normally contains a DW_AT_stmt_list attribute with a relocated reference to a line table. In this case, the line table serves only to provide a list of directory and file names that are indexed by DW_AT_file attributes. For this purpose, the .dwo file will contain a skeleton .debug_line.dwo section whose only purpose is to list the directories and file names that are indexed by DW_AT_file attributes. Its format will be the same as the .debug_line section, but without the line tables. The value of the DW_AT_stmt_list attribute in a .debug_types.dwo section will be the unrelocated offset to the skeleton line table within the .debug_line.dwo section.
- References to strings in the .debug_str.dwo section will be replaced with an indirect string index (using a new FORM code, DW_FORM_str_index), expressed as an unsigned LEB128 number. A new section in the .dwo file, .debug_str_offsets.dwo, will contain a list of offsets to the strings in the string table, and the indirect string index will select one of the offsets in that list. The entries in .debug_str_offsets.dwo will not need relocations. The .debug_str.dwo section will have the same format as the .debug_str section.
- References to ranges in the .debug_ranges section from DWARF attributes using the form DW_FORM_sec_offset typically have a relocation to an offset relative to a symbol placed in the .debug_ranges section. These symbolic references will be replaced by direct offsets that need no relocation (still using DW_FORM_sec_offset). The skeleton DW_TAG_compile_unit entry in the .o file’s .debug_info section will contain a DW_AT_ranges_base attribute whose (relocated) value points to the base of that compilation unit’s .debug_ranges contribution. All values using DW_FORM_sec_offset for a DW_AT_ranges or DW_AT_start_scope attribute (i.e., those attributes whose values are of class rangelistptr) must be interpreted as offsets relative to the base address given by the DW_AT_ranges_base attribute in the skeleton compilation unit DIE.
- References to addresses in loadable sections (e.g., .text or .data), from DWARF attributes using the form DW_FORM_addr, will be replaced with an indirect symbol index (using a new FORM code, DW_FORM_addr_index), expressed as an unsigned LEB128 number. A new section in the .o file, .debug_addr, will contain a list of relocated addresses, one for each reference needed. Each entry in the .debug_addr section will be an address-sized value, with the relocation that applied to the original reference. The linker will combine the .debug_addr sections from all the input .o files, and will apply the relocations normally. In order for the debugger to find the contribution to the .debug_addr section corresponding to a particular compilation unit, the skeleton DW_TAG_compile_unit entry in the .o file's .debug_info section will also contain a DW_AT_addr_base attribute whose value points to the base of that compilation unit's .debug_addr contribution.
- References to addresses in loadable sections from DWARF location expressions in the .debug_info.dwo or the .debug_loc.dwo section, using the opcode DW_OP_addr, will be replaced with an indirect symbol index (using a new OP code, DW_OP_addr_index), expressed as an unsigned LEB128 number. These references will index the .debug_addr section as above. Likewise, references to relocatable values using the opcodes DW_OP_const4u or DW_OP_const8u (typically used for thread-local storage offsets) will be replaced with an indirection using a new OP code, DW_OP_const_index.
- Each location list entry in the .debug_loc section contains a beginning and ending address offset, which normally are relocated addresses. In the .debug_loc.dwo section, these offsets will be replaced by indices into the .debug_addr section. Each location list entry will begin with a single byte identifying the entry type: DW_LLE_end_of_list_entry (0) indicates an end-of-list entry, DW_LLE_base_address_selection_entry (1) indicates a base address selection entry, DW_LLE_start_end_entry (2) indicates a normal location list entry providing start and end addresses, DW_LLE_start_length_entry (3) indicates a normal location list entry providing a start address and a length, and DW_LLE_offset_pair_entry (4) indicates a normal location list entry providing start and end offsets relative to the base address given by a base address selection entry. An end-of-list entry has no further data. A base address selection entry contains a single unsigned LEB128 number following the entry type byte, which is an index into the .debug_addr section that selects the new base address for subsequent location list entries. A start/end entry contains two unsigned LEB128 numbers following the entry type byte, which are indices into the .debug_addr section that select the beginning and ending addresses. A start/length entry contains one unsigned LEB128 number and a 4-byte unsigned value (as would be represented by the form code DW_FORM_const4u). The first number is an index into the .debug_addr section that selects the beginning offset, and the second number is the length of the range. Addresses fetched from the .debug_addr section are not relative to the base address. An offset pair entry contains two 4-byte unsigned values (as would be represented by the form code DW_FORM_const4u), treated as the beginning and ending offsets, respectively, relative to the base address. As in the .debug_loc section, the base address is obtained either from the nearest preceding base address selection entry, or, if there is no such entry, from the compilation unit base address (see section 3.1.1 in the DWARF-4 spec). For the latter three types (start/end, start/length, and offset pair), the two operand values are followed by a location description as in a normal location list entry in the .debug_loc section.
In the initial implementation, we will modify gcc to emit all the debug information into the single .o file, and we will use post-compile processing to move the appropriate sections into the separate .dwo file.
The debug sections to remain in the .o file are:
- .debug_abbrev - Defines the abbreviation codes used by the skeleton .debug_info section.
- .debug_info - Contains the skeleton DW_TAG_compile_unit DIE. This DIE has the following attributes: DW_AT_comp_dir, DW_AT_stmt_list, DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges, DW_AT_dwo_name, DW_AT_dwo_id, DW_AT_ranges_base, DW_AT_addr_base, DW_AT_pubnames (see below), and DW_AT_pubtypes (see below). If DW_AT_ranges is present, DW_AT_low_pc and DW_AT_high_pc are not used, and vice versa. Note the following difference between the current GCC implementation and the DWARF v5 specification: In the current GCC implementation (based on DWARF v4), if DW_AT_ranges is present, the offset into the ranges table is not relative to the value given by DW_AT_ranges_base (i.e., DW_AT_ranges_base is used only for references to the range table from the dwo sections). In DWARF v5, the DW_AT_ranges_base attribute is used for all references to the range table -- both from dwo sections and from skeleton compile units.
- .debug_types - Contains the skeleton DW_TAG_type_unit DIE. This DIE has the following attributes: DW_AT_comp_dir, DW_AT_dwo_name, DW_AT_pubnames (see below), and DW_AT_pubtypes (see below). (It is recommended that the string-valued attributes use DW_FORM_strp, because these strings will be duplicates of the corresponding attributes in the skeleton .debug_info section.) This section will be removed in a future release (and does not appear in the DWARF v5 specification), as skeleton type units are no longer needed by GDB.
- .debug_str - Contains any strings referenced by the .debug_info or .debug_types sections (via DW_FORM_strp).
- .debug_addr - New section to hold references to loadable sections, indexed by attributes of form DW_FORM_addr_index or location expression DW_OP_addr_index opcodes.
- .debug_line - Line tables, unaffected by this design. The list of source file names in this main line table must include all source files contributing to the compilation unit, even if they would only be needed in a skeleton line table for a type unit. (These could be moved to the .dwo file, but in order to do so, each DW_LNE_set_address opcode would need to be replaced by a new opcode that referenced an entry in the .debug_addr section. Furthermore, leaving this section in the .o file allows many debug info consumers to remain unaware of .dwo files, and makes the list of filenames referenced by each CU available to the debugger from the executable.)
- .debug_ranges - Range lists, unaffected by this design.
- .debug_pubnames - Public names for use in building the .gdb_index section at link time. This section will have the same format and use as always, but we will fix gcc to emit all names that need to appear in the index.
- .debug_pubtypes - Public types for use in building the .gdb_index section at link time.
- .debug_aranges - Range table for the compilation unit, for use in building the .gdb_index section at link time. It's likely, but unverified, that the contents of this section as produced by gcc today are sufficient for this use.
The following debug sections will be generated by gcc and the assembler in the .o file, but will be moved to the separate .dwo file in post-compile processing:
- .debug_abbrev.dwo - Defines the abbreviation codes used by the .debug_info.dwo section.
- .debug_info.dwo - Contains the complete debug information for the compilation unit. The top-level DW_TAG_compile_unit DIE will contain the standard attributes, except for DW_AT_stmt_list, DW_AT_low_pc, DW_AT_high_pc, and DW_AT_ranges.
- .debug_types.dwo - Contains the complete debug information for each type unit. (There may be multiple instances of this section. Unlike in a .o file, the .debug_types.dwo sections will not be placed in COMDAT groups.)
- .debug_loc.dwo - Contains the location lists referenced from the .debug_info.dwo section.
- .debug_line.dwo - Contains a skeleton line table providing directory and file names referenced by DW_AT_file attributes in the .debug_types.dwo sections. (There may be multiple line tables, one for each type unit, or there may be one line table shared by the many type units in a .dwo file.)
- .debug_str.dwo - Contains the strings referenced indirectly via DW_FORM_strp from the .debug_info.dwo and .debug_types.dwo sections.
- .debug_str_offsets.dwo - Contains an array of offsets of strings in the .debug_str.dwo section. Attributes using DW_FORM_str_index will reference an entry in this array.
- .debug_macinfo.dwo or .debug_macro.dwo - Macro information, moved to the .dwo file, but otherwise unaffected by this design.
Building a GDB Index
The .gdb_index section (see GNU Index Section) allows GDB to locate the appropriate compilation unit or type unit quickly, given a name or an address, without having to open all the .dwo files and scan all the debug information at start-up time.
In order to build the .gdb_index section, the linker today needs to do three things:
- Scan the .debug_info sections to extract all the public names mentioned in those sections and build a hash table mapping those names to a list of compilation units that provide definitions for those names.
- Scan the compilation unit headers of the .debug_info sections to build a list of compilation units, and to build a range table that can be used to map and address to a specific compilation unit.
- Scan the .debug_types sections to build a list of type units.
In this new design, the .debug_info and .debug_types sections will not be available for the linker to scan. Instead, the linker will extract the public names from the .debug_pubnames and .debug_pubtypes sections, the range table from the .debug_aranges sections, and the the list of type units from the .debug_pubtypes sections.
The format of the .gdb_index section will remain unchanged (with the possible exception of removing the range tables in favor of having gdb use .debug_aranges directly).
The .debug_pubnames section was always intended as a comprehensive list of symbols that gdb could use for quick lookup, but bugs in gcc have so far prevented gdb from using this section for its intended purpose, and the .gdb_index section has instead been built from a scan over the .debug_info section. We will need to fix gcc so that the .debug_pubnames section contains all of the names that gdb requires in its index.
The .debug_pubtypes section, likewise, was intended as a list of public types, but does not yet include types or public names (e.g., enum constants) defined in type units in the .debug_types sections. We will have to modify gcc to produce the required information.
Each .debug_pubnames and .debug_pubtypes section contains a reference to the compilation unit in the .debug_info section, but there is no efficient way to find the pubname/pubtypes section given a compilation unit or a type unit. In order to simplify the processing necessary to build the .gdb_index section, we will add two new attributes to the DW_TAG_compile_unit and DW_TAG_type_unit DIEs: DW_AT_pubnames and DW_AT_pubtypes, whose values will be references (with form DW_FORM_sec_offset) to the associated pubnames/pubtypes sets. These attributes have the further benefit that the consumer will be able to know for sure that the list of public names in each section is reliable (relative to older versions of GCC).
The .debug_aranges section currently contains all the required information for producing the .gdb_index section (as far as I know), and could in fact be used directly by gdb instead of having the linker reformat its contents into .gdb_index. We can either modify the linker to extract the necessary range information from .debug_aranges to produce .gdb_index, or modify gdb to use the existing information from .debug_aranges instead of expecting it in .gdb_index.
Packaging Debug Information for Release
While the .dwo files are convenient for debugging an application during development, when the .dwo files can all be found in the build tree, once an application is copied outside the build tree, it is desirable to be able to collect the full set of .dwo files into a single package. For this purpose, we will support a dwp utility that will combine a set of .dwo files into a single DWARF package file, for which we will use the ".dwp" extension. The proposed package file format is described in DWARF Package File Format.
Proposed Extensions to the DWARF Specification
We propose adding several new sections, attribute codes, and form codes to the DWARF specification, and extending the .debug_pubtypes format to support type units.
New DWARF sections
- .debug_addr -- This section contains relocated references to loadable sections in the program. No header is required; the base of each compilation unit's contribution to the section is given by the DW_AT_addr_base attribute of the skeleton DW_TAG_compile_unit DIE in the .debug_info section. Each entry is 4 or 8 bytes, depending on the address_size field of the compilation unit header in the .debug_info section.
- .debug_abbrev.dwo -- This section contains the abbreviation table for the .debug_info.dwo section. Its format is identical to the .debug_abbrev section.
- .debug_info.dwo -- This section contains the bulk of the debug information for a compilation unit. Its format is identical to the .debug_info section.
- .debug_types.dwo -- This section contains a type unit. Its format is identical to the .debug_types section. Each type unit is placed in a separate .debug_types.dwo section; unlike .debug_types sections, however, .debug_types.dwo sections are not placed in COMDAT groups.
- .debug_lines.dwo — This section contains a skeleton line table providing file names referenced by DIEs in the .debug_types.dwo section. Its format is the same as the .debug_lines section, but will not contain any line number tables.
- .debug_loc.dwo — This section contains the location lists referenced from .debug_info.dwo. Its format is like that of the .debug_loc section, except that location list entries are modified as described above. The DW_OP_addr opcode may not be used in this section, and must be replaced with DW_OP_addr_index in order to refer to a location within the loadable text or data of the program.
- .debug_str.dwo -- This section contains the DWARF string table in a .dwo file. Its format is identical to the .debug_str section in the .o file.
- .debug_str_offsets.dwo -- This section contains a list of offsets to strings in the .debug_str.dwo section. References to strings using the DW_FORM_str_index form code refer to an entry in this section. No header is required. The size of each entry is 4 bytes for DWARF-32 compilation units, or 8 bytes for DWARF-64 compilation units, as determined by the unit_length field of the compilation unit header in the .debug_info.dwo section. Entries in this section are not relocated--each entry is an offset relative to the beginning of the .debug_str.dwo section in the same .dwo file.
- .debug_id.dwo -- This section contains a 64-bit signature of the debug information for the compilation unit. The debugger will compare this signature with the value from the DW_AT_GNU_dwo_id attribute found in the skeleton DW_TAG_compile_unit DIE, to verify a match. (Future improvement: We would like to add the signature to the CU header in both the .debug_info section and the .debug_info.dwo section, remove the DW_AT_GNU_dwo_id attribute, and remove this section.)
- .debug_macinfo.dwo or .debug_macro.dwo — These sections contain macro information generated by the -g3 compiler option, and have the same format as the corresponding .debug_macinfo and .debug_macro sections. The compiler generates .debug_macinfo.dwo for strict DWARF output, or .debug_macro.dwo instead as a Gnu extension.
New DWARF attributes
- DW_AT_dwo_name -- A string-valued attribute that identifies the name of the corresponding .dwo file. This attribute is found in the DW_TAG_compile_unit DIE of the .debug_info section, and is used by the debugger to locate the .dwo file that contains the debug information for the compilation unit.
- DW_AT_dwo_id -- A block-valued attribute that provides a unique ID for the compilation unit. This attribute is found in the DW_TAG_compile_unit DIE in the skeleton .debug_info sections and in the .debug_id.dwo section in the .dwo file, and can be used to verify a match between .o file (or linked binary) and .dwo file.
- DW_AT_addr_base -- A reference-valued attribute that refers to the beginning of the compilation unit's contribution to the .debug_addr section. This attribute is found in the DW_TAG_compile_unit DIE of the .debug_info section.
- DW_AT_ranges_base — A reference-valued attribute that refers to the beginning of the compilation unit’s contribution to the .debug_ranges section. This attribute is found in the DW_TAG_compile_unit DIE of the skeleton .debug_info section.
- DW_AT_pubnames -- A reference-valued attribute that refers to the set of names in .debug_pubnames provided by this compilation unit (and by any type units that accompany the compilation unit).
- DW_AT_pubtypes -- A reference-valued attribute that refers to the set of names in .debug_pubtypes provided by this compilation unit (and by any type units that accompany the compilation unit).
New DWARF form codes
- DW_FORM_addr_index -- An unsigned LEB128 value that refers to an entry in the .debug_addr section. This form belongs to the address class, and may be used in a .debug_info.dwo section for any attribute that would have otherwise used DW_FORM_addr to refer to an address in a loadable section of the linked binary. The first entry in the compilation unit’s contribution to the debug_addr section has an index of 0.
- DW_FORM_str_index -- An unsigned LEB128 value that refers to an entry in the .debug_str_offsets.dwo section. This form belongs to the string class, and may be used in a .debug_info.dwo or .debug_types.dwo section for any attribute that would have otherwise used DW_FORM_strp to refer to a string in the .debug_str section. The first entry in the .dwo_str_offsets section has an index of 0.
New DWARF location expression opcodes
- DW_OP_addr_index — Takes an unsigned LEB128 value that refers to an entry in the .debug_addr section. The first entry in the compilation unit’s contribution to the debug_addr section has an index of 0. The value obtained from the .debug_addr section will be treated as an address by the DWARF consumer, like DW_OP_addr, and may need to be relocated if the load module's load address is different from its link-time address.
- DW_OP_const_index — Takes an unsigned LEB128 value that refers to an entry in the .debug_addr section. The first entry in the compilation unit’s contribution to the debug_addr section has an index of 0. The value obtained from the .debug_addr section will be treated as a constant value by the DWARF consumer, like DW_OP_const4u and DW_OP_const8u.
Extension to .debug_pubtypes
The .debug_pubtypes table will also contain the public types defined by type units generated with the compilation unit. Its header will point to the compilation unit in the .debug_info section, but the DW_AT_pubtypes attribute in the DW_TAG_type_unit DIE in the type unit will provide the link between the type unit and the pubtypes table.