Demangled names in debug info (PDB) for Swift or other non-C++ languages (original) (raw)
January 18, 2024, 11:02pm 1
Hi,
When the Swift compiler generates the PDB (codeview) files, it currently writes the demangled names (the LF_FUNC_ID
field) for user-defined functions but the mangled names for the compiler-synthesized functions that don’t have source-level names like closures.
We are considering switching some of the compiler-synthesized entities to demangled names to make the backtraces more sensible to the application developers.
For example,$s7Logging0A6SystemO4lock33_C9FF2A9F0C61C813477A18DFECB0159ALL_WZ
—> one-time initialization function for lock
$s7Logging6LoggerV13MetadataValueO11descriptionSSvgSSAEXEfU_
—> closure #1 (Logger.MetadataValue) -> String in Logger.MetadataValue.description.getter
As we understand it, C++ uses mangled names and the Windows tools rely on UnDecorateSymbolName
to demangle them as needed, while Rust currently demangle all the names in the PDB because the tools don’t understand the Rust demangling.
Do folks have insights as to whether it could potentially cause problems?
For example,
- It would increase the PDB file size (demangled names tend to be larger)
- Some tools such as WPA and WinDBG make assumptions about the names in PDBs such as no whitespaces or special characters, etc.
- Some tools may rely on heuristics of how the names or declarations are to decode additional information (by making assumptions what the associated compilers emit).
Thank you!
rnk January 22, 2024, 10:43pm 2
What I recall is that classes and namespaces are represented as parent scopes, so you get something like this:
$ cat t.cpp
int foo(int a , int b) {
return a + b;
}
namespace Bar {
int baz(int a , int b) {
return a + b;
}
}
struct Qux {
void method();
};
void Qux::method() {}
$ clang -c t.cpp -g --target=x86_64-windows-msvc && llvm-pdbutil dump -types t.o | grep -B4 -A4 LF_M\\?F
0x0074 (int): `int`
0x1001 | LF_PROCEDURE [size = 16]
return type = 0x0074 (int), # args = 2, param list = 0x1000
calling conv = cdecl, options = None
0x1002 | LF_FUNC_ID [size = 16]
name = foo, type = 0x1001, parent scope = <no type>
0x1003 | LF_STRING_ID [size = 12] ID: <no type>, String: Bar
0x1004 | LF_FUNC_ID [size = 16]
name = baz, type = 0x1001, parent scope = 0x1003
0x1005 | LF_STRUCTURE [size = 36] `Qux`
unique name: `.?AUQux@@`
vtable: <no type>, base list: <no type>, field list: <no type>
options: forward ref | has unique name, sizeof 0
0x1006 | LF_POINTER [size = 12]
referent = 0x1005, mode = pointer, opts = const, kind = ptr64
0x1007 | LF_ARGLIST [size = 8]
0x1008 | LF_MFUNCTION [size = 28]
return type = 0x0003 (void), # args = 0, param list = 0x1007
class type = 0x1005, this type = 0x1006, this adjust = 0
calling conv = cdecl, options = None
0x1009 | LF_FIELDLIST [size = 20]
- LF_ONEMETHOD [name = `method`]
type = 0x1008, vftable offset = -1, attrs = public
0x100A | LF_STRUCTURE [size = 36] `Qux`
unique name: `.?AUQux@@`
--
options: has unique name, sizeof 1
0x100B | LF_STRING_ID [size = 60] ID: <no type>, String: .../t.cpp
0x100C | LF_UDT_SRC_LINE [size = 16]
udt = 0x100A, file = 4107, line = 9
0x100D | LF_MFUNC_ID [size = 20]
name = method, type = 0x1008, class type = 0x1005
0x100E | LF_POINTER [size = 12]
referent = 0x1005, mode = pointer, opts = None, kind = ptr64
0x100F | LF_STRING_ID [size = 56] ID: <no type>, String: .../t.cpp
The LF_FUNC_ID
and LF_MFUNC_ID
identifiers are the basenames, with no parameter types, and with the scopes represented separately.
I think this might power unqualified name lookup in the debugger, and maybe you should name your Swift-synthesized functions in whatever way would be easiest to name in the debugger.
I hope that helps.