[LLVMdev] First-class debug info IR: MDLocation (original) (raw)
Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Oct 24 16:16:28 PDT 2014
- Previous message: [LLVMdev] Recent changes in -gmlt break sample profiling
- Next message: [LLVMdev] First-class debug info IR: MDLocation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I've attached a preliminary patch for MDLocation
as a follow-up to the
RFC 1 last week. It's not commit-ready -- in particular, it squashes
a bunch of commits together and doesn't pass make check
-- but I think
it's close enough to indicate the direction and work toward consensus.
IMO, the files to focus on are:
include/llvm/IR/DebugInfo.h
include/llvm/IR/DebugLoc.h
include/llvm/IR/Metadata.h
include/llvm/IR/Value.h
lib/AsmParser/LLLexer.cpp
lib/AsmParser/LLParser.cpp
lib/AsmParser/LLParser.h
lib/AsmParser/LLToken.h
lib/Bitcode/Reader/BitcodeReader.cpp
lib/Bitcode/Writer/BitcodeWriter.cpp
lib/Bitcode/Writer/ValueEnumerator.cpp
lib/Bitcode/Writer/ValueEnumerator.h
lib/IR/AsmWriter.cpp
lib/IR/AsmWriter.h
lib/IR/DebugInfo.cpp
lib/IR/DebugLoc.cpp
lib/IR/LLVMContextImpl.cpp
lib/IR/LLVMContextImpl.h
lib/IR/Metadata.cpp
Using Value
instead of MDNode
A number of APIs expect MDNode
-- previously, the only referenceable
type of metadata -- but this patch (and the ones that will follow) have
referenceable metadata that do not inherit from MDNode
. Metadata
APIs such as Instruction::getMetadata()
and
NamedMDNode::getOperand()
need to return non-MDNode
metadata.
I plan to commit the API changes incrementally so we can fix any issues there before pushing the functionality changes. Unfortunately, this currently adds a lot of noise to the (squashed) patch.
Introducing MDLocation
Of course, this adds MDLocation
, the first subclass of MDUser
. This
is a first-class IR type that has two other representations:
DILocation
(which now trivially wraps MDLocation
instead of
MDNode
) and DebugLoc
.
I've genericised the code in LLParser
(and elsewhere) to sketch out
how adding other MDUser
subclasses will go. Perhaps I used the wrong
axis, but we can adjust it as we go.
Usage examples:
!6 = metadata MDLocation(line: 43, column: 7, scope: !4)
!7 = metadata MDLocation(scope: !5, line: 67, inlinedAt: !6)
The fields can be listed in any order. The scope:
field is required,
but the others are optional (line:
and column:
default to 0
,
inlinedAt:
defaults to null
).
(Note that in the RFC I referred to this as an MDLineTable
, but
MDLocation
is a better name. If/when this work supersedes the
DIDescriptor
hierarchy, it'll likely get renamed to DILocation
, but
for now there's a name clash.)
Where this is heading
Let's look at a concrete example. Here's some simple C++ code:
$ cat t.h
struct T { short a; short b; };
$ cat foo.cpp
#include "t.h"
int foo(T t) { return t.a + t.b; }
$ cat bar.cpp
#include "t.h"
int foo(T t);
int bar(T t) { return foo(t) * 2; }
Looking forward, after refactoring ownership and uniquing and fixing up a few schema issues, I'd expect the above to link into something like the following:
!0 = metadata DIFile(filename: "foo.cpp", directory: "/path/to")
!1 = metadata DIFile(filename: "./t.h", directory: "/path/to")
!2 = metadata DIFile(filename: "bar.cpp", directory: "/path/to")
!3 = metadata DIBaseType(name: "short", size: 16, align: 16)
!5 = metadata DIBaseType(name: "int", size: 32, align: 32)
!6 = metadata DICompositeType(tag: 0x13, name: "T", uniqued: "_ZTS1T",
file: !1, line: 1, size: 32, align: 16)
!7 = metadata DIMember(line: 1, file: !1, type: !3,
name: "a", size: 16, align: 16, context: !6)
!8 = metadata DIMember(line: 1, file: !1, type: !3,
name: "b", size: 16, align: 16, context: !6)
!9 = metadata DISubroutineType(args: [ !5, !6 ])
!10 = metadata DICompileUnit(file: !0, language: 4, kind: FullDebug,
producer: "clang version 3.6.0 ",
retainedUniqueTypes: [ !6 ])
!11 = metadata DISubprogram(name: "foo", linkageName: "_Z3foo1T",
handle: i32(i32)* @_Z3foo1T, file: !0,
type: !9, context: !10)
!12 = metadata DIArgVariable(name: "t", arg: 1, line: 2, type: !6,
context: !11)
!13 = metadata DILocation(line: 2, column: 11, scope: !11)
!14 = metadata DILocation(line: 2, column: 16, scope: !11)
!15 = metadata DICompileUnit(file: !2, language: 4, kind: FullDebug,
producer: "clang version 3.6.0 ",
retainedUniqueTypes: [ !6 ])
!16 = metadata DISubprogram(name: "bar", linkageName: "_Z3bar1T",
handle: i32 (i32)* @_Z3bar1T, file: !2,
type: !9, context: !15)
!17 = metadata DIArgVariable(name: "t", arg: 2, line: 3, type: !6,
context: !16)
!18 = metadata DILocation(line: 3, column: 11, scope: !16)
!19 = metadata DILocation(line: 3, column: 23, scope: !16)
Notice that only the links to parents (i.e., context:
) are explicit
here -- backlinks are implied. For example, !7 and !8 point to !6, but
not the reverse.
This has the interesting property of removing all cycles from serialization (assembly and bitcode).
Making debug info assembly readable and writable
Moreover, we're now in a place where it's trivial to express the "context" pointer structurally. Here's the same debug info as above, using syntactic sugar to fill the "context" pointers:
!0 = metadata DIFile(filename: "foo.cpp", directory: "/path/to")
!1 = metadata DIFile(filename: "./t.h", directory: "/path/to")
!2 = metadata DIFile(filename: "bar.cpp", directory: "/path/to")
!3 = metadata DIBaseType(name: "short", size: 16, align: 16)
!5 = metadata DIBaseType(name: "int", size: 32, align: 32)
!6 = metadata DICompositeType(tag: 0x13, name: "T", uniqued: "_ZTS1T",
file: !1, line: 1, size: 32, align: 16) {
!7 = metadata DIMember(line: 1, file: !1, type: !3,
name: "a", size: 16, align: 16)
!8 = metadata DIMember(line: 1, file: !1, type: !3,
name: "b", size: 16, align: 16)
} ; !6
!9 = metadata DISubroutineType(args: [ !5, !6 ])
!10 = metadata DICompileUnit(file: !0, language: 4, kind: FullDebug,
producer: "clang version 3.6.0 ",
retainedUniqueTypes: [ !6 ]) {
!11 = metadata DISubprogram(name: "foo", linkageName: "_Z3foo1T",
handle: i32(i32)* @_Z3foo1T, file: !0,
type: !9) {
!12 = metadata DIArgVariable(name: "t", arg: 1, line: 2, type: !6)
!13 = metadata DILocation(line: 2, column: 11)
!14 = metadata DILocation(line: 2, column: 16)
} ; !11
} ; !10
!15 = metadata DICompileUnit(file: !2, language: 4, kind: FullDebug,
producer: "clang version 3.6.0 ",
retainedUniqueTypes: [ !6 ]) {
!16 = metadata DISubprogram(name: "bar", linkageName: "_Z3bar1T",
handle: i32 (i32)* @_Z3bar1T, file: !2,
type: !9) {
!17 = metadata DIArgVariable(name: "t", arg: 2, line: 3, type: !6)
!18 = metadata DILocation(line: 3, column: 11)
!19 = metadata DILocation(line: 3, column: 23)
} ; !16
} ; !15
This assembly has the following advantages over the status quo:
Fields are named. Aside from readability, this prevents adding/reordering fields in the schema from requiring testcase updates.
Serialization graph becomes a DAG. Aside from readability, this removes most RAUW from assembly (and all RAUW from bitcode).
Structure is clear.
Bike sheds to paint
Should we trim some boilerplate? E.g., it would be trivial to change:
!6 = metadata MDLocation(line: 43, column: 7, scope: !4)
to:
!6 = MDLocation(line: 43, column: 7, scope: !4)
This would not complicate
LLParser
. Thoughts?Which of the two "end goal" syntaxes is better: flat, or hierarchical? Better for what? Why?
The flat one might be better for FileCheck-ing (not sure), but IMO the hierarchical one is much saner for us humans, and that's the main point of assembly. It wouldn't be hard to default to one and write the other based on a command-line flag -- is that a good idea?
Assembly syntax is pretty easy to change, so this doesn't have to be perfect now. Nevertheless, is there a magical syntax that would be easier to read/write/FileCheck?
-------------- next part -------------- A non-text attachment was scrubbed... Name: MDLocation-preview.patch Type: application/octet-stream Size: 608893 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141024/2296c85c/attachment.obj> -------------- next part --------------
- Previous message: [LLVMdev] Recent changes in -gmlt break sample profiling
- Next message: [LLVMdev] First-class debug info IR: MDLocation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]