[llvm-dev] Range lists, zero-length functions, linker gc (original) (raw)

Alexey Lapshin via llvm-dev llvm-dev at lists.llvm.org
Fri May 29 06:31:11 PDT 2020


But, this would not completely solve the problem from https://reviews.llvm.org/D59553 - Overlapped address ranges. Binutils approach will solve the problem if the address range specified as start_address:end_address. While resolving relocations, it would replace such a range with 1:1.

However, It would not work if address ranges were specified as startaddress:length since the length is not relocated. This case could be additionally fixed by fast scan debuginfo for HighPC defined as length and changing it to 1. Something which you suggested here: http://lists.llvm.org/pipermail/llvm-dev/2020-May/141599.html.

Hmm, I don't /think/ I intended to suggest anything that would have to parse all the debuginfo, even if just to fixup highpc. I meant that debugrnglist for the CU at least (rnglist has fewer problems - you can't accidentally terminate it early, but still has the "large functions in programs that use relatively low code addresses can't just be resolved to "addend" because then [0, length) of the large function might overlap into that code address range") could be modified by a DWARF-aware linker to remove the unused chunks.

right. you did not. that is my suggestion to extend that idea - not only fix debug_rnglist but all other occurrences of HighPC.

The DWARF that describes a specific function using lowpc/highpc - it may be split into a .dwo file and unreachable by the linker - so it /needs/ a magic value for the address referenced by the lowpc to indicate that it is invalid.

for the split-dwarf: solution which updates HighPC should patch .dwo files also.

Which all comes back to "we probably need to pick a value that's explicitly invalid" and -2 (max - 1) seems to be about the right thing.

So it looks like following solution could fix both problems and be relatively fast: "Resolve all relocations from debug sections into dead code to 1. Parse debug sections and replace HighPc of an address range pointing to dead code and specified as length to 1".

That second part seems pretty expensive compared to anything else the linker is doing with debug info. I'd try to avoid it if at all possible.

Agreed with that. Though there are some concerns about -2 which could be essential or not:

I do not know real problems caused by using UINT64_MAX-1 for address ranges pointing to deleted code. Moreover, while testing https://reviews.llvm.org/D59553 I noticed that the tools become work better: lldb, llvm-symbolizer, gnu addr2line, gnu objdump. They report code location correctly with the patch and incorrectly without the patch.

But there is a corner case: address range is specified as start_address:length. After replacing start_address with -2, LowPC becomes higher than HighPC.

From the point of DWARF standard - this is "undefined behavior". The standard says nothing about that case. Different tools could interpret it differently. Some tools could assume that such a situation is not possible and crash if it occurs. Some could ignore it. Others could report an error and stop working. f.e. llvm-dwarfdump --verify reports error and continue to work.

llvm-dwarfdump --verify : error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

So after implementing this, some tools could potentially stop working. I do not know, such tools. So, I am not sure whether that is the problem.

Additionally, It is necessary to document that behavior in DWARF standard to avoid problems in the future(same as for zero length address ranges):

"A bounded range entry whose beginning address offset greater than ending address offset indicates an invalid range and may be ignored. "

Note, that this does not specify an additional magic value(UINT64_MAX-1). Instead, it describes general situation(LowPC>HighPC).

If backward compatibility is not a problem - then using LowPC>HighPC to indicate invalid address range pointing to deleted code seems to be the fastest solution(which could be implemented by resolving relocations from debug sections to deleted code to UINT64_MAX-1).

If backward compatibility is a problem - then we could use already standardized "zero-length address range" to mark address ranges pointing to deleted code. That solution would require to patch address range length in the dwarf.

Thank you, Alexey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200529/60230385/attachment.html>



More information about the llvm-dev mailing list