[llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target? (original) (raw)
Peter Smith via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 19 01:51:23 PDT 2017
- Previous message: [llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?
- Next message: [llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello Leslie,
The errors coming from the gnu assembler are due to the file being assembled in Arm state, to get rid of the errors you'll either need to put a .thumb directive in the file, or pass -mthumb to the assembler via arm-linux-gnu-gcc -Wa,-mthumb (I think).
I'm not able to explain what you are seeing in your print out as it doesn't quite match the map file. Looking at your source diff I think I may have found a bug: uint64_t AddrLoc = getOutputSection()->Addr + Offset; RelExpr Expr = Rel.Expr;
- if ((Expr == R_PC || Expr == R_GOT_PC) &&
(Config->EMachine == EM_ARC_COMPACT ||Config->EMachine == EM_ARC_COMPACT2)) {uint64_t M = 0;if (Type == R_ARC_32_PCREL || Type == R_ARC_PC32 ||Type == R_ARC_GOTPC32 || Type == R_ARC_GOTPC)M = 4; // bitsize >= 32 ? 4 : 0AddrLoc = (getOutputSection()->Addr /* output_section->vma */ +cast<InputSection>(this)->OutSecOff /* output_offset */ +Offset /* reloc_offset */ - M) & ~0x3;- } uint64_t TargetVA = SignExtend64( getRelocTargetVA(Type, Rel.Addend, AddrLoc, *Rel.Sym, Expr), Bits);
Looking at your calculation for AddrLoc, it seems like your calculation doesn't match the original as Offset is (in trunk lld, your diff is a against an old version, but I think the line hasn't changed semantically) uint64_t Offset = getOffset(Rel.r_offset); which for a regular InputSection will expand to uint64_t Offset = this->OutSecOff + Rel.r_offset;
Original: AddrLoc = getOutputSection()->Addr + this->OutSecOff + Rel.r_offset;
Yours: AddrLoc = (getOutputSection()->Addr /* output_section->vma / + cast(this)->OutSecOff / output_offset / + Offset / reloc_offset */ - M) & ~0x3; uses Offset and not Rel.r_offset so expanding Offset gives me: AddrLoc = (getOutputSection()->Addr + this->OutSecOff + (this->OutSecOff + Rel.r_offset Offset) - M) & ~0x3;
This looks like you are adding this->OutSecOff twice.
No idea whether this is the cause of the problem or whether you have fixed this up in the meantime. I recommend that you take a closer look at your changes to the generic parts of lld first to see if you have inadvertently changed something.
Peter
On 19 September 2017 at 04:28, Leslie Zhai <lesliezhai at llvm.org.cn> wrote:
Hi Peter,
Thanks for your kind response!
在 2017年09月18日 20:44, Peter Smith 写道:
Hello Leslie, I don't know quite what to say as I don't know precisely what your question is? If I am not being precise enough please can you put some explicit questions in? From what I can see in the output, here are some comments. From your arc mapfiles it looks like that in the output both linker's have given the .text output section the correct base address given the alignment restrictions as the alignment requirement of .text from liba-memset-bs.o is 4, therefor the alignment requirement of the OutputSection .text should be 4: LLD: Address Size Alignment 00000000 00000080 4 .text 00000000 00000004 1 basic-arc.o:(.text) 00000000 00000000 0 main 00000004 0000007c 4 ... (liba-memset-bs.o):(.text) LD .text 0x0000000000000000 0x80 (.text .stub .text. .gnu.linkonce.t.*) .text 0x0000000000000000 0x4 basic-arc.o .text 0x0000000000000004 0x7c ... libc.a(liba-memset-bs.o) 0x0000000000000004 memset _0x0000000000000060 strncpybzero Reloc type=RARCS25WPCREL, shouldrelocate = true offset = 0x0, addend = 0x0 Symbol: value = 0x00000000 Symbol Section: section name = .text, outputoffset 0x00000006, outputsection->vma = 0x00000006 file: liba-memset-bs.o Inputsection: section name = .text, outputoffset 0x00000000, outputsection->vma = 0x00000006 changedaddress = 0x00000006 file: basic-arc.o RELOCTYPE = ARCS25WPCREL FORMULA = ( ME ( ( ( ( S + A ) - P ) >> 2 ) ) ) S = 0xc A = 0 L = c symbolsection->vma = 0xc symbolsection->vma = 0x6 PCL = 0x4 P = 0x4 G = 0 SDAOFFSET = 0x2188 SDASET = 1 GOTOFFSET = 0 relocation = 0x000002 before = 0x000802 data = 00000002 (2) (2) after = 0x0000080a then I need to investigate how LD calculate relocdata.inputsection->outputsection->vma, it might different with LLD even the same Alignment https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L485
I'm not entirely sure where the Arm example has come from, but it does show an interesting difference. It looks like the linker's are handling the -ttext option slightly differently when the of the OutputSection is not 0 modulo OutputSection alignment. From the map file we can see that lld is aligning the OutputSection to the nearest 4-byte boundary, GNU-ld is placing the OutputSection on the requested address, but is adding padding before the .text section to make sure that in the final executable the InputSection is aligned. LLD Address Size Align Out In Symbol 00011008 00000018 4 .text 00011008 00000018 4 arm-thumb-undefined-weak.o:(.text) 00011008 00000000 0 $t.0 00011008 00000000 0 start LD .text 0x0000000000011006 0x1a ... fill 0x0000000000011006 0x2 .text 0x0000000000011008 0x18 arm-thumb-undefined-weak.o The fill is visible as a nop in the disassembly for the LD produced image. Strictly speaking I think LD is producing a file that doesn't strictly conform to ELF here as the shaddr of the .text OutputSection is 0 modulo shaddralign (4). In practice it probably wouldn't make much difference. My preference is for LLD's behaviour here. It might be arm-linux-gnu toolchain's issue: $ arm-linux-gnu-gcc -o arm-thumb-undefined-weak-ld.o -c arm-thumb-undefined-weak.s arm-thumb-undefined-weak.s: Assembler messages: arm-thumb-undefined-weak.s🔞 Error: width suffixes are invalid in ARM mode --
beq.w target'_ _arm-thumb-undefined-weak.s:20: Error: width suffixes are invalid in ARM mode_ _--b.w target' then arm-linux-gnu-ld might wrongly relocated RARMTHMCALL for arm-thumb-undefined-weak-lld.o generated by llvm-mc. PeterOn 18 September 2017 at 03:28, Leslie Zhai <lesliezhai at llvm.org.cn> wrote: Hi Peter, Map file about LD for ARC target https://drive.google.com/open?id=0ByE8c-y74luRWpQdUh2c0VXZ1k LLD for ARC https://drive.google.com/open?id=0ByE8c-y74lueGVuYkR0a3RSWjQ
arm-thumb-undefined-weak.s https://github.com/llvm-mirror/lld/blob/master/test/ELF/arm-thumb-undefined-weak.s $ llvm/build/bin/llvm-mc -filetype=obj -triple=thumbv7a-none-linux-gnueabi arm-thumb-undefined-weak.s -o arm-thumb-undefined-weak.o $ llvm/build/bin/ld.lld -o arm-thumb-undefined-weak-lld arm-thumb-undefined-weak.o -Ttext=11006 $ arm-linux-gnu-ld -o arm-thumb-undefined-weak-ld arm-thumb-undefined-weak.o -Ttext=11006 $ arm-linux-gnu-readelf -r arm-thumb-undefined-weak.o Relocation section '.rel.text' at offset 0x8c contains 6 entries: Offset Info Type Sym.Value Sym. Name 00000000 00000333 RARMTHMJUMP19 00000000 target 00000004 0000031e RARMTHMJUMP24 00000000 target 00000008 0000030a RARMTHMCALL 00000000 target 0000000c 0000030a RARMTHMCALL 00000000 target 00000010 00000332 RARMTHMMOVTPR 00000000 target 00000014 00000331 RARMTHMMOVWPR 00000000 target DEBUG: lld: RARMTHMJUMP19 TargetVA: 0 A: -4 P: 69640 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 0 DEBUG: lld: RARMTHMJUMP24 TargetVA: 0 A: -4 P: 69644 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 4 DEBUG: lld: RARMTHMCALL TargetVA: 1 A: -4 P: 69648 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 8 DEBUG: lld: RARMTHMCALL TargetVA: 1 A: -4 P: 69652 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 12 DEBUG: lld: RARMTHMMOVTPREL TargetVA: 0 A: 0 P: 69656 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 16 DEBUG: lld: RARMTHMMOVWPRELNC TargetVA: 0 A: 0 P: 69660 Align: 4 VMA: 69640 Output Offset: 0 Reloc Offset: 20 DEBUG: arm-linux-gnu-ld: RARMTHMJUMP19: VMA: 69638 Output Offset: 2 Reloc Offset: 0 DEBUG: arm-linux-gnu-ld: RARMTHMJUMP24: VMA: 69638 Output Offset: 2 Reloc Offset: 4 DEBUG: arm-linux-gnu-ld: RARMTHMCALL: VMA: 69638 Output Offset: 2 Reloc Offset: 8 DEBUG: arm-linux-gnu-ld: RARMTHMCALL: VMA: 69638 Output Offset: 2 Reloc Offset: 12 DEBUG: arm-linux-gnu-ld: RARMTHMMOVTPREL: VMA: 69638 Output Offset: 2 Reloc Offset: 16 DEBUG: arm-linux-gnu-ld: RARMTHMMOVWPRELNC: VMA: 69638 Output Offset: 2 Reloc Offset: 20 $ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d arm-thumb-undefined-weak-lld arm-thumb-undefined-weak-lld: file format ELF32-arm-little Disassembly of section .text: start: 11008: 00 f0 00 80 beq.w #0 <start+0x4> 1100c: 00 f0 00 b8 b.w #0 <start+0x8> 11010: 00 f0 00 f8 bl #0 11014: 00 f0 00 f8 bl #0 11018: c0 f2 00 00 movt r0, #0 1101c: 40 f2 00 00 movw r0, #0 My question: why LD's relocation is different from LLD? and thanks for your explanation :)
$ llvm/build/bin/llvm-objdump -triple=thumbv7a-none-linux-gnueabi -d arm-thumb-undefined-weak-ld arm-thumb-undefined-weak-ld: file format ELF32-arm-little Disassembly of section .text: .text: 11006: 00 00 movs r0, r0 start: 11008: 2e f4 fa af beq.w #-69644 1100c: 00 e0 b #0 <start+0x8> 1100e: 00 bf nop 11010: 00 e0 b #0 <start+0xC> 11012: 00 bf nop 11014: 00 e0 b #0 <start+0x10> 11016: 00 bf nop 11018: cf f6 fe 70 movt r0, #65534 1101c: 4e f6 e4 70 movw r0, #61412
在 2017年09月15日 20:49, Peter Smith 写道: Just a thought I had about the calculation of P. I think that following the ld approach too closely may be a mistake. I'm speculating that the reason for this change in the value of P is similar to the situation in Arm for a Thumb BLX immediate instruction (Branch Link and Exchange with the immediate an offset from the PC). When calculating the target address the immediate is added to Align(PC, 4) where Align rounds down to nearest 4-byte boundary. The linker needs to account for this when resolving the relocation RARMTHMCALL. To handle the alignment difference for this one special case in lld I accounted for the alignment difference in relocateOne. You may be able to use a similar method for Arc rather than writing modifyARCAddrLoc. Again I know nothing about Arc so you'll need to look at the Architecture reference manual to understand what the instruction the relocation applies to works. Peter On 15 September 2017 at 04:19, Leslie Zhai <lesliezhai at llvm.org.cn> wrote: Hi Peter, Thanks for your kind response! 在 2017年09月14日 17:36, Peter Smith 写道: Hello Leslie, I think we are going to need to know a bit more about the ELF ABI for what looks like the ArcCompact before we can help you. https://github.com/foss-for-synopsys-dwc-arc-processors/arc-ABI-manual But I prefer to read bfd linker's source code about ARC instead: 1. Specific eflags https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc.h 2. Relocation define https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/elf/arc-reloc.def 3. Relocation replace function https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/include/opcode/arc-func.h 4. Calculation of S, A, P, PDATA, GOT, etc. https://github.com/foss-for-synopsys-dwc-arc-processors/binutils-gdb/blob/arc-2017.09/bfd/elf32-arc.c#L1156
LLD's calculation of P (the place to be relocated) is as it is in the generic ELF specification. The Rel.Offset corresponds to the ELF roffset field. This is covered by: "For a relocatable file, the value is the byte offset from the beginning of the section to the storage unit affected by the relocation." For LLD we are calculating the virtual address (VA) of P, as I understand it this is equivalent to the vma used in BFD. Assuming that the relocation is relocating a regular InputSection from the basic-arc.o object then the LLD calculation of P = getOutputSection()->Addr + getOffset(Rel.Offset); translates to: (VA of OutputSection) + (Offset of InputSection within OutputSection) + (Offset within InputSection given by roffset) The BFD linker seems to be doing the equivalent calculation with an extra modification of the (Offset within InputSection given by roffset) and is rounding down the result to the nearest 4-byte boundary. This looks unfamiliar to me, and could well be specific to ArcCompact. I think that you will need to refer to the ELF ABI documentation as this should tell you if there are any processor specific modifications to generic ELF that you have to follow. I implemented the MOD P for ARC: static void modifyARCAddrLoc(uint64t &AddrLoc, const uint16t EMachine, RelExpr Expr, uint32t Type, uint64t VMA, uint64t OutSecOff, uint64t RelOff) { if (EMachine != EMARCCOMPACT || EMachine != EMARCCOMPACT2 || Expr != RPC || Expr != RGOTPC) { return; } uint64t M = 0; if (Type == RARC32PCREL || Type == RARCPC32 || Type == RARCGOTPC32 || Type == RARCGOTPC) { M = 4; // bitsize >= 32 ? 4 : 0 } AddrLoc = (VMA + OutSecOff + RelOff - M) & ~0x3; } modifyARCAddrLoc(AddrLoc, Config->EMachine, Expr, Type, getOutputSection()->Addr, <-- VMA is important!_ _cast(this)->OutSecOff, Rel.Offset); The other thing that you should do is try and work out why the VA (vma) is 6 in LD and 8 in LLD and whether this is actually a problem. The VA of the OutputSection is not guaranteed to be the same between different linkers so it may have just been that differences in order of InputSections or alignment has caused a different VA. I would check the output of the linker map file to see where it placed the Output and Input Sections to see what the answer should be. LLD's getOutputSection()->Addr = https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L530
In summary: It looks like there are some Arc specific things that might need to be done. Unfortunately I don't have any experience with Arc, and I'm not sure the other people that work on LLD do either. I suggest looking at the public ABI documentation and making any arguments for changes based on that documentation, it is worth assuming that we know nothing about Arc, don't have the documentation to hand and don't know where to find it! Hope that is of some help, with a bit more context I might be able to help a bit more, unfortunately I can't spend a lot of time learning about Arc. Peter On 14 September 2017 at 07:16, Leslie Zhai via llvm-dev <llvm-dev at lists.llvm.org> wrote: Hi LLVM developers, basic-arc.s: main: bl memset $ arc-elf32-gcc -mcpu=arc600 -o basic-arc.o -c $ arc-elf32-readelf -r basic-arc.o Relocation section '.rela.text' at offset 0xd4 contains 1 entries: Offset Info Type Sym.Value Sym. Name + Addend 00000000 00000611 RARCS25WPCREL 00000000 memset + 0 High address: 0x0 $ arc-elf32-ld -o basic-arc basic-arc.o -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/arc600 -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib/arc600 -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1 -L/opt/arc-gnu/lib/gcc/arc-elf32/7.1.1/../../../../arc-elf32/lib --start-group -lgcc -lc -lnosys --end-group -Ttext=0 DEBUG: arc-ld: RARCS25WPCREL relocation: 1 S: 4 A: 0 P: 0 = (vma: 0 + outputoffset: 0 + relocoffset: 0 - 0) & ~0x3 DEBUG: arc-ld: type: RARCS25WPCREL insn: 2054 $ ld.lld -o basic-arc-lld basic-arc.o $ARCLINKERLIB -Ttext=0 DEBUG: lld: RARCS25WPCREL TargetVA: 4 A: 0 P: 0 <-- same P as_ _arc-ld_ _DEBUG: lld: RARCS25WPCREL: Insn: 2050 Rel: 1_ _DEBUG: lld: RARCS25WPCREL: Insn: 2054 <-- same relocation value as_ _arc-ld_ _But with several different high address *not* 0x0, such as 0x6:_ _DEBUG: arc-ld: RARCS25WPCREL relocation: 2 S: 12 A: 0 P: 4 = (vma:_ _6_ _+_ _outputoffset: 0 + relocoffset: 0 - 0) & ~0x3_ _DEBUG: arc-ld: type: RARCS25WPCREL insn: 2058_ _DEBUG: lld: RARCS25WPCREL TargetVA: 4 A: 0 P: 8 <-- different P?_ _DEBUG: lld: RARCS25WPCREL: Insn: 2050 Rel: 1_ _DEBUG: lld: RARCS25WPCREL: Insn: 2054 <-- different relocation_ _value_ _How arc-ld calculates P?_ _P = ((relocdata.inputsection->outputsection ? relocdata.inputsection->outputsection->vma : 0) + relocdata.inputsection->outputoffset + (relocdata.relocoffset - (relocdata.bitsize >= 32 ? 4 : 0))) & ~0x3; for example, RARCS25WPCREL's bitsize < 32, P = (6 + 0 + 0 - 0) &_ _~0x3_ _=_ _4, when vma is 6, output and reloc offset is 0._ _How LLD calculates P (AddrLoc)?_ _P = getOutputSection()->Addr + getOffset(Rel.Offset); for example, the same high address 0x6, LLD's P is 8, different with arc-ld? so do I need to modify the value of P for RPC case in the getRelocTargetVA? please give me some hints, thanks a lot! PS: arc-ld RARCS25WPCREL's FORMULA is: ( S + A ) - P ) >> 2, and it needs middle endian convert, so: Insn = middleEndianConvert (insn, TRUE); Insn = replaceDisp25w(Insn, ( S + A ) - P ) >> 2); Insn = middleEndianConvert (insn, TRUE); write32le(Loc, Insn); -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/ -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/ -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
- Previous message: [llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?
- Next message: [llvm-dev] Do I need to modify the AddrLoc of LLD for ARC target?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]