[llvm-dev] question about xray tls data initialization (original) (raw)
comic fans via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 21 07:32:23 PST 2017
- Previous message: [llvm-dev] question about xray tls data initialization
- Next message: [llvm-dev] question about xray tls data initialization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
with some dirty hack , I've made xray runtime 'built' on windows , but unfortunately I haven't enough knowledge about linker and the runtime, and finally built executable didn't run. I'd like to share my changes here , hopes somebody help me to make it run on windows. in AsmPrinter, copy/paster xray for coff target
InstMap = OutContext.getCOFFSection("xray_instr_map", 0, SectionKind::getReadOnlyWithRel()); FnSledIndex = OutContext.getCOFFSection("xray_fn_idx", 0,SectionKind::getReadOnlyWithRel());
in XRayArgs , allow windows platform to use xray args. with this, generated code seems have sled and xray parts.
in xray runtime, bool atomic_compare_exchange_strong(volatile atomic_sint32_t *a, s32 *cmp, s32 xchg, memory_order mo) is missed for MSVC , I take atomic_uint32_t implementation
msvc 14.1 treats BufferQueue::Buffer::Buffer as constructor instead of data member, Buf.Buffer=>Buf.Data
FunctionRecord pack , attribute((packed)) => #pragma pack(push,1), msvc also requires bitfields to be same type to pack them together( all types => uint32_t)
FD int => HANDLE, most code logic still valid (-1 as invalid value), r/w API replaced with windows
mprotect => VirtualProtect
readTSC in xray_x86_64.inc also works for windows
replace read tsc from proc with QueryPerformanceFrequency
msvc can not compile such code void setupNewBuffer(int (*wall_clock_reader)(clockid_t, struct timespec *));
must use typedef first . xray use clock_gettime as default implementation , which is not friendly for windows .create a fake one based on chrono system_clock(ignore clockid_t)
for tls destructor part, I've just commented them out.(but https://www.codeproject.com/Articles/8113/Thread-Local-Storage-The-C-Way gives a thread exit callback way for coff)
and last thing , which I don't understand is the weak symbol for __start_xray_instr_map[] __stop_xray_instr_map[] __start_xray_fn_idx[] __stop_xray_fn_idx[]
I replace them with __declspec(selectany) , but I'm not sure they have same meanings.
some random generated code: .text .intel_syntax noprefix .def call; .scl 2; .type 32; .endef .globl call # -- Begin function call .p2align 4, 0x90 call: # @call .seh_proc call
BB#0: # %entry
.p2align 1, 0x90
.Lxray_sled_0: .ascii "\353\t" nop word ptr [rax + rax + 512] sub rsp, 16 .seh_stackalloc 16 .seh_endprologue mov dword ptr [rsp + 12], ecx mov dword ptr [rsp + 8], 0 mov dword ptr [rsp + 4], 0 .LBB0_1: # %for.cond # =>This Inner Loop Header: Depth=1 mov eax, dword ptr [rsp + 4] cmp eax, dword ptr [rsp + 12] jge .LBB0_4
BB#2: # %for.body
# in Loop: Header=BB0_1 Depth=1
mov eax, dword ptr [rsp + 4]
add eax, dword ptr [rsp + 8]
mov dword ptr [rsp + 8], eax
BB#3: # %for.inc
# in Loop: Header=BB0_1 Depth=1
mov eax, dword ptr [rsp + 4]
add eax, 1
mov dword ptr [rsp + 4], eax
jmp .LBB0_1
.LBB0_4: # %for.end mov eax, dword ptr [rsp + 8] add rsp, 16 .p2align 1, 0x90 .Lxray_sled_1: ret nop word ptr cs:[rax + rax + 512] .seh_handlerdata .text .seh_endproc # -- End function .section xray_instr_map,"y" .Lxray_sleds_start0: .quad .Lxray_sled_0 .quad call .byte 0x00 .byte 0x00 .byte 0x00 .zero 13 .quad .Lxray_sled_1 .quad call .byte 0x01 .byte 0x00 .byte 0x00 .zero 13 .Lxray_sleds_end0: .section xray_fn_idx,"y" .p2align 4, 0x90 .quad .Lxray_sleds_start0 .quad .Lxray_sleds_end0 .text
and parts of obj dump:
SECTION HEADER #5 /16 name (xray_instr_map) 0 physical address 0 virtual address 40 size of raw data 198 file pointer to raw data (00000198 to 000001D7) 1D8 file pointer to relocation table 0 file pointer to line numbers 4 number of relocations 0 number of line numbers 100000 flags 1 byte align
RAW DATA #5 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 56 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 V............... 00000030: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
RELOCATIONS #5 Symbol Symbol Offset Type Applied To Index Name
00000000 ADDR64 00000000 00000000 0 .text 00000008 ADDR64 00000000 00000000 E call 00000020 ADDR64 00000000 00000056 0 .text 00000028 ADDR64 00000000 00000000 E call
SECTION HEADER #6 /4 name (xray_fn_idx) 0 physical address 0 virtual address 10 size of raw data 200 file pointer to raw data (00000200 to 0000020F) 210 file pointer to relocation table 0 file pointer to line numbers 2 number of relocations 0 number of line numbers 500000 flags 16 byte align
RAW DATA #6 00000000: 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........ at .......
RELOCATIONS #6 Symbol Symbol Offset Type Applied To Index Name
00000000 ADDR64 00000000 00000000 8 xray_instr_map 00000008 ADDR64 00000000 00000040 8 xray_instr_map
On Tue, Nov 21, 2017 at 7:46 PM, Dean Michael Berris <dean.berris at gmail.com> wrote:
On 17 Nov 2017, at 00:44, comic fans via llvm-dev <llvm-dev at lists.llvm.org> wrote: I'm learning the xray library and try if it can be built on windows, in xrayfdrloggingimpl.h line 152 , comment written as // Using pthreadonce(...) to initialize the thread-local data structures
but at line 175, 183, code written as threadlocal pthreadkeyt key; // Ensure that we only actually ever do the pthread initialization once. threadlocal bool UNUSED Unused = [] { new (&TLSBuffer) ThreadLocalData(); auto result = pthreadkeycreate(&key, +[](void *) { auto &TLD = *reinterpretcast<ThreadLocalData *>(&TLSBuffer); I'm confused that pthreadkeyt and Unused are both threadlocal variable, doesn't it mean the following lambda will run for each thread , and create one pthreadkeyt for only one tls data(instead of only one pthreadkeyt for all thread) ? also what does the '+' before lambda expression mean ? this may be stupid questions, could somebody kindly helped ? Yeah, that comment is out-of-date (and the implementation is buggy) -- which is a shame really. :/ But, the good news, is I think we've fixed this now in the top-of-trunk with https://reviews.llvm.org/D39526 and https://reviews.llvm.org/D40164. Curiously though, how far did your exploration into getting XRay to build on Windows go? Cheers -- Dean
- Previous message: [llvm-dev] question about xray tls data initialization
- Next message: [llvm-dev] question about xray tls data initialization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]