[llvm-dev] Building LLVM's fuzzers (original) (raw)

Justin Bogner via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 11 11:52:38 PDT 2017


Kostya Serebryany <kcc at google.com> writes:

Justin,

Calling appendToUsed has horrible complexity and if we call it in every function clang consumes tons of memory (6Gb when compiling one of the clang's source files). This killed my machine today :)

The solution is to call appendToUsed once per module, instead of once per function.

Oh right, updating lists in metadata is O(n), so doing it per function is quadratic. This slipped my mind - sorry!

Also, since this does not seem to be required for linux, I've put this under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see if this breaks Mac (there seem to be no llvm tests for this, only compiler-rt tests) but please also check if this looks ok.

This looks fine, though I'd rather if we just did it on all platforms for consistency / clear semantic intent. Running appendToUsed once should be cheap, and we do it unguarded in our other instrumentation (like InstrProfiling.cpp).

But this all still sounds bad on linux at least: * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these arrays get removed (as discussed here)

This is sad, and I don't think we have any particularly good ideas to fix this.

* with newer linkers the sanitizer coverage essentially disables gc-sections

I'm not sure I follow. We're making sure the global arrays / instrumentation data doesn't get dead stripped here, but dead stripping of functions should work as normal.

--kcc

On Thu, Aug 24, 2017 at 6:43 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:

On Thu, Aug 24, 2017 at 6:30 PM, Justin Bogner <mail at justinbogner.com> wrote: Peter Collingbourne <peter at pcc.me.uk> writes: > On Thu, Aug 24, 2017 at 3:38 PM, Kostya Serebryany <kcc at google.com> wrote: > >> >> >> On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <_ _>>> llvm-dev at lists.llvm.org> wrote: >>> >>>> >>>> >>>> On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <_ _mail at justinbogner.com> >>>> wrote: >>>> >>>>> I think the simplest fix is something like this: >>>>> >>>>> diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp >>>>> b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp >>>>> index c6f0d17f8fe..e81957ab80a 100644 >>>>> --- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp >>>>> +++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp >>>>> @@ -256,6 +256,7 @@ SanitizerCoverageModule::Creat eSecStartEnd(Module >>>>> &M, const char *Section, >>>>> new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkag >>>>> e, >>>>> nullptr, getSectionEnd(Section)); >>>>> SecEnd->setVisibility(GlobalValue::HiddenVisibility); >>>>> + appendToUsed(M, {SecStart, SecEnd}); >>>>> >>>>> return std::makepair(SecStart, SecEnd); >>>>> } >>>>> >>>>> I'm trying it out now. >>>>> >>>> >>>> LGTM (if this works), thanks! >>>> >>> >>> I wouldn't expect that to work because for ELF targets llvm.used has no >>> effect on the object file (only on the optimizer). >>> >>> Is there a simple way to reproduce the link failure? >>> >> >> >> ninja compiler-rt >> echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a, >> unsigned long b){return 0; } ' > test.cc >> clang -O3 test.cc -fsanitize=fuzzer # works >> clang -O3 test.cc -Wl,-gc-sections -fsanitize=fuzzer # fails >> > > It seems that the issue is that older versions of ld.bfd have a bug which _> causes it not to define _start and stop symbols if the only reference > to those symbols is from a constructor. It looks like this is a different problem from the one on macOS (and I wasn't able to reproduce it with any bfd ld I had available, they were all too new) I've gone ahead and fixed the issue on macOS in r311742. > If I add an artificial reference to the start symbol from libfuzzer's main > function, the program links correctly. > > diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp > b/compiler-rt/lib/fuzzer/FuzzerMain.cpp > index af8657200be2..c41e28e012db 100644 > --- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp > +++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp > @@ -16,6 +16,10 @@ extern "C" { > int LLVMFuzzerTestOneInput(const uint8t *Data, sizet Size); > } // extern "C" > > +attribute((weak)) void nop(void *p) {} _> +extern void *start_sancovpcs; > + > int main(int argc, char **argv) { _> + nop(start_sancovpcs); > return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput); > } If we were to do this, we'd have to guard it appropriately - not all _platforms name the start symbols like this. Of course. There's also the issue of how to keep the symbols alive in DSOs. > The problem also goes away if I use "GNU ld (GNU Binutils) > 2.28.51.20170105". 2.27 also doesn't have the issue. I don't know what our minimum version of binutils is, and I'm under the impression most people use gold or lld to link LLVM these days, so it isn't clear to me how big of a problem this is. For the record, the problem reproduces under 2.24, which is shipped by Ubuntu 14.04 LTS, which isn't that old. My view is that if we can find an unintrusive enough workaround, we should deploy it (with a comment to remove it after N years). Peter > Peter > > > >> >> >> >> >> >>> >>> Peter >>> >>> >>>> >>>>> >>>>> Kostya Serebryany <kcc at google.com> writes: >>>>> > With -Wl,-gc-sections I get this: >>>>> > SimpleTest.cpp:(.text.sancov.modulector[sancov.modulector] +0x1b): _>>>>> > undefined reference to _start_sancovpcs'_ _>>>>> > SimpleTest.cpp:(.text.sancov.modulector[sancov.modulector]_ _+0x20):_ _>>>>> > undefined reference to stop_sancovpcs' >>>>> > >>>>> > >>>>> > >>>>> > On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <_ _>>>>> ekarpenkov at apple.com> >>>>> > wrote: >>>>> > >>>>> >> >>>>> >> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <kcc at google.com> >>>>> wrote: >>>>> >> >>>>> >> Interesting. >>>>> >> This is a relatively new addition (fsanitize-coverage=pc-tables, >>>>> which is >>>>> >> now a part of -fsanitize=fuzzer). >>>>> >> The tests worked (did they? On Mac?) so I thought everything is ok. >>>>> >> >>>>> >> >>>>> >> For tests we never compile the tested target with -O3 (and that >>>>> wouldn’t >>>>> >> be sufficient), >>>>> >> and for testing fuzzers I was always building them in debug >>>>> >> >>>>> >> Yea, we need to make sure the pc-tables are not stripped (this is a >>>>> >> separate section with globals). >>>>> >> (I still haven't documented pc-tables, will do soon) >>>>> >> >>>>> >> >>>>> >> Do you know what's the analog of Wl,-deadstrip on Linux? >>>>> >> >>>>> >> >>>>> >> Apparently -Wl,—gc-sections. >>>>> >> For some reason LLVM does not do it for gold, even though it seems to >>>>> >> support this flag as well. >>>>> >> (that could be another reason why you don’t see the failure on Linux) >>>>> >> >>>>> >> 1 if(NOT LLVMNODEADSTRIP) >>>>> >> 2 if(${CMAKESYSTEMNAME} MATCHES "Darwin") >>>>> >> 3 # ld64's implementation of -deadstrip breaks tools that use >>>>> >> plugins. >>>>> >> 4 setproperty(TARGET ${targetname} APPENDSTRING PROPERTY >>>>> >> 5 LINKFLAGS " -Wl,-deadstrip") >>>>> >> 6 elseif(${CMAKESYSTEMNAME} MATCHES "SunOS") >>>>> >> 7 setproperty(TARGET ${targetname} APPENDSTRING PROPERTY >>>>> >> 8 LINKFLAGS " -Wl,-z -Wl,discard-unused=sections") >>>>> >> 9 elseif(NOT WIN32 AND NOT LLVMLINKERISGOLD) >>>>> >> 10 # Object files are compiled with -ffunction-data-sections. >>>>> >> 11 # Versions of bfd ld < 2.23.1 have a bug in --gc-sections_ _that_ _>>>>> >> breaks >>>>> >> 12 # tools that use plugins. Always pass --gc-sections once we >>>>> require >>>>> >> 13 # a newer linker. >>>>> >> 14 setproperty(TARGET ${targetname} APPENDSTRING PROPERTY >>>>> >> 15 LINKFLAGS " -Wl,--gc-sections") >>>>> >> 16 endif() >>>>> >> 17 endif() >>>>> >> >>>>> >> >>>>> >> >>>>> >> --kcc >>>>> >> >>>>> >> >>>>> >> >>>>> >> On Thu, Aug 24, 2017 at 2:49 PM, Justin Bogner <_ _>>>>> mail at justinbogner.com> >>>>> >> wrote: >>>>> >> >>>>> >>> George Karpenkov <ekarpenkov at apple.com> writes: >>>>> >>> > OK so with Kuba’s help I’ve found the error: with optimization, >>>>> dead >>>>> >>> > stripping of produced libraries is enabled, >>>>> >>> > which removes coverage instrumentation. >>>>> >>> > >>>>> >>> > However, this has nothing to do with the move to compiler-rt, so >>>>> I’m >>>>> >>> > quite skeptical on whether it has worked >>>>> >>> > beforehand. >>>>> >>> > >>>>> >>> > A trivial fix is to do: >>>>> >>> > >>>>> >>> > diff --git a/cmake/modules/HandleLLVMOptions.cmake >>>>> >>> b/cmake/modules/HandleLLVMOptions.cmake >>>>> >>> > index 04596a6ff63..5465d8d95ba 100644 >>>>> >>> > --- a/cmake/modules/HandleLLVMOptions.cmake >>>>> >>> > +++ b/cmake/modules/HandleLLVMOptions.cmake >>>>> >>> > @@ -665,6 +665,9 @@ if(LLVMUSESANITIZER) >>>>> >>> > endif() >>>>> >>> > if (LLVMUSESANITIZECOVERAGE) >>>>> >>> > append("-fsanitize=fuzzer-no-link" CMAKECFLAGS >>>>> CMAKECXXFLAGS) >>>>> >>> > + >>>>> >>> > + # Dead stripping messes up coverage instrumentation. >>>>> >>> > + set(LLVMNODEADSTRIP ON) >>>>> >>> > endif() >>>>> >>> > endif() >>>>> >>> > >>>>> >>> > Any arguments against that? >>>>> >>> >>>>> >>> We shouldn't do this. We really only want to prevent dead stripping >>>>> of >>>>> >>> the counters themselves - disabling it completely isn't very nice. >>>>> >>> >>>>> >>> > Apparently, a better way is to follow ASAN instrumentation pass, >>>>> >>> > which uses some magic to protect against dead-stripping. >>>>> >>> >>>>> >>> I thought this was already being done - how else did it work before? >>>>> >>> >>>>> >>> >> On Aug 24, 2017, at 11:29 AM, Justin Bogner <_ _>>>>> mail at justinbogner.com> >>>>> >>> wrote: >>>>> >>> >> >>>>> >>> >> (kcc, george: sorry for the re-send, the first was from a >>>>> non-list >>>>> >>> email >>>>> >>> >> address) >>>>> >>> >> >>>>> >>> >> My configuration for building the fuzzers in the LLVM tree >>>>> doesn't >>>>> >>> seem to >>>>> >>> >> work any more (possibly as of moving libFuzzer to compiler-rt, >>>>> but >>>>> >>> there >>>>> >>> >> have been a few other changes in the last week or so that may be >>>>> >>> related). >>>>> >>> >> >>>>> >>> >> I'm building with a fresh top-of-tree clang and setting >>>>> >>> >> -DLLVMUSESANITIZER=Address and -DLLVMUSESANITIZECOVERAGE=O >>>>> n, >>>>> >>> which >>>>> >>> >> was working before: >>>>> >>> >> _>>>>> >>> >> % cmake -GNinja _ >>>>> >>> >> -DCMAKEBUILDTYPE=Release _-DLLVMENABLEASSERTIONS=On _ _>>>>> >>> >> -DLLVMENABLEWERROR=On _ >>>>> >>> >> -DLLVMUSESANITIZER=Address >>>>> -DLLVMUSESANITIZECOVERAGE=On _>>>>> >>> _ _>>>>> >>> >> -DCMAKECCOMPILER=$HOME/llvm-lkgc/bin/clang _ >>>>> >>> >> $HOME/code/llvm-src >>>>> >>> >> >>>>> >>> >> But when I run any of the fuzzers, it looks like the sanitizer >>>>> coverage >>>>> >>> >> hasn't been set up correctly: >>>>> >>> >> >>>>> >>> >> % ./bin/llvm-as-fuzzer >>>>> >>> 2017-08-24 11:14:33 >>>>> >>> >> INFO: Seed: 4089166883 <(408)%20916-6883> >>>>> >>> >> INFO: Loaded 1 modules (50607 guards): 50607 [0x10e14ef80,_ _>>>>> >>> 0x10e18063c),_ _>>>>> >>> >> INFO: Loaded 1 PC tables (0 PCs): 0_ _[0x10e2870a8,0x10e2870a8),_ _>>>>> >>> >> ERROR: The size of coverage PC tables does not match the_ _number_ _>>>>> of_ _>>>>> >>> instrumented PCs. This might be a bug in the compiler, please_ _>>>>> contact the_ _>>>>> >>> libFuzzer developers._ _>>>>> >>> >>_ _>>>>> >>> >> From the build logs, it looks like we're now building objects_ _>>>>> with_ _>>>>> >>> these_ _>>>>> >>> >> sanitizer flags:_ _>>>>> >>> >>_ _>>>>> >>> >> -fsanitize=address_ _>>>>> >>> >> -fsanitize-address-use-after-scope_ _>>>>> >>> >> -fsanitize=fuzzer-no-link_ _>>>>> >>> >>_ _>>>>> >>> >> We're then linking the fuzzer binaries with these:_ _>>>>> >>> >>_ _>>>>> >>> >> -fsanitize=address_ _>>>>> >>> >> -fsanitize-address-use-after-scope_ _>>>>> >>> >> -fsanitize=fuzzer-no-link_ _>>>>> >>> >> -fsanitize=fuzzer_ _>>>>> >>> >>_ _>>>>> >>> >> Any idea what's wrong or where to start looking?_ _>>>>> >>>_ _>>>>> >>_ _>>>>> >>_ _>>>>> >>_ _>>>>>_ _>>>>_ _>>>>_ _>>>> ________________________ _>>>> LLVM Developers mailing list_ _>>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> >> >> > > > --

-- -- Peter



More information about the llvm-dev mailing list