[LLVMdev] asan coverage (original) (raw)
Bob Wilson bob.wilson at apple.com
Fri Mar 28 14:59:03 PDT 2014
- Previous message: [LLVMdev] asan coverage
- Next message: [LLVMdev] asan coverage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mar 28, 2014, at 1:33 AM, Kostya Serebryany <kcc at google.com> wrote:
Some more data on code size.
I've build CPU2006/483.xalancbmk with a) -O2 -fsanitize=address -m64 -gline-tables-only -mllvm -asan-coverage=1 b) -O2 -fsanitize=address -m64 -gline-tables-only -fprofile-instr-generate The first is 27Mb and the second is 48Mb. _The extra size comes from llvmprf* sections. You may be able to make these sections more compact, but you will not make them tiny. The instrumentation code generated by -asan-coverage=1 is less efficient than -fprofile-instr-generate in several ways (slower, fatter, provides less data). But it does not add any extra sections to the binary and wins in the overall binary size. Ideally, I'd like to see such options for -fprofile-instr-generate as well. —kcc
It might make sense to move at least some of the counters into the .bss section so they don’t take up space in the executable.
We’re also seeing that the instrumentation bloats the code size much more than expected and we’re still investigating to see why that is the case.
On Sat, Feb 22, 2014 at 9:13 AM, Kostya Serebryany <kcc at google.com> wrote: Our users combine asan and coverage testing and they do it on thousands machines. (An older blog post about using asan: http://blog.chromium.org/2012/04/fuzzing-for-security.html) The binaries need to be shipped to virtual machines, where they will be run. The VMs are very short of disk and the network bandwidth has a cost too. We may be able to ship stripped binaries to those machine but this will complicate the logic immensely. Besides, zip-ed binaries are stored for several revisions every day and the storage also costs money. Just to give you the taste (https://commondatastorage.googleapis.com/chromium-browser-asan/index.html): asan-symbolized-linux-release-252010.zip 2014-02-19 14:34:24 406.35MB asan-symbolized-linux-release-252017.zip 2014-02-19 18:22:54 406.41MB asan-symbolized-linux-release-252025.zip 2014-02-19 21:35:49 406.35MB asan-symbolized-linux-release-252031.zip 2014-02-20 00:44:25 406.35MB asan-symbolized-linux-release-252160.zip 2014-02-20 06:30:16 406.34MB asan-symbolized-linux-release-252185.zip 2014-02-20 09:21:47 408.52MB asan-symbolized-linux-release-252188.zip 2014-02-20 12:20:05 408.52MB asan-symbolized-linux-release-252194.zip 2014-02-20 15:01:05 408.52MB asan-symbolized-linux-release-252218.zip 2014-02-20 18:00:42 408.54MB asan-symbolized-linux-release-252265.zip 2014-02-20 21:00:03 408.65MB asan-symbolized-linux-release-252272.zip 2014-02-21 00:00:40 408.66MB --kcc On Sat, Feb 22, 2014 at 8:58 AM, Bob Wilson <bob.wilson at apple.com> wrote: Why is the binary size a concern for coverage testing? On Feb 21, 2014, at 8:43 PM, Kostya Serebryany <kcc at google.com> wrote: I understand why you don't want to rely on debug info and instead produce your own section. We did this with our early version of llvm-based tsan and it was simpler to implement. But here is a data point to support my suggestion: chromium binary built with asan, coverage and -gline-tables-only is 1.6Gb. The same binary is 1.1Gb when stripped, so, the line tables require 500Mb. Separate line info for coverage will essentially double this amount. The size of binary is a serious concern for our users, please take it into consideration. Thanks! --kcc
On Fri, Feb 21, 2014 at 8:28 PM, Bob Wilson <bob.wilson at apple.com> wrote: We’re not going to use debug info at all. We’re emitting the counters in the clang front-end. We just need to emit separate info to show how to map those counters to source locations. Mapping to PCs and then using debug info to get from the PCs to the source locations just makes things harder and loses information in the process. On Feb 21, 2014, at 2:57 AM, Kostya Serebryany <kcc at google.com> wrote:
We may need some additional info. What kind of additional info? I haven't put a ton of thought into this, but I'm hoping we can either (a) use debug info as is or add some extra (valid) debug info to support this, or (b) add an extra debug-info-like section to instrumented binaries with the information we need. I'd try this data format (binary equivalent): /path/to/binary/or/dso1 numcounters1 pc1 counter1 pc2 counter2 pc3 counter3 ... /path/to/binary/or/dso2 numcounters2 pc1 counter1 pc2 counter2 pc3 counter3 ... I don't see a straightforward way to produce such data today because individual Instructions do not work as labels. But I think this can be supported in LLVM codegen. Here is a raw patch with comments, just to get the idea. Index: lib/CodeGen/CodeGenPGO.cpp =================================================================== --- lib/CodeGen/CodeGenPGO.cpp (revision 201843) +++ lib/CodeGen/CodeGenPGO.cpp (working copy) @@ -199,7 +199,8 @@ llvm::Type *Args[] = { Int8PtrTy, // const char *MangledName Int32Ty, // uint32t NumCounters - Int64PtrTy // uint64t *Counters + Int64PtrTy, // uint64t *Counters + Int64PtrTy // uint64t *PCs }; llvm::FunctionType *FTy = llvm::FunctionType::get(PGOBuilder.getVoidTy(), Args, false); @@ -209,9 +210,10 @@ llvm::Constant *MangledName = _CGM.GetAddrOfConstantCString(CGM.getMangledName(GD), "llvmpgoname"); MangledName = llvm::ConstantExpr::getBitCast(MangledName, Int8PtrTy); - PGOBuilder.CreateCall3(EmitFunc, MangledName, + PGOBuilder.CreateCall4(EmitFunc, MangledName, PGOBuilder.getInt32(NumRegionCounters), - PGOBuilder.CreateBitCast(RegionCounters, Int64PtrTy)); + PGOBuilder.CreateBitCast(RegionCounters, Int64PtrTy), + PGOBuilder.CreateBitCast(RegionPCs, Int64PtrTy)); } llvm::Function *CodeGenPGO::emitInitialization(CodeGenModule &CGM) { @@ -769,6 +771,13 @@ llvm::GlobalVariable::PrivateLinkage, llvm::Constant::getNullValue(CounterTy), _"llvmpgoctr"); + + RegionPCs = + new llvm::GlobalVariable(CGM.getModule(), CounterTy, false, + llvm::GlobalVariable::PrivateLinkage, + llvm::Constant::getNullValue(CounterTy), _+ "llvmpgopcs"); + } void CodeGenPGO::emitCounterIncrement(CGBuilderTy &Builder, unsigned Counter) { @@ -779,6 +788,21 @@ llvm::Value *Count = Builder.CreateLoad(Addr, "pgocount"); Count = Builder.CreateAdd(Count, Builder.getInt64(1)); Builder.CreateStore(Count, Addr); _+ // We should put the PC of the instruction that increments llvmpgoctr _+ // into llvmpgopcs, which will be passed to llvmpgoemit. + // This patch is wrong in many ways: + // * We pass the PC of the Function instead of the PC of the Instruction, + // because the latter doesn't work like this. We'll need to support + // Instructions as labels in LLVM codegen. + // * We actually store the PC on each increment, while we should initialize + // this array at link time (need to refactor this code a bit). + // + Builder.CreateStore( + Builder.CreatePointerCast( + castllvm::Instruction(Count)->getParent()->getParent(), + Builder.getInt64Ty() // FIXME: use a better type + ), + Builder.CreateConstInBoundsGEP264(RegionPCs, 0, Counter)); } Index: lib/CodeGen/CodeGenPGO.h =================================================================== --- lib/CodeGen/CodeGenPGO.h (revision 201843) +++ lib/CodeGen/CodeGenPGO.h (working copy) @@ -59,6 +59,7 @@ unsigned NumRegionCounters; llvm::GlobalVariable *RegionCounters; + llvm::GlobalVariable *RegionPCs; llvm::DenseMap<const Stmt*, unsigned> *RegionCounterMap; llvm::DenseMap<const Stmt*, uint64t> *StmtCountMap; std::vector *RegionCounts; @@ -66,8 +67,9 @@ public: CodeGenPGO(CodeGenModule &CGM) - : CGM(CGM), NumRegionCounters(0), RegionCounters(0), RegionCounterMap(0), - StmtCountMap(0), RegionCounts(0), CurrentRegionCount(0) {} + : CGM(CGM), NumRegionCounters(0), RegionCounters(0), RegionPCs(0), + RegionCounterMap(0), StmtCountMap(0), RegionCounts(0), + CurrentRegionCount(0) {} ~CodeGenPGO() {} /// Whether or not we have PGO region data for the current function. This is
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140328/25c0025f/attachment.html>
- Previous message: [LLVMdev] asan coverage
- Next message: [LLVMdev] asan coverage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]