(original) (raw)
The size increase of chrome on Linux by switching from all ICF to safe ICF is small.
All ICF:
text data bss dec hex filename
169314343 8472660 2368965 180155968 abcf640 chrome
169314343 8472660 2368965 180155968 abcf640 chrome
Safe ICF:
text data bss dec hex filename
text data bss dec hex filename
174521550 8497604 2368965 185388119 b0ccc57 chrome
On Windows, chrome.dll increases size by around 14 MB (12MB increases in .text section).
All ICF:
Size of out\\Default\\chrome.dll is 170.715648 MB
name: mem size , disk size
.text: 141.701417 MB
.rdata: 22.458476 MB
.data: 3.093948 MB, 0.523264 MB
.pdata: 4.412364 MB
.00cfg: 0.000040 MB
.gehcont: 0.000132 MB
.retplne: 0.000108 MB
.rodata: 0.004544 MB
.tls: 0.000561 MB
CPADinfo: 0.000056 MB
\_RDATA: 0.000244 MB
.rsrc: 0.285232 MB
.reloc: 1.324196 MB
name: mem size , disk size
.text: 141.701417 MB
.rdata: 22.458476 MB
.data: 3.093948 MB, 0.523264 MB
.pdata: 4.412364 MB
.00cfg: 0.000040 MB
.gehcont: 0.000132 MB
.retplne: 0.000108 MB
.rodata: 0.004544 MB
.tls: 0.000561 MB
CPADinfo: 0.000056 MB
\_RDATA: 0.000244 MB
.rsrc: 0.285232 MB
.reloc: 1.324196 MB
Safe ICF:
Size of out\\icf-safe\\chrome.dll is 184.499712 MB
name: mem size , disk size
.text: 153.809529 MB
.rdata: 23.123628 MB
.data: 3.093948 MB, 0.523264 MB
.pdata: 5.367396 MB
.00cfg: 0.000040 MB
.gehcont: 0.000132 MB
.retplne: 0.000108 MB
.rodata: 0.004544 MB
.tls: 0.000561 MB
CPADinfo: 0.000056 MB
\_RDATA: 0.000244 MB
.rsrc: 0.285232 MB
.reloc: 1.379364 MB
name: mem size , disk size
.text: 153.809529 MB
.rdata: 23.123628 MB
.data: 3.093948 MB, 0.523264 MB
.pdata: 5.367396 MB
.00cfg: 0.000040 MB
.gehcont: 0.000132 MB
.retplne: 0.000108 MB
.rodata: 0.004544 MB
.tls: 0.000561 MB
CPADinfo: 0.000056 MB
\_RDATA: 0.000244 MB
.rsrc: 0.285232 MB
.reloc: 1.379364 MB
If an attribute is used and it affects unnamed\_addr of a symbol, it determines whether the symbols should show up in the .addrsig table. All-ICF mode in ld.lld and lld-link ignore symbols in the .addrsig table, if they belong to code sections. So, it won't have an effect on disabling ICF.
On Mon, Mar 22, 2021 at 10:19 PM Fangrui Song <maskray@google.com> wrote:
On 2021-03-22, David Blaikie via llvm-dev wrote:
\>ICF: Identical Code Folding
\>
\>Linker deduplicates functions by collapsing any identical functions
\>together - with icf=safe, the linker looks at a .addressing section in the
\>object file and any functions listed in that section are not treated as
\>collapsible (eg: because they need to meet C++'s "distinct functions have
\>distinct addresses" guarantee)
The name originated from MSVC link.exe where icf stands for "identical COMDAT folding".
gold named it "identical code folding" - which makes some sense because gold does not fold readonly data.
In LLD, the name is not accurate for two reasons: (1) the feature can
apply to readonly data as well; (2) the folding is by section, not by function.
We define identical sections as they have identical content and their
outgoing relocation sets cannot be distinguished: they need to have the
same number of relocations, with the same relative locations, with the
referenced symbols indistinguishable.
Then, ld.lld --icf={safe,all} works like this:
For a set of identical sections, the linker picks one representative and
drops the rest, then redirects references to the representative.
Note: this can confuse debuggers/symbolizers/profilers easily.
lld-link /opt:icf is different from ld.lld --icf but I haven't looked
into it closely.
I find that the feature's saving is small given its downside
(also increaded link time: the current LLD's implementation is inferior:
it performs a quadratic number of comparisons among an equality class):
This is the size differences for the 'lld' executable:
% size lld.{none,safe,all}
text data bss dec hex filename
96821040 7210504 550810 104582354 63bccd2 lld.none
95217624 7167656 550810 102936090 622ae1a lld.safe
94038808 7167144 550810 101756762 610af5a lld.all
% size gold.{none,safe,all}
text data bss dec hex filename
96857302 7174792 550825 104582919 63bcf07 gold.none
94469390 7174792 550825 102195007 6175f3f gold.safe
94184430 7174792 550825 101910047 613061f gold.all
Note that the --icf=all result caps the potential saving of the proposed annotation.
Actually with some large internal targets I get even smaller savings.
ld.lld --icf=safe is safer than gold --icf=safe but probably misses some opportunities.
It can be that clang codegen/optimizer fail to mark some cases as {,local\_}unnamed\_addr.
I know Chromium and the Windows world can be different:) But I'd still want to
get some numbers first.
Last, I have seen that Chromium has some code like
https://source.chromium.org/chromium/chromium/src/+/master:skia/ext/SkMemory\_new\_handler.cpp
void sk\_abort\_no\_print() {
// Linker's ICF feature may merge this function with other functions with
// the same definition (e.g. any function whose sole job is to call abort())
// and it may confuse the crash report processing system.
// http://crbug.com/860850
static int static\_variable\_to\_make\_this\_function\_unique = 0x736b; // "sk"
base::debug::Alias(&static\_variable\_to\_make\_this\_function\_unique);
abort();
}
If we want an approach to work with link.exe, I don't know what we can do...
If no desire for link.exe compatibility, I can see that having a proper way marking the function
can be useful... but in any case if an attribute is used, it probably should affect
unnamed\_addr directly instead of being called \*icf\*.
\>On Mon, Mar 22, 2021 at 6:16 PM Philip Reames via llvm-dev <
\>llvm-dev@lists.llvm.org> wrote:
\>
\>> Can you define ICF please? And give a bit of context?
\>>
\>> Philip
\>> On 3/22/21 5:27 PM, Zequan Wu via llvm-dev wrote:
\>>
\>> Hi all,
\>>
\>> Background:
\>> It's been a longstanding difficulty of debugging with ICF. Programmers
\>> don't have control over which sections should be folded by ICF, which
\>> sections shouldn't. The existing address significant table won't have
\>> effect for code sections during all ICF mode in both ld.lld and lld-link.
\>> By switching to safe ICF could mark code sections as unique, but at a cost
\>> of increasing binary size out of control. So, it would be good if
\>> programmers could selectively disable ICF in source code by annotating
\>> global functions/variables with an attribute to improve debugging
\>> experience and have the control on the binary size increase.
\>>
\>> My plan is to add a new section table(\`.no\_icf\`) to object files. Sections
\>> of all symbols inside the table should not be folded by all ICF mode. And
\>> symbols can only be added into the table by annotating global
\>> functions/variables with a new attribute(\`no\_icf\`) in source code.
\>>
\>> What do you think about this approach?
\>>
\>> Thanks,
\>> Zequan
\>>
\>>
\>> \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
\>> LLVM Developers mailing listllvm-dev@lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
\>>
\>> \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
\>> LLVM Developers mailing list
\>> llvm-dev@lists.llvm.org
\>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
\>>
\>\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
\>LLVM Developers mailing list
\>llvm-dev@lists.llvm.org
\>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev