[llvm-dev] Question: llvm-link type merge behaviour of c++ classes (original) (raw)
Alex Denisov via llvm-dev llvm-dev at lists.llvm.org
Sun May 31 05:08:44 PDT 2020
- Previous message: [llvm-dev] Question: llvm-link type merge behaviour of c++ classes
- Next message: [llvm-dev] Question: llvm-link type merge behaviour of c++ classes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Björn,
I’m not particularly knowledgeable about the implementation of the llvm-link, but my guess is that the merging is intended there. I had a very similar problem recently: basically the loss of information caused by IRLinker (the implementation behind llvm-link). I tried to dig into it to find if it’s possible to change the behavior, but stuck pretty quickly since the code there is not very intuitive (IMHO, of course).
I recall reading somewhere that the intention to merge structurally equivalent types was to get rid of “duplicated” types, i.e.:
You have a module A and a module B, both include a header with the same type resulting in the following bitcode:
; Module A %struct.Shape = type { … }
; Module B %struct.Shape = type { … }
However, when the modules are loaded in the same LLVM context you will see something like this:
; Module A %struct.Shape = type { … }
; Module B %struct.Shape.1 = type { … }
So in the end if you link them together you'll get:
; Module A+B merged %struct.Shape = type { … }
Which is desired. What is not desired, IMO, is the situation that you describe, i.e.: “different” types being merged regardless of their name.
So, in the end we decided not to use IRLinker at all and merge the types on our own, you can read more on the approach here: https://lowlevelbits.org/type-equality-in-llvm/
I’m not sure if that helps at all, but you are definitely not the only one out there who is confused by the implementation :)
On 28. May 2020, at 12:33, Björn Fiedler via llvm-dev <llvm-dev at lists.llvm.org> wrote:
Hi LLVM community, I'd like to ask a question regarding the behavior of llvm-link: My code contains Classes which are structurally equivalent but they are totally unrelated and distinct on a c++ point of view. However, if the compiled IR gets processed by llvm-link, these types are merged together. My question is: Is this expected behavior or a bug? To explain it more in detail, a reduced example follows: IR code before llvm-link:
_ _..._ _%class.Bakery = type { i32, i32 }_ _%class.Container = type { i8 }_ _%class.Rectangle = type <{ %class.Shape, i8, [3 x i8] }>_ _%class.Shape = type { i32, i32 }_ _..._ _define linkonceodr dsolocal void @ZN9Container6insertEP5Shape(%class.Container*, %class.Shape*) #1 comdat align 2 { ... }_ _..._ _
IR code after llvm-link:_ _..._ _%class.Bakery = type { i32, i32 }_ _%class.Container = type { i8 }_ _%class.Rectangle = type <{ %class.Bakery, i8, [3 x i8] }>_ _..._ _define linkonceodr dsolocal void @ZN9Container6insertEP5Shape(%class.Container*, %class.Bakery*) #1 comdat align 2 { ... }_ _..._ _
In this example theBakery
andShape
types get merged. The type definition ofRectangle
reflects this change, too, but from my intuition, they should stay distinct. I've fond an article from Chris[1] where the new type system is described. There he states that the name gets part of the type. "This means that LLVM 3.0 doesn't exhibit the previous confusing behavior where two seemingly different structs would be printed with the same name." So, if the name is part of the type and the "confusing behavior" is removed, why get these types merged? Is this the intended behavior? My use-case and reason for all this comes from writing an analysis tool for IR code which gets stuck in finding matching calls for function pointers. The changing and merging in these types messes up the current logic of finding matching candidates. To reproduce this code, find the c++ code below and use the following invocations:_ _clang++-9 mwe.cc -S -emit-llvm -o beforelink.ll_ _llvm-link-9 -S beforelink.ll -o afterlink.ll_ _
mwe.cc_ _// #include "shapes.h"_ _// shapes.h content follows_ _class Shape {_ _public:_ _int width, height;_ _};_ _class Rectangle : public Shape {_ _public:_ _bool issquare;_ _};_ _class Container {_ _public:_ _void insert(Shape* s){};_ _};_ _// end shapes.h_ _// #include "bakery.h"_ _// bakery.h content follows:_ _class Bakery {_ _public:_ _int numovens, numemployees;_ _};_ _// end bakery.h_ _// some instances_ _Bakery b;_ _Container c;_ _Rectangle r;_ _void dostuff() { c.insert(&r); }_ _void bake(Bakery* bakery) {}_ _
My system: ``` clang++-9 --version clang version 9.0.1-12 Target: x8664-pc-linux-gnu Thread model: posix InstalledDir: /usr/binllvm-link-9 --version LLVM (http://llvm.org/): LLVM version 9.0.1 Optimized build. Default target: x8664-pc-linux-gnu Host CPU: skylake ``` Thanks in advance Björn [1] http://blog.llvm.org/2011/11/llvm-30-type-system-rewrite.html -- Björn Fiedler, M.Sc. (Scientific Staff) Leibniz Universität Hannover (LUH) Fachgebiet System- und Rechnerarchitektur (SRA) Appelstraße 4 30167 Hannover, Germany Tel: +49 511 762-19736 Fax: +49 511 762-19733 eMail: fiedler at sra.uni-hannover.de WWW: https://www.sra.uni-hannover.de
LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
- Previous message: [llvm-dev] Question: llvm-link type merge behaviour of c++ classes
- Next message: [llvm-dev] Question: llvm-link type merge behaviour of c++ classes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]