[RFC] Upstreaming ClangIR (original) (raw)
Hey folks,
This RFC proposes upstreaming ClangIR: incorporating the llvm/clangir repo from LLVM’s incubator into the mainstream llvm/llvm-project.
Background
A little over a year ago, an RFC introducing ClangIR was published: a new higher-level IR for C/C++. It’s an MLIR based C/C++ dialect for Clang generated out of the Clang AST and can be lowered to other IRs – check it out for more background and motivation bits and see the FAQ below to understand what has changed since. The ClangIR page also contains general information, documentation and usage instructions.
A year of progress
Last October Evolution of ClangIR talk was presented at the LLVM Dev Meeting (the video should be available on LLVM’s youtube soon). It explores some aspects of the design and some of the past year’s achievements. Given the progress and community built around it I believe that CIR is no longer ‘experimental’ in concept and the working group (MLIR C/C++ frontend folks) now believes that the dialect and architecture are in the right direction.
The project also grew from two contributors from January 2023, to a total of nine by the end of 2023, with four currently active ones. I also expect that number to increase due to upstreaming (see next section). Some of the achievements include:
- A coroutines-aware C++ lifetime checker based on ClangIR. Deployment for this just started at Meta, where a C++ lifetime bug (which caused major churn in production) was retroactively caught in our codebase.
- Progressive lowering of ClangIR.
- CIRGen: AST to CIR. Direct translation, avoiding early optimizations.
- Passes: Lifetime checker, cleanup pass, lowering prepare, idiom recognizer and library call optimizer.
- LoweringPrepare: Unwrap some abstractions and expand CIR into more basic CIR operations, prior to lowering to LLVM IR. To be more concrete, this is where things like static initializers get their
cxa_aquire
/cxa_release
, trivial constructors get expanded to CIR’smemcpy
/memmove
, etc.
- Lowering to LLVM IR: more than half of the SingleSource tests (1000+) pass correctness checks. This is the default ClangIR pipeline.
- Lowering to MLIR in-tree dialects: still in toy shape. Some lowering to
memref
,arith
,func
,scf
andcf
. - Community: monthly meetings with other C/C++ MLIR Frontend efforts. This has been extremely valuable in getting feedback, direction, building momentum and figuring out how all of our projects fit together.
Why upstream now?
Why is this a good moment in time to include ClangIR into llvm-project?
The project is currently getting contributions from some new interested parties (e.g. see recent OpenACC RFC), and it’s more convenient to everyone involved for ClangIR collaboration to happen directly upstream, including examples like: OpenACC bits shared between Clang and Flang, and SYCL/MLIR effort already using a fork from intel/llvm. It also happens that it’s more appealing for some of the entities involved to directly contribute to an upstream llvm-project instead of a project under the incubator - incorporation into their process become an easier step.
ClangIR is young enough to be actively redesigned. Its evolution so far has been driven by lifetime checker and LLVM lowering, but there’s more to cover on C/C++ language extensions (GPUs, HPC, …), static analysis, debug info, sanitizers, etc. The project is also mature enough not to cause major breakages/churn to the rest of LLVM, and is of sufficient quality that one expects from the LLVM infrastructure.
Stakeholders
The conversation of upstreaming ClangIR has already started among interested parties, here’s a list of community members that would rather see ClangIR upstreamed sooner than later:
- OpenACC / OpenMP at NVIDIA. OpenACC’s upstreaming RFC states the interest in a lowering story via ClangIR. They are already sending PRs to ClangIR and have expressed the intent to see this upstream – Erich Keane, David Olsen.
- SYCL. Intel and Codeplay are looking into downstream strategies in order to adopt ClangIR for their SYCL-MLIR project presented at EuroLLVM 2023 (talk, poster). We provided them a merged branch with another fork to start an experiment, but that isn’t ideal long term – Intel and Codeplay: Lukas Sommer, Julian Oppermann, Victor Lomüller, Victor Perez, Ettore Tiotto, Whitney Tsang.
- HLSL. Interested in exploring the potential benefits of lowering HLSL to Clang IR to preserve structured control flow for the SPIR-V backend in the future, which may also allow us to share more common graphics legalization passes with DXIL. HLSL has complex legalization requirements that are onerous to implement solely on ASTs. Today DXC relies on legalization at the LLVM IR layer, which can result in poor quality diagnostics issued late. ClangIR has the potential to significantly improve the accuracy and quality of this class of diagnostic. Having ClangIR included in llvm-project would make testing this alternate path easier without needing to manage multiple merge branches for frequently updating projects – Google and Microsoft: Diego Novillo, Steven Perron, Natalie Chouinard, Nathan Gauër, Cassandra Beckley, David Neto, Chris Bieneman, Justin Bogner.
- Polygeist. There’s common agreement with the project owners over community meetings that Polygeist would benefit from lowering directly out of ClangIR instead of AST. Having ClangIR in tree will let them start the process – Google: Alex Zinenko.
- VAST. Trail of Bits expresses their interest in using ClangIR to lower VAST IRs down to LLVM IR. As VAST targets high-level program analysis for C/C++, it would benefit everyone not to split the community into multiple representations and allow interchangeable formats. ClangIR, being the representation that brings this to the table, can serve as a unifying standard, similar to how LLVM IR did for various tools in the past – Henrich Lauko, Lukáš Korenčik and Peter Goodman.
- At NextSilicon, we recognize the significant value of integrating clang-mlir into our workflow. As an accelerated compute company, we primarily use MLIR as the driving force behind our chip optimization efforts. The adoption of ClangIR would introduce an additional optimization layer for our hardware, leveraging high-level abstractions like for loops and multi-dimensional arrays to enhance performance – Or Birenzwige, Johannes de Fine Licht, Christian Ulmann, and Tobias Gysi.
If you are reading this, and I missed your project (or your support), please chime-in!
Implementation Strategy
ClangIR’s development follows some guiding principles:
- Follow the proven CodeGen skeleton: re-use the direct AST-to-LLVM codegen skeleton as much as possible as it has been proven to be a correct and safe baseline and is a convenient entry point for newcomers.
- Produce hard errors on unimplemented language features: this prevents silently failing to properly generate some IR for unimplemented features which might lead to tricky late implementation. In the near future the plan is to make this more graceful using some form of diagnostics.
- Generate the same LLVM IR at baseline: lowered LLVM IR out of ClangIR should be as close as possible from what Clang currently generates. This helps eliminate canonicalization issues and phase ordering when investigating codegen quality. If it makes sense, this guiding principle could change when ClangIR is mature.
- Avoid early optimization and premature lowering within the AST-to-ClangIR transformation: traditional Clang codegen does a lot of this eagerly (e.g. replacing ctor calls with memcpy). ClangIR’s raison d’être is to prevent going “too low, too early”.
- ClangIR’s source code is mostly isolated and non-intrusive: no dependency on custom AST changes or ported patches.
Source code
Most of the new code is in clang/lib/CIR
, clang/include/clang/{CIR,CIRFrontendAction}
and clang/test/CIR
. Additional changes in the codebase include:
- A ClangIR-based clang-tidy infrastructure in
clang-tools-extra/clang-tidy/cir
(used to invoke the lifetime checker from clang-tidy) - Driver changes in tablegen files to add new flags to activate ClangIR-specific behavior.
Compiler Flags
From clangir.org:
By passing
-fclangir-enable
to the clang driver, the compilation pipeline is modified and CIR gets emitted from Clang AST and then lowered to LLVM IR, backend, etc … To get CIR printed out of a compiler invocation the flag-emit-cir
can be used to tell the compiler to stop right after CIR is produced.
ClangIR codegen (CIRGen) and passes are hidden behind flags:
-fclangir-enable
forces CIR to be enabled in the pipeline and used transparently (e.g. if one asks the compiler to output assembly then that’s the end result).- Miscellaneous
-fclangir-*
flags change CIRGen and pipeline behavior (adding passes, disabling verifiers, etc). - The
-emit-cir
flag which isthe moral equivalent of-emit-llvm
for CIR.
Prefixing clangir
in flag names has been our way to mark behavior as experimental
, though alternatively these flags could be changed and prefixed with experimental
- as done by similarly experimental past projects, e.g. the new pass manager.
Builds
Building ClangIR is optional and can be accomplished by setting the proper CMake flag: CLANG_ENABLE_CIR
. It works very similar to existing flags like CLANG_ENABLE_ARCMT
or CLANG_ENABLE_STATIC_ANALYZER
.
Note that CIR test execution is also tied to overall CMake enablement, e.g. ninja check-clang-cir
only works if the proper CMake setup is done.
Git strategy & Timeline
This is probably a more engaging discussion and I’d prefer to first focus on getting approval on the proposal before tackling this (maybe even on its own RFC). So unless this becomes somehow critical to the decision, perhaps best to wait for a follow up?
FAQ
Is there an easy way to play around with ClangIR?
Yes, compiler explorer to the rescue! See an example here: Compiler Explorer. Note that it’s still missing proper setup with a more updated C++ standard library version in order to play with coroutines and other more modern features.
To what extent has the current design of ClangIR changed since the initial RFC?
The initial design has changed on top of community feedback since then. The top three changes in ClangIR are:
- Operations are able to hold references back to the Clang AST (inspired by Swift’s SIL).
- We have opted for a more cautious approach, staying closer to LLVM unless there are compelling reasons to raise operations early. This choice will help us reach the finish line faster, as opposed to pursuing a clean room design with higher semantics and representation.
- Focused on direct LLVMIR dialect lowering rather than standard MLIR (in tree) dialects. The in-tree dialects are still unstable and for a project like clang a non-fixed target IR wasn’t the target we chose to go with. Work on the standard MLIR path is encouraged but hasn’t been the focus of the more active contributors.
How about the Kleckner criteria (build time footprint)?
Reid Kleckner (@rnk) raised some good questions regarding ClangIR’s compile time footprint. For the “C/C++ → CIR → LLVM” path, we have only been able to gather compile time numbers for the part of the SingleSource tests we’re able to build from the LLVM testsuite - results are noisy though. Unfortunately, it’s not a reliable performance comparison as many of these tests are too small.
For the “C/C++ → CIR → C++ lifetime analysis” path there’s currently no good proxy to compare against, especially given CIR codegen is only done for source files being analyzed (no CIRGen for definitions from headers, only declarations are emitted).
The honest answer is that we don’t have reliable numbers to show just yet. Though it’s also worth mentioning that there are possible compile time benefits unique to MLIR around function pass level parallelism.
How much longer does Clang’s build and testing get?
Time to build: The ClangIR specific code added were in the noise compared to a build that also built both Clang and MLIR. However, the cost of building MLIR is pretty significant. The average build time measured to add MLIR to the LLVM_ENABLE_PROJECTS list was ~45% overhead compared to just building Clang. (conf: 2x AMD, 166 cores, 224GB)
Time to run tests (assuming nothing else to build): ninja check-clang-cir
reports in ~2s for release builds and ~6s for debug builds (~225 tests. conf: Apple M1 Max laptop, 64GB).
What’s the progress on static analysis?
The lifetime checker is the only current piece in that direction, and it does very simple analysis - it’s capable of catching low hanging fruits from modern C++ mainly because the higher level operations and the AST back references are really useful in the compiler understanding C++. Over the past year we (subset of MLIR C/C++ frontend folks) had many discussions and guidance from some of the experts in the community (such as Gabor Horvath, Dmytro Hrybenko and Artem Dergachev), and some open project ideas we’d like to see in the future include: teach dataflow analysis framework to use ClangIR and implement some of Clang’s CFG-based analysis (e.g., AnalysisBasedWarnings) with CIR passes (this would also be great for compile time evaluation).
Assuming there’s a large amount of code duplication between ClangIR (CIRGen) generation and LLVM tradition IR generation in CodeGen (IRGen), what are the expectations for maintainers (for example, if someone fixes a bug in IRGen, should they also fix it in CIRGen?)
No. CIRGen follows the general skeleton of IRGen… However, there are no plans to merge both code generators. One area of improvement is about the sharing of AST queries done by both - there are duplicated helpers that gather information from types and other AST properties, and those should be shared. We currently track a bunch of these and plan to send a specific RFC in the future to discuss proper mechanisms to address them.
On the expectations for maintainers: none. If the developers of IRGen want to be helpful they can communicate the new gap, but nothing is required. We’ve been operating as a few people playing catchup for years now, we’re fine with that until the community decides it’s worth their time to keep up.
Acknowledgements
Thanks to everyone who contributed PRs, created issues and participated in the C/C++ MLIR frontend meetings. Special thanks to folks who contributed to the project in the past year: Nathan Lanza (@lanza), Vinicius Couto Espindola (@sitio-couto), Hongtao Hu (@htyu), David Olsen, Yury Gribov, Oleg Kamenkov, Henrich Lauko (@xheno), Jeremy Kun (@j2kun), Keyi Zhang, Sirui Mu (@Lancern), Roman Rusyaev (@rusyaev-roman), Zhou (@redbopo), Ivan Murashko (@ivanmurashko), Nikolas Klauser (@philnik) and Fabian Mara Cordero (@fabianmc).
RFC accepted in this message.