[PSA] Annotating LLVM Public Interface (original) (raw)

Hi everyone! I’m a new contributor to LLVM, and I have been looking into building LLVM as a DLL (shared library) on Windows. To support this option, we are adding annotations to LLVM’s public headers to explicitly describe the set of symbols that should be visible externally.

Details

Code changes will primarily consist of annotating LLVM’s public symbols with the LLVM_ABI macro already defined in llvm/Support/Compiler.h. There are similar macros for annotating C++ template instantiations which are used in some less common situations. A portion of the codebase is already annotated.

Because the macros are inactive by default, adding them throughout the codebase is low-risk and can be done incrementally. Annotations will not become mandatory until the entire codebase has been annotated and there are CI jobs, documentation, and tools in place to catch regressions.

Generally, annotations will be added to individual symbols rather than to entire classes. This method is preferred for a couple of reasons:

The bulk of annotations will be added mechanically using the Interface Definition Scanner tool, which leverages clang’s AST and rewriter libraries.

Previous Efforts

This LLVM discourse from 2021 covers the original proposal in detail and is still mostly relevant. Following that discussion, There was some initial work in 2023 which identified issues and proved-out viability. This work resulted in this Discord discussion.

In 2024, the effort was resumed as part of a GSoC project to support clang plugins on Windows. This work is primarily tracked in this issue on GitHub. The project added build options to build LLVM as a DLL, introduced the macros to annotate LLVM’s public surface area, and annotated a portion of the codebase. The work to get LLVM fully building as a DLL is incomplete.

Maintainability

Most LLVM developers do not build on Windows locally, so they may not immediately catch breaks caused by missing symbol annotations. There are a number of things we will do to help identify issues earlier in the development cycle. Annotations will not be mandatory until these pieces are in place.

1. Documentation and Examples

The use cases for LLVM_ABI and related macros will be documented and discoverable. We will document, with examples, patterns and situations that may occur to make it easy for developers to address related issues that arise during development.

2. Windows LLVM DLL CI build job

A Windows LLVM DLL build job to CI will catch unannotated symbols at link time. This job can run either pre- or post-merge. This build job will not catch any unannotated LLVM symbol referenced by projects that don’t get built.

We may also consider changing the default Windows build to LLVM DLL. This change would let all existing Windows build jobs to catch missing export issues.

3. Approximate DLL export behavior on other shared-library builds

We can achieve similar behavior to Windows DLL exports in other environments by building ELF and Mach-O shared libraries with default hidden symbol visibility. This result is achieved by setting -fvisibility-default=hidden and re-defining the LLVM_ABI annotation to __attribute__((__visibility__("default"))). The existing annotations in llvm/Support/Compiler.h already behave this way when configured for a non-Windows shared library build.

This mechanism will produce similar behavior to the Windows DLL build and could catch most issues without building for Windows. However, since most developers are using static library builds locally, this change won’t necessarily result in catching missing annotations earlier.

4. Static analysis with the Interface Definition Scanner tool

The Interface Definition Scanner tool will be run on PRs to flag newly introduced symbols that are not properly annotated for export. It can run much faster than a full Windows build of all projects, and can suggest exact fixes to address missing exports.

Once the bulk of symbol annotations have been merged, we can enable IDS to run on all LLVM PRs – there is not need to wait until building Windows as a DLL is a complete or fully supported configuration.

Additional Background

LLVM can already be built as a shared library on ELF- and Mach-O-based systems; however, building it as a Windows DLL is more involved for several reasons:

Exporting C++ Classes

When defining DLL exports, it is possible to annotate entire C++ classes and structs, rather than their individual members, with __declspec(dllexport). Annotating a class will export every method and static field in the class class including:

Annotating a class does not implicitly export nested classes/structs or any friend class or function declarations. A class with a class-level annotation cannot also have annotated members-- it will fail to compile.

The advantage of annotating at the class level is that new members will be automatically exported. However, exporting entire classes can cause significantly more methods to be exported than necessary, and it can lead to tricky-to-debug problems with compiler-generated methods.