Supporting multiple LLVM profile formats at once (original) (raw)

October 1, 2025, 7:07pm 1

Our group has worked on adding llvm-cov support for Linux and Xen. These freestanding systems cannot easily use the existing coverage definitions and code (e.g., InstrProfilingBuffer.c) because they are not linked with compiler-rt, thus requiring redefinition of relevant data structures and reimplementation of data writing to disk (dump() in Xen, llvm_cov_serialize_raw_profile() in Linux PATCH v2 1/4). The key issue with reimplementation is that it must be updated whenever the LLVM Profile version changes (e.g., update v4 to v10).

A potential solution would be to generalize the LLVM definitions (InstrProfData.inc) to support multiple versions simultaneously (rather than only the latest version), e.g., using #ifdef-sprinkled code to accommodate the current version plus several previous versions. This change would make it easier for Linux, Xen, and many embedded systems to simply copy-paste the code from LLVM and get all the latest updates. We understand that this change would come at a somewhat higher cost of maintainability for LLVM. A similar discussion was raised by Xen developers.

Would you be interested to explore this change? Should we prepare a draft patch to support multiple versions at once?

@evodius96

Yes, this is my concern, as well as testing. Although there are multiple versioned formats in InstrProfData.inc: raw profile, indexed profile, and coverage mapping, the example to which you link concerns raw profile format, which is not guaranteed to be backward or forward compatible. Just to make sure I am aligned: you suggest using preprocessor directives to “accommodate the current version plus several previous versions”. It sounds like you’re asking to have the capability of building llvm-cov (and llvm-profdata?) according to an older raw profile format in order to mitigate the burden placed on you to keep current with format changes. How would we test this upstream?

Maintaining active backward compatibility might be more flexible, in my opinion, and easier to test, but would probably be much more difficult to guarantee and make needed changes.

To be honest, I don’t know the history behind why raw profile format is maintained as it is. I’d like to pull in a few others who may know more about the history or have an opinion.

@petrhosek @MaskRay @gulfemsavrun @vedantk @ellishg

-Alan

royger November 25, 2025, 3:10pm 3

I’ve also raised this as an issue quite some time ago in the LLVM github page:

TL;DR it would be nice if clang provided a built in to generate the raw profile output without having to care about versioning. Just having one function that returns the required buffer size, plus another function where we pass a buffer and it gets magically filled would be enough. But they need to be builtins, not part of the compiler-rt library.