RFC: Integrate Clang-Tidy checkers into Clang Static Analyzer (original) (raw)

Malavika Samak, Static Security Tools, Apple

Summary

This RFC proposes an integration that allows Clang Static Analyzer (CSA) users to run Clang-Tidy and Clang Static Analyzer (CSA) in a single analysis pass. The integration will allow users to enable, disable, and configure Clang-Tidy checkers directly through CSA’s command-line interface. The integration does not enable Clang-Tidy checkers by default; users explicitly choose which checkers to run and will have full control to enable any combination of checkers or disable them entirely. We have already built and deployed the proposed integration internally with positive results, and this proposal seeks to upstream these efforts to benefit the broader clang community.

Motivation

Clang Static Analyzer (CSA) and Clang-Tidy are both built to help improve code quality by detecting defects and suspicious code patterns. However, they employ different analysis techniques: CSA employs sophisticated data-flow and path-sensitive analysis and clang-tidy utilizes faster AST-based pattern matchers. This difference in approach means each tool excels at identifying distinct classes of defects, while some overlap may exist where both can detect similar issues. To avail the benefits of both tools, currently developers must run both tools separately and then combine the individual results.

This workflow increases build times as the ASTs are built twice, once for each analysis and it also requires managing distinct configurations for both tools. Further, this approach requires users to merge the results to create a unified view of code quality issues. Finally, it creates a placement confusion for the tool developers when introducing new checkers in determining the most appropriate home for every new analysis.

To address this we propose to create an integration between the two tools. The integration will create the ability to execute clang-tidy checkers as part of the CSA execution. Such an execution is expected to cut down the compilation overhead. This integration will enable unified reporting across all checkers, delivering both analysis results in existing CSA output formats— Plist, SARIF and HTML. This should also simplify the user workflow where they can now consolidate everything into one command execution and one configuration approach. We also expect this to simplify and speed up CI/CD pipelines.

Design Goals

The proposed integration adheres to the following design goals:

It must be resilient to future changes to the Clang-tidy frontend.
The solution will support both command-line invocation and scan-build integration.
Provide clarity to users about which checks are enabled and how they are configured, eliminating any ambiguity in the analysis setup.
The two analysis engines will execute in tandem without interference, allowing each to leverage its strengths independently while sharing the underlying AST infrastructure.
The Clang-Tidy findings generated via the integration will be reformatted to align with CSA’s reporting conventions, ensuring a consistent and coherent user experience across all analysis results.

Proposed Solution

Checker Enablement and Disablement: The integration extends CSA’s existing checker selection mechanism to support Clang-tidy checkers.
- Enabling clang-tidy checkers: User can enable specific Clang-tidy checkers or entire modules using the new -analyzer-tidy-checker flag:
  * Individual checker: -analyzer-tidy-checker=bugprone-assert-side-effect
  * Entire module: -analyzer-tidy-checker=bugprone-*
- Disabling clang-tidy checkers: Similarly, Users can disable specific checkers or modules using the new -analyzer-disable-tidy-checker flag:
  * Individual checker: -analyzer-disable-tidy-checker=bugprone-infinite-loop
  * Entire module: -analyzer-disable-tidy-checker=bugprone-*
  * All Clang-tidy checkers: Use the existing -analyzer-disable-all-checks flag, which will turn off all clang-tidy checkers along with all CSA checkers
Checker configuration: The integration will provide a YAML-based configuration system through the -analyzer-tidy-config flag, which accepts Clang-Tidy configuration in the same YAML format used by .clang-tidy files.
- Per-check options: User can configure individual clang-tidy checker using the existing CheckOptions category offered by clang-tidy.
  * -analyzer-tidy-config “{CheckOptions: {bugprone-sizeof-expression.WarnOnSizeOfPointer: true,bugprone-sizeof-expression.WarnOnSizeOfIntegerExpression: true}}”
- Global options: Similarly, user can also set the global options for the clang-tidy
  * -analyzer-tidy-config “{WarningsAsErrors: ‘bugprone-*’}”
Support for scan-build: The integration will maintain compatibility with scan-build, to ensure users can leverage the integration within their existing build workflows. When invoked through scan-build, the integration automatically intercepts compilation commands and applies both CSA and Clang-Tidy analyses to the codebase, requiring no modifications to the underlying build system. Users can control Clang-tidy checker enablement through scan-build’s command-line interface using the newly introduced -enable-tidy-checker and -disable-tidy-checker flags.
Checker execution and reporting: The integration employs a multiplexed AST consumer architecture where both CSA and Clang-Tidy consumers process the same Abstract Syntax Tree in a single compiler invocation. The multiplexed AST consumer routes AST events to both consumers and diagnostics from both tools are collected separately. To achieve unified reporting, Clang-tidy diagnostics are converted to CSA’s PathDiagnostic format by transforming each Clang-Tidy warning into a PathDiagnostic. The merged diagnostics from both tools are then emitted through CSA’s existing PathDiagnosticConsumer infrastructure, producing unified output in all supported CSA formats.

Non-Goals

While this integration aims to provide an unified static analysis experience, the following is out-of-scope:

This proposal does not seek to merge the underlying analysis engines themselves—CSA and Clang-Tidy will remain architecturally distinct tools with separate codebases
We are not attempting to create a single unified checker format or API that would require rewriting existing checkers from either tool.
The integration does not aim to replace or deprecate standalone Clang-Tidy usage.
We are not proposing changes to the core analysis algorithms or checker logic of either tool—the focus is purely on integration infrastructure, not on modifying how individual checkers operate.

Overall Architecture

Overview: The Clang-Tidy integration is built on a multiplexed consumer architecture that enables both CSA and Clang-tidy to analyze the same Abstract Syntax Tree (AST) in a single compiler invocation. This design minimizes overhead by eliminating duplicate parsing, semantic analysis, and AST construction that would occur when running the tools separately. The figure below provides the high level overview of the proposed integration.

Overall Data Flow and Control Flow

The integration extends the existing CSA pipeline with new components while preserving the original architecture.

Command-Line Processing and Configuration Loading
The flow begins with command-line parsing in the compiler invocation layer, which already handles CSA flags for checker enablement. The integration will extend this to also parse the newly introduced front end flags. This involves applying the same validation patterns as existing CSA flag handling, checking for empty strings and basic syntax errors before storing raw argument values in two new vectors added to the existing analyzer options class. The parsed options flow into the existing frontend pipeline unchanged: the standard preprocessor, parser, and semantic analysis stages execute exactly as before, producing the same AST representation.

Consumer Creation and Multiplexing
The integration adds a new multiplexer that creates divergence at consumer creation. The AnalysisConsumer which previously created only a consumer for CSA, now will conditionally create a second consumer if at least one clang-tidy checker is enabled. While creating the clang-tidy consumer a newly added helper function creates the configuration for clang-tidy. The configuration processor detects format by string inspection (new logic), wraps simple key=value inputs provided via -analyzer-tidy-config flag into YAML CheckOptions structure (new transformation), parses with Clang-Tidy’s existing configuration parser, and merges configurations using Clang-Tidy’s existing merge method. The resulting configuration initializes a Clang-Tidy context object (existing Clang-Tidy class used in new context), which uses existing Clang-Tidy infrastructure to register checkfactories, instantiate checks, and populate a matcher framework.

The factory then creates a new multiplexing consumer that wraps both the existing CSA consumer and the new Clang-Tidy consumer, implementing the standard AST consumer interface (existing interface) by forwarding callbacks to both underlying consumers (new multiplexing logic). This multiplexer is the key architectural component enabling unified analysis: it presents a single consumer interface to the frontend while internally routing to two independent analysis engines.

Analysis Execution
When the frontend invokes the translation unit completion callback on the multiplexer (existing callback mechanism), the multiplexer routes it to both consumers. The CSA consumer executes CSA’s existing analysis pipeline completely unchanged—building control flow graphs, running symbolic execution, generating path diagnostics. Simultaneously (conceptually, though sequentially in implementation), the Clang-tidy consumer executes Clang-Tidy’s existing analysis pipeline—traversing the AST with pattern matchers, invoking check callbacks, generating standard diagnostics. Both analysis engines operate independently on the same AST without information sharing, maintaining clean separation of concerns.

Diagnostic Conversion and Unification
A new diagnostic is added to the pipeline, which intercepts Clang-Tidy’s diagnostic objects (existing type) before they reach Clang-Tidy’s normal output. This converter extracts check names using the context’s existing accessor method, constructs bug type strings with new formatting (“Clang-Tidy [check-name]”), applies category mapping using a new lookup table (either mapping to specific CSA categories like “Logic Error” or defaulting to “Clang-Tidy [module]”), converts notes to path events (new transformation), and creates path diagnostic objects (existing CSA type) that are structurally identical to CSA’s output. This conversion ensures that both CSA and Clang-Tidy findings flow through the same reporting infrastructure.

Output Generation
The converted diagnostics merge with CSA’s path diagnostics in the existing path diagnostic consumer infrastructure. The integration adds a new convenience function that creates multiple output consumers simultaneously: when requested, it instantiates the HTML consumer to generate interactive reports, the plist consumer to serialize to XML, and the text consumer to emit minimal console output, all writing to the same base output path. This combined consumer approach simplifies invocation by allowing users to generate multiple output formats in a single analysis run. Additionally, users can still invoke individual consumers directly:the SARIF consumer (existing, unchanged) generates JSON when explicitly requested. The existing consumers cannot distinguish between CSA-generated and converter-generated path diagnostics, ensuring unified output without modifying any output-generating code. All findings from both tools are placed in the same output files with consistent formatting and structure.

Backward Compatibility
Backward compatibility is maintained by making all new code paths conditional. If the no clang-tidy checkers are enabled, the factory creates only the existing CSA consumer, the multiplexer is not instantiated, and the system executes the original CSA-only code path with zero overhead. Existing CSA tests run unchanged, existing command-line invocations work identically, and existing output formats remain byte-for-byte compatible when Clang-Tidy integration is not enabled.

Testing Strategy

The integration will be validated through a suite of unit and integration tests that verify correct operation.

Availability tests will confirm that the integration can be invoked through both direct command-line usage with the -analyzer-tidy-checker flags and through scan-build with its -enable-tidy-checker and -disable-tidy-checker options, ensuring that both entry points correctly activate the unified analysis framework.
Diagnostic conversion tests will verify that Clang-Tidy warnings are accurately transformed into CSA’s PathDiagnostic format, preserving all relevant information including checker names, source locations, diagnostic messages, and notes, while confirming that the converted diagnostics appear correctly in all output formats including plist, SARIF, HTML, and text.
Checker enablement tests will validate the wildcard pattern matching for enabling entire checker modules (such as bugprone-*), verify that individual checker enablement works correctly, confirm that the -analyzer-disable-tidy-checker flag properly disables specific checkers or modules, and ensure that -analyzer-disable-all-checks disables both CSA and Clang-Tidy checkers as expected.
Configuration tests will verify that the -analyzer-tidy-config flag correctly parses YAML configuration, that CheckOptions are properly applied to their respective checkers, that multiple configuration flags are merged correctly with later flags overriding earlier ones, that .clang-tidy files are read and merged with command-line configuration according to the documented precedence rules, and that configuration errors produce clear diagnostic messages.

Overhead analysis

To better understand the overall benefit of the clang-tidy integration, we gathered the analysis time and peak memory demand for the following three workflows:

Clang static analyzer without integration in default mode
Clang static analyzer with clang-tidy integration and three clang-tidy checkers enabled
Direct clang-tidy invocation with the same three checkers enabled

The table below presents the above mentioned data for two WebKit files: the largest file in the repository (2.8 MB) and an average-sized file (226 KB).

File name	Size	Integration	CSA	Clang-tidy
ReaderArticleFinderSource.cpp	2.8MB	5.73 sec, 107.71 MB	4.46 sec, 104.77 MB	5.48 sec, 77.28 MB
FormMetadataJSControllerSource.cpp	226KB	0.93 sec, 90.49 MB	0.8 sec, 89.35 MB	0.92 sec, 60.13 MB

This indicates that the overall analysis time of executing the CSA and clang-tidy individually is expected to be higher than executing the CSA with clang-tidy integration for most users.

Limitations

No Cross-Tool Information Sharing
Clang-Tidy checkers do not receive path-sensitive information from CSA. They continue to operate with standard AST-based analysis. The integration provides unified execution but does not enhance Clang-Tidy’s analysis capabilities. Bridging these two architectures (AST matchers vs. symbolic execution) is not the purpose or within scope for this integration.

Fix-It Application Not Integrated
Clang-Tidy fix-its are preserved in diagnostics and SARIF output, since the CSA interface currently does not support applying fix-its. Users must run standalone clang-tidy with -fix to apply fixes.

Checker Naming Differences
CSA uses package structure (core.NullDereference) while Clang-Tidy uses a module based structure (bugprone-assert-side-effect). The integration will not unify the naming conventions to ensure minimal disruption to the existing users/integrations of both tools.

No Automatic DeduplicationThe integration does not automatically deduplicate findings between CSA and Clang-Tidy. If both tools detect the same issue, both warnings appear in the output.

Conclusion

This integration combines the complementary strengths of CSA and Clang-Tidy in a single, efficient analysis pass, eliminating the overhead and complexity of running separate tools. By unifying these analysis engines under a consistent interface with intelligent defaults and flexible configuration, we believe this integration will significantly improve the static analysis experience for users. We welcome community feedback on this proposal.