Getting Involved — Extra Clang Tools 22.0.0git documentation (original) (raw)

clang-tidy has several own checks and can run Clang static analyzer checks, but its power is in the ability to easily write custom checks.

Checks are organized in modules, which can be linked into clang-tidywith minimal or no code changes in clang-tidy.

Checks can plug into the analysis on the preprocessor level using PPCallbacksor on the AST level using AST Matchers. When an error is found, checks can report them in a way similar to how Clang diagnostics work. A fix-it hint can be attached to a diagnostic message.

The interface provided by clang-tidy makes it easy to write useful and precise checks in just a few lines of code. If you have an idea for a good check, the rest of this document explains how to do this.

There are a few tools particularly useful when developing clang-tidy checks:

If CMake is configured with CLANG_TIDY_ENABLE_STATIC_ANALYZER=NO,clang-tidy will not be built with support for theclang-analyzer-* checks or the mpi-* checks.

If CMake is configured with CLANG_TIDY_ENABLE_QUERY_BASED_CUSTOM_CHECKS=NO,clang-tidy will not be built with support for query based checks.

Choosing the Right Place for your Check

If you have an idea of a check, you should decide whether it should be implemented as a:

Preparing your Workspace

If you are new to LLVM development, you should read the Getting Started with the LLVM System, Using Clang Tools and How To Setup Clang Tooling For LLVM documents to check out and build LLVM, Clang and Clang Extra Tools with CMake.

Once you are done, change to the llvm/clang-tools-extra directory, and let’s start!

When you configure the CMake build, make sure that you enable the clang and clang-tools-extra projects to build clang-tidy. Because your new check will have associated documentation, you will also want to installSphinx and enable it in the CMake configuration. To save build time of the core Clang libraries, you may want to only enable the X86target in the CMake configuration.

The Directory Structure

clang-tidy source code resides in thellvm/clang-tools-extra directory and is structured as follows:

clang-tidy/ # Clang-tidy core. |-- ClangTidy.h # Interfaces for users. |-- ClangTidyCheck.h # Interfaces for checks. |-- ClangTidyModule.h # Interface for clang-tidy modules. |-- ClangTidyModuleRegistry.h # Interface for registering of modules. ... |-- google/ # Google clang-tidy module. |-+ |-- GoogleTidyModule.cpp |-- GoogleTidyModule.h ... |-- llvm/ # LLVM clang-tidy module. |-+ |-- LLVMTidyModule.cpp |-- LLVMTidyModule.h ... |-- objc/ # Objective-C clang-tidy module. |-+ |-- ObjCTidyModule.cpp |-- ObjCTidyModule.h ... |-- tool/ # Sources of the clang-tidy binary. ... test/clang-tidy/ # Integration tests. ... unittests/clang-tidy/ # Unit tests. |-- ClangTidyTest.h |-- GoogleModuleTest.cpp |-- LLVMModuleTest.cpp |-- ObjCModuleTest.cpp ...

Writing a clang-tidy Check

So you have an idea of a useful check for clang-tidy.

First, if you’re not familiar with LLVM development, read through the Getting Started with the LLVM System document for instructions on setting up your workflow and the LLVM Coding Standards document to familiarize yourself with the coding style used in the project. For code reviews, we currently use LLVM Github, though historically we used Phabricator.

Next, you need to decide which module the check belongs to. Modules are located in subdirectories of clang-tidy/and contain checks targeting a certain aspect of code quality (performance, readability, etc.), a certain coding style or standard (Google, LLVM, CERT, etc.) or a widely used API (e.g. MPI). Their names are the same as the user-facing check group names described above.

After choosing the module and the name for the check, run theclang-tidy/add_new_check.py script to create the skeleton of the check and plug it to clang-tidy. It’s the recommended way of adding new checks.

By default, the new check will apply only to C++ code. If it should apply under different language options, use the --language script’s parameter.

If we want to create a readability-awesome-function-names, we would run:

$ clang-tidy/add_new_check.py readability awesome-function-names

The add_new_check.py script will:

Let’s look at the check class definition in more detail:

...

#include "../ClangTidyCheck.h"

namespace clang::tidy::readability {

... class AwesomeFunctionNamesCheck : public ClangTidyCheck { public: AwesomeFunctionNamesCheck(StringRef Name, ClangTidyContext *Context) : ClangTidyCheck(Name, Context) {} void registerMatchers(ast_matchers::MatchFinder *Finder) override; void check(const ast_matchers::MatchFinder::MatchResult &Result) override; bool isLanguageVersionSupported(const LangOptions &LangOpts) const override { return LangOpts.CPlusPlus; } };

} // namespace clang::tidy::readability

...

Constructor of the check receives the Name and Context parameters, and must forward them to the ClangTidyCheck constructor.

In our case the check needs to operate on the AST level and it overrides theregisterMatchers and check methods. If we wanted to analyze code on the preprocessor level, we’d need instead to override the registerPPCallbacksmethod.

In the registerMatchers method, we create an AST Matcher (see AST Matchersfor more information) that will find the pattern in the AST that we want to inspect. The results of the matching are passed to the check method, which can further inspect them and report diagnostics.

using namespace ast_matchers;

void AwesomeFunctionNamesCheck::registerMatchers(MatchFinder *Finder) { Finder->addMatcher(functionDecl().bind("x"), this); }

void AwesomeFunctionNamesCheck::check(const MatchFinder::MatchResult &Result) { const auto *MatchedDecl = Result.Nodes.getNodeAs("x"); if (!MatchedDecl->getIdentifier() || MatchedDecl->getName().startswith("awesome_")) return; diag(MatchedDecl->getLocation(), "function %0 is insufficiently awesome") << MatchedDecl << FixItHint::CreateInsertion(MatchedDecl->getLocation(), "awesome_"); }

(If you want to see an example of a useful check, look atclang-tidy/google/ExplicitConstructorCheck.hand clang-tidy/google/ExplicitConstructorCheck.cpp).

If you need to interact with macros or preprocessor directives, you will want to override the method registerPPCallbacks. The add_new_check.py script does not generate an override for this method in the starting point for your new check.

Check development tips

Writing your first check can be a daunting task, particularly if you are unfamiliar with the LLVM and Clang code bases. Here are some suggestions for orienting yourself in the codebase and working on your check incrementally.

Guide to useful documentation

Many of the support classes created for LLVM are used by Clang, such as StringRefand SmallVector. These and other commonly used classes are described in the Important and useful LLVM APIs andPicking the Right Data Structure for the Tasksections of the LLVM Programmer’s Manual. You don’t need to memorize all the details of these classes; the generated doxygen documentationhas everything if you need it. In the header LLVM/ADT/STLExtras.h you’ll find useful versions of the STL algorithms that operate on LLVM containers, such as llvm::all_of.

Clang is implemented on top of LLVM and introduces its own set of classes that you will interact with while writing your check. When a check issues diagnostics and fix-its, these are associated with locations in the source code. Source code locations, source files, ranges of source locations and the SourceManager class provide the mechanisms for describing such locations. These and other topics are described in the “Clang” CFE Internals Manual. Whereas the doxygen generated documentation serves as a reference to the internals of Clang, this document serves as a guide to other developers. Topics in that manual of interest to a check developer are:

Most checks will interact with C++ source code via the AST. Some checks will interact with the preprocessor. The input source file is lexed and preprocessed and then parsed into the AST. Once the AST is fully constructed, the check is run by applying the check’s registered AST matchers against the AST and invoking the check with the set of matched nodes from the AST. Monitoring the actions of the preprocessor is detached from the AST construction, but a check can collect information during preprocessing for later use by the check when nodes are matched by the AST.

Every syntactic (and sometimes semantic) element of the C++ source code is represented by different classes in the AST. You select the portions of the AST you’re interested in by composing AST matcher functions. You will want to study carefully the AST Matcher Reference to understand the relationship between the different matcher functions.

Using the Transformer library

The Transformer library allows you to write a check that transforms source code by expressing the transformation as a RewriteRule. The Transformer library provides functions for composing edits to source code to create rewrite rules. Unless you need to perform low-level source location manipulation, you may want to consider writing your check with the Transformer library. The Clang Transformer Tutorial describes the Transformer library in detail.

To use the Transformer library, make the following changes to the code generated by the add_new_check.py script:

Developing your check incrementally

The best way to develop your check is to start with simple test cases and increase complexity incrementally. The test file created by the add_new_check.py script is a starting point for your test cases. A rough outline of the process looks like this:

The quickest way to prototype your matcher is to use clang-query to interactively build up your matcher. For complicated matchers, build up a matching expression incrementally and use clang-query’s let command to save named matching expressions to simplify your matcher.

clang-query> let c1 cxxRecordDecl() clang-query> match c1

Alternatively, pressing the tab key after a previous matcher’s open parentheses would also show which matchers can be chained with the previous matcher, though some matchers that work may not be listed. Note that tab completion does not currently work on Windows.

Just like breaking up a huge function into smaller chunks with intention-revealing names can help you understand a complex algorithm, breaking up a matcher into smaller matchers with intention-revealing names can help you understand a complicated matcher.

Once you have a working clang-query matcher, the C++ API matchers will be the same or similar to your interactively constructed matcher (there can be cases where they differ slightly). You can use local variables to preserve your intention-revealing names that you applied to nested matchers.

Creating private matchers

Sometimes you want to match a specific aspect of the AST that isn’t provided by the existing AST matchers. You can create your own private matcher using the same infrastructure as the public matchers. A private matcher can simplify the processing in your check method by eliminating complex hand-crafted AST traversal of the matched nodes. Using the private matcher allows you to select the desired portions of the AST directly in the matcher and refer to it by a bound name in the checkmethod.

Unit testing helper code

Private custom matchers are a good example of auxiliary support code for your check that can be tested with a unit test. It will be easier to test your matchers or other support classes by writing a unit test than by writing a FileCheck integration test. The ASTMatchersTests target contains unit tests for the public AST matcher classes and is a good source of testing idioms for matchers.

You can build the Clang-tidy unit tests by building the ClangTidyTests target. Test targets in LLVM and Clang are excluded from the “build all” style action of IDE-based CMake generators, so you need to explicitly build the target for the unit tests to be built.

Making your check robust

Once you’ve covered your check with the basic “happy path” scenarios, you’ll want to torture your check with as many edge cases as you can cover in order to ensure your check is robust. Running your check on a large code base, such as Clang/LLVM, is a good way to catch things you forgot to account for in your matchers. However, the LLVM code base may be insufficient for testing purposes as it was developed against a particular set of coding styles and quality measures. The larger the corpus of code the check is tested against, the higher confidence the community will have in the check’s efficacy and false-positive rate.

Some suggestions to ensure your check is robust:

Documenting your check

The add_new_check.py script creates entries in therelease notes, the list of checks and a new file for the check documentation itself. It is recommended that you have a concise summary of what your check does in a single sentence that is repeated in the release notes, as the first sentence in the doxygen comments in the header file for your check class and as the first sentence of the check documentation. Avoid the phrase “this check” in your check summary and check documentation.

If your check relates to a published coding guideline (C++ Core Guidelines, SEI CERT, etc.) or style guide, provide links to the relevant guideline or style guide sections in your check documentation.

Provide enough examples of the diagnostics and fix-its provided by the check so that a user can easily understand what will happen to their code when the check is run. If there are exceptions or limitations to your check, document them thoroughly. This will help users understand the scope of the diagnostics and fix-its provided by the check.

Building the target docs-clang-tools-html will run the Sphinx documentation generator and create HTML documentation files in the tools/clang/tools/extra/docs/html directory in your build tree. Make sure that your check is correctly shown in the release notes and the list of checks. Make sure that the formatting and structure of your check’s documentation look correct.

Registering your Check

(The add_new_check.py script takes care of registering the check in an existing module. If you want to create a new module or know the details, read on.)

The check should be registered in the corresponding module with a distinct name:

class MyModule : public ClangTidyModule { public: void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override { CheckFactories.registerCheck( "my-explicit-constructor"); } };

Now we need to register the module in the ClangTidyModuleRegistry using a statically initialized variable:

static ClangTidyModuleRegistry::Add X("my-module", "Adds my lint checks.");

When using LLVM build system, we need to use the following hack to ensure the module is linked into the clang-tidy binary:

Add this near the ClangTidyModuleRegistry::Add<MyModule> variable:

// This anchor is used to force the linker to link in the generated object file // and thus register the MyModule. volatile int MyModuleAnchorSource = 0;

And this to the main translation unit of the clang-tidy binary (or the binary you link the clang-tidy library in)clang-tidy/ClangTidyForceLinker.h:

// This anchor is used to force the linker to link the MyModule. extern volatile int MyModuleAnchorSource; static int MyModuleAnchorDestination = MyModuleAnchorSource;

Configuring Checks

If a check needs configuration options, it can access check-specific options using the Options.get<Type>("SomeOption", DefaultValue) call in the check constructor. In this case, the check should also override theClangTidyCheck::storeOptions method to make the options provided by the check discoverable. This method lets clang-tidy know which options the check implements and what the current values are (e.g. for the-dump-config command-line option).

class MyCheck : public ClangTidyCheck { const unsigned SomeOption1; const std::string SomeOption2;

public: MyCheck(StringRef Name, ClangTidyContext *Context) : ClangTidyCheck(Name, Context), SomeOption1(Options.get("SomeOption1", -1U)), SomeOption2(Options.get("SomeOption2", "some default")) {}

void storeOptions(ClangTidyOptions::OptionMap &Opts) override { Options.store(Opts, "SomeOption1", SomeOption1); Options.store(Opts, "SomeOption2", SomeOption2); } ...

Assuming the check is registered with the name “my-check”, the option can then be set in a .clang-tidy file in the following way:

CheckOptions: my-check.SomeOption1: 123 my-check.SomeOption2: 'some other value'

If you need to specify check options on a command line, you can use the inline YAML format:

$ clang-tidy -config="{CheckOptions: {a: b, x: y}}" ...

Testing Checks

To run tests for clang-tidy, build the check-clang-tools target. For instance, if you configured your CMake build with the ninja project generator, use the command:

$ ninja check-clang-tools

clang-tidy checks can be tested using either unit tests orlit tests. Unit tests may be more convenient to test complex replacements with strict checks. Lit tests allow using partial text matching and regular expressions which makes them more suitable for writing compact tests for diagnostic messages.

The check_clang_tidy.py script provides an easy way to test both diagnostic messages and fix-its. It filters out CHECK lines from the test file, runs clang-tidy and verifies messages and fixes with two separate FileCheck invocations: once with FileCheck’s directive prefix set to CHECK-MESSAGES, validating the diagnostic messages, and once with the directive prefix set to CHECK-FIXES, running against the fixed code (i.e., the code after generated fix-its are applied). In particular, CHECK-FIXES: can be used to check that code was not modified by fix-its, by checking that it is present unchanged in the fixed code. The full set of FileCheck directives is available (e.g., CHECK-MESSAGES-SAME:, CHECK-MESSAGES-NOT:), though typically the basic CHECK forms (CHECK-MESSAGES and CHECK-FIXES) are sufficient for clang-tidy tests. Note that the FileCheckdocumentation mostly assumes the default prefix (CHECK), and hence describes the directive as CHECK:, CHECK-SAME:, CHECK-NOT:, etc. Replace CHECK with either CHECK-FIXES or CHECK-MESSAGES for clang-tidy tests.

An additional check enabled by check_clang_tidy.py ensures that if CHECK-MESSAGES: is used in a file then every warning or error must have an associated CHECK in that file. Or, you can use CHECK-NOTES:instead, if you want to also ensure that all the notes are checked.

To use the check_clang_tidy.py script, put a .cpp file with the appropriate RUN line in the test/clang-tidy directory. UseCHECK-MESSAGES: and CHECK-FIXES: lines to write checks against diagnostic messages and fixed code.

It’s advised to make the checks as specific as possible to avoid checks matching incorrect parts of the input. Use [[@LINE+X]]/[[@LINE-X]]substitutions and distinct function and variable names in the test code.

Here’s an example of a test using the check_clang_tidy.py script (the full source code is at test/clang-tidy/checkers/google/readability-casting.cpp):

// RUN: %check_clang_tidy %s google-readability-casting %t

void f(int a) { int b = (int)a; // CHECK-MESSAGES: :[[@LINE-1]]:11: warning: redundant cast to the same type [google-readability-casting] // CHECK-FIXES: int b = a; }

To check more than one scenario in the same test file, use-check-suffix=SUFFIX-NAME on check_clang_tidy.py command line or-check-suffixes=SUFFIX-NAME-1,SUFFIX-NAME-2,.... With -check-suffix[es]=SUFFIX-NAME you need to replace your CHECK-*directives with CHECK-MESSAGES-SUFFIX-NAME and CHECK-FIXES-SUFFIX-NAME.

Here’s an example:

// RUN: %check_clang_tidy -check-suffix=USING-A %s misc-unused-using-decls %t -- -- -DUSING_A // RUN: %check_clang_tidy -check-suffix=USING-B %s misc-unused-using-decls %t -- -- -DUSING_B // RUN: %check_clang_tidy %s misc-unused-using-decls %t ... // CHECK-MESSAGES-USING-A: :[[@LINE-8]]:10: warning: using decl 'A' {{.}} // CHECK-MESSAGES-USING-B: :[[@LINE-7]]:10: warning: using decl 'B' {{.}} // CHECK-MESSAGES: :[[@LINE-6]]:10: warning: using decl 'C' {{.*}} // CHECK-FIXES-USING-A-NOT: using a::A;$ // CHECK-FIXES-USING-B-NOT: using a::B;$ // CHECK-FIXES-NOT: using a::C;$

There are many dark corners in the C++ language, and it may be difficult to make your check work perfectly in all cases, especially if it issues fix-it hints. The most frequent pitfalls are macros and templates:

  1. Code written in a macro body/template definition may have a different meaning depending on the macro expansion/template instantiation.
  2. Multiple macro expansions/template instantiations may result in the same code being inspected by the check multiple times (possibly, with different meanings, see 1), and the same warning (or a slightly different one) may be issued by the check multiple times; clang-tidy will deduplicate _identical_ warnings, but if the warnings are slightly different, all of them will be shown to the user (and used for applying fixes, if any).
  3. Making replacements to a macro body/template definition may be fine for some macro expansions/template instantiations, but easily break some other expansions/instantiations.

If you need multiple files to exercise all the aspects of your check, it is recommended you place them in a subdirectory named for the check under the Inputsdirectory for the module containing your check. This keeps the test directory from getting cluttered.

If you need to validate how your check interacts with system header files, a set of simulated system header files is located in the checkers/Inputs/Headersdirectory. The path to this directory is available in a lit test with the variable%clang_tidy_headers.

Submitting a Pull Request

Before submitting a pull request, contributors are encouraged to runclang-tidy and clang-format on their changes to ensure code quality and catch potential issues. While clang-tidy is not currently enforced in CI, following this practice helps maintain code consistency and prevent common errors.

Here’s a useful command to check your staged changes:

$ git diff --staged -U0 | ./clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
-j $(nproc) -path build/ -p1 -only-check-in-db $ git clang-format

Note that some warnings may be false positives or require careful consideration before fixing. Use your judgment and feel free to discuss in the pull request if you’re unsure about a particular warning.

Out-of-tree check plugins

Developing an out-of-tree check as a plugin largely follows the steps outlined above, including creating a new module and doing the hacks to register the module. The plugin is a shared library whose code lives outside the clang-tidy build system. Build and link this shared library against LLVM as done for other kinds of Clang plugins. If using CMake, use the keywordMODULE while invoking add_library or llvm_add_library.

The plugin can be loaded by passing -load to clang-tidy in addition to the names of the checks to enable.

$ clang-tidy --checks=-*,my-explicit-constructor -list-checks -load myplugin.so

There are no expectations regarding ABI and API stability, so the plugin must be compiled against the version of clang-tidy that will be loading the plugin.

The plugins can use threads, TLS, or any other facilities available to in-tree code which is accessible from the external headers.

Note that testing out-of-tree checks might involve getting llvm-lit from an LLVM installation compiled from source. See Getting Started with the LLVM System for ways to do so.

Alternatively, get lit following the test-suite guide and get the FileCheck binary, and write a version of check_clang_tidy.py to suit your needs.

Running clang-tidy on LLVM

To test a check, it’s best to try it out on a larger code base. LLVM and Clang are the natural targets as you already have the source code around. The most convenient way to run clang-tidy is with a compile command database; CMake can automatically generate one; for a description of how to enable it, seeHow To Setup Clang Tooling For LLVM. Once compile_commands.json is in place and a working version of clang-tidy is in PATH the entire code base can be analyzed with clang-tidy/tool/run-clang-tidy.py. The script executes clang-tidy with the default set of checks on every translation unit in the compile command database and displays the resulting warnings and errors. The script provides multiple configuration flags.

On checks profiling

clang-tidy can collect per-check profiling info, and output it for each processed source file (translation unit).

To enable profiling info collection, use the -enable-check-profile argument. The timings will be output to stderr as a table. Example output:

$ clang-tidy -enable-check-profile -checks=-*,readability-function-size source.cpp ===-------------------------------------------------------------------------=== clang-tidy checks profiling ===-------------------------------------------------------------------------=== Total Execution Time: 1.0282 seconds (1.0258 wall clock)

---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.9136 (100.0%) 0.1146 (100.0%) 1.0282 (100.0%) 1.0258 (100.0%) readability-function-size 0.9136 (100.0%) 0.1146 (100.0%) 1.0282 (100.0%) 1.0258 (100.0%) Total

It can also store that data as JSON files for further processing. Example output:

$ clang-tidy -enable-check-profile -store-check-profile=. -checks=-*,readability-function-size source.cpp $ # Note that there won't be timings table printed to the console. $ ls /tmp/out/ 20180516161318717446360-source.cpp.json $ cat 20180516161318717446360-source.cpp.json { "file": "/path/to/source.cpp", "timestamp": "2018-05-16 16:13:18.717446360", "profile": { "time.clang-tidy.readability-function-size.wall": 1.0421266555786133e+00, "time.clang-tidy.readability-function-size.user": 9.2088400000005421e-01, "time.clang-tidy.readability-function-size.sys": 1.2418899999999974e-01 } }

There is only one argument that controls profile storage: