[RFC] Modern C++ Alternative to TableGen for MLIR Operation Definition (original) (raw)

Hello MLIR Community,

I’d like to propose to the MLIR community a CRTP (Curiously Recurring Template Pattern)+Trait-binding-based approach for defining MLIR operations as a potential alternative or supplement to the current TableGen-based system. This approach uses selective method overriding and trait binding.

Complete Proposal: GitHub Repository
Working Demo: Compilable C++ code provided in repository
Full RFC: Detailed Technical Document

Core Question

Why generate C++ when we can write better C++ directly?

TableGen generates ~200 lines of C++ from ~10 lines of DSL, but the MLIR ecosystem is predominantly C+±centric:

“Python bindings” are actually pybind11 C++ code
Core infrastructure is C++
Developers debug generated code instead of their actual logic

Proposed Solution

Selective Override Pattern

template<typename Derived>
class Op {
public:
    auto getInput() { return derived()->default_getInput(); }
    LogicalResult verify() { return derived()->default_verify(); }
    
    // Default implementations - users can selectively override
    auto default_getInput() { return getOperand(0); }
    LogicalResult default_verify() { return success(); }
    
private:
    Derived* derived() { return static_cast<Derived*>(this); }
};

// Users only override what they need
class IdentityOp : public Op<IdentityOp> {
    LogicalResult default_verify() {
        return getInput().getType() == getOutput().getType() ? 
               success() : failure();
    }
    // Everything else uses defaults
};

Declarative Trait Binding

// Framework provides defaults
template<typename T>
struct trait_binding : Type2Type<DefaultTrait<T>> {};

// Users specialize for custom behavior
template<> struct trait_binding<AddOp> : Type2Type<ArithmeticTrait<AddOp>> {};
template<> struct trait_binding<LoadOp> : Type2Type<MemoryTrait<LoadOp>> {};

// Operations automatically get corresponding traits
class AddOp : public Op<AddOp> {
    // Automatically inherits ArithmeticTrait capabilities
};

Key Advantages

Aspect	TableGen	CRTP Approach
Learning Curve	New DSL syntax	Standard C++
Customization	Fixed extension points	Any function can be overridden
Code Generation	200+ lines per operation	0 lines generated
IDE Support	Limited	Full C++ toolchain
Debugging	Generated code	Your actual code
Template Support	Basic	Full C++ templates
Performance	Zero overhead	Zero overhead

Why This Matters

Zero Learning Curve: Every MLIR developer already knows C++
Complete Flexibility: Override any method, not just predefined extension points
Framework Extension: Code non-invasive, framework functionality invasive
Better Developer Experience: Full IDE support, direct debugging, standard refactoring
Modern C++ Features: Template specialization, constexpr, concepts, etc.
Gradual Migration: Can coexist with TableGen during transition

Template Specialization Elegance

Instead of if constexpr chains, we use elegant type-based dispatch:

// Clean trait design
template<typename ElementType>
struct TypedConstantTraits {
    static const char* getOpName() { return "const_unknown"; }
};

template<> struct TypedConstantTraits<int> {
    static const char* getOpName() { return "const_int"; }
};

template<> struct TypedConstantTraits<float> {
    static const char* getOpName() { return "const_float"; }
};

Seeking Community Input

I’d love feedback on:

Technical Feasibility: Are there fundamental issues I’m missing?
Migration Strategy: How do we handle transition from TableGen?
Tooling Impact: Which tools would need updates?
Implementation Priority: Should this be explored for new dialects first?

Try It Yourself

The repository contains working demos:

git clone https://github.com/shenxiaolong-code/mlir-crtp-proposal
cd mlir-crtp-proposal
cd test && make

or

# Run basic CRTP demo
g++ -std=c++17 base_crtp_demo.cpp -o base_demo && ./base_demo

# Run full trait_binding demo
g++ -std=c++17 enhanced_crtp_trait_bind_demo.cpp -o enhanced_demo && ./enhanced_demo

Next Steps

If the community is interested, I’m willing to:

Create a proof-of-concept integration with existing MLIR
Convert example dialects to demonstrate feasibility
Develop migration tools and guidelines
Conduct performance comparison benchmarks

Collaboration

This proposal is open source and available for community review, feedback, and contribution. All suggestions and improvements are welcome!

Complete technical analysis, implementation details, and rationale: Full RFC Document

What do you think? Is this direction worth exploring, or am I missing some crucial aspects?

Looking forward to the discussion and your insights!

GitHub Repository: GitHub - shenxiaolong-code/mlir-crtp-proposal: Modern C++ CRTP alternative to TableGen for MLIR operation definition

Hi Xiaolong,

Does your ODS spec support the standalone dialect and Python bindings ?

Curious.

Also what’s the compile time benefit/penalty ?

Great questions!

Standalone Dialect Support

Yes, absolutely! The CRTP approach works perfectly for standalone dialects:

// Your standalone dialect - completely independent
namespace MyDialect {
    // Declare your traits in your own namespace
    template<> struct trait_binding<MyCustomOp> : Type2Type<MyCustomTrait<MyCustomOp>> {};
    
    class MyCustomOp : public Op<MyCustomOp> {
        // Your implementation - zero framework modification needed
    };
}

The beauty is that you never touch MLIR framework code - everything happens in your own codebase through template specialization.

Python Bindings

Perfect compatibility! The CRTP operations are just regular C++ classes, so they work seamlessly with existing Python binding mechanisms:

// Your CRTP operation
class AddOp : public Op<AddOp> { /* ... */ };

// Python bindings - exactly like current MLIR approach
void bindDialectOps(py::module &m) {
    py::class_<AddOp>(m, "AddOp")
        .def("getInput", &AddOp::getInput)
        .def("verify", &AddOp::verify);
        // Same as always - no changes needed
}

Compile Time

Always faster in any scenario - eliminates TableGen generation step, only compiles what you use.
See detail : Compilation Time Analysis - MLIR CRTP RFC

Would love to hear your thoughts after reviewing the technical details!

Important Addition: Core Design Philosophy

Thank you all for the attention! I want to emphasize the most critical innovation of this CRTP approach:

Non-Invasive Code, Invasive Functionality

Key Feature: Users can control and modify framework behavior from their own scope without modifying any framework code.

Code Non-Invasive: Zero changes to framework source code
Functionality Invasive: Complete control over framework behavior
Implementation Mechanism: Declarative trait binding controls the base class of framework operations

This approach inverts the traditional extension model - instead of the framework providing fixed extension points, users declare what they want, and the framework adapts automatically without any code enhancement.

This is the fundamental difference from TableGen and the most valuable aspect of this proposal!

Complete Technical Analysis: GitHub Repository

Supplement: AI-Assisted Learning Recommendations

To help the community better understand this CRTP approach, I recommend using AI tools for interactive learning:

Understanding this approach’s implementation principles and various usage patterns with AI assistance can significantly accelerate the process of familiarizing with this approach and exploring more possible extension usages.

Suggested AI Prompt Examples:


"Explain CRTP patterns and advantages in MLIR operation definition"

"Compare differences between TableGen and CRTP approaches for framework extension"

"Help me deeply understand the 'non-invasive code, invasive functionality' design philosophy"

"Analyze how this CRTP approach controls framework behavior without modifying MLIR framework code"

"Explain the working principles of the trait_binding mechanism in detail"

"Based on my specific requirements (xxx), how can I use this approach's techniques to solve them"

Hey,

Thanks for the discussion. I think it may lead to seeing where we can improve our C++ code or C++ code generation from ODS. Note: ODS is not required, one can use plain C++ today.

This is actually a feature. By limiting these we keep things more consistent, prod folks into better patterns/styles that we can support. It also enables us to avoid needing to provide stability guarantees C++ side. We’ve been able to make multiple changes to the underlying structure without needing to update multiple users, those we had to do manually were rather painful vs changing a param in ODS and generating.

This is definitely true. If one spans a DSL and another language, especially where extensions are allowed in the other language, one can’t report errors without being able to interpret the extensions (e.g., interpret C++). Which is rather difficult especially with libraries etc.

While we use nanobind to bind to Python, we do generate the op classes from ODS. We even do that for Haskell bindings too. What folks don’t like is that ODS is not self-contained in the sense that our extension points are written in C++ and one needs heuristics to interpret. So these folks would want more limitations on the extension points/more semantic blocks rather than less.

We also use this to generate docs, generate conversion patterns, different utility builders, and rewrite patterns (DRR & PDLL uses/can use). Along with the previously mentioned decoupling point where we can contain uglier bits of compatibility and avoid breaking everyone if we make a C++ change.

It would be interesting to compare on a few non-trivial ops and see if we can’t improve the generation (in optimized compilation the symbol sizes). Currently we have variants for unwrapped and wrapped types, these do make the generated code bigger for the benefit of user convenience. Especially since we started pre C++17 usage being allowed.

I’m not following this one, one can already define an MLIR dialect downstream without modifying MLIR.

This is showing one can manually write Python bindings using nanobind to C++ classes, but doesn’t address the generation aspect (or convenience builders).

One still would need to know conventions (e.g., default_verify and trait hierarchies). And now you have it in vanilla C++ and with some action-at-distance (e.g., AddOp automatically inheriting capabilities discovery is difficult to see and it would seem that downstream groups could inject more capabilities into it in an unconstrained manner).

Hi,
I just touch the mlir language and the table-gen system, I study the mlir with AI’ assistant. For the mlir language,you are the expert and I trust your justice.

When AI tell me what is the.td file and how it works, I realized that the C++ can take its job fullly with more graceful way, and C++ can provides more features and better flexibility.

Thank you a lot again.

Xiaolong
From mobile 2025.6.11

When we started MLIR: almost all the ops were directly written in C++ in the codebase. We used TableGen and improved ODS over and over until we could implement all of the dialects we had with ODS and got rid of manually written C++.
However as @jpienaar mentioned, you can still define your operation in pure C++ already if you’d like so. I’m a bit confused by what you’re proposing here actually. What would you add to MLIR concretely?

If I look at the example of your “Detailed Technical Document”

// Input: ~15 lines of direct C++
class IdentityOp : public Op<IdentityOp> {
    Value input_, output_;
    
public:
    IdentityOp(Value input, Type outputType) 
        : input_(input), output_(createResult(outputType)) {}
    
    static StringRef getOperationName() { return "demo.identity"; }
    Value getInput() { return input_; }
    Value getOutput() { return output_; }
    
    // Only override what needs customization
    LogicalResult default_verify() {
        return getInput().getType() == getOutput().getType() ? 
               success() : failure();
    }
};

I can’t make sense of how this is supposed to work. You’re defining data member in the operation class, which is not possible in MLIR.

I would need to see something more concrete, take for example one of the dialects in MLIR (like arith or SCF) and rewrite them (or part of them) with your proposal.

ftynse June 11, 2025, 11:11am 9

Just to clarify, are you proposing a change you came up with yourself or simply copy-pasting a suggestion from a chatbot?

MLIR already uses CRTP extensively. It is already possible to define ops in C++, but it comes with a lot of boilerplate code we don’t want to write. The fact that we generating hundreds of lines of C++ is a feature, because we don’t want to write that manually.

The only acute tooling problem I’m aware of is the lack of completion on create<> methods because of deep template parameter pack forwarding. This will not address that problem. Refactoring might be a concern but has been so far successfully mitigated by minimizing the amount of inline C++ in tablegen. We could introduce properties, a fundamental change in op structure, without much code change in op definitions, for example.

Hi mending,
Perhaps this code isn’t a good example.
I want to demonstrate that the CRTP can do the tablegen work, I think it’s better to use C++ to solve C++'s inherent problems.

if we use the C++ , we can benefit a lot from current C++ ecosystem

Hi, I provide the idea and example code, the chat tool help me to finish the document.

the chatbot tells me the mlir is using the crtp, but it confirms the usage is different.

the document focuses on :

use C++ , instead of the.td file.
the user can change the framework behavior without framework changes. ( this flexibility might fit user’s requirements)

For the last item, chatbot tells me there isn’t similar functionality in current mlir crtp.

rengolin June 11, 2025, 1:15pm 12

I think there’s a misunderstanding here. TableGen does not try to solve a C++ problem, it just generates tables.

We want to generate static, validated and consistent tables to be included inside C++, with the guarantees that they will not have further semantics. C++ templates are notoriously difficult to program, debug and avoid hidden corner cases that just happen to hit some auto or implicit cast rules.

As stated by others, MLIR already uses various tools where they’re helpful, so this isn’t particularly helpful to MLIR directly.

Also note that TableGen isn’t an MLIR technology, but an LLVM one. The idea of replacing it with other tools have been circulated in the community for almost as long as TableGen exists. C++ templates have been proposed multiple times (a long time ago, by myself), and always shown to be an inferior solution.

I remember proposals to replace with some Python or Haskell auto-generation, which all had the same problem: introducing one will give us two problems, and replacing all of table-gen is not viable.

Thanks for your concern.

With the type-to-value-trait-binding technology , the C++ templates isn’t so difficult to debug and analyze. especially with the compile stack dump.

it is very clear to view the call procedure and the every parameter,
we are analyzing the callstack statically — needn’t the debugger as runtime code. I think it is an advantage.

because of the long ,various and complex template symbol, the template code callstack becomes difficult.
if we erase the long symbol with the type-to-value-trait-binding technique, I would like to believe the template code is very friendly :

the compile stack is static - like stack of generic runtime code in debugger.
the runtime stack is same with the one of general runtime code in the debugger.

the type-to-value-trait-binding can erase the long template symbols
see link: mlir-crtp-proposal/advanced_bind_from_value_to_type.hpp at main · shenxiaolong-code/mlir-crtp-proposal · GitHub

sorry for missing thing.

yes, I agree with your viewpoint.
For your concerns, “resolving C++ problem” isn’t good words.
My proposal lies on that C++ crtp can do better for the task done by the tablegen.

kuhar June 11, 2025, 2:59pm 15

I think we could have a more productive discussion here by exploring if we can improve the existing C++ dialect definitions using some of the idioms you outlined. Tablegen and C++ are complementary in the sense that making the underlying C++ simpler causes the auto-generated code to be more concise too. If the generated C++ is simple enough, then it may warrant eventually dropping Tablegen.

This exercise would be best done on the existing codebase IMO and not in an isolated repo that oversimplifies the structure of the MLIR IR.

Compiling C++ templates can be a pretty heavy lifting job for the frontend sometimes, I guess that’s why @ai-mannamalai was asking. Could you do some experiments and provide some concrete numbers?