Generic implementation strategies in Carbon and Clang (original) (raw)

Richard Smith

@zygoloid

Carbon / Google

LLVM Developers’ Meeting 2024

C++ templates and Carbon generics

template<typename T> T clamp_nonnegative(const T &a) {
  return a < T() ? T() : a;
}

C++ templates and Carbon generics

fn ClampNonnegative[template T:! type](a: T) -> T {
  return if a < (0 as T) then 0 as T else a;
}
fn ClampNonnegative[T:! Numeric & Ordered](a: T) -> T {
  return if a < (0 as T) then 0 as T else a;
}

C++ template representations

template<typename T> T clamp_nonnegative(const T &a) {
  return a < T() ? T() : a;
}

Two main approaches:

Dependent parse trees

Dependent parse trees

Parse: build normal IR

template<typename T> T clamp_nonnegative(const T &a) {
  return a < T() ? T() : a;
}
FunctionDecl <col:22, line:3:1> line:1:24 clamp_nonnegative 'T (const T &)'
├─ParmVarDecl <col:30, col:39> col:39 referenced a 'const T &'
╰─CompoundStmt <col:42, line:3:1>
  ╰─ReturnStmt <line:2:3, col:26>
    ╰─ConditionalOperator <col:10, col:26> '<dependent type>'
      ├─BinaryOperator <col:10, col:16> '<dependent type>' '<'
      │ ├─DeclRefExpr <col:10> 'const T' lvalue ParmVar 0xd67ffd0 'a' 'const T &'
      │ ╰─CXXUnresolvedConstructExpr <col:14, col:16> 'T' 'T'
      ├─CXXUnresolvedConstructExpr <col:20, col:22> 'T' 'T'
      ╰─DeclRefExpr <col:26> 'const T' lvalue ParmVar 0xd67ffd0 'a' 'const T &'

… with explicit representation for dependent constructs

Dependent parse trees

Instantiate: tree transformation builds a new tree

template<> int clamp_nonnegative<int>(const int &a) {
  return a < int() ? int() : a;
}
FunctionDecl <col:22, line:3:1> line:1:24 clamp_nonnegative 'int (const int &)'
├─ParmVarDecl <col:30, col:39> col:39 used a 'const int &'
╰─CompoundStmt <col:42, line:3:1>
  ╰─ReturnStmt <line:2:3, col:26>
    ╰─ConditionalOperator <col:10, col:26> 'int'
      ├─BinaryOperator <col:10, col:16> 'bool' '<'
      │ ├─DeclRefExpr <col:10> 'const int' lvalue ParmVar 0xddc7958 'a' 'const int &'
      │ ╰─CXXScalarValueInitExpr <col:14, col:16> 'int'
      ├─CXXScalarValueInitExpr <col:20, col:22> 'int'
      ╰─DeclRefExpr <col:26> 'const int' lvalue ParmVar 0xddc7958 'a' 'const int &'

Dependent parse trees

Instantiate: tree transformation builds a new tree

template<> int clamp_nonnegative<int>(const int &a) {
  return a < int() ? int() : a;
}
FunctionDecl <col:22, line:3:1> line:1:24 clamp_nonnegative 'int (const int &)'
├─ParmVarDecl <col:30, col:39> col:39 used a 'const int &'
╰─CompoundStmt <col:42, line:3:1>
  ╰─ReturnStmt <line:2:3, col:26>
    ╰─ConditionalOperator <col:10, col:26> 'int'
      ├─BinaryOperator <col:10, col:16> 'bool' '<'
      │ ├─ImplicitCastExpr <col:10> 'int' <LValueToRValue>
      │ │ ╰─DeclRefExpr <col:10> 'const int' lvalue ParmVar 0xddc7958 'a' 'const int &'
      │ ╰─CXXScalarValueInitExpr <col:14, col:16> 'int'
      ├─CXXScalarValueInitExpr <col:20, col:22> 'int'
      ╰─ImplicitCastExpr <col:26> 'int' <LValueToRValue>
        ╰─DeclRefExpr <col:26> 'const int' lvalue ParmVar 0xddc7958 'a' 'const int &'

Subtree reuse

Clang TreeTransform

template<range R> auto average(const R &v)
    -> range_value_t<R> {
  int n = 0;
  range_value_t<R> sum = 0;
  for (auto &elem : v) {
    sum += elem;
    ++n;
  }
  return sum / (n ? n : 1);
}

Actually reuses:

IntegerLiteral <line:3:11> 'int' 0
IntegerLiteral <line:4:26> 'int' 0
IntegerLiteral <line:9:25> 'int' 1

Subtree reuse

Clang TreeTransform

Why?

Dependent parse trees

Cost of building instantiation

Example: type trait

template<typename T> struct is_const {
  static constexpr bool value = __is_const(T);
};
const bool b1 = is_const<int[1]>::value;
const bool b2 = is_const<int[2]>::value;
...
template<typename T>
constexpr bool is_const = __is_const(T);
const bool b1 = is_const<int[1]>;
const bool b2 = is_const<int[2]>;
...

Directly computing __is_const(int[N]):

Faster? Smaller?

Carbon approach: overlay

Idea: treat instantiation as overlay on template

Parse generic:

Build specific:

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: `<0>R`)
    -> `<1>R`.Value {
  var n: i32 = 0;
  var sum: `<1>R`.Value = 0;
  for (elem in v) {
    sum += elem;
    ++n;
  }
  return sum / (if n != 0 then n else 1);
}

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: #0)
    -> `<0>#0.Value` {
  var n: i32 = 0;
  var sum: `<0>#0.Value` = 0;
  for (elem in v) {
    sum += elem;
    ++n;
  }
  return sum / (if n != 0 then n else 1);
}

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: #0)
    -> #1 {
  var n: i32 = 0;
  var sum: #1 = `0`;
  for (elem in v) {
    sum += elem;
    ++n;
  }
  return sum / (if n != 0 then n else 1);
}

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: #0)
    -> #1 {
  var n: i32 = 0;
  var sum: #1 = 0.(`ImplicitAs(#1).Convert`)();
  for (elem in v) {
    sum += elem;
    ++n;
  }
  return sum / (if n != 0 then n else 1);
}

R.Value impls Numeric implies that
IntLiteral(0) impls ImplicitAs(R.Value)

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: #0)
    -> #1 {
  var n: i32 = 0;
  var sum: #1 = 0.(#2)();
  for (elem in v) {
    sum += elem;
    ++n;
  }
  return sum / (if n != 0 then n else 1);
}

Building a generic

fn Average[R:! Range where R.Value impls Numeric](v: #0)
    -> #1 {
  var n: i32 = 0;
  var sum: #1 = 0.(#2)();
  for (elem in v.(#3)() ... v.(#4)()) {
    sum.(#5)(elem);
    ++n;
  }
  return sum.(#6)(if n != 0 then n else 1);
}

(Pseudocode, actually done in SemIR)

Generic representation

These are instructions extracted from the generic:

Can represent this as a block of code

Generic representation

We have computed a compile-time function to form a specific from a generic:

fn MakeAverageSpecific(R:! Range where R.Value impls Numeric) -> <function> {
  let v0:! auto = R;
  let v1:! auto = v0.Value;
  let v2:! auto = (IntLiteral(0) as ImplicitAs(v1)).Convert;
  let v3:! auto = v0.Begin;
  let v4:! auto = v0.End;
  let v5:! auto = (v1 as AddAssign).Op;
  let v6:! auto = (v1 as DivWith(i32)).Op;
  return <make specific>(Average, (v0, v1, v2, v3, v4, v5, v6));
}

Forming a specific from a generic is compile-time function evaluation.

Building a specific

MakeAverageSpecific([i32; 3])
fn MakeAverageSpecific(R:! Range where R.Value impls Numeric) -> <function> {
  let v0:! auto = R;
  let v1:! auto = v1.Value;
  let v2:! auto = (IntLiteral(0) as ImplicitAs(v1)).Convert;
  let v3:! auto = v0.Begin;
  let v4:! auto = v0.End;
  let v5:! auto = (v1 as AddAssign).Op;
  let v6:! auto = (v1 as DivWith(i32)).Op;
  return <make specific>(Average, (v0, v1, v2, v3, v4, v5, v6));
}

Building a specific

MakeAverageSpecific([i32; 3])
fn MakeAverageSpecific(R:! Range where R.Value impls Numeric) -> <function> {
  let v0:! auto = [i32; 3];
  let v1:! auto = i32;
  let v2:! auto = <builtin IntLiteral to i32 conversion>;
  let v3:! auto = [i32; 3].Begin;
  let v4:! auto = [i32; 3].End;
  let v5:! auto = <builtin AddAssign for i32>;
  let v6:! auto = <builtin DivWith for i32>;
  return <make specific>(Average, (v0, v1, v2, v3, v4, v5, v6));
}

Specific representation

Generic

Average[R:! Range where...]

inst[0] = R
inst[1] = #0.Value
inst[2] =
  (IntLiteral(0) as ImplicitAs(#1)).Convert
inst[3] = #0.Begin
inst[4] = #0.End
inst[5] = (#1 as AddAssign).Op
inst[6] = (#1 as DivWith(i32)).Op

Specific

Average with R = [i32; 3]

value[0] = [i32; 3];
value[1] = i32;
value[2] =
  <builtin IntLiteral to i32 conversion>;
value[3] = [i32; 3].Begin;
value[4] = [i32; 3].End;
value[5] = <builtin AddAssign for i32>;
value[6] = <builtin DivWith for i32>;
let a: [i32; 3] = (1, 2, 3);
let b: auto = Average(a);

Templates

So far, only talked about types and constant values that symbolically depend on generic parameters.

What about templates?

Templates

Add another kind of instruction to instantiate a single expression

fn CallF[template T:! type](x: T) {
  `x.F`();
}

Templates

Add another kind of instruction to instantiate a single expression

fn CallF[template T:! type](x: T) {
  `#0()`;
}

Templates

Add another kind of instruction to instantiate a single expression

fn CallF[template T:! type](x: T) {
  #1;
}

Evaluating <instantiate> instruction produces another instruction

Not a dependent parse tree representing the eventual meaning of the program

Templates

Forming a specific is still a compile-time function evaluation

Code complexity cost

Lose orthogonality

Tradeoff

Clang dependent parse tree model:

Carbon toolchain overlay model:

Questions?

Bonus slides: token soup

Token soup

Parse: collect list of tokens

template<typename T> T clamp(const T &a)
  = { "return" "a" "<" "T" "(" ")" "?" "T" "(" ")" ":" "a" ";" }

Instantiate: replay tokens

template<> int clamp<int>(const int &a) {
  return a < int() ? int() : a;
}

Or:

using T = int;
template<> T clamp<T>(const T &a) {
  return a < T() ? T() : a;
}

Token soup

Good:

Bad:

Token soup

int a = 1;
namespace N {
  template<typename T> int f() { return a; }
  int a = 2;
}
int b = N::f<int>();

EDG:

MSVC (old parser):

Token soup

Bonus slides: lowering

Lowering

fn Average[R:! Range where R.Value impls Numeric](v: #1)
    -> #2 {
  var n: i32 = 0;
  var sum: #2 = 0.(#3)();
  for (elem in v.(#4)() ... v.(#5)()) {
    sum.(#6)(elem);
    ++n;
  }
  return sum.(#7)(if n != 0 then n else 1);
}

->

define void @_CAverage.Main.abc123() {
entry:
  %n.var = alloca i32, align 4
  store i32 0, ptr %n.var, align 4
  %sum.var = alloca

Lowering

fn Average[R:! Range where R.Value impls Numeric](v: #1)
    -> #2 {
  var n: i32 = 0;
  var sum: #2 = 0.(#3)();
  for (elem in v.(#4)() ... v.(#5)()) {
    sum.(#6)(elem);
    ++n;
  }
  return sum.(#7)(if n != 0 then n else 1);
}

->

define void @_CAverage.Main.abc123(%v.param: ptr) {
entry:
  %n.var = alloca i32, align 4
  store i32 0, ptr %n.var, align 4
  %sum.var = alloca i32, align 4
  store i32 0, ptr %sum.var, align 4
  ...

Lowering

#1 -> ptr
#2 -> i32
#3 -> <builtin implicit conversion from IntLiteral to i32>
#4 -> @_CBegin.Array.Core.abc123
#5 -> @_CEnd.Array.Core.abc123
#6 -> <builtin AddAssign for i32>
#7 -> <builtin DivWith for i32>

Lowering

Result:

Overlay model gives us the information to do this