Templates across API boundaries: Optimizing template generators

May 16, 2016

scross-May 16, 2016

C++ is controversial. For every discussion about it, a certain percentage of people will love it and another substantial group will hate it. Above all, one feature highlights this divide: templates. Templates have many problems: they slow compile times, force implementations to be specified in headers and can bloat generated code. However they’re also a powerful mechanism for programming against generic types. In the first article in this series, we saw how C++ templates have significant limitations, most notably that they can’t be used across API boundaries. In the second article, we explored how templates can be implemented by using ‘template generators’, which are trees of compiler-generated functions that compute template arguments for a function given its path in the tree. In this final part we’ll see how template generators can be optimized to minimize, or in many cases eliminate, their overhead.

Pass-through path reduction
Consider the following:

int f() {
   return g<int, float, double>(1.0f, 2.0);
}
 
template <typename A, typename B, typename C>
A g(B b, C c) {
   return h<A, B, C>(b, c);
}
 
template <typename A, typename B, typename C>
A h(B b, C c) {
   // ...some code...
}

By the approach described above, we’d have the following paths:

  • To get arguments of g(): ‘1’
  • To get arguments of h(): ‘10’

Here are the template generators: 

void root_template_generator(type* args, path_t path) {
   args[0] = { int_vtable };
   args[1] = { float_vtable };
   args[2] = { double_vtable };
   g_template_generator(args, path, get_path_start(path));
}
 
void g_template_generator(type* args, path_t path, size_t position) {
   if (position == 0) {
       // Path end.
       return;
   }
   h_template_generator(args, path, position - 1);
}
 
void h_template_generator(type* args, path_t path, size_t position) {
   return;
}

However in this case the template arguments to g() are the same as the arguments it passes to h(). Hence we can merge these two transitions, giving only one transition (continue to h_template_generator), which means no bits are required in the path for g(). This gives us:

  • To get arguments of g(): ‘1’
  • To get arguments of h(): ‘1’

This simplifies the template generators to: 

void root_template_generator(type* args, path_t path) {
   args[0] = { int_vtable };
   args[1] = { float_vtable };
   args[2] = { double_vtable };
   g_template_generator(args, path, get_path_start(path));
}
 
void g_template_generator(type* args, path_t path, size_t position) {
   h_template_generator(args, path, position);
}
 
void h_template_generator(type* args, path_t path, size_t position) {
   return;
}

This works because g_template_generator() knows that h_template_generator() must return the template arguments we gave it since the path ends at that point. 

I refer to this situation as ‘pass-through’ and it’s a very common case. In particular it has the nice property that it is a local (non-ABI) change, hence it is an optimization rather than a specified requirement.

There’s an explanation about how we handle pathological cases, such as mutually recursive functions, on the Loci website. In short, direct function calls (calls via function pointers or interfaces cannot be templated) between modules must form a directed acyclic graph (DAG), so templated recursion can only occur within a module; this allows the compiler to perform cycle detection on the templated functions within that module.

Prefix pass-through
Pass-through also applies if the template arguments of the caller are a prefix of the callee, or vice versa, such as:

int f() {
   return g<int, float, double>(1.0f, 2.0);
}
 
template <typename A, typename B, typename C>
A g(B b, C c) {
   return h<A, B>(b);
}
 
template <typename A, typename B>
A h(B b) {
   // ...some code...
}

Here’s how the template generators look after applying the pass-through optimization: 

void root_template_generator(type* args, path_t path) {
   args[0] = { int_vtable };
   args[1] = { float_vtable };
   args[2] = { double_vtable };
   g_template_generator(args, path, get_path_start(path));
}
 
void g_template_generator(type* args, path_t path, size_t position) {
   h_template_generator(args, path, position);
}
 
void h_template_generator(type* args, path_t path, size_t position) {
   return;
}

When h() queries its template arguments, it gets an extra argument it didn’t expect. Fortunately, it completely ignores that, so this works. 

Of course, pass-though works for the original example as well. 

void root_template_generator(type* args, path_t path) {
   args[0] = { int_vtable };
   g_template_generator(args, path, get_path_start(path));
}
 
void g_template_generator(type* args, path_t path, size_t position) {
   args[1] = { float_vtable };
   h_template_generator(args, path, position);
}
 
void h_template_generator(type* args, path_t path, size_t position) {
   return;
}

In this case it’s the other way around; g() is getting an argument it didn’t expect. But again, it ignores it.

Next Page >>

 

< Previous
Page 1 of 3
Next >

Loading comments...

Most Commented