AngelikaLanger.com - Explicit Function Template Argument Specification and STL (original) (raw)
Explicit Function Template Argument Specification and STL
Explicit Function Template Argument Specification and STL
A New Language Feature and Its Impact on Old Programming Techniques
C/C++ Users Journal, December 2000
Klaus Kreft & Angelika Langer
In this installment of the column, we will explain how the new language feature of explicit function template argument specification can break previously safe and sound code that was implemented before the feature was available. In order to avoid potential problems, new programming idioms are needed. We will study the effect taking real-world examples from STL. Many STL implementations were built before the new language feature was supported by compilers, and some implementations have never been updated and still contain problematic function templates, namely an outer function template calling an inner function template relying on automatic template argument deduction.
Type Arguments of Function Templates
Explicit specification of template arguments for function templates is a relatively new language feature, which was added during C++ standardization. When you look at the ARM (Annotated Reference Manual) [1], which is the reference for pre-standard C++, you will find that initially there was no way to tell the compiler which types to use as template arguments for instantiation of a function template. In those days, a template such as the one shown below was illegal.
template T* create();
Every template argument, like the type argument T in the example above, was required to be used in the argument types of the function template. Otherwise the compiler was not capable of deducing the template arguments. In the example above, the function does not have any arguments and for this reason the compiler cannot figure out which type to use for the instantiation of the function template.
The New Language Feature
Today, we can explicitly tell the compiler which types it must use for instantiation of a function template. In the case above, we can invoke the function using explicit argument specification syntax, as shown below:
int n = create();
The language construct create is called explicit specification of a function template argument. The syntax is similar to the instantiation of class templates: the name of the template is followed by the list of template arguments.
Even in cases where explicit specification of the function template argument is unnecessary because the compiler can deduce the actual template arguments from the function argument types, we can circumvent the automatic deduction and use explicit specification instead. Here is an example:
template void add(T t);
add("log.txt"); // automatic argument deduction add("log.txt"); // explicit argument specification
The example also shows that automatic deduction and explicit specification can have different effects. Automatic argument deduction will trigger instantiation of add<char*>, and explicit argument specification will generate a different function, namely add.
The Catch
The new language feature was added to the language in order to solve for instantiation of function templates where the template argument is not used in the function arguments types. It adds extra flexibility to the language, but there is a catch. Formerly safe code might now be problematic. In the old days before explicit function template argument specification, it was perfectly reasonable to implement function templates that pass their arguments to other function templates using automatic argument deduction, as shown below:
template void inner(T t) ;
template void outer(T t) { ... inner(t); ... }
The outer function template passes its function argument to the inner function template, and for invocation of the inner function template, it lets the compiler figure out the template argument.
Today, with explicit function template argument specification, this is a questionable implementation of a function template, because it has the potential to create object slicing problems if the outer function is instantiated for a reference type. It is still reasonable to pass arguments from one function template to another function template, but the syntax for safely doing so is different today.
Automatic Function Template Argument Deduction
vs.
Explicit Function Template Argument Specification
Let us analyze our problematic example from above:
template void inner(T t) ;
template void outer(T t) { ... inner(t); ... }
Why is it hazardous today, but was safe in the early days of C++ before explicit argument specification was invented for instantiation of function templates? It has to do with the extra flexibility that the new feature adds to the language.
The Problem
Using explicit template argument specification, the outer function template can be instantiated for a reference type, as shown below:
class Base; class Derived : public Base {}; Base& ref = new Derived; outer<Base&>(ref);
The generated function outer<Base&> would look like this:
void outer<Base&>(Base& t) { ... inner(t); // calls: void inner(Base t); ... }
When it invokes the inner function template, it relies on automatic template argument deduction, and the compiler instantiates the inner function template for the value type Base instead of the reference type Base&. This might be surprising, but is intended: automatic function argument deduction involves a number of implicit type conversions, one of which is the lvalue-to-rvalue transformation (more details about the conversion later in this article). The consequence is that the function argument t, which is a base class reference pointing to a derived class object, is passed from the outer function to the inner function by value. Only a base class slice of the derived class object is made available to the inner function. This is called object slicing, and it is a well-know problem that arises when copies of base class references are created.
A Solution
In a correct implementation of outer, we would pass the argument tto the inner function as it was received (i.e., by reference when it was received by reference, and by value when it was received by value). This can easily be achieved via explicit argument specification for the inner function template, as shown below:
template void inner(T t) ;
template void outer(T t) { ... inner(t); ... }
The generated function outer<Base&> would trigger instantiation of the inner function template for the reference type Base&.
void outer<Base&>(Base& t) { ... inner<Base&>(t); // calls: void inner<Base&>(Base& t); ... }
The function argument t is passed by reference to the inner function, and no object slicing can occur because no copies are created.
Evaluation
The problem of object slicing in function templates stems from the fact that the template is instantiated for a base class reference type. Now, you can see why the naive implementation of the outer and inner function templates was safe before explicit template argument specification was added to the language: it was just not possible to instantiate a function template for a reference type. For this simple reason, the implementer of outer had no need to prepare for a base class reference being passed as an argument. There was no danger of object slicing because no references were involved. Today, this restriction is gone, and the function template can be instantiated on any arbitrary type, including reference types. Hence, the implementer of a function template must be prepared to cope correctly with any arbitrary type.
An Alternative Solution
In principle, the implementer of the outer function template can take a different approach from our proposed solution. Maybe, he/she does not want to accept arbitrary types and decides to exclude reference types for instantiation of the outer function template and restrict its usability to value types. Here is a conceivable implementation that cannot be instantiated for a reference type:
template void inner(T t) ;
template void outer(T t) { ... typedef T& dummy; inner(t); ... }
The attempt to instantiate outer<Base&> would fail, because references to references are not allowed in C++. The generated function would look like this:
void outer<Base&>(Base& t) { ... typedef Base&& dummy; // error : reference to reference inner(t); ... }
The downside of this solution is that we usually strive for maximum applicability of a template rather that restricting its usability. Unless there is a compelling reason for the restriction to value types, the more flexible solution using explicit argument specification is superior.
Implicit Type Conversions Involved in Template Argument Deduction
For sake of comprehensiveness, let us point out that the lvalue-to-rvalue conversion that was involved in the automatic argument deduction in our example is not the only implicit type conversion that is applied before a template argument is deduced.
Before the template argument type is determined, the compiler performs the following implicit type conversions:
- lvalue transformation
- qualification conversion
- derived class to base class conversion See the C++ Primer ([2], page 500) for comprehensive coverage of this topic.
Simply put, the compiler drops some type properties, like the lvalue property of the reference type in our example. For example, the compiler instantiates a function template for a value type rather than the corresponding reference type. Similarly it instantiates function templates for a pointer type rather than the corresponding array type. It drops const qualifications and would never instantiate a function template for a const type, but always for the corresponding non-const type.
The bottom line is that automatic template argument deduction involves type conversions, and certain type properties get lost when the compiler determines the template arguments automatically. These type properties can be retained when explicit function template argument specification is used.
STL Algorithms
Function templates that invoke inner function templates and rely on template argument deduction for this invocation can be found in many STL implementations. All algorithms in STL are function templates, and often they use other algorithms for their own implementation. The remove_if algorithm is an example. Here is an implementation that can be found in popular STL implementations:
template <class ForwardIterator, class Predicate> ForwardIterator remove_if(ForwardIterator first, ForwardIterator last, Predicate pred) { first = find_if(first, last, pred); ForwardIterator next = first; return first == last ? first : remove_copy_if(++next, last, first, pred); }
The remove_if algorithm invokes find_if and remove_copy_if. In both cases, remove_if relies on automatic argument deduction. The iterators and the predicate are passed by value regardless of the fact that they might be handed in to remove_if by reference.
Is the danger of object slicing likely in this case? How often do we pass iterators or predicates by reference?
Iterators. Well, iterators are required by the standard to exhibit value semantics. An iterator type must be copyable; hence pass-by-value is guaranteed to work. Typically, an iterator type neither contains a lot of data nor any virtual functions; hence passing around references to iterators is unlikely.
Predicates. The requirements for predicates are different. Predicate types requirements imposed by the standard are relatively relaxed. Here is the respective quote from the C++ Standard:
The Predicate parameter is used whenever an algorithm expects a function object that when applied to the result of dereferencing the corresponding iterator returns a value testable as true. In other words, if an algorithm takes Predicate pred as its argument and first as its iterator argument, it should work correctly in the construct if (pred(*first)){...}. The function object pred shall not apply any non-constant function through the dereferenced iterator. This function object may be a pointer to function, or an object of a type with an appropriate function call operator.
In plain English, predicate stands for a type that is either a function pointer type or a functor type. The function (object) must return a value that is convertible to bool and must accept an argument to which the result of dereferencing an iterator is convertible. In addition, the predicate must not modify the container elements. Other than that, the standard does not impose any further requirements on predicate types. Note, a predicate need not even be copyable.
Predicate References and count_if
The weak requirements for predicate types indeed suffice. Typically an algorithm does not do a lot with a predicate: it just invokes it by passing a reference to a container element (via a dereferenced iterator) to the predicate. Here is a typical example, the count_if algorithm, which illustrates how an algorithm uses its predicate:
template <class InputIterator, class Predicate> typename iterator_traits::difference_type count_if(InputIterator first, InputIterator last, Predicate pred) { typename iterator_traits::difference_type n = 0; for ( ; first != last; ++first) if (pred(*first)) ++n; return n; }
The algorithm just calls the predicate, provides a dereferenced iterator as an argument, and uses the predicate's return value in a conditional expression.
Predicate References and remove_if
In contrast, the implementation of remove_if algorithm shown earlier in this article requires more of its predicate than the standard allows. It passes on the predicate by value to other algorithms, which requires predicate type to be copyable in the first place and, in addition, risks object slicing in case of references to base class predicates.
Polymorphic Predicate Types
For illustration of the potential object slicing problem, think of a hierarchy of predicate types with an abstract base class and a number of concrete derived classes [3]. If you want to use it in conjunction with STL algorithms, then you might try to instantiate an STL algorithm for the base class reference type. The code below demonstrates the approach:
template void foo(Container& cont, const predicateBase& pred) { remove_if<typename Container::iterator, const predicateBase&> (cont.begin(),cont.end(),pred); }
The generated remove_if function receives the predicate via a base class reference and, as we know from look into the implementation ofremove_if, passes it by value to find_if and remove_copy_if -- a classical case of object slicing [4].
Predicate Types Containing Data
Another reason for using references to predicates is that predicates have data members and accumulate information in these data members.
Think of a bank application where we have a list of bank accounts, and we need to check whether any balances are below a certain threshold; if so, the client is removed from the list of bank accounts. At the same time, whenever a balance exceeds a certain threshold, the client's name is added to a mailing list. We can achieve this using remove_if with an appropriate predicate that builds up the mailing list and returns true for all clients that must be removed.
There is just one tiny problem: how do we get access to the mailing list after execution of the algorithm, when the mailing list is a data member of the predicate? When the predicate is passed to remove_ifby value then the algorithm works with a temporary copy of our predicate object, and all the accumulated information is discarded before we have a chance to extract it. For this reason, we pass it by reference, but then the algorithm passes it by value to find_if and remove_copy_if– and defeats the purpose of using a reference in the first place.
Summary
There are various reasons for passing function objects to algorithms by reference. Unfortunately, some standard library implementations create copies of the referred to objects and risk object slicing because they assume that an algorithm can never be instantiated for a reference type.
This STL problem is an instructive example of how an extension to the language suddenly requires new programming idioms. Today, we cannot safely make any assumptions about the template arguments that are used for instantiation of a function template. They can be reference types, they can have constqualifications, and they can have other type properties, which they did not have under automatic template argument deduction.
When we pass on function arguments of an "unknown" type to inner function templates, we can take two approaches to avoid the object slicing problem discussed in this article:
- Be restrictive. If we intentionally impose restrictions on the type arguments of a template, then we must document them and should ideally make sure that the template cannot be instantiated for a type that does not meet the requirements. In our example, a dummy typedef that would resolve to a reference to a reference for unwanted reference types would do the trick.
- Stay neutral. Normally we strive for maximum usability of a template and avoid any restrictions if possible, and it is possible to pass on arguments of a function template without dropping type properties along the way: just invoke inner function templates using explicit function template argument specification.
References and Notes
[1] Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual (Addison-Wesley, 1990).
[2] Stan Lippman and Josée Lajoie. The C++ Primer (Addison-Wesley, 1998).
[3] Hierarchies of polymorphic predicate types can be found in practice because the GOF book [5] suggests this kind of implementation for the Strategy pattern. Predicates in STL are typical strategies in the sense of the GOF strategy pattern.
[4] One might argue that use of polymorphic predicate types in conjunction with STL is not a wise thing to do. Generic programming provides enough alternatives (replace run-time by compile-time polymorphism), and there is no need for passing predicate base class references. True, in principle, yet the implementation of remove_if, which relies on automatic function argument deduction, creates a pitfall.
[5]Gamma, Helm, Johnson, Vlissides. Design Patterns(Addison-Wesley, 1995).
If you are interested to hear more about this and related topics you might want to check out the following seminar: |
---|
Seminar Effective STL Programming - The Standard Template Library in Depth 4-day seminar (open enrollment and on-site) IOStreams and Locales - Standard C++ IOStreams and Locales in Depth 5-day seminar (open enrollment and on-site) |