AngelikaLanger.com - Allocator Types - Angelika Langer Training/Consulting (original) (raw)

Allocator Types
Allocator Types

C++ Report, June 1998
Klaus Kreft & Angelika Langer

Allocators encapsulate memory allocation and are used by the container classes in the standard library for allocation and deallocation of the contained elements. "Container classes" in this context are the STL containers: vector, deque, list, (multi)set, and (multi)map, but also string, which is seen as containers of characters. Some container-related classes such as the container adapters stack, queue, and priority_queue, as well as string streams also use allocators. All remaining classes in the standard library do not use them at all.

Purpose of Allocators

Allocators were introduced to the standard library in order to allow for allocation strategies other than just heap data allocated by newand for alternative pointer types. Basically, an allocator covers two areas:

In addition to the flexibility gained regarding memory allocation strategies, allocators are also intended for alternative pointer types. For example, think of near, far, and huge pointers in certain memory models. Such non-standard memory models introduce new pointer and reference types. Another category of alternative pointer types are user-defined pointer classes such as smart pointers or range-checked pointers. Allocators facilitate use of non-standard pointer types, and the containers use these facilities in their implementation and interface in all places where pointer and reference types are needed. This way non-standard memory models and alternative pointer types can be used in place of built-in C++ pointers.

An Overview of Allocator Types

The C++ standard [1] requires that allocator types have a certain interface:

template class allocator_type {
public:
// constructors, destructors, and the like
allocator() throw();
allocator(const allocator&) throw();
template allocator(const allocator&) throw();
~allocator() throw();
// types and functions for alternative pointer types
typedef _allocator-specific_pointer;
typedef _allocator-specific_const_pointer;
typedef _allocator-specific_reference;
typedef _allocator-specific_const_reference;
typedef _allocator-specific_value_type;
typedef _allocator-specific_size_type;
typedef _allocator-specific_difference_type;
pointer address(reference x) const;
const_pointer address(const_reference x) const;
// types and functions for alternative memory allocation strategies
pointer allocate(size_type, allocator::const_pointer hint = 0);
void deallocate(pointer p, size_type n);
void construct(pointer p, const T& val);
void destroy(pointer p);
size_type max_size() const throw();
template struct rebind { typedef allocator other; };
};

An allocator must define:

The standard library provides a standard allocator type, called allocator. It encapsulates allocation and deallocation of memory from the heap viaoperator new and uses the ordinary, built-in C++ pointer and reference types. The standard allocator is used as a default template argument in all templates that require an allocator type. The vectortemplate, for example, is defined as:

template <class T, class Allocator = allocator > class vector;

Hence a vector of strings declared as a vectoris actually of type vector < string, allocator >, where string is a typedef for type basic_string < char, char_traits, allocator >.

Use of Allocators in Containers

Each container uses its allocator for memory allocation and deallocation of its elements and in all places where the container needs pointers, references, pointer differences, etc. that related to its elements. Here is an example taken from the vector interface:

template <class T, class Allocator = allocator > class vector {
public:
// types:
typedef typename Allocator::reference reference;
typedef typename Allocator::const_reference const_reference;
typedef implementation defined iterator;
typedef implementation defined const_iterator;
typedef implementation defined size_type;
typedef implementation defined difference_type;
typedef T value_type;
typedef Allocator allocator_type;
typedef typename Allocator::pointer pointer;
typedef typename Allocator::const_pointer const_pointer
// element access:
reference operator[](size_type n);
const_reference operator[](size_type n) const;
reference at(size_type n);
const_reference at(size_type n) const;
reference front();
const_reference front() const;
reference back();
const_reference back() const;
// ...
};

We can see that all element access is defined in terms of the allocator's reference types. The vector's operator[]()for instance, does not return a reference to the value type T, but an object of type Allocator::reference.

The allocation and deallocation functions are used whenever elements are created, copied, or discarded. Here is an example from the vector'smember function reserve():

void reserve(size_type newSize)
{ if (capacity() < newSize)
{ iterator newFirst = _allocator.allocate(newSize, this);
uninitialized_copy(_first, _last, newFirst);
for (iterator i = _first; i != _last; ++i) _allocator.destroy(&*i);}
allocator.deallocate(_first);
_endOfStorage = newFirst + newSize;
_last = newFirst + size();
_first = newFirst;
}
}

The function reserve() takes a size argument and extends the vector's capacity to the specified size, i.e. is allocates enough memory, copies the contained elements to the newly allocated area, destroys and deallocates the original elements, and eventually adjusts the vector's internal data accordingly. For allocation and deallocation of memory the allocator's functions allocate()and deallocate() are used.

Details of Allocator Types

After this first overview, let us now take a more in-depth look at the functionality required of an allocator type. We will not discuss constructors, destructors, and the like, because they must have the expected functionality. More exciting are the nested types and member functions that an allocator type must provide. They fall into two categories: one part of the interface describes allocation strategies and the other part represents alternative pointer and reference types.

Allocation Strategies

For the purpose of alternative memory allocation schemes, an allocator type must define a number of member functions. These functions are supposed to reflect the functionality that is usually used by the run-time system for allocation, deallocation, construction, and destruction of objects. When the compiler sees a new expression then it decomposes this expression into two activities: memory allocation and object initialization. To put it simple, the allocation is performed by calling an allocation function, which is operator new() and can be overloaded. The initialization is performed by calling the right constructors. Similarly, deletion of an object falls into destruction and deallocation. Allocators allow the same decomposition. They define the functions allocate()and deallocate() for acquisition and release of (uninitialized) memory, and have functions construct()and destroy() for object initialization and cleanup. Here is the interface of the member functions required for alternative allocation strategies:

template class allocator {
public:
pointer allocate(size_type, allocator::const_pointer hint = 0);
void deallocate(pointer p, size_type n);
void construct(pointer p, const T& val);
void destroy(pointer p);
// ...
};

The second argument hint to theallocate()function is intended to be used by the allocator to aid locality. If for instance a sequence of list elements is allocated, it might be desired to place all these pieces of memory close to each other, e.g. onto the same memory page. By passing in the previous list element as a hint to allocation of the subsequent list element, the allocator can optimize the memory layout. Note, that this optimization is not required of an allocator type; the hint may be ignored.

The default allocator type allocatoris an example of a conforming allocator type. Let us see how it implements these functions:

template class allocator {
public:
pointer allocate(size_t size, allocator::const_pointer hint = 0)
{ return (T*)::operator new(size*sizeof(T)); }
void deallocate(pointer p, size_type n)
{ operator ::delete(p); }
void construct(pointer p, const T& val)
{ new((void *)p) T(val); }
void destroy(pointer p)
{ ((T*)p)->~T(); }
};

Allocation and deallocation boil down to memory allocation and deallocation from the heap by means of the global operators newand delete. Initialization is a call to placement new, which calls the value type's constructor to initialize the previously allocated memory. The function destroy() is an explicit call of the value type's destructor.

There is an additional requirement to allocator types, that was introduced into the standard long after allocators had first been added to the standard library: All instances of a given allocator type are required to be interchangeable. This means that, for any two allocator objects of the same allocator type, it must not make a difference which allocator object is used for allocation or deallocation.

This requirement imposes quite a restriction on allocator types. Consider the example of an allocator type thread_allocatorrepresenting thread-specific memory. Each thread allocator might manage memory specific to another thread. An allocator object of typethread_allocator will therefore contain an identification of the thread which the memory belongs to. Obviously, it makes a significant difference which allocator object is used. Objects of typethread_allocator are not interchangeable and violate the additional requirement. As a consequence, it is not guaranteed that this kind of allocator can be used with standard-compliant container implementations. Why is this restriction?

Let us consider the "prohibited" situation of two containers with unequal allocator objects a little further. Some algorithms operates on two containers. An example of such an operation is the swap()algorithm. It exchanges the content of two objects. Both objects have to be of the same type. Say, they are vectors of strings, which are allocated via thread_allocators. Both vectors are of the same type, namely vector<string, thread_allocator >, but they have different, unequal allocator objects referring to different threads.

vector<string, thread_allocator > vec_1(thread_allocator(threadId_1));
vector<string, thread_allocator > vec_2(thread_allocator(threadId_2));
vec_1.swap(vec_2);

What does it mean to swap the elements of vec_1, lying in memory specific to thread 1, with the elements of vec_2, whose elements are located in memory specific to thread 2? Does it mean that all elements of vec_1 are copied into the thread-specific memory of thread 2 and vice versa? Usually, the vector swap() is implemented much more efficiently and does not involve copying of elements at all. Instead, the vector swap() simply exchanges the vectors' internal data, i.e. the pointers to the first and last element and the pointer to the end of allocated storage. This cannot be the appropriate thing to do, if two different allocator objects are involved. Unfortunately, this question does not only apply to swap(), but to all operations that involve two containers of the same type.

The requirement of interchangeability of allocator objects was introduced into the standard in order to allow for efficient library implementations. For a library implementor the additional requirement means that the swap()algorithm for instance can be implemented in its most efficient form: It does not make a difference how (i.e. by means of which particular allocator object) the vectors were allocated; hence swapping the internal pointers is sufficient. The effort for providing a swap()algorithm for containers with non-interchangeable allocators is significantly higher. The library implementor would have to provide several versions of swap(): an optimized one for interchangeable allocators and , maybe several additional, versions for non-interchangeable allocators.

What does the requirement mean for users of the standard library? Well, it depends on what a user intends to do with allocators. Let's distinguish between implementing an allocator type and using allocators.

We use allocators, implicitly, when we use containers. Imagine a situation similar to the swap()algorithm: we want to use several containers of the same type and create, copy, or destroy container elements. In that case we can take advantage of the additional requirement of interchangeable allocators: We need not care which container's allocator we use. The containers' allocators are all of the same type and, meeting to the requirement, are interchangeable and basically all the same.

The requirement of interchangeable allocators has a fundamentally different meaning for programmers who aim to implement their own allocator types. In that case the requirement is a restriction imposed on the new allocator type. If the allocator type does not meet the requirement, then there is no guarantee that it can be used with containers and algorithms of any conforming standard library implementation.

Pointer and Reference Types

Besides memory allocation strategies, allocators are intended for representing alternative pointer types. For this purpose, an allocator type must define nested types for pointers, references, and related items, as well as member functions called address(). Specifically:

// types:
value_type
pointer
const_pointer
reference
const_reference
size_type
difference_type
// functions:
pointer address(reference x) const;
const_pointer address(const_reference x) const;

Non-standard pointer types must behave like native pointer types. For this reason, these types and conversions defined by an allocator must reflect the relationship among native pointer and reference types. Among built-in pointer and reference types, the C++ type system already defines implicit conversions between a type T and its reference type T&. Also, there are conversions defined from Tand T& to T*by means of the address operator&()and from T* to Tvia the dereferencing operator*().

Figure 2: Conversions of built-in pointer and reference types in the C++ type system

Non-standard pointer and reference types must mimic this behavior: The types Allocator::value_type, Allocator::pointer, and Allocator::reference are required to be convertible to each other, in the same way as T,T*, and T&. This convertibility is achieved by imposing the following requirements on the types defined by an allocator:

Figure 3: Conversions of non-standard pointer and reference types defined by an allocator

Let us see how the default allocator type allocator meets these requirements. The pointer, reference and related types are identical to those in the C++ type system. The address()function for conversion from a reference type to a pointer type boils down to the address operator.

template class allocator {
public:
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;
typedef size_t size_type;
typedef ptrdiff_t difference_type;
pointer address(reference x) const { return (&x); }
const_pointer address(const_reference x) const { return (&x); }
// ...
};

Despite of the latitude this model seems to allow at first sight, we are not entirely free to define alternative pointer and reference types. The crux is that almost all operations in the STL are implemented as function templates, and convenient use of these function templates involves automatic function template argument deduction. Unfortunately, the automatic deduction does not work in presence of alternative pointer and reference types. Let us see why.

Assume, we had defined an allocator type with non-standard pointer and reference types. Let us call it smartPtr_allocator. The non-standard types defined by smartPtr_allocatorwould be used, for instance, as the return type of the index operator[]()of a vector<T,smartPtr_allocator >. Consider a function that takes a reference returned by operator[]()as a function argument, say we pick the max()algorithm from the standard library. To our very surprise, a call to max()like in the following code snippet

vector<int,smartPtr_allocator > vec;
... max(vec[0],vec[1]) ... // error

will not compile. The reason is that max()is a function template. Normally, the compiler automatically deduces the function template arguments from the types of arguments passed to the function call. If vec[0] and vec[1]were of type int&, then the compiler would conclude that max()must be called here. In our case of non-standard pointer and reference types, the arguments to max() are of type smartPtr_allocator::reference.The compiler is not capable of understanding that smartPtr_allocator::referencerepresents an alternative reference to type int, and it does not deduce int as the template argument, but the allocator's reference type itself. As we cannot rely on the compiler for template argument deduction under these circumstances, we would have to specify the template argument of max()explicitly and would call max(), or more precisely:

vector<int,smartPtr_allocator > vec;
... max< vector<int,smartPtr_allocator >::value_type>(vec[0],vec[1]) ...

The same inconvenience occurs for the other nested types. For instance, the compiler is capable of deducing Tas a template argument from a T*function argument, but it cannot deduce Allocator::value_typefrom Allocator::pointer. To make automatic function template argument deduction work, the standard makes the following requirements to the typedef members of an allocator type:pointermust be defined as T*,const_pointerasconst T*, reference as T&,const_reference as const T&,value_typeas T,size_typeas size_t, and difference_typeas ptrdiff_t.

This additional requirement was introduced into the standard to guarantee convenient use of standard library operations that use any of the types defined by allocators. Without the requirement, library implementers would have to find ways and means for allowing the desired convenience. They could, for instance, provide specializations of the max()algorithm, so that its use would not involve automatic argument deduction, but would work via function template overloading. Library implementers are encouraged, by the standards committee, to supply libraries that can accept allocator types with alternative pointer and reference types. To our knowledge, none of the available implementations does so.

As a user of allocators, we profit from the imposed requirement, because we, too, can assume that a container's pointer or reference type indeed is a pointer or reference type in the sense of the C++ type system, and that conversions and deductions automatically performed by the compiler for these types work as expected.

If, however, we implement a new allocator type, the requirement basically means that we cannot define alternative pointer and reference types at all.

Summary

Allocators are intended to allow for alternative pointer and reference types as well as for alternative memory allocation schemes. Allocators are used by all containers in the standard library, including the character container class basic_string. The support of allocators is slightly restricted to permit efficient library implementations.

References

Working Paper for Draft Proposed International Standard for Information Systems
Programming Language C++
Accredited Standards Committee X3, INFORMATION PROCESSING SYSTEMS
Doc No:X3J16/97-0079, WG21/N1117
Date: 29 September 1997

If you are interested to hear more about this and related topics you might want to check out the following seminar:
Seminar Effective STL Programming - The Standard Template Library in Depth 4-day seminar (open enrollment and on-site) IOStreams and Locales - Standard C++ IOStreams and Locales in Depth 5-day seminar (open enrollment and on-site)