A list, an associative map, a set, and other data structures, are called containers in C++. A container:
is a generic data structure,
stores elements of any type,
stores elements only of a single type,
stores elements by value, not by reference,
stores elements at the heap,
can grow or shrink dynamically (i.e., at run-time).
A container is generic, because it can store data of any type
container type (e.g.,
std::vector<T>) is templated, because we
have to give a template argument
T at compile-type.
While the container itself (i.e., the object of the container type) can be anywhere in memory, the elements have to be stored at the heap, because the size of the container can change at run-time.
Containers are complemented by iterators and algorithms. The iterator
is a generalization of a pointer. Algorithms, such as
are generalized in that they can be used with various containers,
because access to the container elements is provided with iterators.
Standard containers, iterators, and algorithms fit together because of
the naming and semantic convention. As for the naming convention, for
instance, every container has the
other functions. As for the semantic convention, for instance, we get
the data of iterator
i by dereferencing:
In the early 90’s, containers were:
a cutting-edge technology, and a hot research topic,
originally implemented as the STL, the seminal work by Alexander Stepanov.
Now containers are:
an indispensable tool of every-day use,
a part of the standard library.
There is no excuse, put them to use.
std::vector<T> - the vector,
std::list<T> - the doubly-linked list,
std::deque<T> - the deque, amalgamation of the vector and the
std::forward_list<T> - the singly-linked list,
std::map<K, V> - the associative array (aka dictionary),
std::multimap<K, V> - the associative array with duplicate keys
std::set<T> - the set,
std::multiset<T> - the set with duplicate elements allowed,
std::stack - the stack,
std::queue - the queue,
std::priority_queue<T> - the priority queue.
Container types can be nested, i.e.,
T can be a container type too.
The vector offers:
random access with the random access operator, i.e., the
operator, aka the index operator, or the subscript operator,
memory contiguity: all elements are stored contiguous in memory, which implies:
the random access operator is as fast as can be, because it only has to increase a pointer,
the memory access is faster: the elements of the vector are all stored compactly together, and so the processor cache is used the most effectively,
slow insertion and removal.
The vector may have to reallocate the elements when the currently allocated memory for elements is not enough, as when, e.g., we insert an element into a vector. First, new memory has to be allocated. Next, the elements are copied or moved, depending on whether the type of the elements has the move semantics implemented. Finally, the old memory is released.
The performance of the vector drops not only when elements are frequently reallocated, but also when elements are frequently inserted or removed. When an element is inserted or removed, the elements that follow have to be moved (or copied) one-by-one, because the vector has to guarantee the memory contiguity.
In comparison with other containers, the vector performs very well if the reallocation, insertion and removal do not frequently happen, for instance, if we build a vector, fill it in, and then random-access the elements frequently.
The list does not store its elements contiguously in memory, but stores them in different locations at the heap, which were allocated separately. Then the elements are doubly-linked, which implies:
fast insertion and removal, because elements do not have to be reallocated, and neither the elements that follow have to be moved one-by-one,
iterative access only, because to access some element, we have to get to it through the elements that proceed or follow it.
The list does not offer element random access, even though it could
with a more elaborate implementation. However, C++ aims at providing
fast and lean tools, and such a bloated list would decrease
performance. The list of type
std::list offers the minimal
functionality required, but with superior (time and memory)
The deque (pronounced as “deck”, as in deck of cards) offers:
fast insertion and removal.
How come? Well, insertion and removal are not as fast as in
std::list, and random access is not as fast as in
The deque offers a trade-off between functionality, and efficiency.
Use the deque if you frequently need to random-access, insert and
The deque is implemented with small vectors organized one after another, but without guaranteeing memory contiguity. If element reallocation, insertion or removal is required, then it’s limited to one small vector. However, random access is less efficient in comparison with vector, because finding an address of an element requires more arithmetic.
Use the deque only when the vector and the list won’t do.
std::list is more than we need from a list. Sometimes we
need to forward-iterate only, and can do without the capability to
backward-iterate that is offered by
std::list, the capability that
we pay for with performance.
std::forward_list is even leaner and more performant than
std::list, because it is a singly linked list: we can iterate
forward, but not backward.
The set stores unique elements, and the multiset allows for multiple elements of an equal value. Both container types store the elements sorted. By default the order is ascending, but we can establish any order with a callable.
Interestingly, we cannot modify the elements, because that would destroy the order of elements, and make the container inconsistent. For this reason, the type of the elements stored is made const, even if the template parameter was non-const.
If we want to modify an element, then we have to remove the element first, and insert a new element with a different value next.
Iterators are the glue between the containers, and the algorithms.
std::sort can be used with various containers through
iterators. Iterators provide access to the container elements, so
that we do not care about the type of a container (the container type
was abstracted away).
An iterator is a generalization of a pointer. We could say that a pointer is the iterator of a C-style array. We can initialize the pointer, compare it to some other pointer, dereference it to get to the element it points to, and also increment it. Furthermore, we can random access any element in the C-style array if we increase (with the + operator) the pointer to the element number 0 by the value of the index as in here:
Iterator types are user-defined, e.g., of a structure type. Iterators are wrappers around pointers, where the operators (defined for that type) implement the required functionality. For instance, if in the example above we replace the C-style array with a deque, the rest of the code can remain untouched:
Iterators of the standard library are very small and very efficient. They typically store only a single pointer. Therefore we are free to use them by value, and copy them. We could use iterators by reference too, but that would be just awkward, just as awkward would be using pointers by reference.
For a given container type
T, there are always at least two iterator
non-const iterator type:
const iterator type:
You cannot modify elements of a container with a const iterator. For better code, always use the const iterator if you do not modify the elements.
We know where the elements of a container are with the
end functions. The
begin function returns an iterator to the
first element. The
end function returns a past-the-end iterator
which you would get if you incremented an iterator to the last
element: we could say the
end function returns an iterator to an
imaginary element (imaginary, because that element does not exist)
that would follow the last element. If a container has no elements,
the iterators returned by
end functions return non-const iterators for a
non-const container, and const iterators for a const container. If we
want to iterate with a const iterator over a non-const container, we
can use the
cend functions, which return const
cend are for convenience only, because they are
dispensable. We can achieve the same functionality by calling the
end functions for a non-const container when we
reference the container with a const reference, which we can do with
Iterators can be categorized into a few basic categories (concepts, technically speaking), depending on the provided functionality:
a forward iterator,
a bidirectional iterator,
a random-access iterator.
A forward iterator provides only the two most basic operations:
++i. The iterator of
std::forward_list is a forward iterator.
Here’s an example:
A bidirectional iterator is a forward iterator with an extra operation
--i, i.e., move back by one element. The iterator of
std::list is a bidirectional iterator. For example:
A random-access iterator allows for increasing or decreasing the iterator value by any integer, as we’ve done in the example above with the deque. The iterator of the vector is a random-access iterator. A pointer is a random-access iterator too.
We can forward-iterate over the elements of a container (or, more broadly, a range) in two ways: the old, and the new.
As shown in the example below, you can forward-iterate over the
elements of a container the old way, which is a bit tedious, because
we have to initialize the iteration variable
i, write the loop
condition, and then increment the variable. This loop is also
error-prone, as it’s easy to mistakenly write ‘–i’ instead of ‘++i’,
begin instead of
Since C++11, we can iterate the new way, using the range-based (aka for-each) syntax of the for loop. The semantics is the same as for the old way. The range-based loop is less error-prone, because we have to write less.
The syntax is:
declaration declares the variable that is initialized with the
container elements (or more precisely, the range elements) in every
iteration of the loop. We refer to this variable as the declared
expression is the range expression. Most often, we put the
container here. Having the range expression, we can get the values
of the begin and end iterators.
statement is the statement executed in every iteration of the
loop. In that statement we use the declared variable.
The new range-based loop is translated by a compiler to a regular
loop, where the iteration variable is of an iterator type. The
iteration variable is initialized with a value returned by the
function. The loop continues if the value of the iterator is not
equal to the value returned by the
end function. In every iteration
of the loop, the declared variable is initialized by the
dereferenced iteration variable. After an iteration, the iterator
To use the range-based loops, we need to make sure that:
we can call the
end functions for the range
the type of the values returned by the
should have the following defined:
!= comparison operator,
the dereference operator, i.e.,
the prefix increment operator, i.e.,
Here is an example how we can use that functionality:
We can copy or move an element into a container. We can also move it from a container, except for the associative containers, for which we can extract an element. We can also emplace an element in a container.
Containers have the move semantics implemented. An example:
We can move elements into containers: it’s enough to make sure that the element we insert is used in an rvalue. We can also move from an element of a sequence container. Example:
We cannot move an element from an associative container, because we cannot modify that element, even if we are using a non-const iterator, as the type of the element is const. We shouldn’t even move, because the object we would move from would not be part of the container any longer, as it would have to be in the case of, e.g., a vector. We should extract the element. Extraction is supported only by the associative containers, because it’s needed only there.
Extraction is implemented by unlinking the element from the
container. As the result of the extraction, we get a node handle
(an object of a move-only type) which owns the extracted element: when
a node handle is destroyed while still holding the element, that
element is destroyed too. The node handle can be implemented with the
unique smart pointer, i.e.,
Having a node handle, we can insert the element into a different container of the same type as the container from which the element was extracted. The element remains untouched (still in the same place with the same value) when it’s extracted and inserted; the ownership of the element changes from one container to the other passing through a node handle.
Having a node handle, we can access the owned element with the
function, and move it (move its value) somewhere, e.g., to a different
container of a different type. When the node handle is destroyed, it
will destroy the element from which we moved.
An element can be copied, moved, or emplaced into a container. Copying is needed when we want to keep the source element intact. Moving is faster, and so preferred over copying, if the source won’t be needed later. In both copying and moving, we create an object ourselves, and then pass it to a container. Emplacing creates an object based on the arguments we provide.
Emplacing is the fastest: a container tries to create the element in the required place: the element is created in-place, i.e., in the place (memory location) required by the container. No copying, no moving… if all goes well.
An emplace function takes the arguments of an element constructor, and passes them (forwards, technically speaking) to the constructor when it’s known where (i.e., the place is known) the element should be constructed.
We emplace by calling an
emplace function of a container.
Containers have other functions for emplacing with slight semantic
Emplacement works similar to insertion in that the elements that follow are “pushed to the right”. Therefore emplacement entails the same performance issues as the insertion does.
A container tries to emplace the element. Tries, because the place for the element might be already taken by some other element, e.g., when we emplace at the front of a non-empty vector. If that happens, the new element is created in a different memory location, and then moved into the required place.
Don’t implement the basic data structures, because they are already there.
Use the containers, and get better at using them. Their functionality is quite rich.
With the containers, you can build complex data structures.
With the containers use the standard algorithms, because your own algorithm implementations will most likely perform far worse.
What are the prominent differences between container types?
Why can’t we modify the set elements?
How does emplacement work?