cpp

Introduction

Pointers are indispensible. Pointers:

The pointer support can be:

In C++, it’s best to avoid the raw pointers, and go for the advanced pointer support in the form of standard smart pointers.

A reference in Java or C# is a shared pointer with the object member selection syntax (i.e., object.member). A C++ reference is an alias, which at compile time will be either optimized out or turned into a raw pointer.

Motivation: the problems of raw pointers

Raw pointers are easy to use, but their use is very error-prone, i.e., it’s easy to make mistakes.

The problems

When we have a pointer of type T * which points to a dynamically-allocated memory location, we face these problems:

The type problem

The new and delete operators come in many versions, most notably:

If we allocate data with either the single or array version of the new operator, we should destroy the data with the same version of the delete operator. However, the type of the pointer used is the same for both version, so it’s easy to mismatch the versions, which results in an undefined behaviour.

The ownership problem

The ownership problem can result in:

The exception handling problem

If we manage the dynamically-allocated data with raw pointers, the exception handling becomes a boring and error-prone task, espacially if the data is complex. It’s doable, but who really wants to do it?

An example

The example below shows how easily we can run into the type, ownership, and exception handling problems. The compiler does not report problems with this broken code.

// Who should destroy the allocated data?  Should the data be
// destroyed by the foo function?

void
foo(int *p)
{
  // By mistake the array delete is used.
  delete [] p;
}

int *
factory()
{
  int *p;

  try
    {
      p = new int;

      // Work on the new data.  An exception could be thrown here.
      throw true;

      return p;
    }
  catch(bool)
    {
      // It's easy to forget this delete:
      delete p;      
    }

  return nullptr;
}

int
main()
{
  // The problem is brewing: we use a pointer to integer to point to
  // the begining of an array of integers.
  int *p = factory();

  // I'm thinking that foo will use, but not destroy the data.
  foo(p);

  // This is the second delete.
  delete p;
}

The smart pointer solution

A smart pointer manages dynamically-allocated data, and so we call a smart pointer the managing object, and the dynamically-allocated data the managed data.

A smart pointer doesn’t copy or move the managed data, it can only destroy the data.

The type of the managed data doesn’t have to be prepared in some special way in order to be managed by smart pointers, e.g., the type doesn’t have to be derived from some special type with the required functionality implemented.

The smart pointers solve:

Every flexible language should support raw pointers, because this low-level functionality is needed to implement high-level functionality, such as smart pointers.

A programmer should have a choice between the raw pointers (perhaps for implementing an intricate functionality) and smart pointers (just for every day use).

In C++, for every day use, a programmer should not resort to the raw pointers, let alone to the void * trickery – these times are long gone.

Smart pointer types

There are three smart pointer types defined in the memory header file:

Smart pointer types are wrappers around raw pointers, which are used at compile-time only, and so at run-time they should not degrade the memory or time performance. They should be as fast and take as little memory as the code of the same functionality manually crafted with raw pointers.

Smart pointer types are:

There is also deprecated type std::auto_ptr – don’t use it.

std::unique_ptr

An object of type std::unique_ptr has the exclusive ownership semantics:

The exclusivity implies that std::unique_ptr is a move-only type, and so:

The ownership implies that the managed data should be destroyed when the managing object is:

Most likely you need this smart pointer when you want to switch from raw pointers to smart pointers.

An example

Type std::unique_ptr is a templated type: you need to pass the type of managed data as an argument to the template type. We pass the template arguments in the angle brackets, i.e., <>, like this:

std::unique_ptr<managed_data_type> p;

In the example below, the managing object p manages the data of type int, which will be automatically destroyed by p when it goes out of scope.

#include <memory>

int
main()
{
  std::unique_ptr<int> p(new int);
}

Function std::make_unique

Function template std::make_unique was introduced for convenience (we could do without it): it creates both the managing object, and the managed data.

We can create the managed data yourself with the new operator, and pass its raw pointer to the managing object like this:

unique_ptr<A> up(new A("A1"));

Instead, we can write the equivalent code like this, without typing type A twice:

auto up = make_unique<A>("A1");

Function std::make_unique introduces no overhead: the move constructor will be elided for the return value, and so the managing object will be created directly in the location of up.

By type auto above we ask the compiler to make the type of up the same type as the type of the initializing expression make_unique<A>("A1"), which is std::unique_ptr<A>. We could have equivallently written:

unique_ptr<A> up = make_unique<A>("A1");

To use function std::make_unique, you need to pass the template argument, which is the type of the managed data to create, and manage. The arguments (none, one or more) of the function are passed to the constructor of the managed data (a feat accomplished with the variadic templates). In the example above "A1" is the argument passed to the constructor of type A.

No performance overhead

This example demonstrates that there is no performance overhead of using smart pointers. In more complicated examples there might be some small overhead, which should go away as compilers get better at optimization.

The following example uses both the std::unique_ptr and std::make_unique. Save this file as test1.cc:

#include <memory>

int
main()
{
  auto p = std::make_unique<int>();
}

The following example of the same functionality uses raw pointers. Save this file as test2.cc:

#include <memory>

int
main()
{
  int *p = new int;
  delete p;
}

Now compile them to the assembly code with:

g++ -S -O3 test1.cc test2.cc

Now there are two files with the assembly code: test1.s, and test2.s. Take a look at one of them:

c++filt < test1.s | less

Compare them to see that they are instruction-to-instruction the same (almost, there is one small difference), which shows there is no overhead of using std::unique_ptr and std::make_unique:

diff test1.s test2.s

How to use std::unique_ptr

The example below demonstrates the basic usage of std::unique_ptr.

#include <cassert>
#include <iostream>
#include <memory>
#include <string>

using namespace std;

struct A
{
  string m_name;

  A()
  {
    cout << "default ctor\n";
  }

  A(string &&name): m_name(move(name))
  {
    cout << "ctor: " << m_name << endl;
  }

  ~A()
  {
    cout << "dtor: " << m_name << endl;
  }

  // Smart pointers never copy or move their managed data, so we can
  // delete these special member functions, and the code should
  // compile.
  A(const A &) = delete;
  A(A &&) = delete;
  A &operator=(const A &) = delete;
  A &operator=(A &&) = delete;
};

int
main()
{
  // That's an empty pointer.
  std::unique_ptr<A> p1;

  // That's how we test whether a pointer manages some data.
  assert(!p1);
  assert(p1 == nullptr);

  // This pointer manages an object.
  std::unique_ptr<A> p2(new A("A1"));
  assert(p2);
  assert(p2 != nullptr);

  // We can assign a new object to manage, but not this way.
  // p1 = new A("A1'");

  // That's the correct way.  The previously managed object is
  // destroyed.
  p2.reset(new A("A2"));

  // Or better yet:
  p2 = make_unique<A>("A3");

  // We cannot copy-initialize, because the ownership is exclusive.
  // std::unique_ptr<A> p3(p2);
  // auto p3(p2);

  // We cannot copy-assign, because the ownership is exclusive.
  // p2 = p1;

  // We can move-initialize to move the ownership.
  auto p3 = move(p2);

  // We can move-assign to move the ownership.
  p2 = move(p3);

  // That's how we can get access to the managed data.
  cout << p2->m_name << endl;
  cout << (*p2).m_name << endl;
  cout << p2.get()->m_name << endl;

  // The "release" function releases p1 from managing the data.  The
  // managed data is not destroyed.  Luckily, p1 doesn't manage
  // anything, so we don't get a memory leak.
  p1.release();
}

The solutions to the problems

The type problem

The type problem, more specifically the problem of mismatching the single and array versions of the new and delete operators, is solved by two versions (two template overloads) of smart pointers:

By using the right version of the smart pointer, you don’t have to remember to destroy the managed data with the matching version of the delete operator.

Lurking problems, and how to deal with them.

However, we still can introduce bugs like in the example below, where we:

Use std::make_unique to get the same done safer, as shown below.

#include <memory>

using namespace std;

int
main()
{
  // Undefined behavior!
  unique_ptr<int> up1(new int[5]);
  unique_ptr<int[]> up2(new int);

  // The preferred way, because it's less error-prone.
  auto up3 = make_unique<int>();
  auto up4 = make_unique<int[]>(5);
}

Use std::array instead!

If you really have to have an array of static size (i.e., the size doesn’t change at run-time), it’s better to use std::array instead of the C-style array. You can use it with smart pointers like this:

#include <array>
#include <memory>

using namespace std;

int
main()
{
  unique_ptr<array<int, 5>> up1(new array<int, 5>);
  auto up2 = make_unique<array<int, 5>>();
}

The ownership problem

The ownership problem is solved: you just move the ownership where you need to, e.g., a function or some structure. You can move the ownership when you pass or return a unique pointer by value in a function call, as shown in the example below.

#include <iostream>
#include <memory>

using namespace std;

struct A
{
  A()
  {
    cout << "ctor\n";
  }

  ~A()
  {
    cout << "dtor\n";
  }
};

auto // C++14
factory()
{
  auto p = make_unique<A>();

  return p; // Return value optimization.
}

void
stash(unique_ptr<A> p)
{
  static unique_ptr<A> stash;
  stash = move(p);
}

int
main()
{
  auto p = factory();
  stash(move(p));
}

The exception handling problem

When an exception is thrown, the data previously allocated (or any other resource acquired) and not required any longer because of the exception, should be deleted (or released). When programming with raw pointers, we can release the memory in the catch block, as shown in the example below. We have to declare p before the try block, so that it’s accessible in the catch block, and that complicates the code.

#include <iostream>

struct A
{
  A()
  {
    std::cout << "ctor\n";
  }

  ~A()
  {
    std::cout << "dtor\n";
  }
};

void
foo()
{
  throw true;
}

int
main(void)
{
  A *p;

  try
    {
      p = new A();
      foo();
      delete p;
    }
  catch (bool)
    {
      // Have to delete.
      delete p;
    }
  
  return 0;
}

The same can be accomplished with smart pointers better:

#include <iostream>
#include <memory>

struct A
{
  A()
  {
    std::cout << "ctor\n";
  }

  ~A()
  {
    std::cout << "dtor\n";
  }
};

void
foo()
{
  throw true;
}

int
main(void)
{
  try
    {
      auto p = std::make_unique<A>();
      foo();
    }
  catch (bool)
    {
    }
  
  return 0;
}

Raw pointers not so easy, rather error-prone.

Because function arguments are not guaranteed to be evalued in the order they are listed, in the example below we’ve got a memory leak. At least I’ve got it with GCC, and if you don’t, try to swap the arguments in the call to foo. The object of class A is:

#include <iostream>

struct A
{
  A()
  {
    std::cout << "ctor\n";
  }

  ~A()
  {
    std::cout << "dtor\n";
  }
};

void
foo(int, A *p)
{
  delete p;
}

int
index()
{
  throw true;
  return 0;
}

int
main(void)
{
  try
    {
      foo(index(), new A());
    }
  catch (bool)
    {
    }
  
  return 0;
}

The same can be accomplished the safe way with smart pointers. This code works correctly regardless of whether an exception is thrown or not.

#include <iostream>
#include <memory>

struct A
{
  A()
  {
    std::cout << "ctor\n";
  }

  ~A()
  {
    std::cout << "dtor\n";
  }
};

void
foo(int, std::unique_ptr<A> p)
{
}

int
index()
{
  throw true;
  return 0;
}

int
main(void)
{
  try
    {
      foo(index(), std::make_unique<A>());
    }
  catch (bool)
    {
    }
  
  return 0;
}

The first example revisited

Below there is the first example fixed with raw pointers. All problems gone.

#include <memory>

using namespace std;

void
foo(unique_ptr<int> p)
{
}

unique_ptr<int>
factory()
{
  try
    {
      auto p = make_unique<int>();

      // Work on the new data.  An exception could be thrown here.
      throw true;

      return p;
    }
  catch(bool)
    {
    }

  return nullptr;
}

int
main()
{
  auto p = factory();
  foo(move(p));
}

Conclusion

Quiz

Acknowledgement

The project financed under the program of the Minister of Science and Higher Education under the name “Regional Initiative of Excellence” in the years 2019 - 2022 project number 020/RID/2018/19 the amount of financing 12,000,000 PLN.