cpp

Introduction

When we dynamically create some data (or any other resource) and use them in other threads or parts of code, we run into the problem when to destroy the data. If we:

don’t destroy the data, we get a memory leak,
destroy the data too soon, we get a race condition: a dangling pointer and undefined behavior, because other threads or parts of code still use the data,
destroy the data too late, we get suboptimal performance, because we let the dispensable data linger in memory.

Therefore we should destroy the data at the right time, i.e., when they are no longer needed. However, this right time is hard to pinpoint with raw pointers, because it may depend on:

the data (i.e., their specific values),
the timing of other threads.

The solution to the problem is the shared-ownership semantics:

shared, because the data are managed by a group,
ownership, because the data are destroyed when the group becomes empty.

In Java or C#, a reference has the shared-ownership semantics.

`std::shared_ptr`

#include <memory>
The smart pointer type that implements the shared-ownership semantics.
The objects of this type are the managing objects, and the data allocated dynamically is the managed data.
It’s a template class: the template argument is the type of the manged data.
The opposite of std::unique_ptr.
Objects of this class can be copied and moved.
A managing object that takes the ownership of the managed data creates a group of managing objects. Initially, this group has only one managing object.
When we copy a managing object, we create another managing object, which belongs to the same group of managing objects.
The managed data is destroyed, when the last managing object is destroyed.
The managed data don’t know they are managed: the type of the data doesn’t have to be prepared to be managed, e.g., derived from some base class.
As performant in terms of memory and time, as the same functionality implemented “manually” with raw pointers.
An object of this class takes twice as much memory as the raw pointer.

Details

Usage

The example below shows the basic usage.

#include <cassert>
#include <iostream>
#include <memory>
#include <string>
#include <utility>

using namespace std;

struct A
{
  string m_text;

  A(const string &text): m_text(text)
  {
    cout << "ctor: " << m_text << endl;
  }

  ~A()
  {
    cout << "dtor: " << m_text << endl;
  }
};

int main (void)
{
  // sp takes the ownership.
  shared_ptr<A> sp(new A("A1"));
  assert(sp);

  // We make sp manage a new object.  A1 is destroyed.
  sp.reset(new A("A2"));

  {
    // We copy-initialize the ownership.
    shared_ptr<A> sp2(sp);
    assert(sp);
    assert(sp2);

    shared_ptr<A> sp3;
    // We copy-assign the ownership.
    sp3 = sp2;
    assert(sp2);
    assert(sp3);

    // Even though sp2 and sp3 go out of scope, A2 will not be
    // destroyed, because it's still being managed by sp.
  }

  {
    // We move-initialize the ownership.
    shared_ptr<A> sp2(std::move(sp));
    assert(!sp);
    assert(sp2);

    shared_ptr<A> sp3;
    // We move-assign the ownership.
    sp3 = std::move(sp2);
    assert(!sp2);
    assert(sp3);

    // A2 is destroyed, because sp3 (the sole managing object o A2)
    // goes out of scope.
  }

  // We can't release the managed data from being managed, as we are
  // able to do with unique_ptr, because we can't preempt (strip)
  // other shared_ptr objects of their ownership.

  // sp.release();

  // If we want to reset a pointer, we can use the reset function.
  sp.reset();
}

How it works

The group of managing objects share a control data structure, which is allocated dynamically by the first object in the group.
A managing object has a pointer to the control data structure of its group.
A reference count (i.e., the size of the group) is a field in the control data structure.
When a managing object is copied, the reference count is incremented.
When a managing object is destroyed, the reference count is decremented.
When the reference count reaches 0, the managed data is destroyed.

From `unique_ptr` to `shared_ptr`

We can move the ownership from unique_ptr to shared_ptr like that alright:

#include <memory>

using namespace std;

int
main()
{
  auto up = make_unique<int>();
  shared_ptr<int> sp(up.release());
}

But it’s downright better done this way:

#include <memory>
#include <utility>

using namespace std;

int
main()
{
  auto up = make_unique<int>();
  shared_ptr<int> sp(std::move(up));
}

We can move the ownership from an rvalue of type unique_ptr, because shared_ptr has the constructor which takes by rvalue reference an object of type std::unique_ptr. Therefore, we can create a shared_ptr object from a temporary object of type unique_ptr, e.g., returned by a function like this:

#include <memory>
#include <utility>

using namespace std;

unique_ptr<int>
factory()
{
  return make_unique<int>();
}

int
main()
{
  shared_ptr<int> sp(factory());
}

Performance

A shared_ptr object takes twice as much memory as a raw pointer, because it has two fields:

a pointer to the managed data,
a pointer to the control data structure.

On top of this, there is the memory taken by the control data structure allocated, but it’s not a big deal, because it’s shared among the managing objects.

A pointer to the managed data could be kept in the control data structure, but then getting to the managed data would involve an extra indirect access, thwarting performance.

`std::make_shared`

When creating the managed data and the managing object, we can write the type of the managed data twice (and perhaps introduce bugs):

#include <memory>

using namespace std;

int
main()
{
  // We have to type int twice.
  shared_ptr<int> sp(new int);
  // Bug: constructor and destructor mismatch: int[] vs int
  shared_ptr<int[]> sp2(new int);
  // Bug: constructor and destructor mismatch: int[] vs int[5]
  shared_ptr<int> sp3(new int[5]);
}

But we can use function make_shared and write the type only once like this (which is less error-prone):

#include <memory>

using namespace std;

int
main()
{
  // We don't have to write the type twice.
  auto sp = make_shared<int>();
  // We can't mismatch the constructor and destructor.
  auto sp2 = make_shared<int[]>(5);
}

Function template make_shared takes the type of the data to manage as its template argument.

Similar to function make_unique, function make_shared creates the managed data and the managing object, and then returns the managing object. There is no performance overhead, since the function will most likely be inlined, and the constructors elided when returning the managing object.

Interestingly, make_shared allocates in one piece (i.e., with one memory allocation) the memory for the managed data and the control data structure, and then creates in place (i.e., without allocating memory) the managed data and the control data structure, which is faster than two separate memory allocations.

Conclusion

An object of class shared_ptr<T> allows for sharing data of type T that were dynamically allocated.
The objective: destroy the managed data exactly at the time the data is no longer needed.
A managing object of type shared_ptr is twice as large as a raw pointer.
We can easily pass the ownership from unique_ptr to shared_ptr, but not the other way around.

Quiz

What’s the difference between unique_ptr and shared_ptr?
What do we need a control data structure for?
Does the type of the managed data need to have some properties in order to be managed?

Acknowledgement

The project financed under the program of the Minister of Science and Higher Education under the name “Regional Initiative of Excellence” in the years 2019 - 2022 project number 020/RID/2018/19 the amount of financing 12,000,000 PLN.

cpp