cpp

Introduction

In C we use a function pointer to ship a piece of code (e.g., that establishes ordering between objects) to some other piece of code (e.g., a sorting function). In C++ a generalization of a function is something that we can call. We call it a callable. Calling a callable has the syntax of calling a function (i.e., applying the () operator), and the function interface: we know the types of the parameters and the return value.

The goal of that generalization is to:

avoid calling a function indirectly, i.e., using a pointer, so that a function being called can be inlined (i.e., compiled into the place of call),
allow for keeping some extra (which are not passed as call arguments) data, which we cannot do with a function pointer.

The standard library is using (passing, keeping as a member field) callables by value (i.e., not by reference), so copying a callable should be fast. Callables passed to standard algorithms (e.g., std::sort) and standard containers (e.g., std::priority_queue) should be copied fast. That means, a callable should not keep a lot of data that would be copied.

Callable is frequently used by value (e.g., by the standard library), but it can be used by reference or by pointer too.

Motivation

We sort integers below. We can do it, because integers have a built-in operator < defined:

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

int
main()
{
  vector v{3, 1, 2};
  sort(v.begin(), v.end());
  for(const auto &e: v)
    cout << e << endl;
}

Below we sort data of a user-defined type, and to this end we need to define the ordering (we could also say to establish ordering) between the objects with the comparison operator, i.e., the < operator. There are a number of comparison operators that we can define for a user-defined type: ==, !=, <, <=, >, =>, <=>, but for the ordering between elements (the objects of the user-defined type), the most important operator is the less-than operator, i.e., the < operator. We could refer to the less-than operator as the comparison operator, but to avoid confusion, we refer to it as the < operator.

We defined the < operator as a member function: this is the first object to compare, and the function parameter is the second. The member comparison operator should be declared as const (because it should not modify its object, i.e., this), and it should take the other operand by a const reference (because it should not change that operand).

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

struct A
{
  int id;
  A(int id): id(id)
  {
  }

  bool
  operator < (const A &a) const
  {
    return this->id < a.id;
  }
};

int
main()
{
  vector v{A(3), A(1), A(2)};
  sort(v.begin(), v.end());
  for(const auto &e: v)
    cout << e.id << endl;
}

Function std::sort is using the < operator by default, i.e., if we do not provide a comparison callable as the last call argument. Well, by default, std::sort is actually using an object of type std::less which in turn uses the < operator. For comparing the sorted elements, the implementation of std::sort does not use the < directly, because it would be stuck with it, and be unfit for customization. Instead, the implementation uses a callable for comparison.

No wonder that we can get the same effect as in the example above if we pass as the last argument an object of type std::less<A> (that’s a callable), because if we didn’t, it would be used anyway:

#include <algorithm>
#include <iostream>
#include <functional>

using namespace std;

struct A
{
  int id;
  A(int id): id(id)
  {
  }

  bool
  operator < (const A &a) const
  {
    return this->id < a.id;
  }
};

int
main()
{
  vector v{A(3), A(1), A(2)};
  sort(v.begin(), v.end(), std::less<A>{});
  for(const auto &e: v)
    cout << e.id << endl;
}

With std::sort, we do not always have to use the < operator, and instead we can pass a callable to establish ordering. We can sort in ascending order if we pass an object of the std::greater type. That type is using the > operator, and so we have to define it instead of the < operator.

#include <algorithm>
#include <iostream>
#include <functional>
#include <vector>

using namespace std;

struct A
{
  int id;
  A(int id): id(id)
  {
  }

  bool
  operator > (const A &a) const
  {
    return this->id > a.id;
  }
};

int
main()
{
  vector v{A(3), A(1), A(2)};
  sort(v.begin(), v.end(), std::greater<A>{});
  for(const auto &e: v)
    cout << e.id << endl;
}

A callable can be passed not only to a function, but also to a constructor, which can store the callable as a member field, as, e.g., the standard priority queue (type std::priority_queue) does. Here’s an example, which we’ll be modifying later:

#include <iostream>
#include <queue>

using namespace std;

int
main(void)
{
  priority_queue<int> q;

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

Callable types

A callable is either:

a function (used by pointer or reference),
a functor (used by value, reference or pointer).

A function

An expression that is only the name of a function, e.g., foo, (without the () operator) can be treated as the address of that function, which is called the decay of a function to a pointer. We can also take the address of a function by preceeding the name of the function with the & operator, e.g., &foo. These ways of taking a function address are equivalent, which is inconsistent, because there should be just one way.

Having this address, we can call the function. The only operations we can do with a function pointer is: take a function address, call the function.

In the example below we use a function by pointer and by reference.

#include <iostream>

bool
foo(const int &a, const int &b)
{
  std::cout << "foo: a = " << a << ", b = " << b << '\n';
  return true;
}

int
main()
{
  // Equivalent ways to call a function.
  foo(10, 20); // Is foo decaying into a pointer that we call?
  (&foo)(10, 20);

  // I don't like the C syntax of the pointer to a function.
  bool (*f1a)(const int &, const int &) = foo;
  bool (*f1b)(const int &, const int &) = &foo;
  f1a(10, 20);
  (*f1a)(10, 20);
  f1b(10, 20);
  (*f1b)(10, 20);

  // A reference to a function.
  bool (&f2a)(const int &, const int &) = foo;
  // A reference cannot be initialized with a pointer.
  // bool (&f2b)(const int &, const int &) = &foo;
  f2a(10, 20);
  // The following is wierd, but it compiles. f2a is an alias for foo.
  // So f2a is replaced with foo, and foo decays into a pointer, which
  // we dereference to get a function to call.
  (*f2a)(10, 20);
  
  // The C++ syntax for a function type.
  using foo_type = bool(const int &a, const int &b);
  // Pointers to a function.
  foo_type *f3a = foo;
  foo_type *f3b = &foo;
  // A reference to a function.
  foo_type &f3c = foo;
  f3a(10, 20);
  (*f3a)(10, 20);
  f3b(10, 20);
  (*f3b)(10, 20);
  f3c(10, 20);
  (*f3c)(10, 20);
}

Here we sort descendingly by passing a function pointer:

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

bool
cmp(const int &a, const int &b)
{
  return a > b;
}

int
main()
{
  vector v{3, 1, 2};
  sort(v.begin(), v.end(), cmp);
  for(const auto &e: v)
    cout << e << endl;
}

By default, the priority queue sorts descendlingly, i.e., the returns the largest elemenets. In the example below, we pass a pointer to a comparison function, so that the queue sorts ascendingly.

#include <functional>
#include <iostream>
#include <queue>

using namespace std;

bool
foo(const int &a, const int &b)
{
  return a > b;
}

int
main(void)
{
  priority_queue<int, vector<int>,
                 bool(*)(const int &, const int &)> q(foo);

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

A functor

A functor is an object on which we can call the () operator. The exact definition of the () operator depends on how we call a functor. In comparison with a function, a functor is more capable, because it can keep some extra data passed to its constructor.

In the example below we define a simple functor type, create a functor, and call it. The functor is a callable, because we can call it. Since the () operator was defined constant, we can call it for constant functors.

#include <iostream>

struct functor_type
{
  void operator()() const
  {
    std::cout << "Hello World from a functor!\n";
  }
};

int
main()
{
  const functor_type f;
  f();
}

We can store some data in a functor:

#include <iostream>

struct functor_type
{
  int m_n;

  functor_type(int n): m_n(n)
  {
  }

  void operator()() const
  {
    for(int i = 0; i < m_n; ++i)
      std::cout << "Hello World from a functor!\n";
  }
};

int
main()
{
  functor_type f(10);
  f();
}

A functor acts just like a function when it keeps no data. For example, types std::less and std::greater act like functions, because they do not store any data. Such functor types introduce no performance hit: the constructors and destructors are empty, so they generate no code, and the comparison function is inlined. It’s as fast as it can be.

By default, the priority queue uses std::less, and returns the largest element. To make the priority queue return the smallest element, we can pass it a functor of type std::greater:

#include <iostream>
#include <queue>
#include <vector>

using namespace std;

int
main(void)
{
  // HERE'S THE DIFFERENCE!
  priority_queue<int, vector<int>, greater<int>> q;

  // I was hoping the third template argument would be deduced from
  // the constructor argument, but, alas, no.
  // priority_queue<int> q(std::greater<int>{});
  
  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

Here we define a functor type, which acts like a function:

#include <iostream>
#include <queue>
#include <vector>

using namespace std;

struct CMP
{
  bool
  operator()(const int &a, const int &b)
  {
    return a > b;
  }
};

int
main(void)
{
  // An object of CMP will be default-constructed by q.
  priority_queue<int, vector<int>, CMP> q;

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

In the example below we pass to a functor a boolean value that is used by the call operator. This value is passed at run-time. This functionality couldn’t be achieved with a function.

#include <functional>
#include <iostream>
#include <queue>

using namespace std;

struct CMP
{
  bool m_order;

  CMP(bool order): m_order(order)
  {
  }

  bool
  operator()(const int &a, const int &b) const
  {
    return m_order ? a < b : a > b;
  }
};

int
main(void)
{
  bool order;

  cout << "Enter 0 or 1:";
  cin >> order;

  priority_queue<int, vector<int>, CMP> q(CMP{order});
  // The same as above.
  // priority_queue<int, vector<int>, CMP> q(order);

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }

  return 0;
}

Closure

A closure is a functor which is the result of a lambda expression. A lambda (in short for a lambda expression) is syntactic sugar for conveniently creating functors: they help us create functors with less fuss in comparison with creating a functor class. We could do away with lambda expressions, because functor classes have the same functionality. Lambdas are just handy.

Since a closure is an object, it must have some type, but we usually do not need it, and so we say that a closure is of an anonymous type. We can get the type of a closure with the decltype operator.

Syntax

Lambda expressions can be nuanced, and we’ll not cover all the nuances. However, most lambdas are of this syntax:

[capture list](parameter list) mutable {body}

The capture list and the parameter list are comma-separated. If the parameter list is empty, the () can be dropped. Even if the capture list and the body are empty, [] and {} cannot be dropped.

The capture list can have:

a declarator of the default capture policy: either = or &, but not both, e.g., [=], or [&], but not [&, =],
captured variable names that can, but do not have to be, preceded by declarator &, e.g., [&x],
variable initializations name-in-closure = variable-name that can, but do not have to be, preceded by &, e.g., [&x = y].

The parameter list is the list of function parameters, just like for a regular function.

The mutable specifier is optional. By default the member () operator function is defined constant, but we can make it non-const with the mutable specifier.

Therefore the simplest lambda is []{}. Here we create a closure and call it in one go (in one expression):

int
main()
{
  // These two are equivalent.
  [](){};
  []{};

  // Here we also call the closure: notice the trailing ().
  []{}();
}

Expression []{}(), which creates and calls a closure, is equivalent to this code:

int
main()
{
  // The block scope is here, so that object x behaves like a
  // temporary object.  Object x is of anonymous type.
  {
    struct
    {
      void
      operator()() const
      {
      }
    } x;
    x();
  }
}

Semantics

A lambda creates a functor type (a structure or a class), and an object of this type. These are the basic facts:

The captured variables are stored as fields in the functor, and are initialized by the constructor.
The parameter list becomes the parameter list of the member () operator function.
The member () operator function is const unless mutable is specified.
The body becomes the body of the member () operator function. The return type of that function is deduced based on the expression of the return statement in the body. If there is no return statement, the return type is void.

The capture list describes how to capture variables from the scope of the lambda expression, so that they are available in the body. The scope is the fragment of code where variables are accessible, which can be the global scope, the class scope, the function scope, and the block scope.

The capture list can be empty. In that case only the parameters of the parameter list are available in the body. Example:

#include <iostream>

int
main()
{
  int x = 1;

  // Would not compile, because x is not captured, and the parameter
  // list is empty.
  // []{x = 10;}();

  [](int &x){x = 10;}(x);
  std::cout << "x = " << x << std::endl;
}

The code above is equivalent to this code:

#include <iostream>

int
main()
{
  int x = 1;

  {
    struct
    {
      void
      operator()(int &x) const
      {
        x = 10;
      }
    } f;
    f(x);
  }

  std::cout << "x = " << x << std::endl;
}

Ways of capturing variables

A variable can be captured by value or by reference. When a variable is captured by value, the closure keeps in its member field a copy of the value of the captured variable, i.e., the member field was initialized by copying the value of the captured variable. To capture a variable by value, put its name in the capture list.

When a variable is captured by reference, the closure keeps in its member field a reference to the captured variable, i.e., the member reference was initialized with the captured variable. To capture a variable by reference, put its name in the capture list, and preceeded it by declarator &.

For example:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  // Variable x is captured by value, y by reference.
  [x, &y]() mutable {x = 10, y = 20;}();

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

The code above is equivalent to this code:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  {
    struct F
    {
      int x;
      int &y;

      F(int x, int &y): x(x), y(y)
      {
      }

      void
      operator()()
      {
        x = 10;
        y = 20;
      }
    };

    F f(x, y);
    f();
  }

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

Default capture policy

The capture list can begin with the default policy of capturing variables either by value or by reference. If a default capture policy is given, all variables used in the body are captured, and we do not have to list them.

We set the default capture-by-value policy with =. For example:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  // Capture every variable by value.
  [=]() mutable {x = 10; y = 20;}();

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

The code above is equivalent to this code:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  {
    struct F
    {
      int x;
      int y;

      F(int x, int y): x(x), y(y)
      {
      }

      void
      operator()()
      {
        x = 10;
        y = 20;
      }
    };

    F f(x, y);
    f();
  }

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

We set the default capture-by-reference policy with &. Please note that in the example below, and the next example too, the call operator can be const, because we are not modifying a member reference, but a variable to which the member reference is bound. Here’s an example:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  // Capture every variable by reference.
  [&]{x = 10; y = 20;}();

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

The code above is equivalent to this code:

#include <iostream>

int
main()
{
  int x = 1, y = 2;

  {
    struct F
    {
      int &x;
      int &y;

      F(int &x, int &y): x(x), y(y)
      {
      }

      void
      operator()() const
      {
        x = 10;
        y = 20;
      }
    };

    F f(x, y);
    f();
  }

  std::cout << "x = " << x << ", y = " << y << std::endl;
}

We can specify the default capture policy, and then list those variables that should be captured the other way. Also, for the member fields, we do not have to use the names of the captured variables, but we can give any name.

#include <iostream>

using namespace std;

int
main()
{
  {
    int x = 1, y = 2;

    // Because of the default capture-by-value policy, x is captured
    // by value, while y is captured by reference.
    [=, &y]() mutable {x = 10, y = 20;}();
    cout << "x = " << x << ", y = " << y << endl;
  }

  {
    int x = 1, y = 2;

    // Because of the default capture-by-reference policy, x is
    // captured by reference, while y is captured by value.
    [&, y]() mutable {x = 10, y = 20;}();
    cout << "x = " << x << ", y = " << y << endl;
  }

  {
    int x = 1, y = 2;

    // We name the captured variables differently: a, b.
    [a = x, &b = y]() mutable {a = 10, b = 20;}();
    cout << "x = " << x << ", y = " << y << endl;
  }
}

Closure Examples

Since a closure has some type, but we just don’t care about it, we write:

auto c = lambda expression;

By using the auto type, we let the compiler deduce (this process is called type deduction) the type of c based on the type of the closure expression. Even though there is the = sign, the line above is not an assignment expression, but an initialization statement. That entails that the closure is not copied, but created directly in the memory location of c with copy elision.

Here’s an example of using a lambda with a priority queue:

#include <functional>
#include <iostream>
#include <queue>
#include <vector>

using namespace std;

int
main(void)
{
  auto f = [](const int &a, const int &b){return a > b;};

  priority_queue<int, vector<int>, decltype(f)> q(f);

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

Here we pass an argument to a closure:

#include <functional>
#include <iostream>
#include <queue>
#include <vector>

using namespace std;

int
main(void)
{
  bool order;

  cout << "Enter 0 or 1: ";
  cin >> order;
  
  auto c = [order](const int &a, const int &b)
           {return order ? a < b : a > b;};

  priority_queue<int, vector<int>, decltype(c)> q(c);

  q.push(2);
  q.push(1);
  q.push(3);

  while(!q.empty())
    {
      cout << q.top() << endl;
      q.pop();
    }
  
  return 0;
}

Conclusion

A callable is a generalization of a function. A function and a functor are callables.
Unlike a function, a functor can store data.
Lambdas are nifty and succinct: we can create closures with little writing, and with less room for mistakes (lambdas are less error-prone).

Quiz

In what way can we pass a callable?
What’s the difference between a functor and a closure?
Are lambdas indispensable?

Acknowledgement

The project financed under the program of the Minister of Science and Higher Education under the name “Regional Initiative of Excellence” in the years 2019 - 2022 project number 020/RID/2018/19 the amount of financing 12,000,000 PLN.