cpp

Introduction

Expression categories are fundamental, yet difficult to understand. It’s all about the details of the lvalues and rvalues, about which we don’t think in our daily programming.

To understand the meaning of the lvalues and rvalues, it’s best to go through this text, without searching for some deeper meaning at this time. Similar advice got Alice from Humpty Dumpty in the novel “Through the Looking-Glass” by Lewis Carroll:

“Must a name mean something?” Alice asks Humpty Dumpty, only to get this answer: “When I use a word… it means just what I choose it to mean – neither more nor less.”

The value of an expression

An expression can be:

The value of an expression is the result of evaluating an expression.

The value of an expression has:

History: CPL, C, C++98

Two expression categories introduced in the CPL language (about half a century ago) were:

CPL defined the lvalue and rvalue categories in relation to the assignment operator. These definitions are only of historical importance, and do not apply to C++.

In C, expressions are either lvalues (for locator value; a locator is something that locates (points to) the value, e.g., the name of a variable). In C, a non-lvalue is an expression that is not an lvalue. There is no rvalue in C!

C++98 adopted lvalues from C, and named the expressions that are not an lvalue as an rvalue.

Details

Category of an expression

In C++, the two most important categories of an expression are: the lvalue category and the rvalue category. In short, an lvalue is an expression of the lvalue category, and an rvalue is an expression of the rvalue category.

The expression category determines what we can do with the expression. Some operations we can do only with an lvalue (e.g., &x, i.e., taking the address of variable x), other operations only with an rvalue.

Example operations for expression <expr>:

The definitions of lvalues and rvalues

You can look in vain for a concise and correct definition of lvalues and rvalues in the C++ standard. The C++ standard, which has about 1500 pages, defines them partially is various places, as needed.

Furthermore, in modern C++ new expression categories were introduced: prvalue, glvalue, and xvalue. However, the most important categories are still lvalue, and rvalue.

We need to learn the details of the lvalue and rvalue categories to understand and efficiently use the modern C++. For instance, the following is a statement from http://cppreference.com, which is hard to understand without knowing the lvalue and rvalue details:

Even if the variable’s type is an rvalue reference, the expression consisting of its name is an lvalue expression.

The lvalue category

It’s hard to find a succinct definition in the C++ standard of the lvalue category, because the meaning of the lvalue category is spread all over the standard. But the following is a good description of the lvalue category.

If &<expr> compiles, then <expr> is an lvalue. That is, if we can take the address of an expression, then this expression is an lvalue.

An expression with a variable name (e.g., x) is always an lvalue.

The examples of lvalues are:

The definition of the lvalue that anything that can go on the left of the assignment operator is an lvalue does not apply to C++. You can have an lvalue on the left of the assignment operator, and the code will not compile:

int main()
{
  const int i = 1;

  &i; // Expression "i" is an lvalue.
  // &2; // Expression "2" is an rvalue.

  // i = 2; // Error, even though "i" is an lvalue.
}

The rvalue category

An expression is an rvalue, if it’s not an lvalue. We can’t take the address of an rvalue.

The examples of rvalues are:

The definition of the rvalue as something that should be on the right of the assignment operator does not apply to C++. You can have an rvalue on the left of the assignment operator, and the code will compile. For instance, A() is an rvalue (that creates a temporary object), and we can assign to it, because we defined the assignment operator in class A:

int main()
{
  struct A
  {
    void
    operator = (int i)
    {
    }
  };

  A() = 1;
  A().operator=(1);
}

From lvalue to rvalue

The C++ standard defines this standard conversion, which is applied without the programmer explicitly requesting it:

An lvalue of a non-function, non-array type T can be converted to an rvalue.

For instance, the + operator for an integer type (e.g., int) requires rvalues as its operands. In the following example the + operator expects rvalues, and so the lvalues x and y are converted to rvalues.

int main()
{
  int x = 1, y = 2;
  x + y;
}

For instance, the unary * operator (i.e., the dereference operator) requires a value of a memory address, which is an rvalue. However, we can use the dereference operator with an lvalue too, because that lvalue will be converted to an rvalue.

int main()
{
  // The dereference operator requires an rvalue.
  *static_cast<int *>(0); // OK: A null pointer literal, an rvalue.

  int x = 1;
  int *p = &x;
  *p; // OK: "p" is an lvalue, but converted to an rvalue.
}

There is no standard or implicit conversion from an rvalue to an lvalue. For example, the reference operator (i.e., the unary & operator, a.k.a. the take-the-address-of operator) expects an lvalue. The rvalue that you try to pass will not be converted to an lvalue.

int main()
{
  int *p = static_cast<int *>(0);
  // &static_cast<int *>(0); // Error: lvalue required.
  // &nullptr; // Error: lvalue required.
}

Example of the increment operator

The increment operator (i.e., the ++ operator) requires an lvalue as its operand. This requirement applies to both the prefix and the suffix versions of the operator.

int main()
{
  int x = 1;
  ++x; // The prefix version of the increment operator.
  x++; // The suffix version of the increment operator.
  // ++1; // Error: lvalue needed, no rvalue to lvalue conversion.
  // 1++; // Error: lvalue needed, no rvalue to lvalue conversion.
}

The expression of the increment operator is:

Therefore ++++x compiles, and x++++ doesn’t.

int main()
{
  int x = 1;
  ++++x; // OK: ++x is an lvalue, and ++ wants an lvalue.
  // x++++; // Error: x++ is an rvalue, and ++ wants an lvalue.
}

As a side note, the prefix increment operator has the right-to-left associativity, while the suffix increment operator has the left-to-right associativity.

The same applies to the decrement operator.

Temporary objects

A temporary object (or just a temporary) is an object that is created when an expression is evaluated. A temporary is automatically destroyed (i.e., you don’t need to explicitly destroy it) when it is not needed anymore.

A temporary is needed when:

A temporary is an object, not an expression, and so a temporary is neither an lvalue nor an rvalue, because an object has no category of expression. An object is used in an expression that is either an lvalue or an rvalue. Usually a temporary is created in rvalue expressions.

A temporary as a function argument

An expression with a temporary can be an argument of a function call, in which case that expression is an rvalue. If a function takes an argument by reference (i.e., the parameter of the function is of a const reference type), the expression with that parameter name is an lvalue even though the reference is bound to an rvalue.

That example follows. The constructor outputs the address of the object, so that we can make sure it’s the same object in function foo.

#include <iostream>

struct A
{
  A()
  {
    std::cout << "ctor: " << this << std::endl;
  }
};

// "a" is a paremeter of a const reference type.
void
foo(const A &a)
{
  // "a" is an lvalue.
  std::cout << "foo: " << &a << std::endl;
}

int main()
{
  // "A()" is an rvalue.
  foo(A());
}

A temporary as an exception

An expression with a temporary can be an argument of the throw instruction, in which case that expression is an rvalue. If a catch block catches the exception by a reference, the expression with that reference name is an lvalue even though the reference is bound to an rvalue.

That example follows. The constructor outputs the address of the object, so that we can make sure it’s the same object in the catch block:

#include <iostream>

int main()
{
  struct A
  {
    A()
    {
      std::cout << "ctor: " << this << std::endl;
    }
  };

  try
    {
      // "A()" is an rvalue.
      throw A();
    }
  // Catch the exception by reference.  It's a non-const reference!
  catch (A &a)
    {
      // "a" is an lvalue.
      std::cout << "catch: " << &a << std::endl;
    }
}

We should catch an exception by reference: if we catch it by value, we’re going to copy that exception. Change the example so that an exception is caught by value, and you’ll see that we get a copy (you’ll see different addresses).

Interestingly, and as a side note: in the example above, that non-const reference is bound to an rvalue. C++98 states that only a const reference can bind to an rvalue, which does not hold in the case of catching an exception. In the example above, I would expect catch(A &a) to fail to compile, as it should be catch(const A &a). Wierd.

Interestingly, and as a side note, a statement block (i.e., {<statements>}), can be replaced with a single statement, e.g., {++i;} can be replaced with ++i;. However, the try and catch blocks always have to be blocks, and you cannot remote {} even if it has a single statement. Wierd.

Functions and categories of expressions

Function foo, (e.g., void foo(<params>)) can be used in an expression in two ways:

This is an example of a function call that is an lvalue:

int &
loo()
{
  // It compiles even without the return statement!
  return *static_cast<int *>(0);
}

int main()
{
  &loo(); // OK: "loo()" is an lvalue.
  int &l = loo(); // OK: "loo()" is an lvalue.
}

This is an example of a function call that is an rvalue:

int
roo()
{
  return 0;
}

int main()
{
   // &roo(); // Error: "roo()" is an rvalue.
   // int &r = roo(); // Error: "roo()" is an rvalue.
}

Incomplete types and categories of expressions

An incomplete type is the type that was either:

Expressions of the incomplete type can be only lvalues (and so rvalues can be only of complete types).

class B;

B &
boo()
{
  return *static_cast<B *>(0);
}

int main()
{
  &boo(); // OK: "boo()" is an lvalue.
  // B(); // Error: expression "B()" is an rvalue.
}

Conclusion

An expression has a category. A value of some type (e.g., of class A or type int) has no category.

What we can do with an expression depends on its category.

Every expression is either an lvalue or an rvalue.

We covered only the basics, there is more: glvalue, prvalue, xvalue.