Expression categories are fundamental, yet difficult to understand. It’s all about the details of the lvalues and rvalues, about which we don’t think in our daily programming.
To understand the meaning of the lvalues and rvalues, it’s best to go through this text, without searching for some deeper meaning at this time. Similar advice got Alice from Humpty Dumpty in the novel “Through the Looking-Glass” by Lewis Carroll:
“Must a name mean something?” Alice asks Humpty Dumpty, only to get this answer: “When I use a word… it means just what I choose it to mean – neither more nor less.”
An expression can be:
3.14
,x
,x + y
,foo(x)
.The value of an expression is the result of evaluating an expression.
An expression has:
a type (e.g., int
, bool
, class A
) known at compile time,
a value of the type (e.g., 5
, false
, A()
) known at run time,
a category (e.g., lvalue, rvalue) known at compile time.
Two expression categories introduced in the CPL language (about half a century ago) were:
lvalue: ``left of assignment’’ value, i.e., any expression that can go on the left of the assignment operator is an lvalue,
rvalue: ``right of assignment’’ value, i.e., any expression that can go on the right of the assignment operator is an rvalue.
CPL defined the lvalue and rvalue categories in relation to the assignment operator. These definitions are only of historical importance, and do not apply to C++.
In C, expressions are either lvalues (for locator value; a locator is something that locates (points to) the value, e.g., the name of a variable). In C, a non-lvalue is an expression that is not an lvalue. There is no rvalue in C!
C++98 adopted lvalues from C, and named the expressions that are not an lvalue as an rvalue.
In C++, the two most important categories of an expression are: the lvalue category and the rvalue category. In short, an lvalue is an expression of the lvalue category, and an rvalue is an expression of the rvalue category.
The expression category determines what we can do with the expression.
Some operations we can do only with an lvalue (e.g., &x
, i.e.,
taking the address of variable x
), other operations only with an
rvalue.
Example operations for expression <expr>
:
<expr> = 1
<reference type> y = <expr>
&<expr>
*<expr>
++<expr>
, <expr>++
You can look in vain for a concise and correct definition of lvalues and rvalues in the C++ standard. The C++ standard, which has about 1500 pages, defines them partially is various places, as needed.
Furthermore, in modern C++ new expression categories were introduced: prvalue, glvalue, and xvalue. However, the most important categories are still lvalue, and rvalue.
We need to learn the details of the lvalue and rvalue categories to understand and efficiently use the modern C++. For instance, the following is a statement from http://cppreference.com, which is hard to understand without knowing the lvalue and rvalue details:
Even if the variable’s type is an rvalue reference, the expression consisting of its name is an lvalue expression.
It’s hard to find a succinct definition in the C++ standard of the lvalue category, because the meaning of the lvalue category is spread all over the standard. But the following is a good description of the lvalue category.
If &<expr>
compiles, then <expr>
is an lvalue. That is, if we can
take the address of an expression, then this expression is an lvalue.
An expression with a variable name (e.g., x
) is always an
lvalue.
The examples of lvalues are:
x
foo
"Hello World!"
++i
The definition of the lvalue that anything that can go on the left of the assignment operator is an lvalue does not apply to C++. You can have an lvalue on the left of the assignment operator, and the code will not compile:
int main()
{
const int i = 1;
&i; // Expression "i" is an lvalue.
// &2; // Expression "2" is an rvalue.
// i = 2; // Error, even though "i" is an lvalue.
}
The assignment operator for the integral types expects an lvalue on
the left, so we cannot write 1 = 1
. Here is a more elaborate
example:
struct A
{
int m_t[3];
int
operator[](unsigned i)
{
return m_t[i];
}
};
int main()
{
A a1;
// The built-in assignment operator for integers expects an lvalue
// on the left-hand size. However, the overloaded operator[]
// function returns a non-reference type, and so its call expression
// is an rvalue. That's why the following equivalent lines of code
// do not compile.
// a1[0] = 1;
// a1.operator[](0) = 1;
}
An expression is an rvalue, if it’s not an lvalue. We can’t take the address of an rvalue.
The examples of rvalues are:
1
std::string("Hello World!")
i++
foo()
, if int foo();
The definition of the rvalue as something that should be on the right
of the assignment operator does not apply to C++. You can have an
rvalue on the left of the assignment operator, and the code will
compile. For instance, A()
is an rvalue (that creates a temporary
object), and we can assign to it, because we defined the assignment
operator in class A
:
int main()
{
struct A
{
void
operator = (int i)
{
}
};
A() = 1;
A().operator=(1);
}
The C++ standard defines this standard conversion, which is applied without the programmer explicitly requesting it:
An lvalue of a non-function, non-array type T can be converted to an rvalue.
For instance, the +
operator for an integer type (e.g., int
)
requires rvalues as its operands. In the following example the +
operator expects rvalues, and so the lvalues x
and y
are converted
to rvalues.
int main()
{
int x = 1, y = 2;
x + y;
}
For instance, the unary *
operator (i.e., the dereference operator)
requires a value of a memory address, which is an rvalue. However, we
can use the dereference operator with an lvalue too, because that
lvalue will be converted to an rvalue.
int main()
{
// The dereference operator requires an rvalue. The null pointer
// literal static_cast<int *>(0) is an rvalue.
*static_cast<int *>(0);
int x = 1;
int *p = &x;
*p; // OK: "p" is an lvalue, but converted to an rvalue.
}
There is no standard or implicit conversion from an rvalue to an
lvalue. For example, the reference operator (i.e., the unary &
operator, a.k.a. the take-the-address-of operator) expects an lvalue.
The rvalue that you try to pass will not be converted to an lvalue.
int main()
{
// static_cast<int *>(0) and nullptr are null-value literals of a
// pointer type. They both are rvalues.
// &static_cast<int *>(0); // Error: lvalue required.
// &nullptr; // Error: lvalue required.
}
The increment operator (i.e., the ++
operator) requires an lvalue as
its operand. This requirement applies to both the prefix and the
suffix versions of the operator. The same applies to the decrement
operator.
int main()
{
int x = 1;
++x; // The prefix version of the increment operator.
x++; // The suffix version of the increment operator.
// ++1; // Error: lvalue needed, no rvalue to lvalue conversion.
// 1++; // Error: lvalue needed, no rvalue to lvalue conversion.
}
The expression of the increment operator for built-in types is:
an lvalue for the prefix version of the operator, i.e., the
++<expr>
is an lvalue, because the prefix increment operator
returns a reference to the just-incremented object it got as an
operand,
an rvalue for the suffix version of the operator, i.e., the
<expr>++
is an rvalue, because the suffix increment operator
returns a temporary copy (which is an rvalue) of the object is got
as an operand.
Therefore ++++x
compiles, and x++++
doesn’t.
int main()
{
int x = 1;
++++x; // OK: ++x is an lvalue, and ++ wants an lvalue.
// x++++; // Error: x++ is an rvalue, and ++ wants an lvalue.
}
As a side note:
the prefix version has lower precedence than the suffix version,
the prefix version has the right-to-left associativity,
the suffix version has the left-to-right associativity.
In the example below, std::string
has the suffix increment operator
defined. The loop with the prefix operator would be more complicated.
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
// We have to define the function as non-member, because we cannot
// modify type std::string.
string
operator++(string &s, int)
{
string tmp = s;
next_permutation(s.begin(), s.end());
return tmp;
}
int main()
{
cout << "Permutations for abc:" << endl;
for(string i = "abc"; i++ != "cba";)
cout << i << endl;
}
A temporary object (or just a temporary) is an object that is created when an expression is evaluated. A temporary is automatically destroyed (i.e., you don’t need to explicitly destroy it) when it is not needed anymore.
A temporary is needed when:
evaluating an operation: 1 + 2
, string("T") + "4"
when passing an argument to a function: foo(A())
when returning an object from a function: string x = foo();
throwing an exception: throw A();
A temporary is an object, not an expression, and so a temporary is neither an lvalue nor an rvalue, because an object has no category of expression. An object is used in an expression that is either an lvalue or an rvalue. Usually a temporary is created in rvalue expressions.
An expression with a temporary can be an argument of a function call, in which case that expression is an rvalue. If a function takes an argument by reference (i.e., the parameter of the function is of a const reference type), the expression with that parameter name is an lvalue even though the reference is bound to an rvalue.
That example follows. The constructor outputs the address of the
object, so that we can make sure it’s the same object in function
foo
.
#include <iostream>
struct A
{
A()
{
std::cout << "ctor: " << this << std::endl;
}
};
// "a" is a parameter of a const reference type.
void
foo(const A &a)
{
// "a" is an lvalue.
std::cout << "foo: " << &a << std::endl;
}
int main()
{
// "A()" is an rvalue.
foo(A());
}
An expression with a temporary can be an argument of the throw
instruction, in which case that expression is an rvalue. If a catch
block catches the exception by a reference, the expression with that
reference name is an lvalue even though the reference is bound to an
rvalue.
That example follows. The constructor outputs the address of the object, so that we can make sure it’s the same object in the catch block:
#include <iostream>
int main()
{
struct A
{
A()
{
std::cout << "ctor: " << this << std::endl;
}
};
try
{
// "A()" is an rvalue.
throw A();
}
// Catch the exception by reference. It's a non-const reference!
catch (A &a)
{
// "a" is an lvalue.
std::cout << "catch: " << &a << std::endl;
}
}
We should catch an exception by reference: if we catch it by value, we’re going to copy that exception. Change the example so that an exception is caught by value, and you’ll see that we get a copy (you’ll see different addresses).
Interestingly, and as a side note: in the example above, that
non-const reference is bound to an rvalue. C++98 states that only a
const reference can bind to an rvalue, which does not hold in the case
of catching an exception. In the example above, I would expect
catch(A &a)
to fail to compile, as it should be catch(const A &a)
.
Wierd.
Interestingly, and as a side note, a statement block (i.e.,
{<statements>}
), can be replaced with a single statement, e.g.,
{++i;}
can be replaced with ++i;
. However, the try and catch
blocks always have to be blocks, and you cannot remote {}
even if it
has a single statement. Wierd.
A member function can be called for both an lvalue or an rvalue.
However, a member function can be declared with a reference qualifier
&
or &&
(and therefore be ref-qualified), so that it can be called
for either an lvalue or an rvalue. Example:
int main()
{
struct A
{
void
operator = (int i) &
{
}
void
operator = (int i) && = delete;
};
A a;
a = 1;
// Does not compile, because the overload declared deleted.
// A() = 1;
}
Function foo
, (e.g., void foo(<params>)
) can be used in an
expression in two ways:
by name only:
the expression: foo
,
that expression is an lvalue,
we can take the address of that expression: &foo
,
by a function call:
the expression: foo(<args>)
,
the category of that expression expression depends on the return type of the function called:
if the return type is a reference type, then that expression is an lvalue,
if the return type is not a reference type, then that expression is an rvalue.
This is an example of a function call that is an lvalue:
int &
loo()
{
// FYI: It compiles even if we remove the return statement below!
return *static_cast<int *>(0);
}
int main()
{
&loo(); // OK: "loo()" is an lvalue.
int &l = loo(); // OK: "loo()" is an lvalue.
}
This is an example of a function call that is an rvalue:
int
roo()
{
return 0;
}
int main()
{
// &roo(); // Error: "roo()" is an rvalue.
// int &r = roo(); // Error: "roo()" is an rvalue.
}
An incomplete type is the type that was either:
declared, but not defined,
or defined as an abstract class.
Expressions of the incomplete type can be only lvalues (and so rvalues can be only of complete types).
class B;
B &
boo()
{
return *static_cast<B *>(0);
}
int main()
{
&boo(); // OK: "boo()" is an lvalue.
// B(); // Error: expression "B()" is an rvalue.
}
An expression has a category. A value of some type (e.g., of class
A
or type int
) has no category.
What we can do with an expression depends on its category.
Every expression is either an lvalue or an rvalue.
We covered only the basics, there is more: glvalue, prvalue, xvalue.
Can an expression be both an lvalue and an rvalue at the same time?
Is a temporary object an rvalue?
What does the category of the function-call expression depend on?
Why does int i; ++i++;
not compile?
The project financed under the program of the Minister of Science and Higher Education under the name “Regional Initiative of Excellence” in the years 2019 - 2022 project number 020/RID/2018/19 the amount of financing 12,000,000 PLN.