Sign up for the KDAB Newsletter
Stay on top of the latest news, publications, events and more.
Go to Sign-up
C++23 is feature complete and on track to be released next year. While many people are complaining that it's, after all, a "minor" release (as the pandemic made the Committee work very difficult), C++23 still has a few very significant changes.
In this blog post, I want to talk about what I think is my favorite feature of C++23: the really
keyword. This is a brand new keyword that can be used in a number of different scenarios. Let's explore them together to understand the usefulness of it.
Consider a function like:
void f(int x);
Unfortunately, we all know too well that C++ allows you to call it with non-integer arguments:
f(1234); // OK, intended: called with an int
f(3.14); // Not really intended (called with double), still legal
If you're lucky, your compiler may give you a warning. Otherwise, the calls with non-int
(double
, long
, ...) will be valid, and this may cause unexpected data losses and/or undefined behavior. Could it be possible to have a compile-time error instead?
C++11 and later gave us different strategies:
// C++11:
void f(int x);
template <typename T> void f(T) = delete;
f(42); // OK
f(3.14); // ERROR
If you don't want to re-declare your functions twice, you can also use SFINAE in C++11 or, in C++20, concepts:
// C++11/14, SFINAE
template <typename T>
std::enable_if_t<std::is_same_v<T, int>> f(T x);
// C++20: concepts
void f(std::same_as<int> auto x);
While the C++20 version is a huge readability improvement over the SFINAE version, it's still a mouthful. Also, this makes the function a function template, so now it can't be out-of-line, it impacts our compilation times, etc.
In C++23 this will be possible by adding really
:
void f(really int x);
f(42); // OK
f(3.14); // ERROR
Simple and effective! You can also apply it to ordinary variables or to return types:
really int i = getInt(); // OK
really int j = getDouble(); // ERROR
const
Consider this code:
class String {
const char *data;
int len;
public:
void a(int) const;
void b() const;
void c();
};
void String::c()
{
a(len * 42);
a(len * 42);
b();
}
It may look a bit silly, but stay with me. The disassembly of c()
(with optimizations enabled) is:
String::c():
push rbp
imul esi, DWORD PTR [rdi+8], 42
mov rbp, rdi
call String::a(int) const
imul esi, DWORD PTR [rbp+8], 42
mov rdi, rbp
call String::a(int) const
mov rdi, rbp
pop rbp
jmp String::b() const
The compiler is doing something quite peculiar here: it's multiplying len
by 42, calling a()
with the result of the calculation, then it multiplies len
again in order to call a()
a second time. How come it's doing the same (relatively) expensive operation twice, rather than simply storing the result and using it for both calls? a()
is a const
member function, after all, so it can't change len --
or can it?
Well, the short answer is that the compiler is not allowed to do otherwise: the value of len
could've been indeed changed by the call to a()
. Maybe a()
ends up calling something else, and that something else changes *this
. Or maybe a()
is just doing a const_cast<String *>(this)
and then mutating the result. It doesn't change the result: the compiler is forced to reload the data members after a member function call as it can't assume that they haven't been changed. The fact that the target member function is const
does not help the compiler in any way.
However, if we declare a()
as really const
:
class String {
const char *data;
int len;
public:
void a(int) really const;
void b() const;
void c();
};
Then we get this improved codegen:
String::c():
push r12
push rbp
mov rbp, rdi
sub rsp, 8
imul r12d, DWORD PTR [rdi+8], 42
mov esi, r12d
call String::a(int) const
mov rdi, rbp
mov esi, r12d
call String::a(int) const
add rsp, 8
mov rdi, rbp
pop rbp
pop r12
jmp String::b() const
It may look longer, but we have a couple of extra mov
in lieu of an imul
. The multiplication result is stored and reused for the second call. And that's a huge win!
What really const
does is makes the compiler consider a
work on an object declared const. Such objects cannot be mutated, not even by stripping their constness away (that's undefined behavior). This causes better codegen; now the compiler knows that calling a
cannot possibly mutate *this
.
Suppose you have an enumeration like this:
enum class MyEnum {
E1, E2, E3
};
Now, suppose you have a function that takes a value of this enumeration. Correctly, you use a switch
over the value, in order to handle all the possible cases:
int calculate(MyEnum e)
{
switch (e) {
case MyEnum::E1: return 42;
case MyEnum::E2: return 51;
case MyEnum::E3: return 123;
}
}
Now, for some reason, your compiler will complain about calculate
. Specifically written like this, you will get a warning regarding the possibility for control to reach the end of the function without hitting a return
statement, and the function does not return void
.
How in the world is that possible? There's clearly a switch
in there that is covering all the possible enumerators!
Defeated, you'll change your function to something like this:
int calculate(MyEnum e)
{
switch (e) {
case MyEnum::E1: return 42;
case MyEnum::E2: return 51;
case MyEnum::E3: return 123;
// for the love of kittens, DO NOT add a default: !!!
}
assert(false);
return -1; // impossible
}
(Yes, do NOT add a default:
to the switch -- a default label will prevent the compiler from warning you that the switch is no longer covering all the enumerators, should you decide some day to extend the enumeration.)
So how is "the impossible" actually possible? Is the compiler wrong?
No, the compiler is right. There is the possibility of passing a value which isn't handled by the switch; and that's because C++ allows casting integers to enumerations, as long as the integer "fits" in the enumerator's range. See the wording in [expr.static.cast]:
That means that this is 100% legal C++:
void bad() {
MyEnum nonsense = static_cast<MyEnum>(-1);
int result = calculate(nonsense); // whops!
}
But just how often is this feature really useful? While, in general, allowing conversions from integer to enumerations makes sense, people are not really supposed to cast arbitrary integers into an enumeration's type.
Enter really enum
, or, of course, enum really
as it needs to be spelled:
enum class really MyEnum {
E1, E2, E3
};
int calculate(MyEnum e)
{
switch (e) {
case MyEnum::E1: return 42;
case MyEnum::E2: return 51;
case MyEnum::E3: return 123;
}
// no warning, all cases are handled
}
void bad() {
MyEnum nonsense = static_cast<MyEnum>(-1); // UB, -1 is not an enumerator
}
Basically, a really enum
is just like an ordinary enum
(or enum class
, like in this case), except that converting integers to it requires the integer to match the value of one of the enumerators -- otherwise, undefined behavior. The UB is not really worse than what we had before (walking out of a function without returning) or crashing due to the failed assert. On top if that, thanks to things like ubsan, we can catch the problem (= the illegal conversion) at its source rather than later, when it causes havok.
Let's face it: templates are hard. And deduction rules for function templates, combined with value categories, are even harder!
But let's start with a simple example. Any proficient C++ developer should know that it's possible to overload a function f
for lvalues and rvalues:
void f(const std::string &s); // 1) lvalue reference
void f(std::string &&s); // 2) rvalue reference
std::string str = "hello";
f(str); // calls 1
f("hello"s); // calls 2
f(std::move(str)); // calls 2
Now suppose you want to make your f
generic, so it can work on std::string
but also on other string types. That, of course, means turning f
into a function template.
Easy enough, right? Well, raise your hand if you ever fell into this trap:
template <typename T> void f(const T &obj); // 1) lvalue reference
template <typename T> void f(T &&obj); // 2) rvalue reference...?
The double ampersand still means rvalue reference, doesn't it? Turns out that, well, yes, but actually no. Since T
is deduced, it doesn't; it now means forwarding (or universal) reference. This code:
std::string s = "hello";
f(s); // call f on a lvalue
is now actually calling overload n. 2 (!!!), after deducing T = std::string &
.
This is incredibly annoying and error-prone: the very same syntax, without deduced template arguments, works correctly. Reusing the same syntax with totally different meanings is a major annoying point (just think about the teachability of such a feature...).
In order to force the second overload only to take rvalue references, we have to again deploy SFINAE or similar techniques:
template <typename T> void f(const T &obj); // lvalue reference
template <typename T>
std::enable_if_t<!std::is_reference_v<T>>
f(T &&obj); // T must not be a reference => T&& is a rvalue reference
I mean... really?! (pun intended) I bet that if we had a time machine, someone would surely propose the usage of different syntax (a triple ampersand?) to specify forwarding references and remove the entire problem.
Luckily for us, in C++23, we can do this:
template <typename T> void f(const T really &obj); // 1) lvalue reference
template <typename T> void f(T really &&obj); // 2) rvalue reference! not forwarding reference
This brings back the "traditional" rules without any of the verbose SFINAE syntax. Technically speaking, it's not needed on 1), but I like the symmetry.
Really Operator Auto
A problem that sometimes affects Qt users is the following:
QString getString();
void print(QString);
void f() {
auto result = QString("The result is: ") + getString();
print(result);
}
What's the type of result
? You may think it's a QString
-- after all, it's being obtained by concatenating (via operator+) two QString objects.
That's actually correct in some cases, but in (many) others, it's not accurate. Generally speaking, result
is actually an object of type QStringBuilder
.
What's QStringBuilder
, you ask? It's a private class in Qt that is employed to optimize string concatenations. A concatenation like this:
QString a, b, c, d;
QString result = a + b + c + d;
may cause up to 3 memory allocations, if done naively: first, calculate a + b
(allocate); then, append c
to the result (allocate again); then, append d
to the final result (allocate again).
QStringBuilder
instead considers the entire "sequence" of operations. When QStringBuilder
is in use, a + b
does not return a QString directly. It returns a QStringBuilder
that remembers a
and b
(It simply internally stores references to them). Note that no allocations (for the result) or copies are taking place yet.
Then, appending c
to the intermediate QStringBuilder
yields another QStringBuilder
that now remembers c,
too. Finally, d
is appended, yielding one final QStringBuilder
.
This final object is then converted to a QString
. That's where the magic kicks in: QStringBuilder
can now perform one memory allocation of the right size (by asking all the other strings about their sizes, as it's keeping references to all of them), copy their contents in order to perform the concatenation, and produce result
.
So what's wrong in the first example? The answer is the usage of auto
:
auto result = QString("The result is: ") + getString(); // whops!
Here, a QStringBuilder
is produced in order to concatenate the two strings. But it's never converted to a QString. Instead, result
is an object of type QStringBuilder
itself!
That is a problem because, as mentioned before, QStringBuilder
merely references the strings that it is supposed to concatenate. These strings are temporaries that will get destroyed at the end of the statement. That means that we now have a QStringBuilder
loaded with dangling references!
Again, with a time machine, the solution could be to allow like this:
class QStringBuilder
{
// invoked when assigning the result to `auto` variable...?
operator auto() const { return QString(*this); }
};
Although operator auto()
exists, it's not what the comment above says. operator auto
is nothing but an "ordinary" conversion operator, combined with auto
deduction for the return type of a function. In other words, the above is just declaring an operator QString() const
-- but we already have that one inside QStringBuilder
!
Instead, in C++23 we can do this:
class QStringBuilder
{
// invoked when assigning the result to `auto` variable.
really operator auto() const { return QString(*this); }
};
Note that the position of really
is important. Otherwise, it applies to the conversion target type, just like I've shown before:
class C
{
operator really int() const;
};
C c;
int i = c; // OK
double d = c; // ERROR
And that's it! I hope you liked this little presentation about the new keyword. I can't wait to start working on C++23 projects to try this one out (and also the other goodies, which you can check out on cppreference).
If you want to know more about really
, feel free to check out the original proposal here.
See you next time!
About KDAB
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.
Stay on top of the latest news, publications, events and more.
Go to Sign-up
Learn Modern C++
Our hands-on Modern C++ training courses are designed to quickly familiarize newcomers with the language. They also update professional C++ developers on the latest changes in the language and standard library introduced in recent C++ editions.
Learn more
12 Comments
1 - Apr - 2022
RJ
This is really great news! Really hope it really happens.
1 - Apr - 2022
Bart
Happy April's Fools.
1 - Apr - 2022
Soroush Rabiei
In the last committee meeting, I remember someone suggested renaming 'really' to 'indeed'. I hope this will be approved before next April.
1 - Apr - 2022
RJ
You're missing the "honestly" proposal. Unlike really, honestly would also allow conversions between types that sound different but are honestly the same, in behavior and layout.
For example, on a MS Windows machine, a long int is honestly an int.
This would also allow conversions from char8_t, which is honestly an unsigned char.
Both "really" and "honestly" could coexist. And honestly, really and honestly would be nice to have.
1 - Apr - 2022
Yacob Cohen-Arazi
I hate this date! ;)
2 - Apr - 2022
nyanpasu64
Well that's already the case since MyEnum doesn't have a fixed underlying type (: int), as ubsan impolitely pointed out to me one day when casting a bit flag to an enum.
2 - Apr - 2022
Giuseppe D'Angelo
MyEnum does in fact have a fixed underlying type because it's a scoped enumeration. See https://eel.is/c++draft/dcl.enum#5:
9 - Apr - 2022
alagner
Didn't you mean
std::enable_if_t<!std::is_lvalue_reference_v>
instead ofstd::enable_if_t<!std::is_reference_v>
?10 - Apr - 2022
Giuseppe D'Angelo
Well spotted, and, pedantically, yes -- although usually the idea is to let the template type argument to be deduced (most things, like the std algorithms, don't want users to specify the template arguments).
8 - May - 2022
LorenDB
Now I want this to be a thing. A
really
keyword would be really (ha!) infinitely handy.22 - Jun - 2022
Sebastian
Shouldn't syntax highlighting be improved so that 'really' is coloured as a keyword?
22 - Jun - 2022
Giuseppe D'Angelo
Absolutely. It already does, but it works only on April 1st, 2023...