Sign up for the KDAB Newsletter
Stay on top of the latest news, publications, events and more.
Go to Sign-up
Giuseppe D’Angelo
21 May 2024
In the last post of this series we started exploring how to erase an element from the middle of a vector.
erase()
.For instance, a vector could move-assign over the elements to be erased:Alternatively, a vector could use rotations or some other algorithms to move the elements to be destroyed to the end:
In both cases we're then left with an element in the rightmost position that needs to be destroyed. We can destroy it, update the vector's size, and that's it!
memmove
. If the type is trivially relocatable, QVector destroys the element to be erased, and moves bytes from the tail over:We concluded the post wondering: is this optimization legitimate? Why can QVector replace move assignments with a byte copy?
This doubt stems from the fact that our definition of relocation did not cover move assignments; it only covered move construction. Is Qt wrong? Is our definition wrong?
Let's start by analyzing erase()
's behavior once more.
Do you remember our claim that the specific strategy used does not really matter; that is, that they are all equivalent? Well, not so fast! It is actually quite imprecise to say that they are all equivalent.
They may be, as long as we deal with types which have value semantics. If we instead use a type that has reference semantics, the choices are absolutely not equivalent, and will yield different outcomes. This is because the semantics of assignment for (certain) reference types are write-through: they assign through the reference (instead of rebinding the reference).
Since we are implementing erasure in terms of assignments (or swaps, which boil down to assignments), this means that the precise sequence of operations done by erase will be visible due to its side-effects; and it also means that changing the strategy will produce different outcomes!
An example of a Standard Library type with these write-through reference semantics is std::tuple<int &>
.
When one assigns such a tuple over another one, the reference inside the tuple will assign-through. This semantic is absolutely deliberate; it is used, for instance, by std::tie
to build a tuple of references that can be used to unpack a given tuple:
int a; double b;
std::tie(a, b) = getTuple(); // std::tie produces std::tuple<int &, double &>
// assignment will write through, over a and b
For the sake of the example, instead of dealing with std::tuple
, let's strip it down to the bare minimum into a struct IntRef
, so that we can focus on the behavior that we care about:
struct IntRef {
int &m_ref;
IntRef &operator=(const IntRef &other)
{
// write-through semantics for assignment:
m_ref = other.m_ref;
return *this;
}
};
Here's an example of this in action:
int a = 1;
int b = 2;
IntRef ra{a};
IntRef rb{b};
ra = rb;
std::println("{}, {}", a, b); // prints 2, 2; we've "assigned through" ra.
The assignment on line 6 wrote 2 into the variable a
.
Note that I had to explicitly write an assignment operator for IntRef
, and implement the wanted semantics (write-through) myself. This is necessary because of the presence of a reference as a non-static data member (m_ref
): the implicitly-generated assignment operator(s) are actually deleted in this case. This default by C++ is actually good, as reference members are a bad idea, and if you have them in a class you should not support assignment.
Let's now generalize this to a std::vector<IntRef>
:
int a = 1, b = 2, c = 3, d = 4, e = 5;
IntRef ra{a}, rb{b}, rc{c}, rd{d}, re{e};
std::vector<IntRef> v{ra, rb, rc, rd, re};
v.erase(v.begin() + 1);
std::println("{}, {}, {}, {}, {}", a, b, c, d, e);
What does this code print? Which is another way of asking: "what happens when we erase the second element from v
?".
Now this is a very tricky question to answer; depending on the implementation strategy chosen, we may end up with a completely different outcome:
1, 3, 4, 5, 5
.std::rotate
.It could be 1, 3, 4, 5, 2
; but it could be something else as well.Please take a moment to convince yourselves of this :-) Pencil and paper helps, otherwise, here's a convenient Godbolt.
The conclusion is that the implementation strategy does matter for types with write-through reference semantics! This is a very different outcome than what we might have expected (or desired, even).
For this reason, the Standard Library clearly documents the behavior of std::vector::erase
in [vector.modifiers]:
Complexity: The destructor of T is called the number of times equal to the number of the elements erased, but the assignment operator of T is called the number of times equal to the number of elements in the vector after the erased elements.
This pretty much imposes a specific implementation strategy (move assign the tail to the left), so you can rely on the outcome, even in the presence of types with write-through reference semantics.
I'll also add an extra consideration: even if this behavior wasn't fully documented, due to Hyrum's law, we still couldn't "just change" the implementation of erase
, because existing code will be affected -- it will be able to tell the difference.
What about QVector? Does it document its erasure behavior? Well... let's talk about this later, shall we?
IntRef
trivially relocatable?Now, here's a million dollar question: would you say that IntRef
is trivially relocatable or that it's not?
Maybe you're tempted to say "it is not" based on what you've just seen -- IntRef
has a "strange" behavior, different from most types that we "normally" use. That should be enough to disqualify it, shouldn't it?
Let's not rush to the answer just yet! For the benefit of the discussion, I am going to repeat our current definition of relocability: to move construct an object, and destroy the source object. Relocability is trivial when these two operations can be realized by simply copying bytes.
For example:
int
is trivially relocatable. std::unique_ptr<T>
is trivially relocatable. QString
is trivially relocatable.std::string
is usually not trivially relocatable (self-referential due to SSO). std::list
may also not be (the first node and/or a sentinel node of the list may contain a pointer to the std::list
object itself).QObject
isn't relocatable (it is not movable at all).So, what about our IntRef
?
If we stick to this very definition the correct answer is that IntRef
is absolutely, unequivocally trivially relocatable. We can replace a move construction of an IntRef
object, followed by the destruction of the source, with a mere memcpy; the result is completely equivalent.
IntRef satisfies our definition of trivial relocability.
If you think about it, IntRef
simply contains a reference -- that is, a pointer. It's not very different from the memory layout of a std::unique_ptr
from this point of view, and we know we can relocate that.
IntRef
does show its reference semantics on assignment, but remember, our current definition of (trivial) relocability does not talk about assignments at all.
As a corollary, this fact means that it would be perfectly safe to grow a vector<IntRef>
using memcpy
when it needs a reallocation!
If IntRef
is trivially relocatable, can we erase an IntRef
object from a vector, using memmove
to do so?
Again, based on what we have seen, the answer is no, we cannot. Erasure is fully specified to use assignments, which yield side effects for a type like IntRef
. If we just moved bytes around, those side effects would not be realized, breaking our programs.
We have reached a contradiction in our design:
IntRef
is trivially relocatable as it satisfies our definition.memmove
, butIntRef
cannot be erased by copying bytes! It needs to be erased by using move assignments.To solve this contradiction, something has to give in:
IntRef
should not be allowed to be stored in a vector;memmove
when erasing objects of trivially relocatable type (but it can still optimize the reallocation of a vector);We will explore these possibilities in the next blog post. Stay tuned!
Overview about all installments:
About KDAB
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.
Stay on top of the latest news, publications, events and more.
Go to Sign-up
Upgrade your applications from Qt 5 to Qt 6 with KDAB’s migration services. Get a free migration assessment and join a hands-on workshop to prepare your team for a successful transition!
Learn more
Learn Modern C++
Our hands-on Modern C++ training courses are designed to quickly familiarize newcomers with the language. They also update professional C++ developers on the latest changes in the language and standard library introduced in recent C++ editions.
Learn more