Qt and Trivial Relocation (Part 4)

On trivial relocation and move assignments

23 May 2024

Changing the behavior of vector erasure

As we have already discussed in the previous blog posts, it is possible to implement erasure in a number of ways, which are not equivalent. The Standard Library chose a specific implementation strategy (move-assign the elements after the ones to be destroyed to the left; destroy the moved-from last elements). Changing it now, over 26 years after the fact, sounds extremely scary; std::vector is such a central class that surely such a change would break somebody's code.

That doesn't mean that we can't at least reason about a possible change there!

Also, there is another library that we keep talking about in these blog posts. This other library has a much smaller userbase than the Standard Library, and that has fewer regards with breaking backwards compatibility. We should certainly reason about that library as well. I'm talking about Qt, of course :)

Change the behavior of vector erasure what, exactly?

In order to reconcile our definition of (trivial) relocatability with the behavior of erase, we would need to implement erase in terms of move constructions and destructions.

Thankfully, this can be done. Suppose again we want to erase the second element from a vector:

Same problem: how to erase B from the vector?

We can implement this by destroying the second element, and move-constructing the third element over it:

First steps: B gets destroyed; C is move-constructed over it.

Then we destroy the moved-from third element, and move-construct the fourth element over it.

Destroy the moved-from C object, move-construct D over it

We keep going until the end, where we have just a moved-from element to destroy:

Destroy moved-from D. It's the last element, nothing else to move around. Update bookkeeping and we're done.

Erasure can use relocation

The fact that this particular implementation strategy is viable gives us some very interesting insights.

Let's look at the sequence of operations once again:

Destroy the element at position 1 (B)
Move construct the element at position 2 into position 1 (C)
Destroy the element at position 2 (moved-from C)
Move construct the element at position 3 into position 2 (D)
Destroy the element at position 3 (moved-from D)

We can look at this sequence in two different ways:

firstly, as pairs of "destroy + move construct" operations, leaving a last element to destroy (1+2, 3+4, 5)
or, more interestingly, as an initial destruction, followed by a series of relocations. (1, 2+3, 4+5).

Move assignments for value types

Let's analyze the first way: pairs of "destruction + move construction" operations. This analysis will allow us to grasp the fundamental difference between value types (like QString, or std::unique_ptr<T>) from reference types (like std::tuple<int &>).

We have already established that for a value type it does not matter what the erasure strategy is; the outcome is the same. For a write-through reference type, the strategy matters; using move assignments instead of destructions/move constructions pairs will yield a different result.

We can now appreciate the difference: for a value type, a move assignment is equivalent to destruction of the target, followed by move construction. This is why, when dealing with value types, using the above strategy for erasure gives us the same results as a strategy based on move assignments (like the one that std::vector uses).

For a write-through reference type, this is not true: a move assignment is not equivalent to destroying the target and move-constructing the source in its place. Again, consider std::tuple<int &>:

int a = 1;
int b = 2;

using T = std::tuple<int &>;
T ta(a);
T tb(b); 

#if USE_MOVE_ASSIGN
// move-assignment from tb
ta = std::move(tb);          // a = 2, b = 2
#else
// destroy ta + move-construct from tb
ta.~T();
new (&ta) T(std::move(tb));  // a = 1, b = 2
#endif

It is worth noticing the behavior of move assigments is completely orthogonal to whether a type is trivially relocatable or not. Both QString and std::tuple<int &>: are trivially relocatable (according to our prior definition), but QString has value semantics, std::tuple<int &>: has not. In other words, this is another property of a type which cannot be "detected" from the outside the type itself (just like trivial relocation cannot be detected).

Trivial relocation and erasure

Let's now tackle the second way of looking at erasure: destruction of an element, followed by a series of relocations.

This means that if a type is trivially relocatable, then the erasure strategy shown above can be implemented by an initial destruction followed by a memory copy for each subsequent element. Of course, these byte copies can be coalesced in a single memmove of the tail:

This is precisely what Qt does when erasing elements of a trivially relocatable type! At least now we have a justification for it.

So which erasure strategy does QVector use exactly?

If Qt uses trivial relocation to erase elements from a QVector of trivially relocatable types, does this mean that QVector also uses (non-trivial) relocation when erasing objects of other types?

Well... no, not really. Qt is way, way, way less strict than the C++ Standard Library: the exact strategies for erasure, insertion etc. are simply undocumented and can and will change at any time. (In fact, as I'm writing this blog post, I can see commits that change these algorithms...) The strategies are also inconsistent between different containers.

For instance, consider a non-empty QVector of size S. If I erase the first element, exactly how many operations are going to happen? 𝒪(S), right?

That would be true in Qt 5, but it's not true any more in Qt 6! In Qt 6 the operation is 𝒪(1): QVector in Qt 6 has a "prepend optimization": there may be empty capacity at both ends of the storage. So, in order to erase the first element, it's sufficient to destroy it and update bookkeeping (move the "begin" pointer forward). This is just one amongst many other similar examples where Qt containers have changed behavior between Qt releases; other changes do happen even between minor releases.

The corollary is of this lack of guarantees is quite strong: do not ever store types that don't have value semantics in Qt containers! Qt will break your programs when its containers change behavior. It's a when, not an if.

Thankfully, the majority of types we deal with in Qt applications have value semantics, so they are not affected.

Could we just ban non-value types from containers?

There is one last question left. Who in the world uses std::tuple<int &> as a vector's value type?

While that may sound very unlikely, in reality there are many other types for which assignment is different than destruction+move construction.

For instance: allocator-aware types (std::string, std::vector, etc.) may exhibit non-value behavior with certain allocators. An example of such an allocator is std::pmr::polymorphic_allocator, which does not propagate on move assignment. Erasing a std::pmr::string from a vector must have a very specific outcome, dictated by its reference semantics; but surely we can't outlaw vector of strings!

Even if we could decide to ban "some" types, how do we figure out their assignment semantics?. A Rule Of Five type may or may not have value semantics, and there is no way to know about that "from the outside".

So, this line of reasoning leads to nowhere.

Conclusions

In this post we started with a list of questions. In search for answers, we have have seen that:

for some types ("value" types), move assignment is equivalent to destruction of the target followed by move construction from the source, while for other types this is not true
we can't just ban the latter types from being stored in containers. They're useful, and in general we have no means to detect them!
vector erasure can be implemented in terms of relocations: this fact potentially allows a vector implementation (like QVector) to use trivial relocation when erasing elements
Qt containers assume value semantics from the objects they hold, and do not document the exact strategies used by their algorithms
we can't change std::vector erasure behavior, because it would break existing code.

In the next post we will finish our exploration of trivial relocability, and we'll look at C++ standardization. Thank you for reading!

Overview about all installments:

Tags:

c++qml qt

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

Giuseppe D’Angelo

Senior Software Engineer

Senior Software Engineer at KDAB. Giuseppe is a long-time contributor to Qt, having used Qt and C++ since 2000, and is an Approver in the Qt Project. His contributions in Qt range from containers and regular expressions to GUI, Widgets, and OpenGL. A free software passionate and UNIX specialist, before joining KDAB, he organized conferences on opensource around Italy. He holds a BSc in Computer Science.