Qt and Trivial Relocation (Part 3)

Trivial relocability for vector erasure, and types with write-through reference semantics

21 May 2024

The reference semantics backstab

Let's start by analyzing erase()'s behavior once more.

Do you remember our claim that the specific strategy used does not really matter; that is, that they are all equivalent? Well, not so fast! It is actually quite imprecise to say that they are all equivalent.

They may be, as long as we deal with types which have value semantics. If we instead use a type that has reference semantics, the choices are absolutely not equivalent, and will yield different outcomes. This is because the semantics of assignment for (certain) reference types are write-through: they assign through the reference (instead of rebinding the reference).

Since we are implementing erasure in terms of assignments (or swaps, which boil down to assignments), this means that the precise sequence of operations done by erase will be visible due to its side-effects; and it also means that changing the strategy will produce different outcomes!

An example of a Standard Library type with these write-through reference semantics is std::tuple<int &>.

When one assigns such a tuple over another one, the reference inside the tuple will assign-through. This semantic is absolutely deliberate; it is used, for instance, by std::tie to build a tuple of references that can be used to unpack a given tuple:

int a; double b;

std::tie(a, b) = getTuple();  // std::tie produces std::tuple&lt;int &amp;, double &amp;&gt;
                              // assignment will write through, over a and b

For the sake of the example, instead of dealing with std::tuple, let's strip it down to the bare minimum into a struct IntRef, so that we can focus on the behavior that we care about:

struct IntRef {
  int &amp;m_ref;

  IntRef &amp;operator=(const IntRef &amp;other)
  {
    // write-through semantics for assignment:
    m_ref = other.m_ref;
    return *this;
  }
};

Here's an example of this in action:

int a = 1;
int b = 2;
IntRef ra{a};
IntRef rb{b};

ra = rb;
std::println("{}, {}", a, b); // prints 2, 2; we've "assigned through" ra.

The assignment on line 6 wrote 2 into the variable a.

Note that I had to explicitly write an assignment operator for IntRef, and implement the wanted semantics (write-through) myself. This is necessary because of the presence of a reference as a non-static data member (m_ref): the implicitly-generated assignment operator(s) are actually deleted in this case. This default by C++ is actually good, as reference members are a bad idea, and if you have them in a class you should not support assignment.

Let's now generalize this to a std::vector<IntRef>:

int a = 1, b = 2, c = 3, d = 4, e = 5;
IntRef ra{a}, rb{b}, rc{c}, rd{d}, re{e};

std::vector&lt;IntRef&gt; v{ra, rb, rc, rd, re};

v.erase(v.begin() + 1);

std::println("{}, {}, {}, {}, {}", a, b, c, d, e);

What does this code print? Which is another way of asking: "what happens when we erase the second element from v?".

Now this is a very tricky question to answer; depending on the implementation strategy chosen, we may end up with a completely different outcome:

If we move assign the "tail" to the left one element at a time, and erase the last element, then the output is going to be 1, 3, 4, 5, 5.
If we rotate the second element to the end, and destroy the end, then the output is... well, we don't really know, do we? It depends on the implementation of std::rotate.It could be 1, 3, 4, 5, 2; but it could be something else as well.

Please take a moment to convince yourselves of this :-) Pencil and paper helps, otherwise, here's a convenient Godbolt.

Vector erasure must be specified

The conclusion is that the implementation strategy does matter for types with write-through reference semantics! This is a very different outcome than what we might have expected (or desired, even).

For this reason, the Standard Library clearly documents the behavior of std::vector::erase in [vector.modifiers]:

Complexity: The destructor of T is called the number of times equal to the number of the elements erased, but the assignment operator of T is called the number of times equal to the number of elements in the vector after the erased elements.

This pretty much imposes a specific implementation strategy (move assign the tail to the left), so you can rely on the outcome, even in the presence of types with write-through reference semantics.

I'll also add an extra consideration: even if this behavior wasn't fully documented, due to Hyrum's law, we still couldn't "just change" the implementation of erase, because existing code will be affected -- it will be able to tell the difference.

What about QVector? Does it document its erasure behavior? Well... let's talk about this later, shall we?

Is `IntRef` trivially relocatable?

Now, here's a million dollar question: would you say that IntRef is trivially relocatable or that it's not?

Maybe you're tempted to say "it is not" based on what you've just seen -- IntRef has a "strange" behavior, different from most types that we "normally" use. That should be enough to disqualify it, shouldn't it?

Let's not rush to the answer just yet! For the benefit of the discussion, I am going to repeat our current definition of relocability: to move construct an object, and destroy the source object. Relocability is trivial when these two operations can be realized by simply copying bytes.

For example:

int is trivially relocatable. std::unique_ptr<T> is trivially relocatable. QString is trivially relocatable.
std::string is usually not trivially relocatable (self-referential due to SSO). std::list may also not be (the first node and/or a sentinel node of the list may contain a pointer to the std::list object itself).
QObject isn't relocatable (it is not movable at all).

So, what about our IntRef?

If we stick to this very definition the correct answer is that IntRef is absolutely, unequivocally trivially relocatable. We can replace a move construction of an IntRef object, followed by the destruction of the source, with a mere memcpy; the result is completely equivalent.

IntRef satisfies our definition of trivial relocability.

If you think about it, IntRef simply contains a reference -- that is, a pointer. It's not very different from the memory layout of a std::unique_ptr from this point of view, and we know we can relocate that.

IntRef does show its reference semantics on assignment, but remember, our current definition of (trivial) relocability does not talk about assignments at all.

As a corollary, this fact means that it would be perfectly safe to grow a vector<IntRef> using memcpy when it needs a reallocation!

What about erasure?

If IntRef is trivially relocatable, can we erase an IntRef object from a vector, using memmove to do so?

Again, based on what we have seen, the answer is no, we cannot. Erasure is fully specified to use assignments, which yield side effects for a type like IntRef. If we just moved bytes around, those side effects would not be realized, breaking our programs.

The problem

We have reached a contradiction in our design:

IntRef is trivially relocatable as it satisfies our definition.
Qt optimizes the erasure of trivial relocatable types by using memmove, but
IntRef cannot be erased by copying bytes! It needs to be erased by using move assignments.

To solve this contradiction, something has to give in:

maybe vector should use a different strategy when erasing elements;
maybe types like IntRef should not be allowed to be stored in a vector;
maybe Qt should not be using memmove when erasing objects of trivially relocatable type (but it can still optimize the reallocation of a vector);
maybe Qt's definition of trivial relocability does not match ours, and we need to fix our definitions.

We will explore these possibilities in the next blog post. Stay tuned!

Overview about all installments:

Tags:

c++qml qt

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

Giuseppe D’Angelo

Senior Software Engineer

Senior Software Engineer at KDAB. Giuseppe is a long-time contributor to Qt, having used Qt and C++ since 2000, and is an Approver in the Qt Project. His contributions in Qt range from containers and regular expressions to GUI, Widgets, and OpenGL. A free software passionate and UNIX specialist, before joining KDAB, he organized conferences on opensource around Italy. He holds a BSc in Computer Science.