Qt and Trivial Relocation (Part 1) What is relocation?

07.05.2024

Giuseppe D'Angelo

2 comments

The container classes introduced in Qt 4 (Tulip, for the aficionados) had an interesting optimization: the ability to turn certain operations on the contained objects into byte-level manipulations.

Example: vector reallocation

Consider the reallocation of a QVector<T>: when the vector is full and we want to insert a new value (of type T), the vector has to allocate a bigger block of memory.

Simplifying, the way this is done for a generic type T is:

1. Allocate a bigger block of memory. 2. Move-construct the T objects from the current storage into the new memory.

Elements are being moved. The red, underlined elements in the source represent moved-from objects. C will be moved next.

3. Destroy the moved-from objects in old current storage. 4. Deallocate the old block of memory. 5. Update bookkeeping (adjust data pointer, size, capacity, as needed).

In pseudocode, this looks something like this:

template &lt;typename T&gt;
vector&lt;T&gt;::reallocate_impl(size_t new_capacity)
{
    assert(m_size &lt;= new_capacity);
    T *new_storage = allocate(new_capacity);

    std::uninitialized_move(m_begin, m_begin + m_size, new_storage);
    std::destroy(m_begin, m_begin + m_size);

    deallocate(m_begin);
    m_begin = new_storage;
    m_capacity = new_capacity;
}

Depending on the operation that causes the reallocation (push_back, insert in the middle, simple reserve) a new element may also be added to the just allocated block of memory; that’s not really relevant here. I am also deliberately ignoring a lot of details, such as exception handling, move_if_noexcept, out of memory, allocators and similar.

The optimization

Types like int don’t do anything when they are constructed or destructed. If an int object is simply memcpy-ed to another place in memory, you would get the same integer in another place. Moreover, there would be nothing to do to destroy the original integer.

Therefore, if you have a QVector<int>, the above approach has some steps that are not needed. There’s a much faster way to achieve the same result: 1. Allocate a bigger block of memory. 2. memcpy all the data from the old storage to the new one. 3. Deallocate the old storage. 4. Update bookkeeping.

In pseudocode:

template &lt;typename T&gt;
vector&lt;T&gt;::reallocate_impl(size_t new_capacity)
{
    assert(m_size &lt;= new_capacity);
    T *new_storage = allocate(new_capacity);

    if constexpr (/* ... magic ... */) {
        std::memcpy(new_storage, m_begin, m_size * sizeof(T));
    } else if constexpr (std::is_nothrow_move_constructible_v&lt;T&gt;) {
        std::uninitialized_move(m_begin, m_begin + m_size, new_storage);
        std::destroy(m_begin, m_begin + m_size);
    } else {
        // ...
    }

    deallocate(m_begin);
    m_begin = new_storage;
    m_capacity = new_capacity;
}

What is going on here?

Qt is exploiting the fact that, for some datatypes, move-construction of an object A into a new object B, followed by the destruction of A, is equivalent / implementable as a byte copy (memcpy of A‘s object representation) into some suitable storage. In pseudocode:

// Given:
T *ptrA = ~~~;   // points to a valid object
T *ptrB = ~~~;   // points to uninitialized storage

// Then this:
new (ptrB) T(std::move(*ptrA));   // move-construct A into new storage (placement new)
ptrA-&gt;~T();                       // destroy A

// can be implemented like this:
memcpy(ptrB, ptrA, sizeof(T));

The combination of move-construction followed by destruction was happening in the generic code for vector reallocation. Instead of that, in the optimized version, we memcpy each object.

Note that in the optimized version we never call the move constructor of T; and we never call the destructor of A. They are both realized by the call to memcpy.

It’s easy to convince ourselves that this works for a type like int, where we can copy its object representation and end up with the same value.

What about a more complicated type, like `QString`?

It still works! When we memcpy a QString, we end up creating a new QString object that points to the same payload as the original, so that part is fine. Now, normally, when we create a copy of a QString, we would need to increase its reference counter. But here we are not creating a new copy: we are moving from the original string, and destroying the original. The consequence is that the reference counter of the string does not need to change; in the end the total number of QString objects pointing to that payload is the same. This is ensured by the fact that we are not running QString’s destructor over the original object.

QString is trivially relocatable: we can memcpy the QString object (just a pointer, really) and deallocate the original without running its destructor. This achieves the same effects as move-construction + destruction.

We can do even better

We have just established that we can use memcpy instead of move constructing one object (from the old storage into the new) and destroying the old object.

Now, QVector has to repeat this operation for multiple objects; since they are all stored contiguously in memory, then QVector can take a shortcut, and simply do a bulk memcpy of its contents!

This can be pushed even further: QVector simply calls realloc when reallocating, which is even better, because it allows to grow the allocated area in-place. Otherwise, realloc itself will move the data in memory. Simple, easy and efficient.

Terminology

In Qt we call a type that can be moved in memory via memcpy relocatable, but this specific jargon means different things to different people. I’ll try to clarify it in the rest of this post, as we go along.

For the moment, let’s try to be slightly more precise than Qt is, and use this vocabulary:

“relocation” means to move-construct an object, and destroy the source;
“trivial relocation” means to do the same, but achieve it via a simple memcpy.

However, please take this definition with a grain of salt. In the next instalments we will need to tweak it, as we discover some interesting properties of relocation.

Does this mean that QVector uses trivial relocation for all types?

No, it doesn’t. While it works for things like int, QPoint, QString, or QPen, this optimization is not safe to do in general: for some types Qt cannot replace move construction + destruction with a memcpy.

For instance, consider a type like a string class which employes the so-called Short/Small String Optimization, or SSO. This would be for instance std::string on most implementations. (For more background about SSO, see here.)

class string
{
    // may point to the heap, or into `m_buffer` if the string
    // is short enough:
    char *m_begin;   
    
    size_t m_size;
    size_t m_capacity;
    
    char m_buffer[SSO_BUFFER_SIZE];
};

string s = "Hello";

What would happen if we tried to replace a move-construction + destruction of a such a string class with a memcpy? Well, nothing good! In the source object the m_begin pointer may be pointing into itself.

If we try to relocate that string into another string via a memcpy, the new string’s pointer will point to the old object’s buffer; but the old object has been destroyed!

In fact, the move constructor of this string type would probably look something like this.

The bottom line is that there is no way to know “from the outside” if a generic type T can be moved in memory via memcpy. Qt must assume that it cannot do that, and therefore Qt by default sticks to the vector reallocation algorithm that moves constructs and destroys each element.

The types that cannot be moved in memory via memcpy are, generally speaking, types whose objects invariants depend on their specific address in memory. This is the case of self-referential types (like the string type), but also types which are referenced to externally (for instance, a node in a linked list is pointed to by the successor and the predecessor in the list, so we can’t just move it around in memory without breaking those links).

So how does Qt know that QString can be trivially relocated?

QString isn’t “hardcoded” to be special; any type can tell Qt that it is safe to be trivially relocated, and it does so by using a macro:

class MyClass { 
    ~~~ 
};

Q_DECLARE_TYPEINFO(MyClass, Q_RELOCATABLE_TYPE);

The macro expands to a template specialization that says that MyClass is opting in the optimization. For QString the macro is here, just hidden behind another macro.

Since most Qt value classes are pimpled (they just wrap a pointer to the private class, allocated on the heap; for more info, see here and here) it turns out that most of them benefit from this optimization, which is likely why it got introduced in the first place.

Q_DECLARE_TYPEINFO is public API; you are supposed to mark your own trivially relocatable types with it.

What does `Q_RELOCATABLE_TYPE` mean?

According to the documentation:

Q_RELOCATABLE_TYPE specifies that Type has a constructor and/or a destructor but can be moved in memory using memcpy().

This seems to pretty much match our previous definition of trivial relocatable type. The wording is however a bit imprecise: it does not clarify what “moved in memory” really means; that is, what sequence of operations is realized through the call to memcpy.

Also, to nitpick, the wording should be talking about a type that has copy or move constructors, assignments, and/or a destructor. If any of those are user-defined, then the type is no longer trivially copyable. We are already authorized by C++ to use memcpy for trivially copyable types; in fact, if a type is trivially copyable, then Qt automatically considers it to be trivially relocatable.

I thought it was `Q_MOVABLE_TYPE`?

In Qt 4 the macro was indeed called Q_MOVABLE_TYPE. That name however clashed with “move semantics” (introduced later in C++: Qt 4 was released in 2005!), so Qt wanted to move away from it, in order to avoid any confusion. While that spelling is still around, in Qt 5 and 6 one should prefer the name Q_RELOCATABLE_TYPE.

To nitpick, it should have really been called Q_TRIVIALLY_RELOCATABLE_TYPE, but that ship has sailed.

Does Qt only use trivial relocation for QVector reallocation?

No, it also uses it in a few other places. For instance:

QVarLengthArray is also a vector-like data structure, and uses a number of similar optimizations;
QVariant (Qt’s equivalent of std::any) allocates memory in order to store the object it contains, but it also has a “small object optimization”, where the held object is stored directly inside the QVariant object. SOO is enabled only if:
1. the target object is small enough to fit (obviously);
2. but also only if the object is trivially relocatable.
In this sense, QVariant move constructor is type-erased; if SSO is in use, then the buffer is simply copied over into the new variant (as in, memcpy), and the old one is cleared.

There’s another instance of using trivial relocation which deserves a special discussion: certain algorithms are optimized in Qt for trivially relocatable types. For instance, inserting or erasing elements from a QVector of a trivially relocatable type is optimized as well.

This has some important consequences. Since this post is already overly long, I am going to analyze them in the next instalment of this series. 🙂

Thank you for reading so far.

Overview about the following installments:

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

Categories: C++ / KDAB Blogs / KDAB on Qt / Qt / QtDevelopment

Tags: C++ / Qt / Relocation

2 thoughts on “Qt and Trivial Relocation (Part 1)”

Robert 13.06.2024 5:09 pm

I have written a small sample that simply detaches a list with 10k items of a custom type 10k times, and I could not measure any benefit from specifying Q_RELOCATABLE_TYPE for my custom type. This is true both for Debug and Release builds (Release was about 3x faster in any case).
I admit, it was a very quick hack, not a properly setup profiling, but do you have counter-examples where specifying Q_RELOCATABLE_TYPE really brought measurable benefits?

Reply
1. Giuseppe D'Angelo 14.06.2024 4:17 pm
  Hi,
  
  Trivial relocation isn’t going to give you performance benefits for detaching. When you detach a Qt container, a copy of each element needs to be taken, and there are no real shortcuts for that (unless the type is trivially copyable or similar).
  
  Instead, you’re going to get performance benefits when a QVector is reallocated, or when certain operations are applied to it (for instance, erasing an element from the middle). A realistic benchmark is something along these lines:
  
  Create a Rule Of 5 class, where construction, copy construction/assignement, destruction are expensive and out of line. For instance, a pimpled value class, or something like std::shared_ptr.
  
  Create a QList of that class and do some operations on it (e.g. trigger a reallocation), measuring the cost
  
  Do the same after marking the class trivially relocatable (via Q_DECLARE_TYPEINFO).
  
  The benchmark should show a huge improvement.
  Reply