Getting rid of "volatile" in (some of) Qt

4 March 2020

`volatile foo` used in place of `std::atomic<foo>`

As usual, the problem here is that volatile in C++ has nothing to do with synchronization/atomicity. (Ok, that's not entirely true on all systems, but it's true enough.)

There are at least two big offenders I found this way.

1: `QQmlIncubationController::incubateWhile(volatile bool *)`

The incubateWhile call takes a pointer to volatile bool. The idea is that the pointed boolean gets periodically checked by the incubator controller; another thread can tell the controller to stop incubating by flipping that bool.

To be honest, I'm not 100% convinced by the soundness of this API. For instance, it could've accepted a function object to poll, moving the problem of the eventual synchronization from another thread to the user's domain. When in doubt, making it someone else's problem is usually a good choice.

But I don't want to redesign the QML engine APIs here, so I've just attempted a fix by porting to std::atomic<bool> instead.

The fix will appear in Qt 5.15.

2: `QObjectPrivate::threadData`

QObjectPrivate::threadData is a raw pointer to QThreadData (i.e. QThreadData *), basically storing the thread affinity of a given QObject, by holding a pointer to that thread's private data.

The problem is that that variable is read and written by multiple threads without synchronization. For instance it's written from here:

void QObjectPrivate::setThreadData_helper(QThreadData *currentData, QThreadData *targetData)
{
    ...

    // set new thread data
    targetData->ref();
    threadData->deref();
    threadData = targetData; // <-- the write

    ...
}

and potentially read from here:

void QCoreApplication::postEvent(QObject *receiver, QEvent *event, int priority)
{
    ...

    QThreadData * volatile * pdata = &receiver->d_func()->threadData; // <-- the read
    ...
}

These calls can happen from different threads, without synchronization — postEvent is documented to be thread-safe. Therefore, we have a data race, and data races are undefined behavior.

This ended up being super-tricky to fix (and I'm still not sure of it).

Sure, the idea is to make QObjectPrivate::threadData an atomic pointer to QThreadData, but then how to fix all the usage points?

The big struggle is that QObject is merely reentrant, and not thread safe, so most cases don't require any
synchronization (and can be turned into relaxed loads) because you're not supposed to be modifying that variable by multiple threads without synchronization. Big exception: the cases that involve code paths that are documented to be thread-safe, like the postEvent code path above, which thus require acquire loads.

The current fix draft is sitting here.

The most interesting part is probably the old loop in postEvent(), trying to lock receiver's thread event queue:

QThreadData * volatile * pdata = &receiver->d_func()->threadData;
    QThreadData *data = *pdata;
    if (!data) {
        // posting during destruction? just delete the event to prevent a leak
        delete event;
        return;
    }

    // lock the post event mutex
    data->postEventList.mutex.lock();

    // if object has moved to another thread, follow it
    while (data != *pdata) {
        data->postEventList.mutex.unlock();

        data = *pdata;
        if (!data) {
            // posting during destruction? just delete the event to prevent a leak
            delete event;
            return;
        }

        data->postEventList.mutex.lock();
    }

This loop used volatile once more to "cheat" synchronization: the code needs to get the threadData of an object and lock its event queue. Before that's done, however, the affinity of the object could've been changed, so we need to unlock and try again. The idea is sound, but volatile does not help at all here as far as C++ is concerned; it seems to work because the compiler actually reloads the value of receiver's threadData, since its access happens through a volatile pointer.

As a side note: the same loop was completely missing from other places, like QCoreApplication::removePostedEvents, that also need to lock an object's event queue (oops!). So I fixed that too, as a drive-by change...

`volatile` as a way to tell the compiler to... stop thinking

The QLibraryInfo class stores, amongst other things, the path to Qt's own installed "parts" (plugins, translations, etc.). This path is hardcoded at build time; it's one of the reasons why a Qt installation cannot be easily relocated.

In QLibraryInfo there's a mysterious "const char * volatile path" variable:

#ifndef QT_BUILD_QMAKE_BOOTSTRAP
    if (!fromConf) {
        const char * volatile path = 0;
        if (loc == PrefixPath) {
            path = getPrefix(
#ifdef QT_BUILD_QMAKE
                        group
#endif
                   );
        } else if (unsigned(loc) <= sizeof(qt_configure_str_offsets)/sizeof(qt_configure_str_offsets[0])) {
            path = qt_configure_strs + qt_configure_str_offsets[loc - 1];
#ifndef Q_OS_WIN // On Windows we use the registry
        } else if (loc == SettingsPath) {
            path = QT_CONFIGURE_SETTINGS_PATH;
#endif
# ifdef QT_BUILD_QMAKE
        } else if (loc == HostPrefixPath) {
            static const QByteArray hostPrefixPath = getHostPrefixFromHostBinDir().toLatin1();
            path = hostPrefixPath.constData();
# endif
        }

        if (path)
            ret = QString::fromLocal8Bit(path);
    }
#endif

What's it for? It turned out to be a complete hack.

In order to achieve relocation of Qt libraries (for instance, to allow the binary Qt installers to install Qt in any path specified by the user), Qt can be built with a dummy installation path. The installer will then binary patch Qt, replacing this dummy path with the actual installation path.

What's the hack, then? The hack is to prevent the compiler to aggressively inline the call to QString::fromLocal8Bit below, which involves a call to strlen(path).

With path set to a compile-time string literal, strlen(path) would also be fully evaluated at compile time, and that would make the binary patch not work -- it would end up with path and its length not matching, and making fromLocal8Bit return broken data or crash horribly. This has actually happened and has been reported as QTBUG-45307.

How to stop the compiler from calling strlen at compile-time? There you have it, don't use a "const char *", use a "const char * volatile" (a volatile pointer to const char). I have no idea why it actually works!

Didn't fix it, but left a comment for the future reader.

`volatile` for `setjmp`/`longjmp` protection

Now we go into the legitimate use cases for volatile. C (and thus C++) define the setjmp and longjmp library functions. They do non-local jumps: longjmp jumps to a setjmp done somewhere before in the call stack.

In C they're used as a poor man's exception handling mechanism -- return from a deep call stack up directly to a caller, unwinding the stack in the process. Since it's C, every object is trivially destructible, so there's nothing special to do to perform the unwinding. In C++ performing a longjmp that would destroy non-trivially destructible objects yields undefined behavior.

volatile has an interesting interaction with setjmp/longjmp: it's used to give well-defined semantics when using function local variables across the jump.

For instance, a snippet like this:

volatile int i = 0;
jmp_buf buf;

if (!setjmp(buf)) {
   // first time, enter here
   i = 42;
   longjmp(buf, 1);
}

// then go here after the longjmp
std::cout << i;

exhibits well-defined behavior (and prints 42) specifically because i is volatile. Without the volatile, i would have indeterminate value, and producing such an indeterminate value in this context is undefined behavior (I'm sensing a pattern here...). In other words: any local variable that is set after a setjmp and read after the longjmp must be declared volatile.

In layman's terms: this volatile is telling the compiler something like "do not put i in a register and don't optimize it out in any way; always keep it spilled on the stack, so it's safe in case of a longjmp".

Qt has a couple of places where there are setjmp/longjmp calls, namely its image handlers. PNG and JPEG use the respective C libraries (libpng and libjpeg), which seem to be designed with using setjmp/longjmp as a way to notify users of errors. That's also where there are more usages of volatile.

PNG handler

In the case of the PNG image handler, a local variable was declared volatile, but that local was actually never accessed after the longjmp; so volatile wasn't even needed in the first place. This has been fixed here.

JPEG handler

For the JPEG image handler, the situation was a bit more spicy. Sure thing, a local variable was again unnecessarily declared volatile (fix).

However the same code also was showing undefined behavior by not protecting some locals
with volatile. For instance here, simplifying the code, we have this pattern:

JSAMPROW row_pointer[1];
row_pointer[0] = 0;
...
if (!setjmp(jerr.setjmp_buffer)) {
    ...
    row_pointer[0] = new uchar[cinfo.image_width*cinfo.input_components];
    ...
}
delete [] row_pointer[0];

The row_pointer local variable was not volatile; it was written after the
setjmp (line 615) and then read again after the longjmp (line 716).

The fix in this case couldn't be to simply mark it as volatile: the object in question is passed to libjpeg's API, which simply want a pointer-to-object, and not a pointer-to-volatile-object (d'oh!). Marking an object with volatile, then const_cast'ing volatileness away and accessing the object is an immediate undefined behavior, so don't even think about doing it...

Also, libjpeg's error handling requires the user to use setjmp/longjmp; therefore, the code had to keep the same structure (including the jumps). Basically, in libjpeg, one installs a custom error handling function; once libjpeg calls that handling function, it expects it not to return. So it must either exit the entire process, or longjmp out.

My fix was to make those variables non-local to the function calling setjmp, thus recovering well-defined behavior. You can read more about it in the fix' commit message here.

`longjmp`s, `longjmp`s everywhere

Now the really interesting bit is that the very same code in the very same shape (and with the same very bug) is present everywhere.

For instance, it's present in the equivalent GTK code in GDK that loads JPEG images here:

struct jpeg_decompress_struct cinfo;
...
if (sigsetjmp (jerr.setjmp_buffer, 1)) {
    ...
    jpeg_destroy_decompress (&cinfo);
    ...
}
...
jpeg_create_decompress (&cinfo);

Or in ImageMagick, here:

struct jpeg_compress_struct
    jpeg_info;

  ...

  if (setjmp(error_manager.error_recovery) != 0)
    {
      jpeg_destroy_compress(&jpeg_info);
    }

  ...

  jpeg_create_compress(&jpeg_info);

It turns out that this kind of code comes all the way back from libjpeg version 6b's own examples, dated 1998!

Those examples showcased the non-local return via setjmp/longjmp, and were affected by the same undefined behavior. You can still download them here (check out the sources for libjpeg 6b).

The historical libjpeg-turbo's example also had the same problem. Comments are mine:

// 1. declare a non-volatile object
struct jpeg_decompress_struct cinfo;

if (setjmp(jerr.setjmp_buffer)) {
  // 3: and access it after the longjmp.
  jpeg_destroy_decompress(&cinfo);
}

// 2. access it after the setjmp
jpeg_create_decompress(&cinfo);

I opened an issue asking for this to be fixed; the fix landed in this commit.

Conclusions

To conclude, let's celebrate the longevity of this problem in all the codebases in the world. In perfect security-drama design: the most important part of an issue is its logo.

Therefore, here's the official logo of the volatile handling in JPEG fiasco, of course made in 1998 style:

Thanks for reading!

Tags:

c++hardware qt

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

3 Comments

Great post, thanks a lot!

Great article. I'm loving how "volatile.jpeg" is actually a PNG

That was a fascinating read! Thank You!

Giuseppe D’Angelo

Senior Software Engineer

Senior Software Engineer at KDAB. Giuseppe is a long-time contributor to Qt, having used Qt and C++ since 2000, and is an Approver in the Qt Project. His contributions in Qt range from containers and regular expressions to GUI, Widgets, and OpenGL. A free software passionate and UNIX specialist, before joining KDAB, he organized conferences on opensource around Italy. He holds a BSc in Computer Science.

Getting rid of "volatile" in (some of) Qt

`volatile foo` used in place of `std::atomic<foo>`

1: `QQmlIncubationController::incubateWhile(volatile bool *)`

2: `QObjectPrivate::threadData`

`volatile` as a way to tell the compiler to... stop thinking

`volatile` for `setjmp`/`longjmp` protection

PNG handler

JPEG handler

`longjmp`s, `longjmp`s everywhere

Conclusions

3 Comments

Related Content

Sign up for the KDAB Newsletter

Getting rid of "volatile" in (some of) Qt

volatile foo used in place of std::atomic<foo>

1: QQmlIncubationController::incubateWhile(volatile bool *)

2: QObjectPrivate::threadData

volatile as a way to tell the compiler to... stop thinking

volatile for setjmp/longjmp protection

PNG handler

JPEG handler

longjmps, longjmps everywhere

Conclusions

3 Comments

Related Content

KDAB contributions to Qt 5.7, Pt 2

Display Widget Windows in Qt Quick Applications

Model/View Drag and Drop in Qt - Part 3

Sign up for the KDAB Newsletter

`volatile foo` used in place of `std::atomic<foo>`

1: `QQmlIncubationController::incubateWhile(volatile bool *)`

2: `QObjectPrivate::threadData`

`volatile` as a way to tell the compiler to... stop thinking

`volatile` for `setjmp`/`longjmp` protection

`longjmp`s, `longjmp`s everywhere