Fuzzing Qt for fun and profit

A brief introduction to fuzzing and how we successfully use it in Qt

29 November 2016

Fuzzing

What is AFL? It's a fuzzer: a program that keeps changing the input to a test in order to make it crash (or, in general, misbehave). This "mutation" of the input goes on forever -- AFL never ends, just keeps finding more stuff, and optimizes its own searching process.

AFL gained a lot of popularity because:

it is very fast (it instruments your binaries);
it uses state-of-the-art algorithms to mutate the input in ways that maximize the effect on the target program;
the setup is immediate;
it has a very nice text-based UI.

The results speaks for themselves: AFL has found security issues in all major libraries out there. Therefore, I decided to give it a try on Qt.

The setup

Setting up AFL is straightforward: just download it from its website and run make. That's it -- this will produce a series of executables that will act as a proxy for your compiler, instrumenting the generated binaries with information that AFL will need. So, after this step, we will end up with afl-gcc, afl-g++ and so on.

You can go ahead and build an instrumented Qt. If you've never built Qt from source, here's the relevant documentation. On Unix systems it's really a matter of running configure with some options, followed by make and optionally make install. The problem at this step is making Qt use AFL's compilers, not the system ones. This turns out to be very simple, however: just export a few environment variables, pointing them to AFL's binaries:

export CC=/path/to/afl-gcc
export CXX=/path/to/afl-g++
./configure ...
make

And that's it, this will build an instrumented Qt. (A more thorough solution would involve creating a custom mkspec for qmake; this would have the advantage of making the final testscase application also use AFL automatically. For this task, however, I felt it was not worth it.)

Creating a testcase

What you need here is to create a very simple application that takes an input file from the command line (or stdin) and uses it to stress the code paths you want to test.

Now, when looking at a big library like Qt, there are many places where Qt reads untrusted input from the user and tries to parse it: image loading, QML parsing, (binary) JSON parsing, and so on. I decided to give a shot at binary JSON parsing, feeding it with AFL's mutated input. The testcase I built was straightforward:

#include <QtCore>

int main(int argc, char **argv)
{
    QCoreApplication app(argc, argv);

    QFile file(app.arguments().at(1));
    if (!file.open(QIODevice::ReadOnly))
        return 1;

    QJsonDocument jd = QJsonDocument::fromBinaryData(file.readAll());

    return 0;
}

Together with the testcase, you will also need a few test files to bootstrap AFL's finding process. These files should be extremely small (ideally, 1-2KB at maximum) to let the fuzzer do its magic. For this, just dump a few interesting files somewhere next to your testcase. I've taken random JSON documents, converted them to binary JSON and put the results in a directory.

Running the fuzzer

Once the testcase is ready, you can run it into the fuzzer like this:

afl-fuzz -m memorylimit \
         -t timeoutlimit \
         [master/slave options] \
         -i testcases/ \
         -o findings/ \
         -- ./test @@

A few explanatory remarks:

The testcases directory contains your reference input files, while the findings of the fuzzers will be written in findings.
To avoid blowing up your system, AFL sets very strict limits for execution of your test: it is allowed to allocate at most memorylimit megabytes of virtual memory and it is allowed to run for at most timeoutlimit milliseconds. You will typically want to raise the memory limit from its default (50MB) to something bigger, depending on your system and on the test.
One instance of afl-fuzz is single threaded; in order to maximize the search throughput on a machine with multiple cores/CPUs, you must manually launch it multiple times with the same -i and -o arguments. You should also give each instance a unique name and, if you want, elect one instance to do a deterministic search rather than a random one. This is all expressed through the master/slave options: pass to one instance the -M fuzzername option, and to all the others pass the -S fuzzername option. (All the fuzzernames must be unique).
Last but not least, @@ gets replaced by the name of a file generated by AFL, containing the mutated input.

For reference, I've launched my master like this:

afl-fuzz -m 512 -t 20 -i testcases -o findings-json -M fuzzer00 -- ./afl-qjson @@

The output is a nice colored summary of what's going on, updated in real time:

AFL running over a testcase.

Now: go do something else. This is supposed to run for days! So remember to launch it in a screen session, and maybe launch it via nice so that it runs with a lower priority.

Findings

After running for a while, the first findings started to appear: inputs that crashed the test program or made it run for too long. Once AFL sees such inputs, it will save them for later inspection; you will find them under the findings/fuzzername subdirectories:

findings-json/fuzzer00/crashes/id:000000,sig:06,src:000445,op:arith8,pos:168,val:+6
findings-json/fuzzer00/crashes/id:000001,sig:11,src:000445,op:arith8,pos:168,val:+7
findings-json/fuzzer00/crashes/id:000002,sig:11,src:000449,op:arith8,pos:196,val:+6
findings-json/fuzzer00/crashes/id:000003,sig:11,src:000489,op:flip1,pos:435
findings-json/fuzzer01/crashes/id:000000,sig:06,src:000526,op:havoc,rep:2
findings-json/fuzzer01/crashes/id:000001,sig:11,src:000532,op:havoc,rep:2
findings-json/fuzzer01/crashes/id:000002,sig:06,src:000533,op:havoc,rep:4

If you're lucky (well, I guess it depends how you look at it...), you will end up with inputs that indeed crash your testcase. Time to fix something!

You may also get false positives, in the form of crashes because the testcase runs out of memory. Remember that AFL imposes a strict memory limit on your executable, so if your testcase allocates too much memory and does not know how to recover from OOM it will crash. If you see many inputs crashing into AFL but not crashing when running normally, maybe your testcase is behaving properly, but just running out of memory, and increasing the memory limit passed to AFL will fix this.

The sig part in the name of each saved input should give you a hint, telling you which Unix signal caused the crash. In the listing above, signal number 11 is a SIGSEGV, which is indeed a problem. The signal 06 is SIGABRT (that is, an abort), which was generated due to running out of memory.

To reproduce this last case, just manually run the test over that input, and check that it doesn't misbehave; then rerun it, but this time limiting its available memory via ulimit -v memory_available_in_kilobytes. If the testcase works normally but crashes under a stricter ulimit, it's likely that you're in an out-of-memory scenario. This may or may not require a fix in your code; it really depends whether it makes sense for your application/library to recover from an OOM.

Fixing upstream

After reporting the findings to the Security Team, it was a matter of a few days before a fix was produced, tested and merged into Qt. You can find the patches here and here.

Tips and tricks

If you want to play with AFL, I would recommend you to do a couple of things:

Set your CPU scaling governor to "performance". This is for a couple of reasons: it makes no sense for the kernel to try to throttle down your CPUs if AFL is running; and it is actually a bad thing because it interferes with AFL measurements. AFL complains about this, so keep it happy and disable "powersave" or "ondemand" or similar governors.
Use a ramdisk for the tests. AFL needs to write a new input file every time it runs your application; for the JSON testcase above, AFL was achieving about 1000 executions/second/core. Each of this run needs a new test file as input; in addition to that, AFL needs to write stuff for its own bookkeeping.This will put your disk under very considerable stress, possibly even wear it out. Now, any modern filesystem will still flush data to disk only a few times every second (at most), but still, why hit the disk at all? One can simply create a ramdisk, and run AFL in there:

$ mkdir afl
# mount -t tmpfs -o size=1024M tmpfs afl/
$ cd afl/
$ afl-fuzz -i inputs -o findings ...

Do not let this run on a laptop or some other computer which may overheat. AFL is tremendously resource intensive and runs for days. If you want to get liquid cooling for your workstation, this is the perfect excuse.

Conclusions

Fuzzing is an excellent technique for testing code that needs to accept untrusted inputs. It is straightforward to set up and run, requires no modifications to the tested code, and it can find issues in a relatively short timespan. If your application feature parsers (especially of binary data), consider to keep AFL running over it for a while, as it may discover some serious problems. Happy fuzzing!

Tags:

c++performance qml qt tools

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

3 Comments

Nice article, thanks. Have you considered to apply AFL to Qt GUI components. Like, perhaps fuzzing sequences of events onto a dialog that consists of a bunch of widgets?

Hi, that seems in interesting idea, but what would it accomplish? If it's for "monkey testing", that's already available in Squish.

More in general, I'd also like to explore libFuzzer and how to stress test things such as the CSS parser in QStyleSheetStyle, the HTML parser in QTextDocument and other similar textual-based inputs.

And libFuzzer http://llvm.org/docs/LibFuzzer.html for QTBUG-57553 https://bugreports.qt.io/browse/QTBUG-57553 AFL can also test for it ;-)

Giuseppe D’Angelo

Senior Software Engineer

Senior Software Engineer at KDAB. Giuseppe is a long-time contributor to Qt, having used Qt and C++ since 2000, and is an Approver in the Qt Project. His contributions in Qt range from containers and regular expressions to GUI, Widgets, and OpenGL. A free software passionate and UNIX specialist, before joining KDAB, he organized conferences on opensource around Italy. He holds a BSc in Computer Science.