Skip to content

Full Stack Tracing Part 1 What is it and when is it useful?

Full stack tracing is a tool that should be part of every software engineer’s toolkit. It’s the best way to investigate and solve certain classes of hard problems in optimization and debugging. Because of the power and capability it gives the developer, we’ll be writing a series of blogs about it: when to use it, how to get it set up, how to create traces, and how to interpret results. Our goal is to get you capable enough to use full stack tracing to solve your tough problems too.

Firstly, what is it? Full stack tracing is tracing on the full software stack, from the operating system to the application. By collecting profiling information (timing, process, caller, API, and other info) from the kernel, drivers, software frameworks, application, and JavaScript environments, you’re able to see exactly how the individual components of a system are interacting. That opens up areas of investigation that are impossible to achieve with standard application profilers, kernel debug messages, or even strategically inserted printf() commands. One way to think of full stack tracing is like a developer’s MRI machine that allows you to look into a running system without disturbing it to determine what is happening inside. (And unlike other low-level traces that we’ve written about before, full stack tracing provides a simpler way to view activity up and down the entire software stack.)

Use cases

When do you need full stack tracing? Since it gives you visibility across a running system, it’s a great solution anytime you need to do things like:

  • Gain visibility into the software running below the level of your application and API calls
  • Measure timing and synchronization of events that happen across multiple applications
  • Optimize application startup time, locating startup script inefficiencies, and trimming down system boot time
  • Verify communication between a client and server process, or any other process pairs like publisher and subscriber, or application and watchdog
  • Detect background processes that interfere with an application
  • Pinpoint places in the system that consume excessive CPU cycles
  • Find problematic interactions between multiple device drivers or independently running software stacks
  • Correctly understand the behavior of a complex system before you make bug fixes or alterations

Battling slow boot time

To put all of this in context, let’s walk through an example where we used full stack tracing to solve a show-stopping problem for a client. In this case, it was an automotive customer creating an infotainment system with a rather complex software stack. The main application was created in Qt and ran on a Nvidia Tegra X1, which is an extremely powerful processor for an embedded system. The GPU rivals that of laptops with a quad core CPU. Because the software required integration from many partners, developers unfortunately realized very late during integration that despite all of the power, the infotainment system was taking over five seconds to show the application’s first screen. This was completely unacceptable, and we were asked to diagnose why it was taking so long.

As much of the application was in QML, we first took a look at the code. While it certainly wasn’t crafted for speed – and they probably used far too much JavaScript code – this didn’t seem like the primary problem. We needed to determine if the boot issues were due to an overall sluggishness with too many pieces of software running, or if we could pinpoint the problems to one (or several) specific areas. Firing up the QML profiler highlighted something we didn’t expect: a trivial binding evaluation of a string to an image file in early startup took 180 milliseconds. What was happening underneath it that caused that binding to take so long?

QML profiler trace of binding issue

QML profiler showing long binding in action (similar issue, different app)

 

That’s a question that the QML profiler can’t answer. While you can make many educated guesses to figure out exactly what’s happening, it is this type of problem where full stack tracing is invaluable. It can tell you what Qt, C++, kernel, driver, or hardware interactions are happening underneath the QML so you can see where the time is being spent.

By turning on full stack tracing for the kernel, syscalls, and C++, adding tracepoints within Qt, and rerunning the application, we were able to find that those long bindings were occurring because many libraries were being loaded for image plug-in formats. Since the application only used PNG files, we were able to eliminate a huge number of these unnecessary loads for other file plug-ins. That included the SVG plugin, which was especially heavy. Although the application was a Qt Quick app that didn’t use Qt Widgets, the SVG plugin was pulling in the relatively big Qt Widgets library, which was unneeded. (This is because the plugin was compiled with QtWidgets support enabled.) But while eliminating the plug-ins made a dramatic improvement, it still wasn’t fast enough, and it was taking far longer to boot than it should.

Again, full stack tracing was able to show us that the disk I/O was a huge culprit and many applications were hammering the disk during startup. However, most of those applications accessing the disk were background tasks that weren’t critical to the startup procedure. Furthermore, most of the disk I/O was coming through a single call within syslog: the processes were emitting reams of debug output, going through syslog to write the data out to the flash drive. The CPU and GPU had more than enough horsepower to manage this, but the on-board flash was really cheap…and pretty slow.

While it might have been better to remove unnecessary logging from those background tasks, that change would have required a lot of coordination and testing because many distinct software suppliers were involved. That would have introduced a good deal of risk to a project that was already perilously close to the production deadline. Instead, we settled for bumping up the main application’s priority. The increased priority let the GUI beat out those background tasks during boot up. The background tasks were able to harmlessly finish their work slightly later. This change was now able to bring the splash screen up within one second, meeting the customer requirements to everyone’s great relief.

While full stack tracing was critical to the technical understanding of the problem, it was also necessary for the project management. With the high pressure of the impending deadline, engineers had to convincingly explain that the corrective actions being taken had a high degree of probable success. Full stack tracing indicated precisely what the problems were and exactly where in the software stack they were, avoiding any finger-pointing between suppliers.

Full stack trace of QML binding issue

QML binding shown with full stack tracing, highlighting SDA activity (similar issue, different app)

Until next time

Hopefully you’ll agree that full stack tracing could be a valuable skill to add to your repertoire. Stay tuned – our next blog will show you how to get it set up.

 

If you’d like to learn more about Debugging and Profiling, you might be interested in one of our training classes.  Visit this page to find out some of the areas and correlating tools with which we are used to working, and with which we can help you.

 

 

About KDAB

If you like this blog and want to read similar articles, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

FacebookTwitterLinkedInEmail
created the Massif-Visualizer, heaptrack and hotspot tools now used widely to improve C++ and Qt applications performance. The co-maintainer of the KDevelop IDE, in 2015 he won the KDE Akademy Award for ongoing work in that. Milian has a Masters Degree in Physics and lives in Berlin. He is fluent in English and German.

2 thoughts on “Full Stack Tracing Part 1”

  1. Milian, thank you for the great post and the many past presentations and blog posts!

    Related to this post: Rico Mariani has written Holistic approaches to performance analysis [1] where he describes how to zoom out and investigate performance issues by looking at the whole system. There’s a lot be gained from that kind of approach, and I’m glad to see the tooling improve for Qt/Linux, in no small part thanks to KDAB.

    [1] https://docs.microsoft.com/en-us/archive/blogs/ricom/holistic-approaches-to-performance-analysis

    1. Milian Wolff

      Yes, ETW is a godsend and I would really like to have something similar on Linux. The power of that technology is also nicely exemplified by the blogs from Bruce Dawson e.g.

      And while I agree that the Linux landscape is improving, it is still a long road ahead, until we have something as powerful as UIforETW and WPA.

      Thanks for your comment!

Leave a Reply

Your email address will not be published. Required fields are marked *