Skip to content

Qt 6 WASM: Uploading & Playing Back Audio Files Adapting to the limitations of the web platform and Qt Multimedia

This article walks through an implementation using C++11 or later, Qt 6.5 or later for WebAssembly (multithreaded), and CMake. The browser environment used was Mozilla Firefox 119.0.1 (64-bit) provided by the Mozilla Firefox snap package for Ubuntu.

Overview & Motivation

Lately, I’ve been working on a small Qt Widgets project to help manage some weekly poker games with my friends, and realized it would be much nicer to distribute copies by web instead of building binaries. This gave me the perfect excuse to test out Qt for WebAssembly, and see how much tinkering I’d have to do with the original source code to make it work in Firefox!

Everything looked and felt great on the web browser when just compiling the desktop code with a Qt WASM CMake kit, but one feature caused some issues. The program plays a notification sound with QMediaPlayer, and allows the user to select any audio file on their computer to use as the sound. The original implementation for both getting a native file dialog and playing back audio files did not work in-browser. Using goofy audio is a must for our poker nights, so I quickly started rewriting.

Fixing the implementation to get the file dialog was simple and required me to replace a very small amount of code, but QMediaPlayer was unusable with file URLs. Firstly, browser code is sandboxed, so the QMediaPlayer can’t set its source to a file URL on the user’s file system. The data would have to be stored in the browser’s internal file system or on a server. Secondly, Qt Multimedia is pretty broken for WASM in general. It’s listed as not working in Qt 5, and untested in Qt 6, with some Qt 6 limitations detailed here.

However, I noticed that the Qt WASM method for opening files provided the file contents as a QByteArray.

I thought, why not just try to play back the audio from the buffer of raw bytes?

Please note that there are other (and usually better) ways to do this. The program I was working on was simple enough that storing a buffer as a DOM variable was sufficient enough. In many cases, it would be preferable to store audio persistently on a server and fetch the stream to play it back, or even in some cases write the file to the browser’s IndexedDB. Either way, knowing how to use C++ to play audio in-browser from a buffer could be useful in these cases, so this simple example should cover a lot of the common ground without making it too complex. Additionally, note that cross-browser implementation may require a little bit more code, and production code should perform more checks than you’ll see here.

Getting File Contents

I’ll start off this basic example by creating a regular Qt Widgets application, with a class that inherits from QMainWindow.

#pragma once

#include 

class MainWindow : public QMainWindow {
    Q_OBJECT

public:
    explicit MainWindow(QWidget *parent = nullptr) noexcept;
    ~MainWindow() override;
};

In the constructor, I’ll create a QPushButton to open a file dialog (dialogButton) and one to play the audio back (playButton).

Let’s connect dialogButton‘s clicked signal to a lambda and get our file dialog.

To do this in a Qt WASM project, we use QFileDialog::getOpenFileContent.

connect(dialogButton, &QPushButton::clicked, this, [this]() {
    QFileDialog::getOpenFileContent(
        tr("Audio files (*.mp3 *.flac *.wav *.aiff *.ogg)"),
        [this](const QString &fileName, const QByteArray &fileContent) {
            if (fileName.isEmpty()) {
                // no file selected
            } else {
                // do stuff with file data
            }
        });
});

Notice that, as I mentioned earlier, the file’s contents are provided as a QByteArray. To play the audio, we need our buffer to contain PCM data, while these bytes are the file’s binary contents in a specific file format. We need to decode the bytes and transform them to PCM.

How do we do that?

Finding the Needed Decoding Method

Let’s stick with the QMediaPlayer idea for now to illustrate some basic concepts. We want to use a buffer of bytes rather than a file URL, so we can set the QMediaPlayer‘s source device to a QBuffer containing PCM data.

To obtain this buffer, we need to create an initial QBuffer from the QByteArray, then give it to a QAudioDecoder, read the decoded bytes as a QAudioBuffer, then create another QBuffer from the QAudioBuffer‘s raw byte data and byte count. Now this buffer can be given to a QMediaPlayer by calling setSourceDevice.

Here’s the problem: QAudioDecoder needs a QAudioFormat to know how to decode the data. This is not just file format – we need to know about channels and bitrate as well. We want to be able to upload files in different formats, with different bitrates, and in both mono and stereo, so we would have to extract all this information manually somehow to decode correctly.

On top of that, this limitation is mentioned in the Qt 6 WASM Multimedia page:

Playing streaming bytes instead of a url. e.g. setSource(QIOStream) is also not currently supported. Some advanced features may or may not work at this time.

Even if we do all the decoding properly, we can’t play from our buffer (by setSource(QIOStream) they mean setSourceDevice(QIODevice *))! We need to find another way.

Luckily, JavaScript’s Web Audio API has a great solution. BaseAudioContext has a nice decodeAudioData method that takes an ArrayBuffer of data in any kind of supported audio format and decodes it as an AudioBuffer. This buffer can be played with an AudioSourceNode connected to an AudioContext‘s destination node.

But wait, how do we get our QByteArray data into a JavaScript ArrayBuffer?

Using the Emscripten API

We need to use emscripten’s C++ API for this. On Qt for WASM, it’s already linked and you can include its headers out of the box.

The idea is that we want to not only invoke JS functions and use JS vars, but also have C++ objects representing JS objects. Then we can create them, perform other operations in our C++ context, and then invoke methods on the JS objects later, even passing data from C++ objects to JavaScript and vice-versa.

Our best option for this is to use emscripten’s embind, instead of writing inline JavaScript with something like EM_JS or EM_ASM. This way, our JS objects are accessible in whatever scope their corresponding C++ objects are defined in.

To start, we need to #include <emscripten/bind.h>.

We’ll be working with the C++ type emscripten::val, essentially emscripten’s wrapper for a JS var, which additionally has a static member function to access global objects and functions.

Since Qt 6.5, in Qt WASM, QByteArray conveniently has a member function toEcmaUint8Array() that represents its data in the layout of a JS Uint8Array and returns it as an emscripten::val. Since we receive fileContent as a const reference to a QByteArray and toEcmaUint8Array is not a const function, we’ll copy fileContent to a new QByteArray:

QFileDialog::getOpenFileContent(
    tr("Audio files (*.mp3 *.flac *.wav *.aiff *.ogg)"),
    [this](const QString &fileName, const QByteArray &fileContent) {
        if (fileName.isEmpty()) {
            // no file selected
        } else {
            auto contentCopy = fileContent;
            auto buf = contentCopy.toEcmaUint8Array();
            // ...
            // ...
            // ...
        }
    });

Nice. Now we have to get the buffer property from our buf object and call decodeAudioData. For that, we need to initialize an AudioContext.

An easy way is to just give our MainWindow class a private data member emscripten::val mAudioContext and initialize it in the constructor. To get the embind equivalent of the JS expression mAudioContext = new AudioContext() we do this:

mAudioContext = emscripten::val::global("AudioContext").new_();

We can go ahead and initialize it in our constructor’s initializer list:

MainWindow::MainWindow(QWidget *parent) noexcept
    : QMainWindow(parent)
    , mAudioContext(emscripten::val::global("AudioContext").new_())
{
    // ...
    // ...
    // ...
}

So why this syntax?

In JavaScript, functions are first-class objects. For a browser that complies with the Web Audio API, the constructor AudioContext is a property of globalThis, so it can be accessed as a global object with emscripten::val::global("AudioContext"). Then we can call new_(Args&&... args) on it, which calls the constructor with args using the new keyword.

Now let’s get back to setting up our AudioBuffer.

Finally Decoding the Audio

Recall that we have this:

auto contentCopy = fileContent;
auto buf = contentCopy.toEcmaUint8Array();

We need to access the property buffer of buf to get an ArrayBuffer object that we can pass to decodeAudioData. To get the property, we just use operator[] like so: buf["buffer"]. Easy.

To call decodeAudioData, we just invoke the template function ReturnValue call(const char *name, Args&&... args) const like so:

auto decodedBuffer = mAudioContext.call<emscripten::val>("decodeAudioData", /* arguments */);

But wait a second, decodeAudioData is overloaded. There are two versions of this function with different arguments and return types. One involves a promise-await syntax where you just pass the buffer, which returns the decoded buffer. The other is void and involves passing both the buffer and a callback function that takes the decoded buffer as an argument. How can we do JS promise-await or pass JS callback functions as arguments in C++?

Well, to do promise-await syntax, we can just call await() on our return value.

auto decodedBuffer = mAudioContext.call<emscripten::val>("decodeAudioData", buf["buffer"]).await();

Please keep in mind this await() is only available when Asyncify is enabled. If you’re using the multithreaded version of Qt WASM, you can enable this. Just add the following to your CMakeLists.txt:

target_link_options(<target name> PUBLIC -sASYNCIFY -O3)

In C++20, you can also use coroutines to do promise-await syntax by placing a co_await operator to the left of the function call.

For the callback function, well, Function is a global object. Thus, you can call the Function constructor with new_(Args&&... args), passing the function body as a string, and store the returned function as an emscripten::val.

The problem is that the callback function takes the decoded buffer as an argument, so we get it in JS rather than C++. It is possible to call a C++ function from the callback and forward the decoded buffer to it, which involves exposing the function to emscripten’s Module object using EMSCRIPTEN_BINDINGS. However, if we need it to be a member function, we get type errors on the function pointer.

I am still looking at how the callback syntax would work, but I believe it involves exposing the entire class to Module.

So, let’s just go with the promise-await syntax for now.

Since we will want to play from the buffer on a button press or some other event, let’s make a data member emscripten::val mDecodedBuffer so it’s in the class scope.

We now have this:

QFileDialog::getOpenFileContent(
    tr("Audio files (*.mp3 *.flac *.wav *.aiff *.ogg)"),
    [this](const QString &fileName, const QByteArray &fileContent) {
        if (fileName.isEmpty()) {
            // no file selected
        } else {
            auto contentCopy = fileContent;
            auto buf = contentCopy.toEcmaUint8Array();

            mDecodedBuffer = mAudioContext.call<emscripten::val>("decodeAudioData", buf["buffer"]).await();
        }
    });

Cool, we can store the data from an audio file in a buffer in PCM format!

Let’s move on to playing the audio from that buffer.

Playing Back the Audio

When we want to play, we first need to create an AudioBufferSourceNode from the AudioContext, and set its buffer to mDecodedBuffer. We then connect the source node to our audio context’s destination node (our audio device). After this, we can call start on the source node to play the sound!

Here’s what that looks like with embind:

auto source = mAudioContext.call<emscripten::val>("createBufferSource");
source.set("buffer", mDecodedBuffer);
source.call<void>("connect", mAudioContext["destination"]);
source.call<void>("start");

A new AudioBufferSourceNode needs to be created every time you want to play the sound, but they are inexpensive to construct.

I have this code to play on the QPushButton click:

connect(playButton, &QPushButton::clicked, this, [this]() {
    auto source = mAudioContext.call<emscripten::val>("createBufferSource");
    source.set("buffer", mDecodedBuffer);
    source.call<void>("connect", mAudioContext["destination"]);
    source.call<void>("start");
});

To make sure the buffer will be populated successfully before this code executes, I make the button initially disabled and use Qt signals readyToPlay and notReady to enable and disable the button respectively.

Complete Code & Final Thoughts

Here is my full header for this basic example:

#pragma once

#include <QMainWindow>
#include <emscripten/bind.h>

class MainWindow : public QMainWindow {
    Q_OBJECT

public:
    explicit MainWindow(QWidget *parent = nullptr) noexcept;
    ~MainWindow() override;

signals:
    void notReady();
    void readyToPlay();

private:
    emscripten::val mAudioContext;
    emscripten::val mDecodedBuffer;
};

and my entire implementation file (it’s a very basic example so I just left everything in the constructor):

#include "mainwindow.h"

#include <QFileDialog>
#include <QHBoxLayout>
#include <QPushButton>

MainWindow::MainWindow(QWidget *parent) noexcept
    : QMainWindow(parent)
    , mAudioContext(emscripten::val::global("AudioContext").new_())
{
    auto *centerWidget = new QWidget(this);
    auto *layout = new QHBoxLayout(centerWidget);

    auto *dialogButton = new QPushButton(centerWidget);
    dialogButton->setText(tr("Choose file"));

    auto *playButton = new QPushButton(centerWidget);
    playButton->setText(tr("Play"));
    playButton->setEnabled(false);

    layout->addWidget(dialogButton);
    layout->addWidget(playButton);
    centerWidget->setLayout(layout);
    setCentralWidget(centerWidget);

    connect(dialogButton, &QPushButton::clicked, this, [this]() {
        QFileDialog::getOpenFileContent(
            tr("Audio files (*.mp3 *.flac *.wav *.aiff *.ogg)"),
            [this](const QString &fileName, const QByteArray &fileContent) {
                if (fileName.isEmpty()) {
                    emscripten::val::global().call<void>(
                        "alert", std::string("no file selected"));
                } else {
                    emit notReady();

                    auto contentCopy = fileContent;
                    auto buf = contentCopy.toEcmaUint8Array();

                    mDecodedBuffer = mAudioContext.call<emscripten::val>("decodeAudioData", buf["buffer"]).await();

                    emit readyToPlay();
                }
            });
    });

    connect(playButton, &QPushButton::clicked, this, [this]() {
        auto source = mAudioContext.call<emscripten::val>("createBufferSource");
        source.set("buffer", mDecodedBuffer);
        source.call<void>("connect", mAudioContext["destination"]);
        source.call<void>("start");
    });

    connect(this, &MainWindow::notReady, this, [playButton]() {
        if (playButton->isEnabled())
            playButton->setEnabled(false);
    });

    connect(this, &MainWindow::readyToPlay, this, [playButton]() {
        if (!playButton->isEnabled())
            playButton->setEnabled(true);
    });
}

MainWindow::~MainWindow() = default;

This example was simple, but I hope readers can use these building blocks in more complete codebases to work with audio in Qt WASM!

Note: if you want to stream your audio, look into the Media Capture and Streams API and use a MediaStreamAudioSourceNode instead of an AudioBufferSourceNode.

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *