Automated porting from Qt 4 to Qt 5

Like many companies in tech, KDAB allows employees some time to spend on ‘personal education’, which must be somewhat job-related, but not necessarily Qt-related, and which must be reported on to colleagues. Sometimes that involves reading a book or investigating a new technology, or writing some new interesting tooling, as was the birth of GammaRay.

While porting KDE code from Qt 4 to Qt 5, I initially wrote some sed scripts to assist with porting, but then started investigating semantic tooling to help with the task. The result is a tool which can automate with precision some of the boring API renames which are a necessary part of any Qt 4 to 5 port.

Initially I called it ‘Captcha’ for ‘Clang Assisted Porting Tool using Compilation, Heuristics and Algorithms’, and because it makes the computer do a better porting job than humans.

For now, let’s call it qt4to5. We are of course interested in making it useful for broader collaboration and use throughout the Qt ecosystem. Find the code on the KDAB github.

Introduction

Several years ago Bertjan Broeksema worked at KDAB on estimation and visualization of porting effort on large codebases and wrote a masters thesis on the subject. The code required for estimating the effort of completing a port has some commonalities with automating a port, as Bertjan showed with some porting done in KDE.

I tried to reuse the porting tooling from Bertjan to create a porting tool usable for Qt 4 to Qt 5 transition. Such a porting tool would give results which are far more likely to be correct and less likely to need manual fixing after porting, as a sed or other string processing based tool would be. Bertjans thesis documents the motivation well.

However, the KDevelop based refactoring code did not have a complete enough understanding of the code to achieve what I was trying to do. It turns out Google has been working for some time already on refactoring tooling based on clang. They have created a C++ API for describing how to find code which needs to be ported, and for actually rewriting the code in-place (this is called source-to-source translation).

The code was not available at the time the video first emerged, but it exists in a branch that will be merged into the mainline some day, I suppose. The tool was also mentioned during the GoingNative conference.

Proof of concept

A few weeks ago I got started with one of the porting steps involved in Qt 4 to 5, porting from Qt::escape(const QString &) to QString::toHtmlEscaped(). The former does still exist in Qt 5, but is deprecated, so we would prefer to port away from it.

This is a difficult problem to automate because of all of the implicit conversions that happen. Starting with this test case (calling Qt::escape with const char *, QLatin1String, QString, QByteArrays, temporaries, method calls, operator+, etc):

    QString foo("foo");

    Qt::escape(QLatin1String("foo"));
    Qt::escape(QString("foo"));
    Qt::escape(foo);
    Qt::escape(foo.trimmed());
    Qt::escape(QString(QLatin1String("foo")).trimmed());
    Qt::escape(foo.trimmed() + foo);
    Qt::escape(foo.trimmed() + "bar");
    Qt::escape("bar" + foo.trimmed());
    Qt::escape(QString::fromLatin1("foo"));
    Qt::escape(QString::fromUtf8("foo"));

    Qt::escape(foo.trimmed().toLatin1() + "bar");
    Qt::escape(foo.trimmed().toLatin1());
    Qt::escape("bar" + foo.trimmed().toLatin1());
    Qt::escape("foo");
    Qt::escape(foo.trimmed().toLatin1().constData());

my porting tool can automatically port it to this when invoked with ‘qt4to5 -port-qt-escape’:

    QString(QLatin1String("foo")).toHtmlEscaped();
    QString("foo").toHtmlEscaped();
    foo.toHtmlEscaped();
    foo.trimmed().toHtmlEscaped();
    QString(QLatin1String("foo")).trimmed().toHtmlEscaped();
    QString(foo.trimmed() + foo).toHtmlEscaped();
    QString(foo.trimmed() + "bar").toHtmlEscaped();
    QString("bar" + foo.trimmed()).toHtmlEscaped();
    QString::fromLatin1("foo").toHtmlEscaped();
    QString::fromUtf8("foo").toHtmlEscaped();

    QString(foo.trimmed().toLatin1() + "bar").toHtmlEscaped();
    QString(foo.trimmed().toLatin1()).toHtmlEscaped();
    QString("bar" + foo.trimmed().toLatin1()).toHtmlEscaped();
    QString("foo").toHtmlEscaped();
    QString(foo.trimmed().toLatin1().constData()).toHtmlEscaped();

-port-qt-escape is only one of the source-to-source translations that the tool can do. What is important to note is that the tool knows when the argument to Qt::escape is a QString (in which case we only need to add .toHtmlEscaped()), or a QByteArray (in which case we also need to wrap it in QString()). That is information that a sed based tool, or any other string processing based tool does not have. By basing the porting tool on an actual C++ compiler, we have access to all of the information that the compiler has about the source code (that is – everything relevant in a C++ context).

Portable porting

When doing any translation, the -create-ifdefs switch can also be provided to instead change the code to:

  #if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
    Qt::escape(QLatin1String("foo"));
  #else
    QString(QLatin1String("foo")).toHtmlEscaped();
  #endif

which helps if we’re supposed to be able to compile our source with both Qt 4 and Qt 5 (KDE is my testcase for this tool, so it is a real-world scenario).

Those ifdefs are then easier to remove with a perl, sed or python script at the end of the porting process. If a QByteArray::toHtmlEscaped ever exists, that can be integrated into this porting tool too.

The only case that is not already ported properly is:

    Qt::escape(Qt::escape(foo));

which appears in KDE. I think it’s something that could be handled, but is a rare enough case to not matter so much.

Other porting options

QByteArray::signature

The above Qt::escape change is not strictly needed for a Qt 4 to 5 port. Qt::escape is still there in Qt 5, but deprecated, so it should be ported away from at some point. A change which is strictly necessary is porting from QMetaMethod::signature to QMetaMethod::methodSignature.

Apart from the method being renamed, the return type was changed from const char * to QByteArray. In fact, the method name was changed because the return type changed just to serve as a warning to anyone using it that they would have to check the lifetime of the result.

  const char *sigName = mm.methodSignature();

would result in sigName pointing to deleted memory after the line was executed because of the operator cast from the temporary QByteArray to const char * (for people who don’t use QT_NO_CAST_FROM_BYTEARRAY). However,

  const QByteArray sigName = mm.methodSignature();

is ok, as is calling another API which takes a const char * and does not take ownership of it.

The porting tool knows all of this, so it ports this:

  {
    const char *methodName = mm.signature();
    QString s = mm.signature();
    QByteArray ba = mm.signature();
  }
  {
    const char *methodName;
    QString s;
    QByteArray ba;
    methodName = mm.signature();
    s = mm.signature();
    ba = mm.signature();
  }
  charStarAPI(mm.signature());
  stringAPI(mm.signature());
  byteArrayAPI(mm.signature());

into this:

  {
    const char *methodName = mm.signature();
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
    QString s = mm.signature();
#else
    QString s = mm.methodSignature();
#endif
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
    QByteArray ba = mm.signature();
#else
    QByteArray ba = mm.methodSignature();
#endif
  }
  {
    const char *methodName;
    QString s;
    QByteArray ba;
    methodName = mm.signature();
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
    s = mm.signature();
#else
    s = mm.methodSignature();
#endif
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
    ba = mm.signature();
#else
    ba = mm.methodSignature();
#endif
  }
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
  charStarAPI(mm.signature());
#else
  charStarAPI(mm.methodSignature());
#endif
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
  stringAPI(mm.signature());
#else
  stringAPI(mm.methodSignature());
#endif
#if QT_VERSION < QT_VERSION_CHECK(5, 0, 0)
  byteArrayAPI(mm.signature());
#else
  byteArrayAPI(mm.methodSignature());
#endif

Note that the cases in which the result was assigned to a const char * were not ported. They would have to be ported manually. It would be possible for the tool to change the type in that case to QByteArray, but then the tool would also have to look through where it was used, and add a constData() to those places as appropriate, etc.

That would also be possible, but I didn’t work on it yet. So far I’m just going for the (relatively) low-hanging fruit.

Renaming class methods and functions

Something else that is necessary in a Qt 4 to 5 port is porting away from obsolete methods. The QT3_SUPPORT methods have been removed in Qt 5, so any Qt 4 code which uses them needs to be ported. Additionally, some code which was deprecated early in the Qt 4 cycle has been removed. Mostly these are trivial porting steps – renaming method calls.

As noted in the video above, this is tricky to get right with non-semantic/language aware tools like perl, but very easy and repeatable with this tool. For example, QRegion::intersect() and QRect::intersect() must be changed to intersected(), but QSet::intersect() should not be changed. It gets more complex with virtual methods and with non-virtual overrides. For example, QPaintDevice has colorCount, but it is not virtual. QImage inherits QPaintDevice and also has colorCount, but it is not an override. This is unusual, but there are certainly cases where it is not a bug. We have to rename the correct method.

The qt4to5 tool checks all of this stuff, so given:

  class Foo
  {
  public:
    // Old API:
    virtual void m1() {}
    void m2() {}
    void m3() {}
    virtual void m5() {}

    // Replacement API:
    virtual void n1() {}
    void n2() {}
    void n3() {}
    virtual void n5() {}
  };

  class Bar : public Foo
  {
  public:
    // Old API:
    void m1() {}
    void m2() {}
    void m4() {}

    // Replacement API:
    void n1() {}
    void n2() {}
    void n4() {}
  };

This code:

    Bar i;

    i.m2();

    Foo d = i;
    d.m2();

is ported using the command line switches ‘-rename-class=Bar -rename-old=m2 – rename-new=n2′ to this result:

    Bar i;

    i.n2();

    Foo d;
    d.m2();

So, the Foo invokation was not changed, because it did not refer to a invokation of the Bar::m2 method. It can be changed separately in the event that it should also be changed as part of the porting.

Similarly, ‘qt4to5 -rename-class=Foo -rename-old=m1 -rename-new=n1′ will change all invokations of m1 to n1 anywhere it is invoked polymorphically, calls of m3() and m5() on Bar objects will be correctly ported to n3 and n5 because they are inherited.

Functions can also be renamed by not specifying the -rename-class argument. qMemCopy is deprecated in Qt 5 and should be replaced with memcpy, so ‘qt4to5 -rename- old=qMemCopy -rename-new=memcpy’ will do that port.

Renaming also works with operator casts, so we can also rename uses of QAtomicInt::operator int() (which is removed in Qt 5) to QAtomicInt::loadAcquire() (Which does not exist in Qt 4).

Renaming enums

Renaming enums is similar, but a bit more tricky. Given this, which is in Qt 5:

  namespace QSsl {
    enum SslProtocol {
      TlsV1_0,
  #if QT_NO_DEPRECATED
      TlsV1 = TlsV1_0,
  #endif
      TlsV3,
    };
  }

and used like so:

    int i = QSsl::TlsV1;

You will notice that the enum name doesn’t appear, but the compiler knows about it, and our porting tool needs to know about it too.

I don’t think the clang tooling can handle wildcards in names yet (probably something I could request), so when porting TlsV1, we need to specify both the actual scope of the value (‘QSsl::SslProtocol::’) and the printed scope (‘QSsl::’).

Modifying virtual method signatures

In Qt 5, the QAbstractItemView::dataChanged virtual method got an extra QVector argument (with a default value). All reimplementations of that method need to get that additional parameter. T

he porting tool will do that using the -port-qabstractitemview-datachanged switch.

  class MyView : public QAbstractItemView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&);
  };

  class MyView2 : public QAbstractItemView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&)
    {
      // inline implementation
    }
  };

  class MyView3 : public NotAView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&);
  };

  void MyView::dataChanged(const QModelIndex&,
                           const QModelIndex&)
  {
  }

  void MyView3::dataChanged(const QModelIndex&,
                            const QModelIndex&)
  {
  }

will be ported to:

  class MyView : public QAbstractItemView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&,
                     const QVector<int>& = QVector<int>());
  };

  class MyView2 : public QAbstractItemView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&,
                     const QVector&<int> = QVector<int>())
    {
      // inline implementation
    }
  };

  class MyView3 : public NotAView
  {
  public:
    void dataChanged(const QModelIndex&,
                     const QModelIndex&);
  };

  void MyView::dataChanged(const QModelIndex&,
                           const QModelIndex&,
                           const QVector<int>&)
  {
  }

  void MyView3::dataChanged(const QModelIndex&,
                            const QModelIndex&)
  {
  }

Note that the one which does not inherit QAbstractItemView is not ‘ported’.

QImage::text and QImage::setText

Also in the area of argument manipulation is the QImage::text and QImage::setText methods. I think that early in Qt 4 they took a const char * argument for language, which was rarely used, so overloads were provided with out the language at some point. Code written before the overload is usually called with a 0 in that position:

    image.text( "Thumb::URI", 0 );

In Qt 5 it is deprecated, so it should become:

    image.text( "Thumb::URI" );

The porting tool is aware if the literal zero is used in the call, and can port only the invokations that include the literal zero, and not port invokations which actually include a language.

Conclusion

This clang based tooling was fun to write and the porting API in clang is certainly interesting (the subject of a future blog post). A tool like this is capable doing many boring parts of what is required in porting, such as renaming methods, and it works quite well, but it will never port from QtWidgets to QML for example.

The automatable part of a Qt 4 to 5 port will always be the a subset of the same possibilities. Some existing code will use QRegion::intersect, while others will not. The tool can test the codebase for what is required though of course, so it can be used in a ‘fire and forget’ manner.

FacebookTwitterGoogle+

6 thoughts on “Automated porting from Qt 4 to Qt 5

  1. That’s really cool.

    However, I’ve just had a skim through the mailing list thread and the thesis you linked to, and noticed that the sematic patch tool “Coccinelle”/”spatch”[0][1] doesn’t appear to have been mentioned or evaluated against other approaches? Was it just not brought up, or was it not suitable in some way for Qt porting? (I don’t know how well it copes with C++, given that it was primarily written for C code bases – specifically the Linux kernel)

    [0] http://coccinelle.lip6.fr/sp.php
    [1] https://lwn.net/Articles/315686/

    • Thanks for the links, Karsellen.

      I indeed didn’t know about that tool. I was more trying to evaluate if clang can be used to do what we want, given that there is quite a bit of momentum behind clang in general, and refactoring tools with clang in general too.

      However, as that tool seem to be focussed on C instead of C++ (from reading the LWN article), it might not be suitable for Qt porting.

      Also an interesting link in the LWN article is to the Pork tool, which I didn’t know about before:

      http://blog.mozilla.org/tglek/2009/01/29/semantic-rewriting-of-code-with-pork-a-bitter-recap/

      Thanks,

  2. Karellen, I have looked at coccinelle before starting to work on the clang AST matchers and tooling infrastructure, and we found it not the perfect fit for multiple reasons at the time.

    One of my main concerns is how you express implicit constructs, like narrowing conversions, implicit constructor calls, implicit this pointer types, derived-from relationships etc.

  3. Pingback: Cross Qt Development! | One Alone Bit

  4. i try’d this today and got errors:

    Qt4To5.cpp: In function ‘int portMethod(const clang::tooling::CompilationDatabase&)’:
    llvm/tools/clang/tools/qt4to5/Qt4To5.cpp:491:49: error: ‘function’ was not declared in this scope

    and a lot more. How can I fix this?

    Thx, jan

Leave a Reply

Your email address will not be published. Required fields are marked *


six − = 2