Assessing the automatic Qt 4 to 5 porting tool

25.06.2012

steveire

No comments

One of the useful outcomes the work Bertjan did on tooling for program understanding and refactoring is a list of considerations we can use to assess the suitability of new tools.

Requirements for a porting system

Section 1.3.5 of his thesis details the requirements for a similar porting system:

GR1: Scalability

The qt4to5 porting tool is scalable. It is designed by Google engineers to be used on large codebases, and to operate on multiple translation units at a time. As the tool is based on a compiler, if you have the resources to compile it, you have the resources to port it.

GR2: C++ Language understanding

The tool has a full understanding of the C++ language. This does not really mean that everything can easily be ported however. For example:

    template<typename T>  void doSomething(const T&) {
      T.intersect();
    }

In cases where T is a QRect, the intersect() should be changed to intersected(), but not if T is a QSet. This is determined by the caller, and the porting tool can not port such cases automatically (see also FX 6.

However, the intent of the tool is not to port everything automatically, but to port the boring and easily automatable parts automatically.

GR1: Simplicity of use

The tool works, but the command line arguments are not very convenient, and probably never can be. I’ve written a python script to make it easier, but a graphical tool would be better:

  renameMethod("QPaintDevice", "numColors", "colorCount")
  renameMethod("QImage", "numColors", "colorCount")
  renameMethod("QImage", "setNumColors", "setColorCount")

The graphical tool would be able to tell the engineer using it the refactoring steps that can be done before starting the switch to Qt 5 (eg porting away from Qt 3 support), so that it can be recompiled in between.

A Qt based tool could also link statically to the tooling framework and avoid calling external processes. It would be possible to integrate such tooling into Qt creator.

The tooling is integrated with CMake already. The tool needs access to the actual command line that should be used to build each compilation unit. CMake provides that when compiled with -DCMAKE_EXPORT_COMPILE_COMMANDS since CMake 2.8.5.

Currently it only works with the CMake “Unix Makefiles” generator and Ninja generator. The feature of creating such a compilation database could similarly be added to qmake presumably, but that has compilications because for some porting steps you need to be compiling against Qt 4 (see FX 1 below), so it would need to be added to the qmake in Qt4 and probably Qt 5 too.

Alternatively, I also created a compiler wrapper which creates such a compilation database when invoked. There may even be a simpler solution by using the porting tool as the actual compiler.

GR4: Customizability

Although the porting tool can port enums, the upstream stuff in the clang repo can not.

I had to extend it, which was quite trivial:

  namespace clang {
  namespace ast_matchers {
  const internal::VariadicDynCastAllOfMatcher<
    clang::Decl,
    clang::EnumConstantDecl> EnumeratorConstant;
  }
  }

I created the EnumeratorConstant functor which can then be used to match AST nodes of type EnumConstantDecl, which basically means use of an enum.

That EnumeratorConstant can then be used just like the rest of the syntax for matching, can contain inner matchers, etc:

  DeclarationReference(
    To(EnumeratorConstant(
      HasName("QSsl::SslProtocol::TlsV1"))
    )
  )

will find all used of QSsl::SslProtocol::TlsV1.

So, the tooling and API is fully extensible.

GR5: Predictable minimal impact

The tool only makes changes on parts of the code that are specified in the matching expressions (such as the enum matcher above).

Therefore, the predictability of the impact of running the tool depends on the exactness of the matching expressions.

The tool might also try to edit your system headers for example if renaming a virtual method or changing its arguments. To prevent that, the tool takes a source-dir as an argument and verifies that edited files are below the provided source dir. That feature is not fully used yet however, and there is scope for clang upstream to add such a feature to ensure correct support for symlinks and relative paths etc.

GR6: Transparency

The python wrapper script provided in the repository creates git commits with commit messages describing the change being made at each step (micro commits).

The user of the tool can review all commits in gitk, or build any step by checking it out (or build all steps by running a git script).

Any other graphical user tool could do the same thing. The result of such an execution of the script can be seen in kdelibs.

Requirements of a fact extraction system

Section 2.2 details requirements for a fact extraction system. This is a necessary component of a porting tool, as it relies on having accurate information to complete a port.

However, not every requirement of a fact extraction system is a requirement of a porting tool.

FX1: Fault tolerance

The tool has some fault tolerance, which comes from clang. If you remember Chandlers GoingNative talk, he mentioned that if clang encounters an error, it tries to ignore it and keep processing.

Indeed, the python porting tool currently only runs cmake, but not make, which means that cmake dependency scanning and file generation is not invoked. That means that moc files and ui_ files are not generated. When running the tool clang will give an error about missing includes for the moc files, but will continue to process the file.

However, while there is some fault tolerance, the specific code to be ported does have to be correct. I think mostly we’d be porting code which already compiles with Qt 4, so I don’t think fault tolerance is a huge problem for a porting tool (as opposed to a fact extraction system).

FX2: Completeness and correctness

All parseable code in the software targetted for porting is extracted correctly, because it is based on a compiler.

However, this is only true for one particular target platform at a time (see also FX 9).

When used on linux, anything in #ifdef Q_OS_WIN for example will not be ported. It may be possible to use Wine headers or a cross compilation build to port such code on a linux host.

FX3: Compliance

The parser is compiliant to at least most of C++03 and parts of C++11. The C++11 parts might be relevant if we ever have to port code which uses C++11 and Qt 4, which is not unheard of.

While clang can find C++11 features such as lambdas, the new tooling system does not yet have API for matching or processing it.

FX4: Cross references

This item is not relevant to our porting tool. This is only relevant to fact extraction systems to for example find the amount of uses of a method. The porting tool compiles each translation unit in isolation.

In our case the fact extraction framework is clang, and it extracts all available information.

FX5: Preprocessing

My investigation of the new clang tooling APIs did not include playing with preprocessor constructs much, but I know that clang stores all information extracted from the preprocessor.

FX6: Coverage

The C++ grammar is partially context dependent, and that can lead to ambiguity as described before when using templates for example:

  template<typename T> int numColors(const T&) {
    return T.numColors();
  }

Is T a QImage or a QPaintDevice?

FX7: Output completeness

This item is not relevant to porting tools. This is only relevant to the fact extraction framework. In our case the fact extraction framework is clang, and it extracts all available information.

FX8: Performance and scalability

The python wrapper script provided in the repository assumes the existance of a git repository, and uses git grep to find uses of a method which needs to be ported. This way not all files need to be processed for each step of porting, but only the ones which contain something to be ported or a false positive (eg QSet::intersect).

This works well for renaming, but perhaps other tricks and heuristics would be needed for more complex porting steps. For example, grep can’t find QAtomicInt::operator int(), but if we grep for QAtomicInt, which might be used in a header file, then grep for uses of that header file we might get all relevant files that need to be ported to use QAtomicInt::loadAcquire().

Any user interface tool should keep this git dependency to create the clean patches. Even where code bases which do not use git exist, the patches can still be created and reviewed in a local git repo created with git init.

FX9: Portability

Clang can currently only generate code on Unix systems, so it can’t be used to generate Windows binaries.

There have been some windows specific patches on the mailing list though, and it’s possible that the parser works just fine on windows (the only part we need).

This still needs to be fully investigated.

FX10: Availability

clang and the required tooling for the porting tool are covered by a weak copyleft licence. This is not reversible, but does not prevent the code from being part of proprietry forks. The license choice assumes that doing so is not worth it anyway as a social measure if not a legal one.

The actual code of the porting tooling is currently in a branch, not in clang trunk, which is a minor temporary inconvenience.

Categories: Porting Qt4 to Qt5