Creating PDF/A documents with Qt
As there were some complex issues around conformity in PDF document creation within Qt, KDAB let me spend some time digging into it so we could make sure that Qt’s PDF engine generates documents up to ISO-standard.
Nowadays, many official institutions have the requirement to archive their data digitally and PDF is a first class citizen for this purpose. However, since the overall PDF standard is too complex, to guarantee that PDF documents can be rendered in their entirety in the future when using all features, this standard restricts the allowed features to a sensible minimum.
Qt has supported the generation of PDF documents since Qt 4.1, by using a
QPrinter object, setting its output format to
QPrinter::PdfFormat and then using a
QPainter to draw arbitrary text, graphics and images onto it. In more recent Qt versions you can also use the
QPdfWriter class directly, to avoid the dependency on
QPrinter. The text, that you draw with
QPainter::drawText(), is stored as actual characters inside the PDF file, so it is searchable and scalable. The same applies to the graphics you create with the various
QPainter::draw*() calls. Only
QPainter::drawPixmap() calls will store the passed images as bitmaps inside the PDF document, so you cannot scale them without aliasing artifacts.
While the original version of Qt’s PDF engine just creates documents that conform to version 1.4, nowadays it is more and more important to conform to the PDF/A standard. If you haven’t heard about it yet, PDF/A is an ISO-standard that describes a subset of PDF which should be used for archiving of digital documents. This standard defines which features are prohibited in a PDF document, since they would be ill-suited for long-term archiving.
There are various levels of conformance, which differ in the required PDF base version and the prohibited features. The oldest and most restrictive version is PDF/A-1, with its sub-versions PDF/A-1b and PDF/A-1a. The newest version is PDF/A-3 (for more details, see Wikipedia).
Apparently the PDF/A-1b standard is a common requirement nowadays when generating PDF documents, so it was about time to extend Qt’s PDF engine to generate standard compliant documents and this is exactly what KDAB did during the last weeks :).
To verify that a PDF/A document is compliant, there are a couple of tools available, some commercial ones and some open source tools. The most mature and strict validator seems to be veraPDF, an open source tool that has been developed in cooperation with the Open Preservation Foundation and the PDF Association (the creators of the PDF/A standard).
Since PDF/A-1b is based on PDF-1.4, we didn’t have to change much of the basic generation code in Qt’s PDF engine, just some smaller syntax fixes were needed to make the validator happy.
Additionally, transparency in embedded images is not allowed. This means, if you call
QPainter::drawImage/Pixmap() with an image that contains an alpha channel, the Qt PDF engine will replace the transparent pixels with white ones and store the result inside the PDF. Semi-transparent colors are also not allowed for drawing. So if you configure
QPainter‘s pen or brush with a color or gradient that contains an alpha channel, the channel is removed.
There are also some requirements in PDF/A-1b, which have not been supported yet, namely:
- meta data must be embedded in the Extensible Metadata Platform format (XMP)
- ICC color profiles for used color spaces must be embedded
Embedding this information into the PDF document makes it slightly larger than its PDF-1.4 equivalent, which is why
QPdfWriter do not use the PDF/A-1b format as default.
From Qt 5.10 on, you can enable PDF/A-1b support in
QPdfWriter by setting their pdfVersion property to
PdfVersion_A1b as default instead of
So if you need PDF/A-1b support in your application, feel free to test the code from the current dev branch and validate the resulting PDF document with veraPDF. If you find any validation errors, please send me the document and I’ll try to fix