Skip to content

Creating PDF/A documents with Qt

As there were some complex issues around conformity in PDF document creation within Qt, KDAB let me spend some time digging into it so we could make sure that Qt’s PDF engine generates documents up to ISO-standard.

Nowadays, many official institutions have the requirement to archive their data digitally and PDF is a first class citizen for this purpose. However, since the overall PDF standard is too complex, to guarantee that PDF documents can be rendered in their entirety in the future when using all features, this standard restricts the allowed features to a sensible minimum.

Qt has supported the generation of PDF documents since Qt 4.1, by using a QPrinter object, setting its output format to QPrinter::PdfFormat and then using a QPainter to draw arbitrary text, graphics and images onto it. In more recent Qt versions you can also use the QPdfWriter class directly, to avoid the dependency on QPrinter. The text, that you draw with QPainter::drawText(), is stored as actual characters inside the PDF file, so it is searchable and scalable. The same applies to the graphics you create with the various QPainter::draw*() calls. Only QPainter::drawImage() and QPainter::drawPixmap() calls will store the passed images as bitmaps inside the PDF document, so you cannot scale them without aliasing artifacts.

While the original version of Qt’s PDF engine just creates documents that conform to version 1.4, nowadays it is more and more important to conform to the PDF/A standard. If you haven’t heard about it yet, PDF/A is an ISO-standard that describes a subset of PDF which should be used for archiving of digital documents. This standard defines which features are prohibited in a PDF document, since they would be ill-suited for long-term archiving.

Those features range from dependencies on external resources (e.g. linking font files), through dynamic content (JavaScript, audio and video) up to the usage of proprietary compression algorithms or encryption.

There are various levels of conformance, which differ in the required PDF base version and the prohibited features. The oldest and most restrictive version is PDF/A-1, with its sub-versions PDF/A-1b and PDF/A-1a. The newest version is PDF/A-3 (for more details, see Wikipedia).

Apparently the PDF/A-1b standard is a common requirement nowadays when generating PDF documents, so it was about time to extend Qt’s PDF engine to generate standard compliant documents and this is exactly what KDAB did during the last weeks :).

To verify that a PDF/A document is compliant, there are a couple of tools available, some commercial ones and some open source tools. The most mature and strict validator seems to be veraPDF, an open source tool that has been developed in cooperation with the Open Preservation Foundation and the PDF Association (the creators of the PDF/A standard).

Since PDF/A-1b is based on PDF-1.4, we didn’t have to change much of the basic generation code in Qt’s PDF engine, just some smaller syntax fixes were needed to make the validator happy.

Additionally, transparency in embedded images is not allowed. This means, if you call QPainter::drawImage/Pixmap() with an image that contains an alpha channel, the Qt PDF engine will replace the transparent pixels with white ones and store the result inside the PDF. Semi-transparent colors are also not allowed for drawing. So if you configure QPainter‘s pen or brush with a color or gradient that contains an alpha channel, the channel is removed.

There are also some requirements in PDF/A-1b, which have not been supported yet, namely:

  • meta data must be embedded in the Extensible Metadata Platform format (XMP)
  • ICC color profiles for used color spaces must be embedded

Embedding this information into the PDF document makes it slightly larger than its PDF-1.4 equivalent, which is why QPrinter and QPdfWriter do not use the PDF/A-1b format as default.

From Qt 5.10 on, you can enable PDF/A-1b support in QPrinter/QPdfWriter by setting their pdfVersion property to PdfVersion_A1b as default instead of PdfVersion_1_4.

So if you need PDF/A-1b support in your application, feel free to test the code from the current dev branch and validate the resulting PDF document with veraPDF. If you find any validation errors, please send me the document and I’ll try to fix QPdfWriter accordingly.

Categories: KDAB Blogs / KDAB on Qt / Qt

Tags:

8 thoughts on “Creating PDF/A documents with Qt”

  1. Thank you very much for this Tobias! I will be starting a project requiring PDFs soon, so this is really good information.

    I haven’t dug into details of the PDF format yet, but I know one of my requirements will be digital signing and I see that, according to the wikipedia article – digital signatures seem to be part of “PDF/A-2”. If it’s not possible with QPdfWriter right now, any thoughts on how tricky that would be to implement?

    1. Tobias Koenig

      Hej Andy,

      the QPdfWriter has no support for digital signing, because its initial goal was to simply export QPainter paint commands into a PDF file. I don’t really see that digital signing support will be added to QPdfWriter either, because that would drag in dependencies on crypto libraries into QtGui, which most users won’t need. The best approach for the moment might be to first generate the PDF with QPdfWriter and afterwards sign it with some external library/tool.

      1. Ok – thanks Tobias. That makes sense. And thanks again for your work on this! Much appreciated.

    1. Tobias Koenig

      Hej Alain,

      a valid PDF/A-1b document is basically also a valid PDF/A-3b document (except a small change in the PDF header), because the latter versions of the standard have lifted some restrictions that are in place in PDF/A-1b. I have not really planned to add PDF/A-3b explicitly, but I can give it a try if time permits 😉

      1. There is any way to store custom metadata inside a PDF/A-1b? and using Qt? Seems possible in PDF/A-1b but i don’t find any API to do that from Qt.
        Could be usefull to keep machine friendly format like json/xml stored inside the pdf, like you generate some reports with sensors data, rendering charts, and you would like to store with the report the original data that generate the chart.

        1. Tobias Koenig

          Hej,

          theoretically you can embed arbitrary files into a PDF document (not PDF/A-1b specific), but the Qt API to generate PDF documents doesn’t provide such a functionality, since QPdfWriter is aimed to just redirect painting commands into a PDF document. For more advanced PDF creation, you’d have to integrate some 3rd-party library into your application, which would post-process the generated PDF document.

Leave a Reply

Your email address will not be published. Required fields are marked *

By continuing to use the site, you agree to the use of cookies. More information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close