PDF or AFP? XML or XSL-FO? And what about HTML5?
The following article discusses the advantages and disadvantages of the standard formats in output management
The topic of data formats in output management is sure fodder for controversy. Which format is better? The point is moot. All have their purpose – both page-oriented formats like AFP and PDF, and content-oriented formats à la HTML and XML. What really counts are what assumptions a company makes and what its document processing structures are. An operation that prints just a few hundred pages a day has no reason to adopt, say, AFP.
On the other hand, a company with a daily print volume of several million pages is an altogether different story. For it, there’s no way around AFP. The format originally developed by IBM is the de facto standard for reliable, high-volume production printing. And with good reason: AFP has features that others do not (also see glossary). AFP, for example, offers the most advanced form of print monitoring. If the content output is incorrect or incomplete, an error message is automatically generated. It’s no accident that AFP is the most widely used format in the industrial production of invoices, account statements, securities/portfolio statements, transfer vouchers, insurance policies, and the like.
AFP for reliable, high-quality bulk printing
Moreover, AFP offers definitive management of multi-page display, paper tray control, and simplex-duplex printing – in other words, everything essential for bulk production. Its comprehensive and flexible resource management and compact data stream make AFP a user favorite. Product and application developers value its carefully designed and well-documented architecture. Even serious contenders like PDF can’t come close to AFP print quality – even if the PDF/VT specification was used. In the end, bundling the bulk-printing benefits of AFP with the extreme flexibility of PDF is just a compromise (see glossary for benefits and specifications of PDF).
Yet due to its extreme compatibility, PDF with all its different specifications is a recognized international standard and has firmly established itself as the format for long-term, read-only archiving (PDF/A) and the creation of barrier-free documents. Ultimately the specific situation will dictate the choice between AFP and PDF. A company that wants or needs to archive large quantities of documents in their original layout may choose to output them in PDF or PDF/A, thus avoiding the necessary conversion from AFP to PDF.
PDF and HTML5 are not competitors
Be that as it may, both AFP and PDF stick rigorously to the standard-size page format, making them the obvious choice for A4 document processing. But neither format works for display on the Web or mobile end-devices. This is where HTML5 comes in. The W3C standard is currently the most intelligent format for the creation and display of documents, regardless of size or output channel.
It supports reformatting, converting page- to text-oriented formats; extracting single data items (including retrieval of invoice items); and building tables of contents and index lists. Moreover, with HTML5, even audiovisual elements, Web links and charts can be embedded. This creates not only multi-channel-capable documents, but also intelligent documents that offer users added value beyond mere text display.
The development of HTML5 represents a quantum leap in functionality. The new version has become the “language of the Web” and can also be used as a print version with relatively little effort. Unfortunately, HTML5 and PDF are still seen as competitors, especially when it comes to preserving structural information. That assumption could not be more wrong. After all, HTML5 is the lowest common denominator for channel-independent display and output of documents. PDF will not disappear. Quite the opposite: each format depends on the other. In document processing, for example, HTML5 can be the preliminary step for PDF, because PDF/A will continue to be needed for certain processes such as archiving.
Put an end to the “religious wars!”
Speaking of multi-channel output management, sooner or later the discussion will turn to yet another format that is gaining ground: XSL-FO. This XML-based mark-up language has one major advantage over HTML: It not only supports creation and output of documents regardless of pages size, it also features a number of sophisticated functions for page design. XSL-FO enables generating premium print products. In contrast to the XHTML/HTML format, which is particularly good for browser applications, XSL-FO is used especially for printing and archiving documents with many pages.
That leaves XML, the ISO-standardized markup language that has become the standard for transferring data from specialist applications to a company’s output instance. XML technologies are now so advanced that no special software components are needed for data extraction. This also applies to the other formats. Whether the output format is AFP, PDF, HTML5, or XSL-FO, current IT solutions now support the popular standards used in modern output management so completely that it should be no problem for a company to develop and establish an overall architecture that supports every scenario, and even keep costs manageable.
It’s time to put an end to the “religious wars” over the best format. The decision is a basic one, namely the company’s strategic orientation for document processing. Which communications channels will play a role in future and to what degree? What are the expected document volumes? How will the ratio of physical to electronic documents evolve over time? The answers to these questions will determine which formats make sense and when. They all have strengths as well as weaknesses. It’s the application scenarios that are important. They alone determine the relevance of any given format for output management within a company.