Compart - Document- and Output-Management

Development and Technology

HTML5 Instead of PDF: The End of the US-Letter Format

Compart |

Today, Web sites are being increasingly designed based on the principles of "responsive design," usually with a "mobile first" thrust.  Hence Internet sites are fashioned for mobile devices with smaller displays and touchscreens first, and then for larger displays (like the PC, etc.). The CSS 3 (Cascading Style Sheets) design language certified by the World Wide Web Consortium (W3C) allows the layout to be adjusted to the given display size.

While a large screen can accommodate information in multiple columns and with illustrations, the smartphone is limited to the most essential information displayed in a single column. This is accomplished via media queries (see below), which apply different style sheets to the HTML document based on the output medium.

HTML Future Proofed

For the first time, HTML5 now propagates scalable vector graphics (SVG) as the universal vector graphics format, yet another milestone in the retreat from pixel-perfect Web sites that do not always perform well in the age of mobile devices. A small pie chart, which requires 32 KB as a PNG bitmap, may need only 4 KB as an uncompressed SVG graphic, which in turn is scalable to any size without loss of quality. Supporting high-resolution displays is no longer a major challenge.

Other innovations introduced with HTML5 include streaming from audio and video as well as other interfaces (JavaScript APIs) for additional functions inside the browser. Such new functions include querying geographical position (geolocation) and browser-based data storage. These features open up whole new worlds for Web applications, ultimately resulting in plug-ins like Flash, which are often not available on mobile devices, falling out of favor. It also appears that HTML has greatly raised awareness of browser interoperability, which significantly cuts overall Web development costs.

In sum, HTML5 and CSS 3 have contributed greatly to the presence of the Internet in everyone's pocket. We buy our train or movie tickets via smartphone more often. A steady stream of new payment systems that use Bluetooth or near field communication (NFC) is arriving on the market. We will probably use our mobile "companion" to make purchases more and more in the future. The EC card and our good old wallet are facing competition.

Summary

Reading time: 4 min

  • HTML5 - an alternative to PDF?
  • Document content matters
  • Digital transformation in document processing

Slow Retreat From the Printed Document

But what does this mean for high-volume document processing? The fact remains that billions of Letter-sized hard copy pages are still being produced every year. We send and receive invoices, delivery slips, dunning letters, insurance policies, notices, etc. in printed form. But B2B commerce is also turning more frequently to electronic document exchange, including document processing (e.g., using formats like EDI or UBL).

Things digital are advancing. Most banks now support the receipt of e-invoices. They do not arrive in traditional letter form but are sent electronically directly to the bank's own e-banking system, where they just need approval for payment. The PDF of the invoice can be downloaded via a link. Some invoice prepares have already begun charging additional fees for paper invoices.

Many large firms offer portals that allow their customers to view their most recent invoices and download them as PDF files. The problem is that even when these portals are optimized for mobile devices, the customer has to request access to each. Besides navigating the inherent password jungle, this also entails a lengthy 'hunting and gathering' session for documents from all the possible sources.

Furthermore, as shown in the above examples, 8 ½ X 11 was and is the base format and hence hardly the best for mobile devices. Viewing a PDF file on a smartphone or a smaller tablet is just no fun, making HTML5 the ideal candidate for the document. Yet that would mean offering three possible format options for every document: XML for raw data, HTML5 for mobile devices, and PDF for traditional letter-sized formats.

This, in turn, incurs additional development costs for HTML if print output needs to be accommodated. In certain cases, these costs can be reduced by converting PDF files to HTML, although that process does not always produce a satisfactory result. Not all layouts are amenable to automatic conversion to a lucid HTML display.

 

Will HTML5 Replace PDF?

 

Wouldn't it be great to produce documents, right from the beginning, for intelligent display over every device or channel? Or even better, to receive payment information in a standardized format (e.g., XML or EDI/XML) that supports direct transfer of funds from your smartphone or file forwarding to an e-banking system?

In a perfect world, such documents would not need accessing via the Internet, since you may want to view the document two years later, when company "X" suddenly changes its name to "Y" with a new URL. That means that CSS style sheets, pictures, signatures and even the JavaScript code would need to be embedded in the file.

What Matters Is the Content

A digital signature plays best in this scenario, both to verify the sender but also check that the file has not been changed. The document also needs to be completely accessible to people with physical and cognitive limitations. The EPUB (electronic publication) format is one standard that already takes these requirements into account. Although it is seen primarily as an eBook format, it would be ideal for multi-channel display of any type and format of document. For instance the current EPUB 3.0 version supports the declaration of alternative presentations of one and the same document (e.g., visual HTML5 and raw data as XML). Though hardly a sensible alternative, even a PDF could be embedded in a ZIP-based container.

In the final analysis, the way documents are sent is not so important. An email attachment suffices, although the sender cannot reliably determine whether the document has arrived. De-Mail, IncaMail or standardized HPPTS interfaces that can be made available from online archives are certainly alternatives. You would need to define only the invoicer's URL for your documents, e.g., "mailto:john.smith@test.org" or "https://onlinearch.iv/john.smith".

This is just one scenario of how HTML5 could affect output management in the future. The fact remains that document exchange is more about content and less about presentation; ultimately it's the data that counts. Presentation is merely a means to an end, and HTML5 covers a lot of territory; so much so, that it will inevitably turn traditional print-oriented output management on its head.

Features of HTML5 at a Glance

Extension of layout-related elements
  • Stronger separation of semantics and layout (CSS)
  • Stringent markup of selected sections of a web site
  • Additional elements for frequently used page areas such as <footer> and <section>
SVG

Scalable Vector Graphics (SVG) is a specification recommended by W3C for creating complex two-dimensional vector graphics in documents. Because SVG is an XML-based format, the content of SVG files is easily accessible for computer-assisted translation and other downstream processes. They can also be edited directly in a text editor.

MathML

Mathematical Markup Language (MathML) is a format for depicting mathematical formulas on the Internet.

Canvas

Programmers use it to generate precise pixel-based graphics in the browser window. Extended with JavaScript, Canvas can generate complex animations, games, and dynamic business graphics that previously required the Adobe Flash plug-in.

Video

With the new <video> element, videos can be embedded in web sites without having to use external plug-ins such as Apple QuickTime or Adobe Flash Player.

Geolocation

The new "Geolocation" JavaScript function enables a web site to specify the location of the user accessing it from a mobile device. This allows location-based services to be offered and shows the user of the Web site nearby businesses or his or her location on a map.

Offline Web applications

Web sites that are also usable offline can be developed using HTML5. The web server just has to let the user’s browser know what data needs to be downloaded. The data is synchronized automatically as soon as the user is back online.

Microdata

This feature gives web sites additional semantic information and converts contact information into a vCard.