Why PDF? – Future Text Publishing (original) (raw)

I am a graphic designer by training. As a graphic designer, I dislike Visual-Meta because it is ugly. It’s ugly! I don’t want that ugly stuff in my beautiful publications, and I can’t imagine that other book designers would want it in their books either.

The small text size and the structure of Visual-Meta show that it is primarily intended to be read by a computer, and if it is primarily intended to be read by a computer, then we should put it in a PDF’s XML metadata (or similar format that is not part of the rendered page). If we do that, we keep the ugly stuff off our beautifully rendered pages, but it can still be read by software applications.

Accessibility is no good reason to put Visual-Meta on the rendered pages, since a PDF’s XML metadata is just as accessible to a computer (and even to a human, who can just open the file in a plain-text editor if the metadata is left uncompressed and unencrypted, which is possible). Tagged PDFs are another example of this principle.

Much publishing today, perhaps even most publishing, happens in a variety of formats. Publishers have toolchains that automatically transform the same information into multiple output formats. Academic publishers make their works available as Web pages, EPUB files, PDF files, and other formats. Open standards for metadata should be as inter-translatable between these formats as possible.

HTML is not just a Web technology, not just for connections between Web sites; it is also used in non-Web publishing toolchains, including for producing PDFs (Prince XML, WeasyPrint, etc.). We can make HTML documents without external links and embed semantic data in HTML with RDFa and other formats. HTML is used as the basis for file formats that do not require a Web browser (for example, EPUB: open it in your ebook readers, or unzip it and open it in your Web browsers). I expect standalone HTML file formats such as EPUB to continue to evolve in exciting ways while adhering to open standards. For all these reasons, it is not true that PDF is somehow more suited to long-term storage than HTML and XML formats.

I am betting against Visual-Meta (which I have never seen used except in your books) and would urge you to find a better solution. My advice is to get rid of the visible ugliness of Visual-Meta and pay more attention to the embedded semantic metadata formats that are already widely used in HTML/XML and that can be used in all formats in an easily inter-translatable way.

I could say more, but I hope this is helpful even if you disagree.