Invalid Font definition, FontFile entry is missing from FontDescriptor for HeiseiKakuGo-W5įont not embedded (and text rendering mode not 3) Glyphs missing in embedded font TrueType font has differences to standard encodings but is not a symbolic font Wrong encoding for non-symbolic TrueType font The table below summarizes the results of the PDFs in the Font Testing category: Test fileįont not embedded (and text rendering mode not 3) Glyphs missing in embedded font The full output for all tested files can be found here. the error description(s) reported by Apache Preflight in the details output element.įor the sake of readability, the tables only list those error messages/codes that are directly related to font problems, multimedia, encryption and JavaScript. the error code(s) reported by Apache Preflight (see Preflight’s source code for a listing of all possible error codes).the error(s) reported by Adobe Acrobat Preflight.The results are summarized in two tables (see next sections). all files in the Classic Multimedia section of the Multimedia & 3D Tests category.all files in the General section of the Font Testing category.Since the Acrobat Engineering site hosts a lot of PDFs, I only focused on a limited subset for the current analysis: Unlike the earlier version, Preflight 2.0.0 does not give any meaningful output in case of encrypted and password-protected PDFs! This is probably a bug, for which I submitted a report here.Better reporting of non-embedded fonts (see also this issue).Apache Preflight now has an option to produce output in XML format (as suggested by William Palmer following the Leeds SPRUCE hackathon).The main differences with respect to that earlier version are: Re-analysis of PDF Cabinet of Horrors corpusīecause the current analysis is based on a more recent version of Apache Preflight than the one used in the 2012 report (which was 1.8.0), I first re-ran the analysis of the PDFs in the PDF Cabinet of Horrors corpus. The table below lists the software versions used: Software As a control I also validated the PDFs with the Preflight component of Adobe Acrobat, using the PDF/A-1b profile. The general methodology I used to analyse these files is identical to what I did in my 2012 report: first, each PDF was validated using Apache Preflight. This makes these files particularly useful for additional tests on Preflight. Although the test documents are not fully annotated, they are subdivided into categories such as Multimedia & 3D Tests and Font tests. Shortly after I completed my initial tests, Adobe released the Acrobat Engineering website, which contains a large volume of test documents that are used by Adobe for testing their products. For these reasons, it is essential to obtain additional evidence of Preflight’s ability to detect ‘risky’ features before relying on this tool in any operational setting. Also, the PDF specification often allows you to implement similar features in subtly different ways. However, PDFs that exist ‘in the wild’ are usually more complex. encryption, non-embedded fonts, and so on). Each PDF in this corpus was created in such a way that it includes only one specific feature that is a potential preservation risk (e.g. But what evidence do we have to support such claims? The only evidence that I’m aware of, are the results obtained from a small test corpus of custom-created PDFs. This Wiki page on uses and abuses of Preflight (created as part of the final SPRUCE hackathon) even goes as far as stating that “ Preflight is thorough and unforgiving (as it should be)”. Much of this later work tacitly assumes that Apache Preflight is able to successfully identify features in PDF that are a potential risk for long-term access. This work was later followed up by others in two SPRUCE hackathons in Leeds (see this blog post by Peter Cliff) and London (described here). Last winter I started a first attempt at identifying preservation risks in PDF files using the Apache Preflight PDF/A validator.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |