***apologies for cross-posting***

 

Dear all,

I am currently working on a research project for BDLSS at the University of Oxford. The project is looking to identify which flavour(s) of PDF/A will best suit the content and repository needs for theses ingested into the Oxford University Research Archive.

My approach has been to conform PDF files to different flavours of PDF/A using Adobe Acrobat [2015] and pdfaPilot [v. 7]. Alternatively, I create files of other types (e.g., docx, doc) as PDF/A flavours using Adobe Acrobat[2015], pdfa Pilot [v. 7], LibreOffice [v. 5.2], PDF Studio [v. 12], and PDF/A Live [v. 6.2]. For born digital documents, I create or conform to PDF/A-2a. For digitized documents, I conform to PDF/A-1b. Using veraPDF for validation, many of the PDF/A-2a documents fail due to the presence of non-Unicode characters. Among the collection of theses containing non-Latin scripts are scientific papers with mathematical formulas and language papers with non-Latin scripts.

Since I have encountered so many files that fail validation with veraPDF, my project team and I are considering investigating the possibility of disregarding some aspects of non-conformance.

I would love to hear other institutions’ approaches to PDF/A validation, in addition to any issues you have encountered in the process or PDF/A creation and conformance.

·         Has your institution integrated veraPDF into the workflow for PDF/A validation?

·         If a PDF/A does not validate, what non-conformances does your institution allow? (Please provide a list of exceptions; e.g., glyph-related non-conformances)

I thank you in advance for any institutional practices you would be willing to share to assist in my own research process.

Cheers,

Anna

 

----

Anna Oates

MSLIS Candidate, University of Illinois at Urbana-Champaign

NDNP Coordinator Graduate Assistant, Preservation Services