PDF Association "The Meeting-Place of the Global PDF Industry"
Portable Document Format (PDF)
is the world’s chosen final-form digital document format.
Founded in 2006 the PDF Association promotes understanding, adoption and
implementation of International Standards for PDF-based technology.
Ever since Adobe transferred the PDF technology for ISO standardization
PDF is further developed in a completely open specification rather
than as a proprietary implementation.
The PDF Association is a global industry initiative for developers of PDF solutions;
companies that work with PDF in document management systems (DMS) and
digital content management systems (ECM), interested individuals, and
users who want to implement PDF technology in their organizations.
As the leading industry and technical body for the PDF industry,
the PDF Association knows more about PDF and what PDF can deliver
than anyone else.
The PDF Association uniquely influences the future of smart standards:
The standardization of PDF is a joint effort between PDF Association and ISO:
"ISO: International Standards offer practical solutions we all can stand behind.
By using international standards, you become a part of the solution":
The PDF Association supports the Theme of World Standards Day 2021:
Mission Statement:
"Delivering a vendor-neutral platform for developing
open specifications and standards for PDF technology"
Participate in the continued development of the PDF technology
PDF Association, Membership and Benefits
Wrap-up / The Years in PDF
After 30: PDF’s bright future - Continues (December 21, 2023)
- PDF turns 30 years:
Podcast: SE Radio interview on 30 years of PDF (October 13, 2022)
PDF Week 2024, Prague, November 4--8, 2024
PDF Week Fall 2023, San Francisco, October 16--20, 2023
PDF Week, Paris, May 2--5, 2023
2022 in PDF, and what's coming in 2023
2021 in PDF, and what's coming in 2022
2020 saw a new high-point for PDF’s mindshare worldwide
Upcoming and earlier events
PDF Association helping government and businesses to understand
the Portable Document Format (PDF)
From recommendations to regulations,
PDF Association is here to help.
The PDF Association engages in many activities as it
follows its mission of promoting the adoption of
ISO standardized PDF technology around the world.
PDF technology is a pervasive feature of the world's
communications infrastructure.
With a unique and unmatched feature-set;
no other technology comes close.
We're not going back to paper, so it's long past time for
governments and businesses to focus just a little on
this ubiquitous format that's never going away.
PDF Association provide information and resources to
government agencies and regulators to help them develop
reference materials, guidelines, regulations and laws.
The significance of PDF for Information Governance (IG)
Better PDF = Better IG
In 2010, Gartner defined Information Governance (IG) as:
"the specification of decision rights and an accountability
framework to encourage desirable behavior in the valuation,
creation, storage, use, archival and deletion of information.
It includes the processes, roles, standards and metrics that
ensure the effective and efficient use of information in enabling
an organization to achieve its goals".
Today PDF, and PDF/A in particular, plays a critical role as
the standardized digital file format of choice for storing and
managing large quantities of an organization’s information.
The modern workplace of today is saturated with web technology.
From end-user technology (websites, office applications,...)
to enterprise-spanning systems to infrastructure and social media.
The range of information inputs, flows, products and systems
within organizations, large and small, continues to expand in
both volume and diversity.
The challenges for information producers, users, analysts,
managers and retainers of information in all businesses
continue to multiply.
For organizations, some pretty fundamental questions
are beginning to nag, such as:
- "When does a web page become a record?"
- "Does a screen-shot of a web-page constitute a record?"
- "The website doesn’t look like that anymore!"
PDF persists as the choice for formal documents.
Accelerated commitments to web- and cloud-based technologies
do not dampen the need for digital documents;
paginated, deliverable content that works everywhere,
whether on a local computer or in the cloud.
PDF remains the medium of choice for formal documents,
graphically-rich content, and any content destined for
print or publishing.
Indeed, PDF is really the ONLY choice;
there’s simply no other general-purpose reliable
digital document format, nor any challenger on the horizon.
Think PDF, think big, and get more from PDF.
Think about how much your organization relies on PDF
for documents, contracts, invoices, presentations, receipts,
reports, case files, archives and in many other roles.
Ask yourself:
- Are we creating the best PDFs we can to meet our IG goals?
- How could we do better PDFs to guarantee reliability and authenticity,
make content easier to find and reuse, as well as more accessible
for users with disabilities, and more.
The significance of PDF for Information Governance (IG):
Initiative Linking Research and Industry.
Open Invitation to Academic Research Institutions:
Interesting?
PDF Association Industry Working Groups
(earlier named "PDF Association Competence Centers")
Maintains and further develops PDF and PDF substandards
The original PDF/A Competence Center was founded 2006 with
the goal of establishing a common interpretation of ISO 19005.
In 2011 the PDF/A Competence Center transfered to become
the PDF Association overseeing all standards based on PDF technology,
each with a dedicated Competence Center.
Later on these Competence Centers have since then
evolved into separate and even more focused:
Industry Working Groups (WGs):
- technical functions (Technical Working Groups/TWGs)
Engaged with developing or maintaining technical specifications
or guidance on the interpretation thereof.
Open to all PDF Association members.
- liaison functions (Liaison Working Groups/LWGs)
Limited to a specific task or the concerns of a vertical marketplace.
LWGs are open to non-members such as end users who would not
otherwise join the PDF Association.
- marketing functions (Marketing Working Groups/MWGs)
Considers end-user awareness and education.
Open to all PDF Association members.
These working groups are "the workshops" where the actual executive work
with a specific PDF standard or application area takes place.
Industry Working Groups operate on the basis of interest without a specific remit.
They have grown to include a variety of objectives, including:
- Promoting exchange between developers focusing
in various subdomains
- Oversight and policies for industry-accepted validation software;
such as veraPDF
- Research and development of new PDF extensions and use cases
- Development of industry standards, best practices,
test suites and other aides to interoperability
- Developing informational resources for PDF developers and users
PDF Association Technical Resources
Glossaries for PDF
Get to know the language, acronyms and terms of PDF.
The PDF glossaries below help to make
PDF Association's technical resources
more understandable and navigable for
the less technically experienced user
as well as the knowledgeable developer.
These glossaries describe commonly encountered
acronyms and terms for PDF and PDF-based substandards
(like PDF/UA for Accessible PDF, PDF/X for print, and
PDF/A for archiving and long-term preservation)
with easily understood lay-person definitions.
Typical non-technical users, such as ordinary end-users,
consumers, and business decision makers will benefit
from these glossaries when encountering
the more technical aspects of PDF,
when communicating with PDF vendors and/or
design/production/communication agencies,
or when they need to understand ISO standards
for various PDF-based technologies:
Free PDF Peer-Review Service
Initiative Linking Research and Industry
Open Invitation to Academic Research Institutions
To assist both academic and industry researchers achieve
high-quality and accurate PDF-oriented research outcomes,
the PDF Association is now making available a new
free peer-review service.
This service will link acknowledged experts in the PDF file format with
journal editors, academic publishers, conference steering committees and
researchers to provide expert peer-review of pre-print/pre-publish articles,
whitepapers and presentations in relation to statements made about
PDF format and PDF technology.
PDF technology per se is, of course, not an academic domain.
Nonetheless, every year many universities and research organizations
publish papers and present research work that focuses on the format
from across a diverse range of domains that utilize PDF.
Topics range from the more obvious software engineering,
cyber-security, accessibility, data mining, archival studies,
and document understanding to specialized areas of
health informatics, medicine, and education.
Most research publications are oriented towards making unique
contributions within their primary domain, however in some cases,
the research is weakened by a lack of PDF knowledge and expertise.
This is understandable especially when the research is conducted
by academics without own deep experience with PDF and are not
PDF experts themselves.
This can result in papers with shortcomings such as:
- misunderstandings about PDF lexical rules, syntax and features;
- referencing out-of-date PDF specifications;
- relying on incorrect information from previously published work;
- being unaware of specialized PDF publications;
- use of old or incomplete implementations;
- limitations in the design and selection of PDF-based corpora, and
- confusion between PDF as a file format specification
and behavior of specific implementations.
As a consequence,
conclusions and future areas for research are often weakened.
But this is precisely where PDF experts, such as PDF Association members,
can “cross pollinate” and assist researchers to create better and
more relevant research outcomes for the benefit of everyone.
|
|
The Portable Document Format (PDF)
Overview and introduction to ISO Standards based on PDF Technology,
information material for download, and recorded webinars
PDF, PDF/A, PDF/X, PDF/UA, PDF/E, PDF/VCR, PDF/VT, 3D PDF,...
and PDF 2.0 which is the new basic standard for PDF which
other PDF substandards now are based on since year 2020
PDF: The de facto Digital Document Technology
PDF is a basic technology for digital documents with wide range of uses
PDF today
The Portable Document Format (commonly known as “PDF”)
is a file format developed in the early 1990s as a way to share
computer documents, including text formatting and inline images.
PDF technology was designed to allow for presentation of documents
independent of the application software, operating system and
hardware used to create them.
PDF, designed as a general-purpose,
page-based digital document technology,
is the world’s chosen digital document format,
with applications far beyond conveying rendered pages.
PDF files encapsulate a complete description of a fixed-layout document,
including the text, fonts, graphics, and other information needed to display it.
PDF files may also include a wide variety of other content,
from hyperlinks to metadata to logical structure to JavaScript and
attached files, that allow the format to meet a wide variety of
functional and workflow requirements for digital documents.
Today PDF spans workflows in publishing, manufacturing,
financial services, government, accounting, litigation,
human-resources, logistics and many others,
by users on every continent.
Today, PDF is the quintessential and ubiquitous "digital document",
with billions made each year.
Adobe Document Cloud alone opens more than 300 billion PDF files each year.
PDF just works!
PDF is developed to fit into workflows.
The format’s innate ability to glide through and between multiple
workflows, a function of its built-in essential "self-contained-ness",
is unique and critical to its success.
That’s because PDF embodies fundamental ideas about what’s
important in communications, ideas that led to the invention of
writing, then paper, then PDF.
These ideas are so basic that we don’t really have good words for them.
Users think about the document’s contents, not the document itself.
Ask them about PDF and they’ll say;
"it’s easy", "it looks the same", "it’s reliable", and so on.
PDF just works and is suited to a wide range of purposes,
as reflected in broad choice of software that creates and uses PDF files.
The range is eclectic;
not just software for server, nor desktop,
nor accessibility or print, or security,
but some, all, and none of the above.
PDF is a page-based technology,
thus the term "page" is frequently used.
But many sectors and PDF applications don’t
print, share or publish pages.
For example, you may be designing for labels, packaging,
industrial print, book covers, signs, or even textiles.
So, if you are not printing, sharing, publishing pages,
you can translate "pages" to the format most
appropriate for your use case.
PDF is a cross-section of means for addressing diverse business
processes and workflows; for using digital documents to solve
problems, reduce costs, invent new solutions and enable other
opportunities in every activity and industry sector.
The one commonality; they do it with PDF.
ISO Standards based on PDF Technology
Figure: PDF - ISO 32000 itself is a Standard
PDF Substandards for particular use
Figure: PDF Substandards for particular use PDF/A, PDF/X, PDF/UA, PDF/E, PDF/VCR, PDF/VT,...
PDF standards explained;
with a focus on the newest (Sept. 15, 2021)
Can a PDF easily comply to PDF/A, PDF/X and PDF/UA?
PDF standards are not mutually exclusive (March 15, 2018)
Yes, of course! A PDF can easily comply to
PDF/A, PDF/X and PDF/UA? (June 13, 2018)
A PDF document can simultaneously meet several standards,
for example both PDF/UA and PDF/A, both PDF/A and PDF/X,
or all three of PDF/UA and PDF/A and PDF/X.
Practical use case - Long-term archiving of accessible PDF:
An accessible PDF has to comply with PDF/UA.
PDF/UA has a concise set of rules and should
be used to test whether a PDF file accessible.
For long-term archiving of this accessible PDF,
the PDF is also required to comply with at least
PDF/A-2a, or PDF/A-2b, or PDF/A-1b.
PDF/A-2 is much more in line with the requirements for PDF/UA.
The combination of the two is just easier to achieve and
more logical than it would be with PDF/A-1.
PDF/A-1 is no longer recommended because it is a standard that
was designed based on what was possible almost 20 years ago.
IT and PDF standards have evolved a lot since then!
ISO standards based on PDF technology - details
ISO standards based on PDF technology - details
Click on picture
Family tree of
PDF specifications and related ISO standards
Source:
The German Printing and Media Industries Federation (BVDM)
|
PDF Association Flyers
Introduction and Overview of PDF-based ISO Standards
PDF Association - PDF/A - PDF/UA - NVDA Goes PDF/UA - PDF/VT
Click on any picture below for download
or order your own printed copy free of charge from
NewFormat AB
|
Promoting ISO Standards for PDF Technology
|
PDF/A - ISO 19005: Standards for long-term digital archiving of digital documents
|
PDF/UA - ISO 14289-1: The standard for universally accessible PDF documents and PDF forms
|
NVDA Screen Reader Goes PDF/UA
|
PDF/VT - ISO 16612-2 The PDF Standard for Personalized Print
|
PDF 2.0 - ISO 32000-2
Interop Workshops 2017 Preparing for the Next-Generation PDF
|
Recommended reading on PDF Standards Free Booklet Downloads
Click on any picture below for download
or order your own printed copy free of charge from
NewFormat AB
The ISO Standard PDF/A - Long-term Preservation/Archiving
From PDF/A-1 to PDF/A-3
(May 13, 2013)
The ISO Standard PDF/UA - Accessible PDF documents
(Aug. 2, 2013)
The ISO Standard PDF/X - PDF for Printing
(May 16, 2017)
PDF in Manufacturing
The future of 3D documentation
(May 13, 2020)
PDF is at the heart of manufacturing and engineering communications.
PDF technology supports manufacturing worldwide, conveying ideas,
plans, communications, agreements, specifications, contracts…
and of course, 2D and 3D drawings and supporting content
throughout complex workflows and across corporate,
organizational and process boundaries.
PDF Association PDF Products and Services Guide 2019
Products, Solutions and Services available from
PDF Association Members
(May 30, 2019)
PDF Declarations
(Sept. 5, 2019)
(For download of the guide, click on the picture above)
ISO-standardized subsets of PDF such as PDF/A, PDF/UA and PDF/X
already include identification mechanisms.
However, in many cases users of PDF files would like to leverage
3rd party standards or other profiles of PDF to meet specific needs.
The PDF Declarations mechanism allows creation and editing software
to declare, via a PDF Declaration, a PDF file to be in conformance with a
3rd party specification or profile that may not be related to PDF technology.
The 3rd party specification or profile may describe or require properties
specific to some or all content in the PDF document.
Cases include, but are not limited to specifications or profiles that:
- Mandate properties
(e.g., accessibility specifications)
- Mandate degree of accuracy
(e.g., engineering specifications)
- Set limits on content types
(e.g., that all images use a specific encoding)
- Make an accountable policy statement regarding document content
(e.g., pertaining to privacy regulations)
- Profile PDF for specific purposes
(e.g., to archive email)
By itself, the presence of a PDF Declaration does not guarantee
that the document conforms to the 3rd party specification or profile.
|
PDF Association Cheat Sheets
|
PDF Association PDF Cheat Sheets
for Developers of Software Tools and Solutions
Based on PDF-technology
Aug. 25, 2023
PDF Cheat Sheets are quick-reference tools intended to help
developers work more efficiently while ensuring that their
knowledge of PDF is technically correct.
To help developers whose relationship with PDF’s specification
is casual or tangential, these free PDF cheat sheets provide
aid in remembering key terms and concepts without constantly
referring to the ISO 32000 Specification for PDF.
What these Cheat Sheets are not:
These cheat sheets don’t introduce PDF technology,
and don't substitute for an initial learning phase.
Nor do they displace the official core PDF specification.
Where the cheat sheets don’t address a subject in sufficient detail,
developers should always refer to the latest ISO 32000-2 specification,
as this is the latest and most up-to-date edition of
the core PDF specification.
What these Cheat Sheets are
These cheat sheets are designed to help developers
work more efficiently while ensuring that their
knowledge of PDF is technically correct.
They are highly condensed summaries,
with logical groupings of information to help
jog one’s memory about nuances or details that are
often forgotten without regular and repeated use.
These cheat sheets use simple sentences,
illustrations and color coding wherever possible to
optimize support for the global community of PDF developers.
For downloads of these PDF files, click on the links below.
PDF - Basics - Cheat Sheet
Click to preview
Topics covered:
The PDF Basics cheat sheet summarizes the essence of PDF,
including principal terminology, acronyms, specifications,
lexical rules, syntax, file layout, and document structure,
along with a glossary of common and PDF-centric terms.
It is useful not only to software engineers,
but also product managers and anyone needing
to navigate the PDF ecosystem.
Download:
PDF - Graphic Operators and Operands - Cheat Sheet
Click to preview
Topics covered:
The PDF Graphic Operators and Operands cheat sheet
covers the graphic commands that can be used in
PDF content streams.
If you need to create or modify PDF graphics
or debug a content stream, this is for you.
For those more familiar with other page description languages
or graphics languages, the categorization of commands into
vector (line art), text, color, etc. can help you locate
the corresponding or equivalent PDF information rapidly.
Download:
PDF - Common Objects - Cheat Sheet
Click to preview
PDF - Color Processing & Blend Modes - Cheat Sheet
Click to preview
Topics covered:
The 3 page PDF Color cheat sheet
summarizes the encoding of color in PDF.
It includes information on blend modes, color processing,
function objects and patterns and shadings:
- Blend Modes in RGB
- Blend Modes in CMYK
- PDF Color Porcessing
- Approximating sRGB with CalRGB
PDF Fragment Identifiers
- PDF Function Objects
- Patterns and Shadings
Download:
|
|
Recommended webinars
|
PDF Days Europe 2022
It’s your PDF Association (2022)
Duff Johnson,
CEO, PDF Association
|
PDF Days Europe 2022
PDF’s mindshare (2022)
"PDF is just l´like electricity...
You never want to think
about how it works.
You can't live without it."
Duff Johnson
CEO, PDF Association
Complementing Facts,
Google Trends Reports on:
PDF's Popularity Online / On the Web
|
PDF Days Europe 2022
Navigating the PDF ecosystem (2022)
Peter Wyatt
CTO, PDF Association
|
Keynote at PDF Days Europe 2022
The future: Re-imagining the best possible definition of PDF
Peter Wyatt
CTO, PDF Association
See also:
SafeDocs Phase 3:
Revolutionizing file format specifications,
beginning with PDF
|
Keynote at PDF Days Europe 2022
Is Digital Transformation the end of PDF?
No, but it will be different
Kenny Swipe
Senior Manager
Enterprise Interoperability Standards
Engineering, Test & Technology
Boeing
|
PDF/Portable Document Format (2015)
What is it, Who owns it, Why it matters
|
PDF 101 - Introduction to PDF (2015)
Leonard Rosenthol
Senior Principal Scientist
PDF Architect, and CAI Architect for Adobe
|
This PDF – what is it for?
A story from PDF’s early days (2016)
|
PDF Workflow (2016)
|
PDF’s ISO-standardized subsets - a tour (2018)
PDF/X, PDF/A, PDF/E, PDF/UA, PDF/VT,...
|
PDF standards
and why you can't afford to ignore them
prepress workflow automation (2024)
PDF/X, PDF/VT, PDF/R,
Print Product Metadata, and Processing Steps,...
|
Introduction to PDF/A for Longt-Term Archiving (2018)
(PDF/A-1, PDF/A-2, PDF/A-3)
|
PDF/A-3 as preservation format (2015)
|
PDF & Open Data (2018):
PDF Beyond Final Form Visual Content
(incl. interactive e-invoices
based on PDF/A-3)
by
Dov Isaacs,
Principal Scientist,
Adobe Systems Inc.
|
PDF Days Europe 2022
10 Years of PDF/A-3 Based
Electronic Invoicing (2022)
Dr. Bernd Wild
intarsys consulting GmbH
The presentation looks at the development
of the various standard versions such as:
Details on ZUGFeRD / Factur-X, Order-X och Delivery-X
and shares experiences from practical use of
PDF/A-3 and XML file attachments
to create and manage millions of electronic invoices
which can be understood by both people and IT systems.
|
How PDF/A-2 and PDF/A-4 are better than PDF/A-1 (2021)
|
PDF Days Europe 2022
Email Archiving in PDF (2022)
Prof. Chris Prom
University of Illinois, Urbana-Champaign
Peter Wyatt
CTO, PDF Association
This presentation looks at email archiving in PDF.
PDF provides a new and scalable approach that can be used
to archive email messages, folders, and even user accounts.
This session provides an update on activities of
the EA-PDF Liaison Working Group (LWG)
The LWG is developing a detailed technical specification for
the proposed EA-PDF (Email Archiving in PDF) file format,
including requirements for EA-PDF viewing software and
implementer guidance.
|
veraPDF (2015)
PDF/A validation
with support of the PDF industry.
(Note!
The term "definitive PDF/A validator"
is not used any longer by this project for obvious reasons)
|
veraPDF (2018)
Real world adoption of veraPDF
and industry needs for more
PDF standards
|
PDF Preflight Standards (2014) (PDF/X and other standards)
|
3D PDF (2018)
The power and future in terms of
an ISO Standard
|
PDF Days Europe 2022
Aligning PDF 1.7 and 2.0 through ISO 32005
and using these for content
Reflow, Reuse, Extraction and Accessibility
Matthew Hardy
Director of Engineering
Mobile and Desktop software
Adobe Inc.
|
Pod (in Swedish) about
Open Document Formats, PDF and ODF,
and specifically Accessible PDF according to
the ISO Standard ISO14289 (a.k.a. PDF/UA)
with Ola Karlsson (Branschkoll) and Kent Åberg (NewFormat)
PDF/UA - ISO 14289-1: The standard for universally accessible PDF documents and PDF forms
This pod can also be downloaded for listening from:
Spotify
Apple Podcasts
|
Accessible PDF
PDF 101
Introduction to PDF/UA (2015)
|
Accessible PDF
Entangled in
the tagged PDF jungle (2014)
|
Accessible PDF
Tagging Page Content (2015) (to PDF/UA compliance)
|
Accessible PDF
What makes a tagged PDF properly
tagged PDF/UA document (2021)
|
Accessible PDF
PDF/UA Basics - Born to be Accessible (2014)
|
Accessible PDF
How does a blind person navigate
PDF documents and forms? (2015)
|
Accessible PDF
PDF/UA for Design Agencies (2018)
|
PDF 2.0 and the future of
accessible PDF (PDF/UA) (2015)
|
Developing PDF / PDF 2.0 (2018)
(What’s happening in
the Next-Generation of PDF)
|
PDF/R The Imaging File Format of the Future
(2021)
|
PDF Days Europe 2022
PDF/R
How PDF/R helps transforming
image capture for mobile and cloud
(2022)
|
PDF/R Introduction (2020)
|
PDF/R (2020)
Looking for an alternative to TIFF?
Try PDF/raster!
|
OCR for PDFs – old news? (2019)
PDF makes it possible to embed
OCR results in scanned documents
ensuring that they are
fully text searchable
|
How to help AI get the most
from legacy archives (2019)
|
Intro to EPUB (2015) (for PDF developers)
|
|
Detailed information on PDF Standards
and intended application areas
|
PDF, PDF/A, PDF/X, PDF/UA, PDF/E, PDF/VCR, PDF/VT, 3D PDF,...
and PDF 2.0 which is the new basic standard for PDF which
other PDF substandards now are based on since year 2020
(Click on preferred icon or link)
|
PDF
|
PDF/A
|
PDF/UA
|
PDF 2.0
|
PDF/Raster
|
3D PDF
|
PDF/VCR
|
PDF/VT
|
PDF/X
|
PDF/E
(2018: Replaced by PDF/A-4e)
|
Use Cases
|
Next-Generation PDF
Deriving HTML from PDF
|
For PDF/X, see also
Ghent Workgroup
|
For PDF/UA, see also:
PDF/UA is a prerequisite for
Access to Digital Public Services
|
PDF/Reuse
|
PDF Forms
|
Software tools supporting
a specific PDF standard or several PDF standards
|
|
|
ISO 3200 the International Standard for
the Portable Document Format (PDF)
Approved international standard since 2008
PDF is widely recognized as
the richest and most robust document format.
ISO 32000 is the family of ISO standards
that defines the core PDF specification,
as identified by the PDF version number.
All other PDF subset specifications
depend upon a specific core PDF version.
The use of PDF is very high.
Follow/participate in the work with the future development of PDF:
PDF Association - PDF Industry Working Group
The Purpose of PDF
The purpose of PDF is to enable users to exchange and view
all kinds of digital documents easily and reliably,
independent of the environment in which they were created
or the environment in which they are viewed or printed.
PDF is a Platform
Platform technologies provide the infrastructure
for the applications end users actually use.
Far beyond a document format, PDF offers a sophisticated
foundation for many types of user applications.
Everyone accepts PDF.
A fixed-layout, shareable, self-contained document
meets a fundamental customer need.
Why PDF?
PDF is characterized by:
- Easy
- Portability
- Flexibility
- Security
- Authentication
- Semantics
- Non-Proprietary
- Accessability
- Reusability
What’s unique about PDF?
And why it will last forever
The Portable Document Format possesses a variety of attributes
that taken together describe a format of such flexibility and power
that it will define the essential digital document concept forever.
PDF: The document format for everything
Overview of all PDF Standards, how and where to use them:
Files inside PDF
How do "files" end up inside PDF files?
There are many ways…
End-users are sometimes perplexed as to why they cannot
simply "download" the JPEG file of a photo in a PDF,
even though they might be able to grab an Excel file from it.
Accurately understanding the kinds of data that can be in a PDF
and how it is stored can be very important.
The core ISO standard for PDF (ISO 32000-2:2020)
is primarily a file format specification and does not mandate
user interface terminology, user interface metaphors,
how users interact with "files", or even whether
file extraction is a required feature.
These are implementation decisions left to developers to best
address the specific needs of their users and target industries.
Diverse interfaces foster various end-user understandings
of what "files" exist inside any given PDF.
This is especially true when we consider the many similar terms in use.
Understandings differ or are ambiguous for end users depending on
their background, and the terms themselves may have
different technical meanings.
Some of these terms used to describe "files"
in the PDF context include:
- File
- Embedded file
- File attachment
- Image files
- Asset
- Collection
- Portfolio
- Package
- Related file
- Associated file
- Stream
- File specification
This variety of terms can be very confusing,
and can lead to misunderstandings.
Ideally, vendors should provide clear documentation
on their use of terminology.
Although PDF content can be created from "files",
this does not mean that a source file can always be
recovered from a PDF even in the simplest of cases.
Although the PDF file format supports many concepts
that can be thought of as "files", PDF’s flexibility can
confuse users due to differing terminology, user interfaces,
metaphors, and expectations derived from other formats.
How PDF contributes to greater sustainability
PDF has many advantages.
In view of global climate change and the scarcity of raw materials,
sustainability and green technologies are becoming increasingly important.
is not a buzzword; more and more enterprises are striving for CO₂ neutrality,
which is appreciated by customers.
At the same time, PDF is one of the greenest office technologies ever invented.
The format is particularly relevant in the context of the often
promoted, but still unachieved paperless office.
Introduced in 1993, PDF offers a stable foundation for replacing physical paper,
according to the principle: instead of printing paper documents,
generate digital paper documents using PDF.
PDF's Popularity Online / On the Web
The Only Digital Document Format
PDF has replaced paper documents with digital analogues.
Some expected the web to replace digital documents as well,
but all indications are that PDF continues to grow. Many websites are, let’s face it,
mostly navigation to help visitors find a specific PDF.
Maybe that's why,
and according to "Statistics of Common Crawl Monthly Archives",
The basic reality is clear,
PDF continues to predominate in digital document formats and
has become indispensable for the vast majority of operations;
for the business sector, for the public sector,
as well as for private individuals.
End-users feel that they can really understand PDF;
it works exactly as you’d expect digital paper to work;
simple, reliable and effective. And it’s SO popular!
Google's Trends data clearly shows that PDF is a technology
that's not only far more relevant in 2023 than it was in 2004,
"PDF" is represented in a far higher proportion of web-searches,
even though the total search volume on Google has increased
dramatically since that time.
"How much longer do you think we'll use PDF?"
Interest in PDF is pretty steady relative to web searches in general.
Compared to searches for other technologies, that's pretty remarkable:
Google's Trends data shows clearly that the number of searches for
PDF documents relative to all other searches continues going up:
PDF is ubiquitous and essential:
PDF is fundamental to business operations.
Everyone who works in an office these days
can be expected to recognize PDF.
PDF is the "Coca-Cola" of digital document formats
because everyone knows it:
Navigating the Future:
Unveiling the Path of PDF Technology in the Next 25 Years.
In the fast-evolving landscape of technology,
few formats have stood the test of time like
the PDF (Portable Document Format).
Born out of the need to digitize paper documents,
the PDF has become an obvious standard for sharing and
archiving information over the last 30 years.
So, what's to come?
We’re just as curious as you, so join us on a journey through our
top five predictions of what the future holds for PDF-based technology:
PDF - Have we passed ‘peak PDF’?
How Google Trends sees “PDF” in 2022
PDF is more popular worldwide in 2022 than in 2018
The big picture.
Although the curve is flattening,
worldwide searches for “pdf” continue to grow in popularity,
indicating that the popular appetite for documents remains healthy.
Users may be banking online, but their searches for
PDF-based documents continue to increase.
How do we gain insight into how users' views of documents are shifting
without spending egregious sums on dubious market-research?
One increasingly interesting source is Google Trends.
This service aggregates Google’s search data to produce a metric
describing search term popularity (relative to itself) over time.
(Trends: https://trends.google.com
example: compare use of document formats: .pdf vs .html)
When users add "pdf" to a Google query it means that they
are not thinking about an HTML web page...
except, perhaps as a place from which to download
the pdf that they seek.
Web pages are more dynamic than ever before,
but they are no more “documents” than they were in 2018.
Although web pages can serve “document-like” functions in many ways;
in practice it's "pdf" that comes to user's minds.
30 years since the PDF format launched in 1993
PDF applications and usage continue to grow;
the ecosystem becomes larger and more vibrant each year.
PDF is key to digital transformation precisely because a
self-contained, reliable and capable portable document format
does what other web technologies do not.
PDF complements web technology; it does not compete with it.
PDF's staying power is impressive.
Few technologies show as much resilience.
The trend is clear,
PDF is more popular worldwide in 2022 than in 2018:
"PDF is just l´like electricity...
You never want to think about how it works.
You can't live without it."
Presentation on "PDF mindshare" at PDF Days Europe 2022:
3 ways developers can impress the boss with PDF
More than just a fancy way to paint a picture.
3 smart PDF features for software developers of
web-connected document applications:
1. Linearized PDF
2. PDF Annotations
3. Tagged PDF
PDF is part of the de facto platform
Open Web Platform (OWP)
PDF is part of the de facto platform
Open Web Platform (OWP) (Sept. 25, 2015)
The Power of the Page
It’s a question that vexes vendors of web-based solutions everywhere:
Why do people still insist on PDF files?
And why does PDF’s mindshare keep going up?
PDF in 2016:
Broader, deeper, richer
Bridging the page and the web, there's still nothing like PDF.
Interest in PDF continues to climb.
The world’s portable document format continues to go from
strength to strength, with more specifications, more files, more users,
more implementations and more developers worldwide.
How you see PDFs versus
how a search engine sees PDFs (Aug. 1, 2019)
How to instantly search very large amounts (terabytes)
of documents, web, and databases:
PDF to end the era of ECM vendor lock-in
Making information management real
A common portable container, PDF, to end the era of ECM vendor lock-in.
Mostly, it’s the fact that a standardized, fully-supported and globally
broadly-accepted portable container format would provide users
with powerful technology independent of any specific vendor,
ending the era of vendor lock-in.
ECM Vendors don’t like that, but customers do.
Over the next 5-10 years,
expect to see PDF become the common portable container for
a new era of smart, interconnected document and
information management systems.
What ECM/e-archive professionals
must know about PDF
Ask your ECM/e-archive vendor to detail their support for PDF,
or risk unecessary costs, increased risks and missed opportunities.
Although PDF represents the bulk of content in ECM/e-archive systems
the majority of such implementations do not handle PDF documents
much differently than they way they’ve handled TIFF images
for the past 25 years.
Not all PDF creation software is equal.
Exclude software that’s dangerous to your documents.
Use ECM/e-archive software that understands PDF.
Ensure PDF documents do not contain Personally Identifiable Information (PII)
and other privacy or security-related content is a critical aspect of releasing
sensitive documents to 3rd parties or into the public domain.
Note! For redaction tools:
Be sure your search software can find all the information you need to remove.
Just putting a black box on top of sensitive information does not "remove"
anything (e.g. the document is still leaking sensitive information).
PDF supports work-from-home (WFH) and school-from-home (SFH)
Work-from-home (WFH) and school-from-home (SFH)
opens up for use of existing PDF solutions in a new way.
Find out how PDF can help companies and institutions to cope.
October 19, 2022
Adobe has announced:
"End of life for Type 1 fonts in
authoring applications by January 2023".
Redaction of Document Content
Redaction of Document Content
Redaction is the process of removing content from a document.
There are various ways to achieve redaction in digital documents,
ranging from removal of content from an original source document
to printing and re-scanning after redaction.
Need redaction?
Then there is no way around PDF, and a good PDF redaction tool.
Why the need to redact implies using PDF:
How do you redact content in HTML?
The HTML format doesn’t really include that concept;
there is no model for redaction in web content or other formats.
In both principle and practice, committing to a rendering is definitive;
final, portable and generic; it’s what the use case demands.
PDF is the only digital document format that fully supports redaction.
Lack of basic knowledge of modern digital document technology
can lead to serious consequences, as is clear from the following
internationally recognized examples:
1) Special U.S. Counsel Robert S. Mueller’s indictments and reports:
2) U.S. President Donald Trump's call with
the President of Ukraine Volodymyr Zelensky:
3) Potential "manipulation / deepfakes" of PDF:
4) PDF redaction – AstraZeneca EU contracts:
The Potential for Deepfakes with PDF
Hunter Biden’s "email" and
the potential for deepfakes with PDF (October 19, 2020)
This article is intended for journalists, researchers, attorneys,
law-enforcement, application developers and other professionals.
Akin to the earlier series on the redaction of The Mueller Report PDF,
this article provides cultural framing and technical background for
considering the evidence provided to-date by the New York Post
regarding Hunter Biden’s alleged email from Ukraine.
On Wednesday, October 14, 2020 the New York Post published
an article in which they claimed to be in possession of a copy of
a hard-drive belonging to Hunter Biden,
the son of the former Vice President of the United States and
current Democratic candidate for president, Joe Biden.
Some journalists are covering this PDF document as if it represents
an email from Hunter Biden’s computer; it may or it may not.
However, many people question the origin of the email.
The question that should be asked is: Who actually created the PDF file?
The PDF Association highlights here that this case may have other
explanations and that PDF technology may have been used
to create a manipulated PDF document;
this article offers interesting insights for digital forensics technicians
about the possibilities to manipulate documents with PDF technology.
PDF redaction – AstraZeneca EU Contracts (Feb. 9, 2021)
Redaction of contract AstraZeneca - EU
Click on the picture
After correctly redacting the text passages on the PDF page,
the PDF's bookmarks referring to redacted content were overlooked.
The confidential information was removed from the page as intended,
but was unfortunately disclosed in the PDF bookmarks!
PDF or EPUB?
PDF or EPUB?
Give users what they want,
and why EPUB can't replace PDF
EPUB can't possibly substitute PDF when it comes to a general-purpose
digital document format usable for publishing, and all the other
purposes to which PDF documents are put
(formal documents, record-keeping, transaction records, etc.)
- EPUB can’t do fixed layout and be accessible at the same time.
- EPUB cannot deal with the case of a document
that combines pages from various sources
(Word, Excel, CAD software, scanner, etc).
- EPUB has no model for color-management,
which is not infrequently important to publishers.
- EPUB cannot accommodate the application of accessibility
structures to arbitrary graphics content, as PDF can.
- EPUB lacks security and digital signature facilities;
features that are native to PDF.
Even for publications, support for the EPUB specification varies
between EPUB readers from different vendors;
thus, users can’t get a consistent display result of publications,
which in itself is totally unacceptable for publishing.
If your readers will prefer EPUB, give them EPUB.
If they will prefer PDF, give them PDF.
If they prefer a choice of PDF or EPUB, then give them that choice.
Why PDF?, Why EPUB? How to Choose:
|
|
ISO 32000-2:2020 - PDF 2.0
Approved international standard since 2020
ISO 32000-2:2020, also know as PDF 2.0,
was first released in 2017,
and was updated and published in December 2020.
The Next-Generation of PDF Standards is Already Here!
PDF 2.0 (a.k.a. "post-Adobe PDF") is the basis for
a new generation of PDF standards.
PDF 2.0 is an evolution in the PDF family
that maintains backwards compatibility to
the strongest degree possible.
Important to know!
ISO 32000-2 defines PDF 2.0 and is the first PDF specification
entirely developed under the ISO open consensus-based process.
ISO now holds the copyright to the PDF specification.
ISO 32000-2:2020 does not include any
proprietary technologies as normative references!
PDF 2.0 is completely disconnected from Adobe.
Thus, Adobe does not own any IP or license rights for PDF 2.0.
The use of PDF 2.0 will be very high.
Follow/participate in the work with the future development of PDF 2.0:
PDF Association - PDF 2.0 Industry Working Group
PDF 2.0 provides numerous enhancements relevant to "Tagged PDF",
the mechanism in PDF that facilitates digital accessibility.
These same accessibility requirements for PDF 2.0,
are elevated into the ISO 14289-2 Standard (PDF/UA-2):
Breaking News! - July 24, 2024
ISO 32000-2 PDF 2.0 'Errata Collection 2'
is now available for download at no cost
from PDF Association
This latest bundle of PDF 2.0 Errata Collection 2
is a complete update and replacement for
the previous no-cost sponsored edition of
ISO 32000-2:2020 (PDF 2.0)
as announced April 5, 2023 (see below)
Note!
PDF is a modern evolving technology,
so corrections and clarifications to
PDF’s specification are a fact of life.
Implementing errata resolutions ensures
PDF software is both correct and interoperable
to the greatest degree possible.
The resulting ISO 32000-2 bundle includes>
- PDF’s core specification, and
- four ISO Technical Specification extensions to PDF
enabling modern cryptography and
digital signature technologies in PDF 2.0.
With downloads of this bundle the PDF Association and
its sponsors have successfully placed the latest and
most up-to-date PDF specification into the hands of
developers worldwide, helping to improve interoperability
and reduce malformed PDF files.
Features and benefits:
- ISO-approved and industry-resolved errata corrections to
the ISO 32000-2:2020 specification up to July 20, 2024
- 265 total errata corrections
- ISO TS 32001 errata correction
- 17 new / replacement pages
- Updated Arlington Model
- Updated Annex L embedded file attachment annotations
- Introducing machine-readable EBNF into the PDF specification
- Navigation enhancements
- An instructions page
Note! You will now need a capable PDF 2.0 viewer.
PDF 2.0 capable PDF viewers are necessary for use of this bundle,
as the errata corrections will otherwise not be visible.
Viewers must support, at a minimum, PDF markup and
file attachment annotations.
Depending on the PDF viewer it may be necessary to
enable a “Comment” or similarly-named pane or window
in order to make the markup visible.
For more details:
Note!
The edition announced below is now completely superseded by
Errata bundle edition above, dated July 24, 2024!
Breaking News! - April 5, 2023
The specification for the latest PDF standard
ISO 32000-2 (PDF 2.0)
is now available for download at no cost
from PDF Association
(Chapter 14 is focusing on
Document Interchange,
Accessible PDF (ISO 14289-2 / PDF/UA-2))
Why PDF 2.0 is the new PDF bible
How could the PDF 2.0 specification be useful
even if we haven’t decided to support PDF 2.0?
The PDF 2.0 specification is the best reference to use
regardless of which PDF version you intend to support.
It provides a clearer understanding of what’s expected
than did previous editions of the specification, and therefore,
provides better guidance on how to implement PDF correctly:
How to get started with PDF 2.0
PDF is a large, complex specification!
Developers often look for first steps in its implementation.
This article offers some suggestions:
Adoption of new PDF standards
The Journey of PDF Standards:
Balancing heritage and innovation is key!
From it 1993 inception, PDF reshaped print and publishing.
Surprisingly, we still rely on decades-old standards,
like PDF/X-1a and PDF/X-4!
Why not now embrace and benefit from modern alternatives,
like ISO 32000-2 and PDF/X-6?
For a standard to be adopted,
people in production and management,
need to understand what the standard is about,
and what the implications are both of adopting,
and not adopting the standard:
PDF Cheat Sheets for Developers
PDF 2.0 (ISO 32000-2:2020)
The PDF 2.0 Standard now released (December, 2020)
PDF 2.0 (ISO 32000-2:2020) replaces PDF 1.7 (ISO 32000-2:2017)
PDF 2.0 includes critical updates to the normative references and
character collections that underlie all PDF technology.
All PDF developers should procure this edition from ISO!
PDF 2.0 spawns a set of updated subset standards designed to leverage
the first "post-Adobe PDF" in the next-generation of archival, accessibility,
engineering, raster image and other specifications based on PDF 2.0.
A variety of PDF subset specifications are based on the PDF 2.0:
ISO 19005-4:2020 - PDF/A-4
PDF/A-4 provides normative guidance on long-term archiving of
PDF files based on new features and other changes in PDF 2.0.
ISO 15930-9:2020 - PDF/X-6
PDF/X-6 adds support for new features in PDF 2.0,
while relaxes some earlier requirements for file exchange.
ISO 16612-3:2020 - PDF/VT-3
Builds on PDF/X-6 to provide support for the PDF 2.0 imaging model
in the variable and transactional printing context.
ISO 23504-1:2020 - PDF/R-1
Standard for raster image data interchange
ISO 25717-1:2020 - ECMAScript for PDF
Incorporates JavaScript (ECMAScript)
to support a variety of interactive functionality.
PDF 2.0 examples now available (Aug., 2017)
The first PDF 2.0 example files are now made available to the public.
This initial set of PDF 2.0 examples were crafted by hand and
intentionally made simple in construction to serve as
teaching tools for learning PDF file structure and syntax.
PDF 2.0: The worldwide standard for digital documents
has evolved (Aug. 30, 2017)
The Portable Document Format is perhaps
the most common example of a de facto standard, so much
so that Wikipedia features PDF on its "de facto standards" page.
From Ethernet and 802.11 to HTTP and CSS,
the modern computing stack consists of hundreds of standards.
The way in which PDF exemplifies the specific value of
standards is almost unique, for PDF’s value proposition
- the reason why PDF is today’s worldwide
de facto standard for digital documents -
is the fact of standardization itself.
PDF 2.0 - What will it bring? (2015)
To put it simply:
PDF 2.0 makes it easier for developers to
create tools to manage digital documents
with more and better features at a reduced cost.
For organizations that procure PDF technology PDF 2.0
makes it easier to insist that vendors are delivering
the highest-quality, most accessible and most
capable PDF technology solutions available.
Are your software tools ready for PDF 2.0?
Check if your PDF and workflow tools are compliant with PDF 2.0
PDF 2.0 adds value to many workflows,
including those for production printing,
but it does also bring a small amount of risk.
If a file has used some of the new features in PDF 2.0
those will usually be silently ignored by an older reader.
PDF was designed to be very flexible and to allow
custom and proprietary data to be embedded virtually
anywhere in the file structure.
It does that by saying that a reader should simply
ignore anything it doesn’t recognize.
To a PDF 1.7 reader, most new PDF 2.0 features are just objects
that it won’t recognize and should therefore ignore.
The most common exception to that rule is around security;
if the PDF file uses the new AES-256 security introduced in PDF 2.0
then an older reader will probably be unable to read that file at all.
The biggest risk area when considering when and how to
roll out PDF 2.0 support is therefore that a PDF 2.0 file using
new features may have those new features silently ignored.
Some PDF readers will emit a warning that the file you’re opening
has a PDF version number that is not explicitly supported.
That’s helpful, but it can never be more than a hint to take care
because that older reader doesn’t know anything about any
new features in the file; it cannot possibly know if they’re
important to you or to your workflow.
Before you start thinking about upgrading the file creation tools,
confirm that your print providers and converters can
process PDF 2.0 files.
Start at the end of the workflow and work upstream.
Software tools supporting PDF 2.0:
Understanding UTF-8 in PDF 2.0 (ISO 32000-2:2020)
A brief introduction:
Text strings in PDF are intended for character strings that could be
presented to a human, such as in a graphical user interface or in
the output from command-line utilities.
The relationship between string types as illustrated in
Figure 7 from ISO 32000-2:
UTF-8 is a variable-width character encoding used for electronic communication.
Defined by the Unicode Standard, the name is derived from Unicode
(or Universal Coded Character Set) Transformation Format - 8-bit.
Because modern PDF text strings support Unicode they can reliably
represent any character, symbol or pictograph from any language
or symbol set supported by Unicode.
Unlike when PDF 1.7 was released, today, UTF-8 dominates
the web and has become the de facto character encoding
for operating systems and programming languages.
PDF 2.0 first added support for UTF-8 back in 2017.
As the adoption of PDF 2.0 increases, it is important for all users
to know that their PDF technology platforms and investments
correctly support the presentation of navigation and interactive
elements that can be encoded as UTF-8.
Most PDF viewers should therefore today be
expected to correctly handle UTF-8 text strings.
Testing of PDF UTF-8 support in various interactive PDF viewers:
Output from various interactive PDF viewers shows differing levels of
PDF 2.0 UTF-8 support for outlines, layers and document information.
PDF viewers not supporting UTF-8 typically display "junk" characters:
The decimal byte values 239, 187 and 191 in PDFDocEncoding represent
i dieresis (ï), right-pointing double angle quotation mark (»),
and inverted question mark (¿) respectively.
PDF technology platforms and investments correctly support
the presentation of navigation and interactive elements
that can be encoded as UTF-8:
PDF 2.0 introduces non-rectangular links to PDF
PDF’s Link annotations enable PDF to support various forms of hyperlinking.
Link annotations do not themselves contain any visible content;
their purpose is to define the location and shape of a "hot area"
on a PDF page, and the actions to occur when a user clicks
within the hot area.
Actions triggered by Link annotations can include hyperlinking to a location
within the same PDF document, a location in another PDF document,
an external URL, or invoking JavaScript or other PDF actions.
Link annotations also allow document authors to define various
appearance properties of the hot area such as border and highlight
styles used by PDF viewers as visual indicators of interactive areas.
Unlike HTML image maps, however, the page graphics underlying
Link annotations are not limited to bitmap images but can also be
device-independent vector graphics or text, or any combination.
This provides a far better on-screen experience when zooming.
Non-Rectangular Links in PDF 2.0:
Until now PDF has never had a purely viewer-only equivalent to
the HTML concept of image maps that allow specification of
arbitrarily shaped hot areas.
The new PDF 2.0 non-rectangular link capability isn’t just for
architecture, engineering and construction (AEC) organizations;
it’s useful for any irregular-shaped area in any PDF
that requires hyperlinking.
Usage examples might include:
- a PDF with a map of the world could have complex Paths
along each country’s border, with each country hyperlinked
to different chapters in the same PDF file.
- the non-rectangular symbols in flowcharts or process diagrams
could be hyperlinked to related content in the same or separate PDFs.
- a circular or irregular company logo could be linked to
the company’s website with some appropriate visual effect
(e.g., a link border in the company’s palette).
For more details on Non-Rectangular Links in PDF 2.0:
PDF 2.0 - Doing PDF Right
Benefit from the body of PDF industry knowledge of "PDF-issues"
Reporting and resolving identified errors and issues
in any PDF 2.0-related standard ensures that PDF
continually improves as an unambiguous interoperable
file format with a clear and reliable appearance model and
commonly defined expected behaviors across implementations.
This helps everyone in the PDF ecosystem,
from PDF developers to end-users.
Whether PDF is your core technology or key to a larger solution
this information is critical to ensure interoperability.
An invaluable source of education and information on a wide range
of technical PDF topics is the public GitHub repository:
Although the resolved issues are expressed as marked-up changes
applied to the latest PDF 2.0 specification (ISO 32000-2:2020),
many corrections are also highly relevant to earlier PDF specifications.
This is because PDF is a backwards-compatible format and a lot of
wording has been retained, or is only slightly adapted,
from earlier PDF specifications.
Clause numbering has largely remained unchanged between
PDF 1.7 (ISO 32000-1:2008) and PDF 2.0 (ISO 32000-2:2020).
PDF developers are therefore easily able to identify and
map such corrections back to earlier specifications
relevant to their implementations.
PDF 2.0 modernizes cryptographic support
PDF 2.0 modernizes cryptographic support.
The first ISO-approved PDF 2.0 extensions update the set of
supported hash algorithms and extend PDF digital signatures
to include modern elliptic curves.
Uses well-known and widely-implemented existing cryptographic algorithms
and are built on top of the digital signature and encryption frameworks
that already exist in PDF.
Learn about:
PDF 2.0 and ISO 19445 XMP metadata
for image and document proofing
PDF 2.0 and ISO 19445 XMP metadata
for image and document proofing.
"ISO 19445 Graphic technology, Metadata for graphic arts workflow,
XMP metadata for image and document proofing" is a metadata
standard for PDF technology developed by ISO TC 130 WG 2 that
"specifies the set of metadata to be used to communicate
the approval status, proof preparation and viewing parameters
for images and documents that are used in
the graphic arts print production workflow".
ISO 19445 is readily applicable to PDF 2.0-based PDF/X-6 files,
just as PDF 1.x-based PDF/X files.
This article provides recommendations to industry as to how
ISO 19445 might be used with PDF 2.0 and related technology
updates relevant to the graphic arts industry.
Learn about:
Discovering PDF metadata.
The importance of easy access to XMP metadata in
PDF files is more important than ever.
Soon, ISO will also publish dated revisions to both
PDF/A-4 and PDF/X-6 which will be indicated via
new values in existing ISO-defined XMP metadata fields.
It is common practice for many PDF applications to provide
banner-style indicators when PDF files declare conformance
with certain ISO standards such as PDF/A or PDF/X.
In addition, these same applications may decide to protect these
PDFs by opening the files in a read-only manner to help users
avoid accidental edits that may invalidate the file’s conformance.
However, some PDF applications have not yet generalized their
support to detect new versions or dated revisions of ISO standards!
This means that such software does not protect these files
until the vendor releases updates to their software.
The design of XMP metadata for each existing family of ISO standards
for subsets of PDF is both forward- and backward-compatible.
This means that even if old software is accidentally used to open
a newer PDF then that software can know that a file is declaring
conformance with a standard, even if the software is unaware
of the specifics.
Learn about:
PDF Fragment Identifiers
allows access to
specific content / Fragments of Documents
in longer PDF documents
This article is specifically for web designers, content creators,
webmasters, and web browser developers that want to improve
the user experience for website visitors.
When referencing PDF documents from web pages,
it is common to be linked to large PDFs where finding
referenced information might be complex.
This is typical across a variety of situations,
including FAQs, referencing manuals,
product information catalogs,
references to specific chapters in books,
articles in collections, etc.
PDF 2.0 introduces new capabilities providing
specific support for fragments of documents.
PDF Fragment Identifiers help to improve
the user experience for website visitors needing to
access specific content in longer PDF documents.
Websites can use URI fragments for PDF references so that
when website visitors need to interact with PDF content,
the precise content can be referenced,
for a quick and helpful experience instead of a generic and unfriendly:
"it’s somewhere in this long PDF document - work it out for yourself".
A URI fragment occurs after the URL and starts with a # character.
Technically speaking, it refers to a subordinate resource of
the primary resource identified by the URL.
URI fragments are extremely common with HTML
as this is how intra-page navigation works using
the anchor tag and IDs.
In the case of PDF, the main resource is the PDF file itself,
while subordinate resources can be specific pages, destinations,
and other types of targets.
Modern web browsers provide built-in PDF viewing capabilities
frequently used by many people.
All web browsers already understand URI fragments
because they are a core part of navigating the web.
It is simply a matter of augmenting the URL to a PDF
by appending the desired fragment identifiers.
Browser development teams now need to pay closer attention to
the needs of end-users when accessing web-delivered PDF content.
This includes fully supporting a broader set of ISO-standardized
PDF fragment identifiers in their default configurations.
Adding appropriate PDF fragment identifiers to the end of URLs
to target specific locations in longer PDF documents can provide
a far better and immediate user experience, including for users
who are less savvy at navigating PDF files.
Given the rapid growth in applications generating
PDF logical structure (content semantics),
it makes long-term sense to define business rules for
key content locations in documents that can persist
across multiple updates to that document.
Referencing by page number can change
if content is added, deleted, or moved.
But by using URLs with the "nameddest" parameter and
a controlled value, URL maintenance can be reduced.
For more information,
please read this article from PDF Association:
An Update on PDF Errata
An update on PDF errata.
PDF Association’s core mission is to deliver a vendor-neutral platform
for developing open specifications and standards for PDF technology,
and therefore maintains a public errata process for addressing issues
with the PDF 2.0 core specification (ISO 32000-2:2020).
The errata process includes the initial set of ISO standards for PDF based on
PDF 2.0 (PDF/A-4, PDF/X-6, PDF/VT-3 and ECMAscript for PDF 2.0),
and has broadened the scope further to support all ISO standards for PDF.
Learn about:
|
|
ISO 19005 (PDF/A)
the international standard for
long-term preservation/archiving of PDF documents
Approved international standard since 2005
Ensures that digital documents can be reproduced in the future.
Please also view the free booklet:
"PDF/A in a Nutshell 2.0, PDF for long-term archiving"
The use of PDF/A is very high.
Follow/participate in the work with the future development of PDF/A:
The Purpose of PDF/A
The purpose of PDF/A is a file format based on PDF
that provides a mechanism for representing digital documents
in a manner that preserves their visual appearance over time,
independent of the tools and systems used for creating,
storing or rendering the files.
PDF/A does not allow external dependencies and circumstances;
such as time dependencies, Javascript, ...
PDF/A for Beginners
What is PDF/A?
The A stands for archiving and thus
PDF/A is the standard for long-term archiving,
which guarantees a unique representation of
digital PDF documents in an unknown future.
The basic principle for achieving this goal is that
the PDF/A file must contain all the elements necessary
to correctly display the document.
A simple example: the fonts must be embedded.
Why should you use PDF/A?
Documents need to be stored securely and for a long time.
Retention periods vary from one industry to another, for example,
in the healthcare 30 years is common, for banks 50 years,
and for insurance companies 80 years.
This is a challenge for documents,
which are often only available in digital form!
The motivation for PDF/A can be easily understood
using the example of Airbus:
- As early as 2000, Airbus had a requirement that
aircraft design plans must be available for 99 years.
- At that time, a working group examined the options and
quickly came to the conclusion that PDF is basically a
very good document format, but that it offers too many
functions for the objective of long-term archiving.
- The result at that time was "minimal PDF" and
this was also the basic idea for PDF/A.
What is PDF/A important for?
For all important documents in your organization that
require a long life span and where it is critical that they
are displayed clearly and correctly in the future.
How to create PDF/A?
For single documents many applications allow you
to set a s "PDF/A" option in the "Save as" dialog in
common office applications.
Many PDF editors also offer a conversion function
to convert conventional PDF files into PDF/A.
For mass processing of documents, a distinction can be made
between scanned documents and digital documents:
Software tools supporting PDF/A:
Conformance levels for PDF/A-1, PDF/A-2, PDF/A-3 and PDF/A-4:
Conformance levels: a, b, u
The different conformance levels reflect the quality of
the archived document and depend on the input material
and the document's purpose.
- Level a (Accessible):
Meets all requirements for the standard,
including the logical structure of the document
and its correct reading order.
Text must be extractable and the logical structure
must match the natural reading order.
Fonts used must meet stringent requirements.
This PDF/A level can usually only be met
by converting born-digital documents.
PDF/A-a requires a tag structure.
- Level b (Basic):
Guarantees that the content of the document
can be unambiguously reproduced.
Level b files are easier to create than Level a,
but Level b does not guarantee 100%
text extraction or search ability.
It does not necessarily mean that the content
can be reused without any problems.
Scanned paper documents can usually be converted to
PDF/A Conformance Level b without any extra work.
PDF/A-b does not require a tag structure.
Examples:
PDF/A-2b: Where "b" means that the PDF file must be
a correct reproduction of the original document and
in that format can easily only consist of scanned pages.
PDF/A-2u: Where "u" stands for text in "Unicode".
Must comply with the "b" variant, correct reproduction of
the original document, and that text fonts must also be
included in Unicode format.
PDF/A-2a: Where "a" stands for "Accessible".
Must meet the "u" variant with text fonts in Unicode format,
and also be a structured document (tagged).
PDF/A-4e and PDF/A-4f:
PDF/A Rules for Document Attachments
PDF/A-1 (ISO 19005-1:2005)
No attachments allowed
Example: Conversion of email to PDF/A-1:
Attachments become additional PDF/A pages.
Conformance levels:
a - Accessible, b - Basic.
PDF/A-2 (ISO 19005-2:2011)
Attachments as PDF/A allowed
PDF/A-2 restricts
embedded file streams (attachments) to PDF/A files
Example: Conversion of email to PDF/A-2:
Attachments are converted to PDF/A and embedded in document.
Conformance levels:
a - Accessible, b - Basic, u - Unicode.
PDF/A-3 (ISO 19005-3:2012)
Attachments in arbitrary formats,
PDF/A and other formats, are allowed
PDF/A-3 has no specific restriction on attachments
Example: Conversion of email with attachments to PDF/A-3:
Attachments as PDF/A and (in addition) embedded in original format.
Conformance levels:
a - Accessible, b - Basic, u - Unicode.
PDF/A-4 (ISO 19005-4:2020)
Attachments in arbitrary formats,
PDF/A and other formats, are allowed
PDF/A-4 restricts the attachments to PDF/A-1, PDF/A-2 and PDF/A-4
PDF/A-4e and PDF/A-4f allow any type of attachment
PDF/A-4 is based on PDF 2.0 (ISO 32000-2:2017 and updated in 2020)
allowing it to take advantage of new PDF 2.0 features.
PDF/A-4 provides normative guidance on long-term archiving of
PDF files based on new features and other changes in PDF 2.0;
including page level output intents and improvements to tagged PDF.
PDF/A-4 supports long-term archiving of PDF 2.0 files
without loss of PDF 2.0 features.
PDF/A-4 introduces ISO-standardized means of archiving
certain types of non-static content common to PDF documents,
such as form fields, 3D content and JavaScript.
Using a conformance class to distinguish files containing
interactive content the specification responds to market demand
by facilitating the preservation of more information in the file.
This new capability via additional conformance levels supports
the long-term preservation of live forms,
engineering and CAD type documents
(where PDF/A-4e is replacing the need for PDF/E)
and in no way diminishes the traditional archival uses of PDF/A.
PDF/A-2 and PDF/A-3 comprise three different conformance levels, a/b/u,
which often causes confusion for many end-users.
PDF/A-4 simplifies this as PDF/A-4 documents
may or may not contain tags.
No dedicated conformance level is required for
tagged PDF/A-4 documents, effectively eliminating
the previous a/b/u conformance levels.
Similarly, PDF/A-4 documents may or may not contain file attachments.
Attached files must conform to PDF/A-1, PDF/A-2 or PDF/A-4.
PDF/A-4 encourages but does not require addition of higher-level
logical structures, and it requires Unicode mappings for all fonts.
While abandoning the a/b/u conformance levels,
PDF/A-4 introduces two new conformance levels:
PDF/A-4f: Allows any other format / non-PDF/A
file attachments to be embedded;
similar to how PDF/A-3 extends PDF/A-2.
PDF/A-4e: Is targeted at the engineering community.
PDF/A-4e is successor of the PDF/E-1 standard ISO 24517-1,
which is based on PDF 1.6.
The initial plan to define a new flavor, PDF/E-2,
was cancelled in 2018.
Instead, PDF/A-4e adds RichMedia annotations for 3D
content in U3D or PRC format to the base PDF/A-4 format,
as well as embedded files to create a PDF/A version
compatible with modern geospatial, construction and
engineering workflows.
As with the other PDF/A specifications,
PDF/A-4 does not require or provide mechanisms for authentication;
it’s strictly intended to facilitate long-term preservation.
How do you choose between variants of PDF/A standards:
PDF/A-1, PDF/A-2, PDF/A-3 or PDF/A-4?
(callas software blog, March 15, 2023)
How PDF/A-2 and PDF/A-4
are better than PDF/A-1 (2021)
callas software presents advantages with PDF/A-2 and PDF/A-4 vs PDF/A-1;
to avoid problems in PDF/A-1 conversion:
Use cases where PDF/A-3 and PDF/A-4f make a difference!
First, a short reminder - PDF/A Rules for Document Attachments:
Examples of use cases with PDF/A-3 and PDF/A-4f:
  +
PDF/A-3 and PDF/A-4f:
Attachments in arbitrary formats,
PDF/A, XML and other formats,
are allowed
The PDF-based e-invoices are also fully adapted and conforming
to the European Standard EN 16931 for e-invoicing.
The EN 16931 standard for e-invoicing
is a completely technology-neutral standard, see:
E-invoicing services implemented according to EN 16931
can therefore be based on various alternative techniques.
Highlighted approvals of PDF/A-3 + XML-based e-invoicing:
- October 12, 2023:
ZUGFeRD approved in Germany as
PDF-based electronic invoice format
by The German Federal Ministry of Finance (BMF)!
- July 1, 2024:
Factur-X approved in France as
PDF-based electronic invoice format!
Both ZUGFeRD and Factur-X are generally suitable
for the exchange of e-invoices between
public administration, companies/suppliers and consumers.
ZUGFeRD / Factur-X, e-Invoice Formats Based on PDF/A-3 and XML:
- 10 Years of PDF/A-3 Based Electronic Invoicing.
The success-story continues with new hybrid document types.
Almost 10 years ago, ZUGFeRD 1.0,
the first e-invoice data format based on the public standards
UN/CEFACT CII and PDF/A-3, was published.
The stated intention was to digitize invoice exchange and make
the transition from paper to data for SMEs and single users
as smooth as possible without losing efficiency.
The idea of using PDF/A-3 as a carrier format and wrapper for
the XML-based invoice data laid the foundation for the success
of digital e-invoices based on hybrid documents.
Apply the ISO Standard PDF/UA as well and your
PDF invoices are de facto accessible PDF e-invoices.
In 2016 decided
and
to develop a common hybrid e-invoice standard
to facilitate e-invoicing: ZUGFeRD / Factur-X.
- ZUGFeRD = Factur-X.
Ever since March 1, 2021, is the specification for
ZUGFeRD / Factur-X fully adapted and consistent
with the European Standard EN 16931 for e-invoicing.
PDF and XML-based e-invoices, such as ZUGFeRD / Factur-X,
can be created/issued, transmitted and received in a
structured electronic format that can be processed
automatically and electronically and they thereby comply
with the European Standard EN 16931.
The ZUGFeRD / Factur-X Standard is a
hybrid electronic invoice format,
that manages both structured and non-structured data,
and consists of two components, a PDF/A-3 file and
an embedded XML file (with identical invoice data)
as attachment:
- The PDF/A-3 file represents the visual part of
the invoice and is therefore readable by humans.
- The XML file contains structured invoice data that
is processed automatically and by machines.
The format contains different profiles of the invoice data,
which are identical in ZUGFeRD and Factur-X.
This tailors the data of the sender to
the requirements of the recipient.
The e-invoice recipient can choose between processing
the invoice as an ordinary PDF or let computers
process the embedded XML code.
Both formats can be used both as data in
ERP systems and for visual representation in
workflow and archive systems.
It is clear that hybrid document formats,
such as ZUGFeRD / Factur-X,
make the exchange of electronic invoices between
companies and between companies and
public administrations much faster,
more comfortable and easier.
Also, if the PDF part is designed according to
the ISO Standard 14289, PDF/UA, the e-invoices become
de facto digitally accessible PDF-based e-invoices.
Which is fully in line with the requirements of
the EU Accessibility Directive and the Swedish
"Act on accessibility to certain products and services",
which enters into force on 28 June 2025.
The law applies to new products and services
for the consumer market.
E-invoice formats, such as ZUGFeRD / Factur-X,
can then easily be e-mailed by authorities,
companies and other businesses directly to
individuals/consumers who in turn easily can
receive and consume the content of the PDF part
with their ordinary and preferred PDF reader.
The below notices/announcements in Germany and France
are good news for all users of ZUGFeRD, Factur-X and
PDF/A-3, or PDF/A-4f, and XML formats and their communities:
Announcement / Breaking News - October 12, 2023:
ZUGFeRD approved in Germany as
PDF-based electronic invoice format!
The German Bundesministerium der Finanzen, BMF
(The German Federal Ministry of Finance)
has now provided initial information about whether
the already known ZUGFeRD format meets
the legal requirements for electronic invoices.
BMF came to the conclusion that ZUGFerD
invoice format and corresponding syntax are in
accordance with Directive 2014/55/EU of 16.04.2014
(EUT L 133, 6. 5. 2014, p. 1).
For more details see:
Announcement / Breaking News - July 1, 2024
Factur-X approved in France as
PDF-based electronic invoice format!
From 2024 and onwards all companies must be able to
retrieve and store invoices received in the Factur-X format
(which is based on the ISO PDF/A-3 document format).
All French invoicing flows will pass a Public Billing Portal (PPF),
a portal set up by the French government.
Only the invoice formats XML UBL, XML CII and Factur-X
will be authorised and accepted.
For more details see:
Meanwhile in Sweden,
The Agency for Digital Government (DIGG, www.digg.se)
publishes since many years systematically on its website,
in its other channels, and in official documents,
the following untrue and completely incorrect claims
about PDF-based e-invoices:
"Is PDF an e-invoice?
No, a PDF invoice is not an e-invoice.
According to the regulations of Ordinance (2003:770)
on electronic information exchange by government authorities
an e-invoice is an invoice that is issued, sent and received in
a structured electronic format which makes it possible to
process it automatically and electronically.
A PDF invoice does not meet these requirements and thus
cannot be counted as an e-invoice according to
the e-invoice act."
What evidence, if any, does DIGG have for its baseless claims?
The fact is that since many years offers
- ISO Standard 19005-3:2012 ("PDF/A-3"), and
- ISO Standard 19005-4:2020 ("PDF/A-4f")
- in combination with XML format attachments
precisely the basic features and functionality that are
in demand to enable e-invoicing according to EN 16931.
PDF and XML-based e-invoices, such as ZUGFeRD / Factur-X,
can be created/issued, transmitted and received in a
structured electronic format that can be processed
automatically and electronically and they thereby comply
with the European Standard EN 16931.
Other EU countries fully accept these characteristics and now
implements PDF-based e-invoicing according to EN 16931
between public administration, companies and consumers.
In addition,
e-invoicing according to the technology prescribed by DIGG
for e-invoicing, Peppol, is expressly intended only for
e-invoicing from/between authorities, companies and
other businesses.
Peppol is not designed/intended for e-invoicing
towards private consumers.
Therefore, a human-oriented e-invoice representation is a
absolute prerequisite for a successful implementation
in Sweden of The EU's Accessibility Directive 2025.
Individuals/consumers will then also need to be able to
receive, read and manage incoming e-invoices from
authorities, companies and other businesses.
PDF-based e-invoices
(based PDF/A-3 or PDF/A-4f with XML attachment)
is the technical solution that already today ensures and
enables exactly this for individuals/consumers.
- Our formal demands and claims to
The Swedish Authority for Digital Administration (DIGG):
More facts on
e-Invoices based on PDF Standards
Source: PDF Association
Hybrid Invoice Formats
The role of PDF/A-3 for ZUGFeRD and Factur-X
Source: PDF Association
Hybrid Invoice Formats
The role of XML for ZUGFeRD and Factur-X
Source: PDF Association
Hybrid Invoice Formats
Early Milestones, ZUGFeRD and Factur-X
Source: PDF Association
Hybrid Invoice Formats
Now ZUGFeRD = Factur-X
Source: PDF Association
Hybrid Invoice Formats
ZUGFeRD / Factur-X uses PDF and XML,
XRechnung uses only XML
The idea of "visual data" (PDF) and "machine-readable data"(XML)
in one file (as in ZUGFerd, and Factur-X) is so convincing that
it is now also being used for other business documents
such as orders (Order-X) and delivery (Delivery-X) notes.
This presentation looks at the development of the various
standard versions (ZUGFeRD / Factur-X, Order-X and Delivery-X),
and describes the experience gained in this context with
PDF/A-3, XMP and file attachments in practical use with
millions of electronic invoices which can be understood
by both humans and IT systems.
-
Forum for Electronic Invoicing Germany (FeRD) brochure:
"Electronic Invoices – Practical Guidelines for Companies"
This brochure presents the rules and regulations applying to both
the paper invoice and the e-invoice, and highlights the special
provisions that apply specifically to e-invoices in the areas of
transmission, approval, correction and record keeping.
The Forum for Electronic Invoicing (FeRD):
"ZUGFeRD - The Format for Electronic Invoicing
in the Public and Private Sector"
ZUGFeRD / Faxtur-X is a kind of translation of
the European legal requirements
(EU Directive 2014/55/EU, European Standard 16931)
and is not application software.
This translation or structural description of a data set and
the associated dependencies must be implemented in
the software used by a business.
Invoice creation with ZUGFeRD / Factur-X:
The integration can be done, for example,
via standard software systems (i.e. ERP or EDI systems)
or in-house IT departments can independently integrate
ZUGFeRD / Factur-X into their individual software.
Many accounting and ERP software systems already
support ZUGFeRD / Factur-X.
By also providing the PDF part in compliance with
the ISO Standard PDF/UA for universally accessible PDF,
the invoice also becomes an accessible PDF.
Accessibility devices (such as screen readers) can then accurately
reproduce the invoice content for the human invoice recipient.
The key strengths of ZUGFeRD / Factur-X are:
- The human-oriented representation using the trusted PDF
to reliably communicate accurate information.
Human-oriented representation is prerequisite for
implementing the EU Accessibility Directive in 2025.
- The machine-oriented EDI information stored within the PDF
for automatic processing of ICT systems.
Benefits of ZUGFeRD / Factur-X:
- Save costs on printing, envelopes and postage
- No need to copy, scan, OCR invoices (less errors)
- Approval process can be done digitally
- Faster processing = faster payment
- No need to file invoices as paper documents
- Different software systems only need to
understand a single format (choice!)
- SMEs can meet requirements of large corporations
without former agreement
- Mails with ZUGFeRD / Factur-X attachment could be
detected, processed and filed automatically
- Banks could read invoices ZUGFeRD / Factur-X
and immediately process
Press Release - March 1, 2022:
The new version ZUGFeRD / Factur-X is published.
Germany and France are growing together with
the common Factur-X / ZUGFeRD 2.2 standard.
The reference profile makes it easier for companies to
implement e-invoicing, because they are now able to map
any country-specific version of the European standard for
electronic invoices within one single form.
The Factur-X / ZUGFeRD e-invoice format is freely available.
The technical specification is based on the international
UN/CEFACT standard Cross Industry Invoice (CII) and
on the ISO standard PDF/A-3.
This complies with the European standard EN 16931,
which specifies the standards and technical rules for
electronic invoicing in Europe, thus ensuring interoperability
and compliance with legal regulations.
In addition, Factur-X / ZUGFeRD integrates a large EXTENDED
profile which constitutes a standard library of additional
invoice data which could be necessary for
specific needs or use cases.
Factur-X is one of the three formats which all companies and
certified platforms will have to accept on reception in
July 2024 in France, especially for SMEs use, which represent
99 % of companies and more than 50 % of invoice flows.
Factur-X / ZUGFeRD gives companies a tool that helps them
in the best possible way to get ready for future developments
in the field of digitalization of the supply chain.
This new level of standardisation of the common e-invoice format
in Germany and France provides the opportunity to complete
the EN 16931 to address 100 % of use cases.
Additional information on e-invoicing within EU:
Practical Use Case
Dynamic genereration of high volymes of
ZUGFeRD e-invoices with callas pdfChip
Picture:
Typical workflow for creation of
ZUGFeRD e-invoices consisting of
a PDF/A-3 file with an embedded XML invoice as attachment
PDF Digital Signatures
Digital Signatures in PDF
Source: PDF Association
ISO 32000-1 + RFCs
ISO 32000-2 + ETSI CADeS/PADeS
Digital Signing of PDF/A Documents
In principle it’s a good idea to always make
the conversion to PDF/A-2/A-3/A-4 before the signing.
Every PDF/A conversion would lead to a breaking of all signatures,
and, much worse, you don’t get any informations on
the modifications executed by the converter and
their consequences on the validity of the signature(s).
In some case it’s not possible to convert the PDF before the signing.
Nevertheless the archivists require a long-term format for their archives.
A possible solution could be to make the conversion as the last step,
producing a PDF/A-3 (based on PDF 1.7),
or even a PDF/A-4 (based on PDF 2.0),
and embed the originally signed PDF as an attachment.
This works perfectly, although all participating parties need to agree on this
way of satisfying long-term archiving (LTA ) and digital signing (DigSig).
But, always try to avoid such constructs with signing components and
redesign the workflow to match the "conversion before signing" goal.
Follow/participate in the work with
the future development of digital signing and validation:
Application of the PDF/A Standard in Sweden
Binding rules apply for all Swedish government agencies and bodies
keeping public documents from state archives.
For Swedish government authorities apply to digital archiving of
office documents and digital documents the authorities must follow:
Riksarkivet's/The National Archives' regulations and rules:
- RA-FS 2009:1 general guidelines for electronic documents
- RA-FS 2009:2 technical requirements for electronic documents.
As of 2016, for long-term archiving of office documents and
digital documents these regulations and guidelines prescribe
the use of the file format: PDF/A-1.
(Comments/Please be aware:
Riksarkivet/The National Archives is expected to revise its
regulations and rules as PDF/A-1 is no longer recommended
because it is a standard that was designed based on
what was possible almost 20 years ago.
IT and PDF standards have evolved a lot since then!)
PDF is Here to Stay - It Will Never Go Away
PDF technology is a pervasive feature of
the world's communications infrastructure.
With a unique and unmatched feature-set;
no other technology comes close.
We're not going back to paper,
so it's long past time for governments and businesses to focus
just a little on this ubiquitous format that's never going away.
PDF in the U.S. Federal Archiving Community,
Library of Congress (USA) blog post (2020):
PDF/A och Enterprise Content Management system
Do Complement Each Other Perfectly!
Not all tools do create PDF/A compliant files suitable for storing in ECM systems.
There are numerous tools on the market, including freeware,
with which users can create PDFs and store them in ECM systems.
But are the results of this always satisfactory?
With PDF/A the archive becomes the “Noah’s Ark” for every document
If only documents in original formats
(MS Word, MS Excel, PDF, HTML, TIFF, JPG, AFP and PCL,...) are stored
the archive's ability to deliver usable content will always be in doubt.
Instead, with every original document, a PDF/A document
should also go "on board" to ensure functionality in an
unknown environment after the "flood".
Memorializing Online Transactions with PDF Documents
What to do when RDBMS systems fail to memorialize transactions?
By capturing the visual representation (in PDF/A!)
at the time that the transaction is processed it is
guaranteed that the data used in creating the document is
current and valid and the visual representation of the transaction
matches the expectations of all the parties involved in the transaction.
Future generations access to and render of vintage email
Packaging Email Archives Using PDF (EA-PDF)
Archiving email isn't easy or obvious.
Commonly, solutions are vendor-specific and email clients are required;
not an ideal solution for static records.
In 2019 the University of Illinois was awarded a grant by
the Andrew W. Mellon Foundation to develop conversion criteria
and requirements for archiving email into PDF containers.
The EA-PDF Working Group, expert members from government,
academia and industry, has filed a report that establishes
high-level functional requirements for an idealized use of
ISO 32000 Portable Document Format (PDF) technology
as a model for packaging email for archival or other purposes.
These requirements provide a framework within which interested people
from the archives, library, museum, digital preservation, and developer
communities can collaborate to develop a technically detailed
specification and implementation reference model.
The EA-PDF concept integrates the capture of EML or MBOX content
with PDF as a packaging, representation and distribution model
for individual emails up to complete mailboxes.
EA-PDF Working Group:
EA-PDF Working Group Report:
“A Specification for Using PDF to Package and Represent Email"
EA-PDF Working Group Presentation at PDF Days Europe 2022
Future generations access to and render of vintage email?
Archives around the world are filled with handwritten letters and typed memos.
But what about correspondence of a later vintage?
How should governments, universities, business, and archives
ensure the future generations can access and render email?
Emails for eternity (July 14, 2021)
Digital messages often contain valuable knowledge that must be retained.
But how can e-mails be elegantly archived?
To date, there is no supreme solution.
However, for a number of reasons,
the PDF/A route currently seems to be the most practical.
The good news is that e-mails are digital per se and already contain metadata.
This makes it fundamentally easier to archive them than
paper-based communications.
However, in many cases, there are no company guidelines in this regard,
so users decide individually how to handle their e-mails.
As a result, there is a high risk that business-relevant messages are lost.
Emails are handled by various specialized systems that enable
the creation, transport, viewing and storage of these digital messages
(lifecycle: client, server, relay, archiving system).
For more on secure archiving of emails in the PDF/A format,
we will have to deep dive in what an email consists of:
PDF/A Tools
"PDF/A-Ready" Software Tools:
Many "PDF/A-Ready" tools are available to support
all aspects of PDF/A production environments.
The current PDF/A specifications are well established and mature
as far as software developers are concerned, among them:
actino DRM and DPS
Cloud- / Web-based Solutions for PDF Processing
axaio MadeToTag
Creation of accessible PDF from within Adobe InDesign
according to the ISO Standards PDF/UA-1 and PDF/A-2a
callas pdfaPilot
Optimizes and standardizes PDF documents and email
in automated workflows to PDF/A for long-term archiving
as well as e-invoices based on PDF/A-3 or PDF/A-4 and XML.
Also very useful for validation and fixes of PDF files
for conformance to PDF/UA
callas pdfChip
Dynamic conversion of HTML content to the desired format
for distribution through any communication channels.
Typically used in automated workflows for
print, publishing and e-archiving to dynamically create
customized high-quality PDF files in large quantities from HTML
(including PDF/A and PDF/X); such as housing/property information,
tickets, receipts, order data, prescriptions, invoices/e-invoices based on
PDF/A-3 or PDF/A-4 and XML.
Dynamisk konvertering av HTML-innehåll till önskat format
för distribution i olika kommunikationskanaler.
Används typiskt i automatiserade arbetsflöden för
print, publicering och e-arkivering för att dynamiskt skapa
individanpassade högkvalitativa PDF-filer i stora mängder från HTML
(inklusive PDF/A och PDF/X); såsom bostads-/fastighetsinformation,
biljetter, kvitton, orderdata, recept, fakturor/e-fakturor baserade på
PDF/A-3 eller PDF/A-4 och XML.
Foxit PDF Compressor and Rendition Server
Solutions for Conversion and Compression of
Documents and E-mail to PDF/A for e-Archiving
Laidback Solution - FileTrain
Solutions for Automation of Any Type of Workflow
Interesting?
|
|
ISO 14289-1 (PDF/UA-1)
international standard for universally accessible PDF
(based on PDF 1.7)
Approved international standard since 2012
(minor revision in 2014)
Specifies the use of ISO 32000-1 (PDF 1.7)
to produce accessible digital documents.
PDF/UA-1 is of interest to organizations concerned
with conformance to regulations requiring
accessible digital content based on PDF 1.7
The use of PDF/UA-1 is very high.
Since its release, PDF/UA-1 has been broadly implemented
in software and is both referenced directly and
implied in regulations around the world.
ISO 14289-2 (PDF/UA-2)
international standard for universally accessible PDF
based on PDF 2.0
Approved international standard since 2024
Specifies the use of ISO 32000-2 (PDF 2.0)
to produce accessible digital documents.
PDF/UA-2 is of interest to organizations concerned
with conformance to regulations requiring
accessible digital content based on PDF 2.0
The use of PDF/UA-2 will be very high.
Breaking News! (August 12, 2024)
No-cost access to PDF’s accessibility standards!
Starting today, PDF users everywhere can leverage
ISO-standardized accessibility requirements for content in PDF.
The PDF Association and leading PDF accessibility companies
make the ISO standards for accessibility in PDF technology,
- ISO 14289-1 (PDF/UA-1), ISO 14289-2 (PDF/UA-2) and ISO TS 32005 -
available for download at no cost.
These ISO standards that define "the gold standard" for accessible PDF files.
Although users worldwide benefit from PDF’s accessibility features,
adoption of accessibility standards for PDF technology has lagged
because a significant portion of the ecosystem can’t or won’t pay
for expensive ISO publications.
Without a no-cost specification it’s very hard to reach
all these developers, remediators and other users.
All types of PDF users can begin to leverage ISO-standardized
accessibility requirements for content in the PDF file format:
- End-users who need assistive technology in order to
navigate and read digital documents can use these
standards to foster improved products and services.
- Organizations can use these standards to set goals for
the documents they create, ingest, publish, share or manage.
- Document remediators can use these standards in performing
professional services related to ensuring PDF files are accessible.
- Software developers can these standards as guidance in
developing software to create and process tagged PDF.
Download:
The ISO documents covered by this announcement include:
ISO 14289-1:2014
Better known as “PDF/UA-1" (UA stands for “universal accessibility”),
this document provides critical accessibility requirements for
PDF documents conforming to PDF 1.7.
ISO 14289-2:2024
The "next-generation" accessibility standard for PDF,
PDF/UA-2 provides accessibility requirements specific to PDF 2.0,
and thus opens the door to a wide variety of
accessibility enhancements enabled with PDF 2.0.
ISO TS 32005
This Technical Specification is published by ISO to provide
rules for integrating structure elements defined in PDF 1.7
with those defined in PDF 2.0.
The PDF Association’s work to advance digital accessibility
A major focus of the PDF Association’s work is to
increase awareness and adoption of standards and
best practices for digital accessibility;
from advancing accessible PDF to promoting
accessibility in ISO standards documents:
Please also view the free booklet:
Follow/participate in the work with the future development of PDF/UA:
Accessible PDF
A fully PDF/UA compliant PDF can be just as
accessible as a WCAG compliant website
The Purpose of PDF/UA
A digital or electronic media is accessible when it is easily opened, read
understood and can be navigated by everyone, with or without disabilities.
The purpose of PDF/UA is to define a complete set of
requirements for universally accessible PDF documents.
Rather than applying to the PDF file format alone,
these clear specifications also define compliant
assistive technology and PDF reading software.
PDF/UA for Beginners
What is PDF/UA?
The UA stands for Universal Accessibility.
PDF/UA is the ISO standard for accessible PDF documents.
Ensuring access to information for people with disabilities
is in many cases a legal requirement.
PDF technology includes a feature known as "tagged PDF"
that make accessible PDF files possible.
A good example of the need for tagged PDF is a person
who cannot see the text or images in a document.
The Tagged PDF feature allow authors to provide
the sequence and nature (heading, paragraph, list, etc.)
of the text, and alternative descriptions for images.
Why should you use PDF/UA?
For government agencies in many countries accessibility
is simply a legal requirement, as countries owe their citizens
equal access to information.
It's also often required of companies offering digital public services.
Good tagging produces better documents in ways
that go beyond accessibility.
Including document structural information helps in
the optimal display of documents on different devices,
and helps to categorize content for document analysis
applications such as those using artificial intelligence.
What is PDF/UA important for?
To meet legal requirements and especially PDF documents
that are published on websites; both public and private.
The effort needed is worth it; especially for
important documents that have a longer lifespan.
A common approach in practice is to start with low-barrier documents,
which represent a compromise between cost and benefit aspects.
How to create PDF/UA?
For digital documents, the "secret" is in the source.
PDF/UA files should be made using software that
supports Tagged PDF, as editing an inaccessible PDF
afterwards requires a relatively larger effort.
Most documents are created in Office applications,
and then well-prepared templates help a lot.
Office packages such as Microsoft offer (rudimentary) checking functions.
Conversion to PDF/UA is then at best an uncomplicated procedure.
Technically, there are many specific requirements to meet for a
PDF document to be a PDF/UA compliant accessible documents.
From an organizational point of view,
in addition to training on accessible authoring,
it has been proven in practice that it requires
experts to prepare and maintain templates,
allowing users to concentrate on the document content.
Scanned documents are a separate area.
A scanned document is not accessible at first.
There are tools for single documents as well as
for mass processing that perform auto-tagging.
These tools typically recognize a lot of the structures
in the document, but only achieve a low-barrier result.
The remaining manual rework required for full accessibility
is often significantly reduced.
Also available are tagging services to manually optimize documents.
Software tools supporting PDF/UA:
PDF/UA Defines Technical Requirements
for Universally Accessible PDF
PDF/UA defines the technical requirements that must be
considered when the PDF document is created
to be universally accessible for all.
The standard specifies HOW relevant PDF content
(such as semantic content, text content, images,
form fields, comments, bookmarks, and metadata)
may be used in PDF/UA-compliant documents.
Properly tagged PDFs are essential and a prerequisite
for accessibility so that screen reader devices for
visually impaired people or reading software for
users with learning disabilities can provide
rich access to a PDF’s content.
PDF tags are also an effective method to improve
Search Engine Optimization (SEO).
Even automated text extraction from PDF documents
is easier with well-tagged documents.
Introduction to PDF/UA
The ISO standard for universal accessibility
PDF Association
"The Matterhorn Protocol"
To promote adoption of PDF/UA,
the ISO standard for accessible PDF,
by software developers, service bureaus and
those interested in document accessibility,
the PDF Association's PDF/UA Technical Working Group
has developed the Matterhorn Protocol,
a list of all the possible ways to fail the PDF/UA-1 Standard.
"PDF/UA-Ready" software tools verify/confirm PDF/UA conformance
based on the Matterhorn Protocol's set of checks to facilitate
the exchange of detailed information on PDF/UA conformance.
PDF/UA conformance requires validation of both syntax and semantic.
The Matterhorn Protocol specifies a common set of
31 "Checkpoints" with 136 failure conditions, whereof
- 87 failure conditions can be checked by software alone,
- 47 failure conditions usually require human judgment.
- 2 failure conditions have no specific tests.
- Some failure conditions pertain to the document,
some to the page and most to individual objects such as
tags, tables or annotations usually require human judgment.
The 47 checks that may require human judgement boil down to:
- Confirming that the document's semantics
as indicated by the tags are accurate
- Confirming that the order of semantic content is logical
- Confirming that any role-mappings in use are valid
- Several checks that apply equally to other forms of content
(color, contrast, metadata, alternate text for images, language)
- Checks pertaining to JavaScript,
or other content-specific checks
Recommended reading:
Matterhorn Protocol v1.1 - PDF/UA Conformance Testing Model (2021)
The 1.1 release of the Matterhorn Protocol, released in April 2021,
adds a new failure condition and provides several clarifications.
The PDF file is tagged to reflect current best-practice in tagging
PDF documents for accessibility and reuse.
Note: The Matterhorn Protocol 1.1 conforms to ISO Standards
PDF/UA-1 (ISO 14289-1) and to PDF/A-2a (ISO 19005-2),
and is presented as a reference-quality PDF/UA file.
PDF techniques for accessibility:
A new model
(October 27, 2023)
In 2018 the PDF Association’s PDF Accessibility Liaison Working Group
began a project to develop a set of definitive techniques for
accessibility in PDF files.
The PDF Association’s Accessibility Techniques
are designed to provide guidance to two key groups:
- End users can use these Techniques to learn
how to tag or check PDF files.
- Developers can use these Techniques to understand
the technical requirements of accessibility.
The PDF Association will shortly launch
its initial set of “Fundamental Techniques”.
To provide a taste of what is to come,
PDF Association now presents examples of PASS and FAIL techniques.
This short article outlines how PDF Association has set out to
improve upon W3C/WCAG’s techniques in several specific ways:
PDF Association Reference Suite, V1.1 (September 17, 2020)
Accessible PDF Documents in Compliance with
the ISO Standard 14289-1 for Universally Accessible PDF - PDF/UA-1
PDF Association Reference Suite V1.1 adheres to these recommendations:
The PDF/UA Reference Suite serves as a reference for
software developers and practitioners interested in
best-practices for creating tagged and accessible PDF files.
Ranging from publications to transactional records
the collection represents a cross-section of document types
reflecting the wide variety of uses for PDF technology.
Documents included in the PDF/UA Reference Suite demonstrate
correct tagging in a number of sophisticated use cases, including:
- Content spanning multiple pages
- Complex table structures
- Interactive forms
- Links targeting structure elements
- Scanned documents
In addition to conformity with PDF/UA-1, some files are also in
conformity with PDF/A-2, the ISO archival standard for PDF.
Some files additionally demonstrate that PDF and PDF/UA-1
support the use of structure elements for diverse purposes
so long as they do not impact interpretation or representation
of the document’s logical structure.
Why tagged PDF is an important prerequisite for accessibility
for everyone to digital information in PDF documents
The IT and PDF Industry drives Tagged PDF forward
In PDF documents, as in HTML, content semantics are
expressed via tags, hence “tagged PDF”.
Tagged PDF allows for semantically accurate extraction
and reuse of text and annotations enabling accessibility,
reflow and other applications.
Tagged PDF is an optional feature in the PDF file format
and thus not every PDF file is tagged.
However modern tools such as Apple’s office suite automatically adds tags
when exporting to PDF and Google’s Chrome now creates tagged PDF as well.
Older tools may require an explicit option to be enabled when exporting.
On the one hand, the fact that tags are optional means that PDF is
extraordinarily flexible in accommodating every type of content from
every source imaginable, even when the original source lacks semantics.
On the other hand, tags require a knowledgeable document author
and capable software to achieve good results.
This article offers an overview of PDF industry
activities pertaining to tagged PDF:
The Value of (Correctly) Tagged PDF
Tagged PDF offers a lot more than access to users with disabilities.
From search engines to mobile devices, tagged PDF offers
powerful options to make content "accessible for all"
thanks to reuse of page-based content.
PDF was originally intended to serve as digital paper;
a properly rendered page irrespective of software or operating system.
Pages, however, aren’t just for reading.
Since people like to add notes, draw lines and fill forms, Adobe Systems,
the inventors of PDF, decided to cater to these uses as well. PDF rapidly
accumulated new features beyond faithfulness to the rendered page
- it began to mirror the interactive capabilities of real paper.
The first generation of interactive PDF features consisted of
annotations of various types. Some allowed users to add text,
others allowed users to draw lines and boxes onto the page.
Still others go beyond the paradigm of the page,
making it possible to add hyperlinks, audio and movies to PDF.
The second generation of interactive PDF brought the ability to
deploy a PDF’s content outside the page-based world.
Tagged PDF provides the means to effectively deploy a
final-form document to a mobile device.
It’s the same means by which PDF files may be made accessible
One of the primary motivations for tagged PDF was to achieve
compliance with regulations that require digital documents
to be accessible to users with disabilities, but implementers can
leverage tagged PDF to accomplish or enhance a wide range of
end user activities.
Guide on to how correctly tag a PDF file for accessibility:
Correctly tagged PDF is a prerequisite for display of PDF
on mobile devices / small screens:
The disadvantages of untagged PDF content vs
the benefits of correctly tagged PDF content:
Semantics or ordering:
- Untagged Content:
No semantic types or ordering;
content is ordered solely for rendering purposes
- Tagged Content:
Semantic type and order is determined,
content may be reused accordingly
Search engines:
- Untagged Content:
Search engines cannot reliably access words and phrases
- Tagged Content:
Search engines get reliable access to content
Reflow of page content:
- Untagged Content:
No reliable means of reflowing page content onto smaller devices
- Tagged Content:
Includes information necessary for reflow
Real content and artifacts:
- Untagged Content:
“Real” content and “artifacts” aren’t distinguished
- Tagged Content:
Consuming software can choose to utilize or ignore artifacts
Content copying and extraction:
- Untagged Content:
Content copying and extraction is unreliable
- Tagged Content:
Content may be extracted with confidence
PDF/A conformance level A:
- Untagged Content:
Not eligible for PDF/A conformance level A
- Tagged Content:
May conform with PDF/A conformance level A
WCAG 2.0 or U.S. Section 508 Compliance:
- Untagged Content:
Cannot comply with WCAG 2.0 or U.S. Section 508
- Tagged Content:
May comply with WCAG 2.0, U.S. Section 508 and
other accessibility regulations
Accessibility:
- Untagged Content:
Inaccessible to disabled users
- Tagged Content:
Accessible to those with PDF-aware Assistive Technology
Guides to Well-Tagged PDF Documents (WTPDF)
Tagged PDF Best Practice Guide: Syntax
(V1.0.1 - Januari 14, 2023)
(For download of the guide, click on the picture above)
This document is intended for developers
implementing tagged PDF and PDF/UA.
Others (including authors with some technical
knowledge of PDF’s accessibility mechanisms)
may also benefit from this document.
For example, this guide is intended to be useful for
those performing detailed accessibility testing on
PDF documents claiming conformance with PDF/UA,
or on PDF documents claiming to be accessible
according to some other specification.
The guide includes detailed guidance
for all structure element types and attributes,
and provides guidance for PDF 2.0.
Well-Tagged PDF (WTPDF)
Using Well-Tagged PDF for Accessibility and Reuse in PDF 2.0
PDF 2.0, the most recent specification of the PDF file format,
introduced powerful new capabilities to Tagged PDF that enhance
PDF’s capacity to deliver reusable and accessible content.
With the introduction of WTPDF, the PDF Association addresses
the critical needs of both reuse and accessibility,
unlocking the full power of PDF 2.0.
The Well-Tagged PDF 2.0 (WTPDF) specification
(V1.0.0 - February 28, 2024)
(For download of the guide, click on the picture above)
This document describes a usage of PDF 2.0 (ISO 32000-2)
that is compatible with PDF/UA-2 (ISO 14289-2).
A common specification for both reuse and accessibility,
is a massive leap forward for PDF technology.
The primary purpose of this specification is to
define how to represent electronic documents
in the PDF format in a manner that allows
the file to be both reusable and accessible across
a wide spectrum of possible use-cases.
Well-Tagged PDF (WTPDF) provides developers with
comprehensive requirements for software that seeks to
create fully reusable and accessible PDF 2.0 files
in an interoperable manner.
If you support WTPDF in your daily business,
you also support PDF/UA-2.
This specification’s rules regarding accessibility
are mirrored in the ISO 14289-2 (PDF/UA-2) specification,
thus this specification is the canonical reference for
both PDF reuse and PDF accessibility in PDF 2.0.
Use cases for this specification include:
- ensuring accessibility of PDF 2.0 files;
- managing reflow of content
(e.g., for responsive layout on mobile devices);
- derivation to other formats, including HTML;
- interoperable structuring of unstructured content;
- content and data extraction (e.g., copy-and-paste);
- selection, annotation and redaction;
- enhancing searchability;
- unlocking content and semantics for use by AI;
- change-tracking;
- round-trip editing
(e.g., word processor → PDF → word processor).
Using PDF/UA in accessibility checklists (2018)
PDF/UA to simplifies the accessibility process.
Applying PDF/UA to accessibility-validation processes allows one
to package sets of tests together, streamlining the validation process.
The relationship between PDF/UA and WCAG
All web content, including PDF files,
must meet the guidelines of WCAG 2.x at the AA level.
But, WCAG's recommendations alone are not enough
to make a universally accessible PDF.
The challenge with WCAG in relation to PDF files is that
"W" stands for "Web" and "G" for "Guidelines".
WCAG is HTML oriented and WCAG's guidelines do not provide
many opportunities to physically test PDF files with respect to
digital accessibility.
For the technical implementation of accessible PDF,
compliance with the ISO Standard PDF/UA is also a requirement;
a PDF file can be compliant both with WCAG and PDF/UA.
PDF/UA is a required complement, not an alternative, to WCAG.
PDF/UA is consistent with WCAG, but far more technically specific,
and provides a clear-cut means of affirming that a given
PDF document meets high standards for digital accessibility.
A fully PDF/UA compliant PDF can be just as
accessible as a WCAG compliant website.
To get the benefit of PDF/UA-1 users will need software
that supports PDF 1.7 and PDF/UA-1.
What is the relationship between
WTPDF, PDF/UA-2 and WCAG 2.x?
To get the benefit of Well-Tagged PDF (WTPDF) or PDF/UA-2
users will need software that supports PDF 2.0 and PDF/UA-2.
WCAG establishes generic accessibility norms for web technologies
(including PDF) focused on end-user outcomes,
whereas WTPDF and PDF/UA focus entirely on constructing of
PDF files for reusability and accessibility.
As such, WCAG, WTPDF and PDF/UA are entirely complementary:
- WCAG provides requirements regarding content accessibility;
- WTPDF and PDF/UA provide requirements
that ensure accessibility in the PDF context.
The best practice for document authors
producing accessible PDF files is to:
- consider WCAG’s requirements
when designing and creating content;
- use software capable of meeting
WTPDF and PDF/UA requirements to produce the PDF files.
Why would I use
WTPDF and PDF/UA-2 instead of PDF/UA-1?
PDF 2.0 introduces new capabilities providing
specific solutions for the following types of content:
- Mathematical expressions
Fragments of documents
- Headings which skip levels
- More than 6 levels of headings
- Sub-divisions of block elements (e.g., lines of code)
- Documents including “side” content
- Documents with both titles and headings
- Lists separated in sections with other content between list items
- Links targeting headings or other content
- Content that uses emphasis
- Page numbers, line numbers, Bates numbers
- Redactions
- Watermarks
WTPDF and/or PDF/UA-2 are required to take advantage of these
PDF 2.0 capabilities in a consistent reusable and accessible manner.
In addition, WTPDF and PDF/UA-2 add comprehensive new rules for
reuse and accessibility of existing PDF features,
including layout attributes and annotations,
that PDF/UA-1 did not fully address.
Achieving WCAG 2.x with PDF/UA
Why aren’t the PDF Techniques for WCAG 2.x sufficient?
Creators and vendors who deliver PDF files are in many cases asked
to deliver PDF files in conformance with WCAG 2.x.
For many vendors this is unknown territory, and WCAG 2.x does not
provide sufficient PDF-specific technical information to achieve similar
results between situations or implementations.
WCAG and PDF/UA complement each other.
The AIIM guide, "Achieving WCAG 2.0 with PDF/UA",
shows what’s necessary to create, process and validate,
(in PDF file-format and conforming reader terms),
a PDF/UA conforming document and reader
to meet all applicable WCAG 2.0 Success Criteria.
The AIIM guide is here:
Breaking News! (September, 2018)
PDF Association helps W3C’s Web Accessibility Initiative
to modernize the W3C's PDF Techniques for Accessibility!
The ISO Standard PDF/UA (accessible PDF) to form the basis of a
revised and updated set of W3C's PDF Techniques for WCAG 2.1
More details:
U.S. Access Board Affirms:
PDF/UA required for “modern” PDF software (2019)
Accessibility best-practices for websites and digital documents
increasingly specify WCAG for HTML/CSS/JavaScript content and video,
and PDF/UA for digital documents.
New U.S. Section 508 rules applies to all forms of federal ICT,
regardless of file format or method of distribution.
U.S. Section 508 applies to all ICT / all forms of digital communication.
Not just websites, but documents, media, blogs, social media, etc.,
for all public-facing ICT, plus 9 categories on non-public-facing ICT
including personnel actions, questionnaires or surveys,
templates or forms, education or training materials,
web-based intranets.
U.S. Section 508 defines by reference international accessibility standards:
- WCAG for websites and HTML information, and
- PDF/UA-1 for PDF files
WCAG and PDF/UA complement each other.
PDF/UA is consistent with WCAG, but far more technically specific,
and provides a clear-cut means of affirming that a given PDF document
meets high standards for accessibility.
New Section 508 rules require PDF/UA
for PDF 1.7 documents (2017).
The U.S. Access Board has issued new rules updating its
“U.S. Section 508” accessibility requirements.
PDF/UA-1 support is required for PDF creation software producing PDF 1.7 files.
U.S. Access Board Affirms:
PDF/UA support required by PDF software (2019)
U.S. American With Disabilities Act (ADA)
also valid for commercial web content (2019)
October 7, 2019, will be remembered in
the accessibility community for a long time.
As of this date, websites and mobile applications in
the U.S. will be assessed as "public accommodations"
rather than merely as one of many ways in which a
consumer might access a retailer’s offerings.
As such, the accessibility requirements (and penalties for non-compliance)
of the American With Disabilities Act (ADA) will apply.
It is now no longer federal, state and local government and their
contractors who are required to ensure their digital content
is accessible for everyone.
The power of the ADA may now be leveraged to force corrective
action by virtually any commercial organization offering
a public accommodation.
See also:
Application of the PDF/UA Standard in Sweden
Access to digital information is a fundamental right for everyone.
Sweden as a nation stands by:
- UN Declaration of Human Rights,
- UN Convention on the Rights of Persons with Disabilities, and
- the Swedish Discrimination Act.
- the Swedish Act on Accessibility to Digital Public Service;
based on the EU Web Accessibility Directive.
PDF/UA is applicable for Swedish government agencies and bodies
in making public sector documents universally accessible for all.
See also:
Entirely barrier-free:
Accessible PDF (PDF/UA) for Accessible eGovernment
Access to digital information is a fundamental right for everyone.
Making information easily accessible to citizens is undoubtedly
a big part of eGovernment and is sought after by federal and
state authorities as well as districts, cities and municipalities.
Increasingly, information is only offered and passed on in digital form,
whereby the reliable and user-friendly Portable Document Format (PDF)
has established itself worldwide as the preferred file format.
To ensure unrestricted access in every respect,
PDF files must meet certain requirements.
These are defined in PDF/UA as the ISO standard for accessible PDF documents.
It ensures that even citizens with greatly diminished vision,
insufficient command of written language or motor limitations can
capture and interactively use documents without outside help.
More in this blog article:
Services and tools to create
accessible PDF documents and forms
according to the ISO Standard PDF/UA
|
|
PDF Accessibility Liaison Working Group (LWG)
Follow/participate in the work with:
PDF Accessibility Working Group (LWG)
Mission
PDF Accessibility LWG is focusing on developing techniques for
achieving PDF/UA and WCAG compliance.
PDF Accessibility LWG meets weekly to review example
"pass" and "fail" PDF files ("techniques"),
and develop appropriate metadata.
Objectives
Techniques for achieving both PDF/UA and WCAG compliance
are fundamental prerequisites for successful digitization.
The PDF Accessibility LWG is working to answer:
What are the fundamental accessibility techniques for PDF,
and how should a document’s components, such as
title, table row header, table spanning more than one page,
nested list, caption or formula, be tagged for digital accessibility?
When searching for clear-cut answers to everyday tagging situations, however,
practitioners often find locating the right content to be cumbersome,
sometimes ambiguous, or simply inconclusive.
The PDF Accessibility Liaison Working Group is working towards
the development of a full set of techniques and examples to demonstrate
correct solutions for all tagging issues:
Who can participate?
The PDF Accessibility LWG is open to all PDF Association members
and by invitation, to non-member accessibility professionals and
end user subject matter experts.
|
|
PDF/UA Processor Liaison Working Group (LWG)
"PDF/UA Processor"
Software or hardware processing PDF/UA-files
Follow/participate in the work with:
PDF/UA Processor Liaison Working Group (LWG)
Mission
To establish principles and a framework,
and then develop requirements for
PDF/UA processors and assistive technology (AT)
Improving accessibility support for PDF documents means
improving the way PDF viewers, other PDF processors and
assistive technology (AT) handle tagged PDF.
Although PDF/UA-1 provides conceptual requirements for processors,
these have not been formally adopted as broadly as have
PDF/UA’s file format requirements.
PDF/UA-2, published in 2024, differs from PDF/UA-1 in that it
focuses exclusively on file format requirements,
with requirements for processors to be developed
in a separate dedicated specification.
Objective
Today, users who require assistive technology (AT) to read PDFs
do not enjoy a similar or consistent experience across diverse
hardware and software platforms.
The PDF/UA Processor LWG was created with two main objectives in mind:
- The first objective is to help developers who are more familiar
with web technology be able to readily understand and use
PDF’s accessibility features.
- The second is to encourage vendors to move towards
a standardized solution instead of relying on
implementation-specific approaches.
To ensure end users can obtain accurate,
consistent results when interacting with digital documents
regardless of software or platform, the PDF/UA Processor LWG
will draft a technical specification for processing PDF/UA files.
The business case
Who will benefit from a specification for processing PDF/UA files?
PDF creators:
More capable viewing software will improve
demand for PDF/UA creation.
PDF viewers:
Support for PDF/UA (and tagged PDF in general)
will be easier to add and maintain with confidence,
even for developers without specific expertise in
supporting accessibility.
Incidental PDF processors:
Awareness of PDF/UA (and tagged PDF in general)
will make it easier for developers who
don’t intend to present the document
(e.g., anti-virus or document management software)
to nonetheless avoid damaging it.
AT developers:
A standardized API, for example, would make enabling
PDF accessibility much less costly, increase the scope for
user-specific features and options, and would greatly
improve the end-user experience.
Document authors:
When content is presented as intended,
authors’ content will be more accurately and
consistently conveyed to all the document’s readers.
Policy managers and regulators:
With better software support for PDF/UA,
policy creators, and regulators will be able to
set clear procurement standards for their vendors.
AT users:
Those reading and/or interacting with content in PDF,
using assistive technology, will enjoy a comprehensive,
consistent experience.
Background
Since PDF 1.4 was published in 2000 the PDF format
has included syntax to enable accessibility through
the feature "Tagged PDF (ISO 32000, 14.8)".
PDF was introduced in 1993,
and was designed for use with desktop application software.
The first accessibility specification for PDF, PDF/UA,
therefore focused on desktop applications.
Today, PDF operates in a multi-platform world,
but accessibility specifications have not kept up with
current-generation software.
It is now critical to define and achieve an equivalent
experience for all end-users regardless of platform.
Meanwhile, commonly-used web technologies have also evolved,
and now target delivery of consistent results across platforms and devices.
This has been made possible, in part, due to the development and adoption
of high-quality APIs that have allowed AT to work across platforms.
The same, unfortunately, cannot be said of PDF.
Today, there is no common accessibility API for processing tagged PDF.
Even worse, some accessibility vendors use entirely
implementation-specific approaches.
Consequently, end-users employing various devices
typically do not receive an equivalent or consistent experience,
a cornerstone of accessibility.
Possible approaches
Past efforts in authoring processor and AT requirements
have included a variety of approaches, all of which involved
mapping tagged PDF to something consumable by AT.
Some possible approaches are outlined below,
but others ideas are very welcome:
- Requirements for processing PDF/UA files
for content delivery to APIs/AT
- Requirements for AT’s role and behaviors
- "An accessibility tree for PDF"
- building on Adobe’s PDF DOM, etc.
- Leverage the web accessibility tech
(HTML DOM ++) defined by W3C
- A User Agent Accessibility Guidelines for PDF, similar to:
- Mapping to your favorite accessibility API(s),
such as platform standards
Other possibilities abound!
The approach thus far and next steps
Because commonly used web technologies have evolved,
and now deliver generally consistent results for web pages,
the PDF/UA Processor LWG is "borrowing those wheels",
as opposed to inventing new ones.
For the past year the PDF/UA Processor LWG has focused on
an examination of the various accessibility API role mappings
for HTML elements and WAI-ARIA (and DPub) attributes to
map these features to their functional equivalents in PDF
(tags, attributes, properties, etc.).
Next steps:
Who can participate?
The PDF/UA Processor LWG hope to involve a diverse group of experts,
including developers whose focus is on accessibility.
While developing the specification, the PDF/UA Processor LWG
will welcome new ideas and provide a workplace for development of
other industry-agreed outcomes (e.g., test files) aligned with our mission.
The LWG’s intended participants are:
- Developers providing support for accessibility needs,
including remediation
- Developers working on PDF processors that interface with
end users, including PDF viewers and web browsers
- Assistive Technology (AT) experts
PDF/UA Processor LWG community will work closely with:
and others to ensure that the entire ecosystem of
those working on developing Tagged PDF technology
are aware of and may contribute to these efforts.
|
|
Next-Generation PDF
|
Document Expectations Through the Ages
Humans have put their thoughts to media with the idea of
capturing these (documents) in time - for many millennia.
Readers’ expectations have changed constantly
and continually through the ages.
PDF and the "Any Screen" Challenge
PDF on Mobile Devices / Small Screens
(a.k.a. "Responsive PDF"!)
PDF Association is developing industry-based model
for addressing the "any screen" challenge
PDF in the mobile world
Based on the premise of a fixed layout,
the page-description model better known as PDF
was developed during a time where documents were
exclusively viewed on desktop monitors, or printed.
The advent of much smaller screens, and screens of many sizes,
presented a variety of challenges and opportunities,
in the PDF paradigm.
Thus, mobile PDF readers are now increasingly including
the capability to reflow, so that the document adjusts
to various screen sizes.
A PDF must be properly tagged to reflow reliably.
An accessible PDF, correctly tagged according to
the PDF/UA standard, is also a mobile-friendly PDF.
So besides that tagged documents are accessible to assistive
technology users, PDF accessibility is vitally important for the
huge and rapidly growing number of mobile device users.
Next-Generation PDF - Deriving HTML from PDF
When it comes to print,
PDF is today the standard format used just about everywhere.
When it comes to the web,
HTML and CSS have obtained a similar position.
However, in our changing world,
print and web flow together in all kinds of interesting ways.
Having a format capable of catering to both worlds,
with the strengths of both worlds, would be very exciting!
The PDF Association Deriving HTML from PDF Technical Working Group
has been working on exactly such a technology project to
develop extensions to PDF.
This TWG is dedicated to continue to explore opportunities and
challenges in advanced reuse of PDF content with a focus on
pathways to HTML expression of PDF content.
Technology that would marry the reliability and robustness of PDF
with the fluidity and elegance of HTML; providing the best possible
user experience for each type of device and use case.
"Next-Generation PDF" is the code-name for extensions to
PDF technology currently under development.
These PDF extensions marry PDF’s core capabilities
to the flexibility of web technologies.
Note! A "responsive (reflow) web / HTML page" or
a "responsive PDF document" does not automatically mean
that the web page or PDF document is digitally accessible.
It is also required that both the web page and the PDF document
are well tagged, as well as that they meet current digital accessibility
standards, WCAG and PDF/UA respectively.
They must simply also contain an internal technical description of
the content so that technical accessibility devices can accurately
reproduce the content.
PDF Association unveiled this radical development in
PDF technology at PDF Days Europe 2017:
Deriving HTML from PDF
Guide: Deriving HTML from PDF
(June 11, 2019)
(For download of the guide, click on the picture above)
In the modern world of small devices, IoT and connected systems,
where interchange and reuse of data is critical, it is reasonable to
question the continued relevance of PDF’s core value proposition.
In particular, search engines, machine learning and artificial
intelligence systems focus on accessing information
contained in documents over visual representation.
In other cases, document producers wish to deliver data
in a form that is suitable for automated processing
while using a PDF file as a record for trust purposes.
End users also want digital documents that adapt
smoothly to viewing on diverse small devices.
This guide describes the algorithm that produces
conforming HTML from a tagged PDF, including how
well-tagged PDF documents, containing both traditional
fixed-layout content and the semantic structures leveraged
by modern devices and software, can be reliably and
consistently reused as HTML to support better user
experiences and renew PDF’s value proposition.
Follow/participate in the work with the future
development of Deriving HTML from PDF:
Next-Generation PDF - "Responsive PDF"!
Demonstration Site
Experiment with the conversion from PDF to HTML in a few clicks
without leaving your browser.
Introduction to the concept of HTML and embedded CSS in PDF documents.
About "The Next-Generation PDF Demonstration Site":
The Next-Generation PDF Demonstration Site is a demo site for a new
technology to "Derive HTML from Tagged PDF" in a predictable manner.
The technology, the derivation algorithm, is developed by the PDF Association.
So-called "Tagged PDF" documents contain additional invisible layer with
semantic information of all visual elements in the document, which is used
by the Derivation Algorithm to represent the same content in HTML.
ngPDF Editor
The ngPDF Editor provides detailed information
on the logical structure tag tree in the PDF document
uploaded to the ngPDF demonstration site
Illustration from Itext White Paper: "Web-Friendly PDFs with ngPDF"
(Click on the picture)
Key Features:
- Turn Tagged PDF into HTML using the Derivation algorithm
- See the derived HTML code next to your PDF for
comparison and immediate adjustments
- Inspect the tagged structure tree of the PDF document
- Manage PDF classes and their attributes
- Manage embedded files associated with structure elements
- Create and modify the mapping between PDF tags (so-called "RoleMaps")
- Edit the CSS to adjust the HTML presentation.
Embed the resulting CSS directly into your
source PDF document for further reuse
- Download the modified PDF back to your local file system
- Full support for PDF 1.7 and PDF 2.0 specifications
PDF original version
native without embedded CSS
(Click on the picture)
HTML version
native without embedded CSS
(Click on the picture)
PDF version
enhanced with embedded CSS
(Click on the picture)
HTML version
enhanced with embedded CSS
(Click on the picture)
Upload your correctly tagged test files here:
Additional information on ngPDF:
Interesting?
Next-Generation PDF - "Responsive PDF"!5>
The Future of PDF is Based on Well-Tagged PDF
The ICT Industry implements support for well-tagged PDF and PDF/UA.
Industry solutions providing access to PDF/UA documents
on smart phones and other mobil devices with smaller screens:
Next-Generation PDF - "Responsive PDF"!
Adobe introduces Liquid Mode for Acrobat Mobile
(September 23,2020)
Consuming content on mobile has long been a painful experience
- especially if a document is long and wordy.
Liquid Mode is a display tool to flow PDF content and
thereby make PDFs more readable on mobile devices.
Liquid Mode for Acrobat Mobile delivers a breakthrough reading experience
that enables a much easier way to read documents on mobile.
Liquid Mode reformats a static PDF into a more
dynamic and customizable experience.
Liquid Mode is built on top of the rich capabilities of PDF,
including the semantics of Tagged PDF.
With the push of a button, Liquid Mode automatically reformats text, images,
and tables for quick navigation and consumption on small screens.
Liquid Mode simultaneously creates an intelligent outline,
collapsible and expandable sections, and searchable text
for quick navigation.
Users can even tailor font size and spacing between words,
characters, and lines to suit their specific reading preferences.
This is especially useful for those who may see text as too small,
squished together, tight, or jumbled.
With Liquid Mode, pinching and zooming is no longer necessary.
Words are resizable and reflowable, images are tappable and
expandable, and tables are responsive.
Adobe launches PDF Extract API:
PDF extraction and document generation APIs for developers
(June 22, 2021)
PDF Extract API unlocks the content and data trapped in your PDFs.
There have been countless PDFs created over the last several decades,
with an estimated 2.5 trillion PDFs created every year.
Can you imagine the amount of critical data inside of those PDFs?
Last year, Adobe introduced Liquid Mode, which uses Adobe Sensei,
Adobe's AI and machine learning platform, to understand the structure of PDF.
Liquid Mode, besides being a display tool, in the new API helps
tag and add structure to content going into and out of PDFs.
With the release of APIs for developers available on AWS Marketplace
Adobe deconstructs the PDF creation and content extraction processes.
The new Adobe PDF Extract API builds on Liquid Mode.
It’s a cloud API that analyzes the structure from both scanned and
native PDFs and extracts all elements of a PDF including text,
table data, and images, with an understanding of relative positioning
and reading order across columns and page breaks.
What sets PDF Extract API apart is that it can extract all PDF elements,
unlike many other extraction technologies that are limited to one type,
such as tables.
Also, many providers are tied to specific platforms.
Staying true to the principle of being platform-agnostic and unlike others,
all of Adobe's APIs, including the PDF Extract API, offer the flexibility to
use any modern programming language or platform.
Organizations can use PDF Extract API to quickly and accurately
extract data for use in machine learning models, analysis,
indexing or storage, to automate downstream processes
using technologies like Robotic Process Automation (RPA) and
Natural Language Processing (NLP), as well as republish
PDF content across different media.
Adobe, a long-time partner with and member of the PDF Association,
continues to evolve PDF, co-working through collaborative groups such as
the Next Generation PDF TWG, PDF/UA TWG and PDF Reuse TWG.
Next-Generation PDF - "Responsive PDF"!
Google Chrome Adds Support for Tagged PDF
(July 29, 2020)
Google pre-announcement:
Starting with Chrome 85, the world's leading browser,
will automatically generate a tagged PDF when using the "Save as PDF" option.
Note!
Although tagged PDF is a first step and prerequisite for accessible PDF,
this announcement does not specifically tell to what extent a Chrome 85
saved PDF file will be in compliance with the PDF/UA Standard.
PDF/UA from Google Docs is Coming Soon! (2016)
Next-Generation PDF - "Responsive PDF"!
Apple Tags PDF
(September 30, 2019)
Apple has published a new page indicating that version 8.2 of
its suite of productivity apps; Pages, Keynote and Numbers,
now supports creation of tagged (and thus, accessible and reusable) PDF,
and not just on MacOS, but on iOS/iPadOS as well.
Note!
Although all excellent recommendations by Apple,
complete accessibility in PDF or any other format
requires attention to a variety of criteria as specified in
W3C/WCAG 2.1 (generally) and ISO PDF/UA (PDF-specific considerations).
Next-Generation PDF - "Responsive PDF"!
Microsoft vs PDF/UA - 2016, 2019 and 2024
PDF and AI:
Does ChatGPT support PDF?
PDF and AI:
What AI should be doing with PDF documents?
Today, most AIs are fed with lousy and/or dumbed-down data
that contributes to bias and untrustworthy results,
and this is all before the problem of malice.
How should AI be integrated for use with documents?
To become truly reliable, AIs need schemes for preserving
rich semantics and data when they encounter it.
Let the author’s AI help the author to provide this richness,
and let consumer AIs leverage it when it’s provided.
It’s time for AI integrators to help authors make semantically rich
documents and step up their game on PDF inputs:
|
|
PDF Forms
The PDF Forms has focus on advancing the current PDF Forms technologies
through the introduction of new declarative models with integrated semantics.
Follow/participate in the work with the future development of PDF Forms:
PDF Association - PDF Forms Industry Working Group
The Purpose of PDF Forms
The purpose of PDF Forms is to modernize the PDF Forms technology.
Digital PDF Forms is key.
While Adobe made additions to the native PDF forms technology
to bring it to functional parity with HTML, little else has changed
for quite a long time.
Although companies such as DocuSign, Adobe, Dropbox and others
have created their own extensions to PDF to enable rich workflows
it is imperative that these capabilities make their way into
the core PDF standard.
The PDF Forms TWG is now evaluating the forms technologies
specified in ISO 32000 and setting a path forward to advancing
forms in PDF.
This community is dedicated to advancing the current
PDF Forms technologies through the introduction of new
declarative models with integrated semantics.
These capabilities will not only bring PDF in alignment with
modern HTML forms, but re-establish PDF’s leadership in
the forms and workflow world.
The community works closely with
to ensure that those groups' input is heard.
Specific areas in which the PDF Forms TWG plan to invest:
- Connecting Forms to "derivation to HTML" concepts
- Replacing reliance on JavaScript for common concepts
(eg. validation & formatting)
- Modernizing form data exchange (eg. no XML)
Conversion of Fillable/Interactive PDF Forms to PDF/A
PDF/A forbids some of the features that
will be needed by most fillable forms.
This makes it impractical to have a fillable PDF form
that also at the same time is PDF/A conforming.
The way around this is as follows:
- make the fillable PDF form as much PDF/A conforming as possible
- send it out for being filled in
- once filled in, send it through a PDF/A conversion process
(preferably with callas pdfaPilot Desktop/Server or CLI);
this last step will make minor adjustments to align
the PDF form with the PDF/A requirements
(e.g. remove JavaScript, adjust certain properties of
form fields and so on, but without changing
the visual appearance, and, most importantly,
without removing the payload/the data as filled in)
|
|
PDF/R - ISO 23504-1:2020
The new standard for raster image data interchange
Approved international standard since 2020
PDF/R (a.k.a. PDF/raster)
TWAIN Working Group and PDF Association Announce PDF/R
The Next-Generation Format for Digital Imaging
A PDF technology-based ISO Standard
The use of PDF/R will be very high.
Follow/participate in the work with the future development of PDF/R:
PDF Association - PDF/R Industry Working Group
PDF/R: The Imaging File Format of the Future (April 15, 2021)
The PDF/R format is designed expressly to support
modern standards-based document imaging workflows.
PDF/R, an ISO-standardized format for storing,
transporting and exchanging multi-page raster-image documents,
especially scanned documents and photographs.
PDF/R takes advantage of the widespread support of PDF
for viewing, printing and processing files.
PDF/R provides the portability of PDF while
offering the core functionality of TIFF.
The format supports uncompressed bitonal, grayscale,
true color images, RGB images as well as JPEG or
lossless CCITT Group 4 Fax compression.
PDF/R fits well into existing workflows and is compatible
so either existing libraries or newly developed frameworks
can be used for embedded systems like firmware.
PDF/R features include support for encryption and authentication,
and is as extensible as PDF itself.
PDF/R can be employed in scanning applications,
as a standalone format or as part of a TWAIN Direct initiative.
This simple and highly compressed format is ideal for use with
IoT technology and helps to optimize cloud applications
with minimal integration time and effort and
no expensive library licensing costs.
Before PDF/R document scanning systems were
based on image formats instead of document formats.
PDF/R delivers the advantages of PDF to all imaging workflows,
allowing even low-cost scanners to produce PDF documents
complete with metadata, encryption and digital signatures,
if desired, straight from the scanner.
The PDF/R standard is a great replacement to
the traditional TIFF and JPEG image formats supported
by traditional scanning devices and applications.
PDF/R delivers compact, high quality images from
image acquisition devices providing efficient and
secure delivery of documents over a network.
PDF/raster - Portable and Feature-rich (August 30, 2017)
PDF/raster provides the portability of PDF
while offering the core functionality of TIFF.
PDF/R can help modernize and secure scanned image data transfer,
especially in the age of cloud and mobile business workflows.
PDF/raster 1.0 Documentation
This document describes PDF/raster,
a strict subset of the PDF file format designed
for storing, transporting and exchanging
multi-page raster-image documents.
TWAIN Direct with PDF/raster Released
The TWAIN Working Group, a liaison member of the PDF Association,
has just announced the release of TWAIN Direct,
their next-generation open source image-acquisition technology.
TWAIN Direct supports direct network communication between
desktop or mobile applications and scanning device.
|
|
PDF REUSE
The PDF REUSE is dedicated to exploring
the technologies and practices that facilitate
reliable reuse of document content and semantics
on diverse devices and the broadest-possible range of applications.
Follow/participate in the work with the future development of PDF/REUSE:
PDF Association - PDF Reuse Industry Working Group
The Purpose of PDF REUSE
The purpose of PDF REUSE TWG is to define a complete set of
requirements for "well-tagged PDF" (WTPDF).
Today, consumers of PDF documents can choose from a variety of screens
which presents real challenges to authors of fixed layout documents.
Beyond use on diverse displays users increasingly want their
PDF documents to work well with technologies that depend less on
the page’s layout, but more on the content, including search engines,
text-to-speech solutions, translation engines, 3D, video and other
features increasingly used to enhance digital document content.
The initial project animating the PDF REUSE TWG is to develop
and maintain a specification for “well-tagged PDF” or WTPDF, that is,
PDF documents that leverage “Tagged PDF” (ISO 32000, 14.8)
to enable reliable reuse of document content on diverse
devices and software applications.
Reuse as HTML is a key target; as such WTPDF will complement
the derivation algorithm specified in:
PDF accessibility is a subset of PDF reuse that includes additional
requirements beyond those required for strictly reuse purposes.
To ensure continuity between these uses WTPDF will mirror equivalent
provisions of PDF/UA-2, and will be developed in close cooperation with:
PDF REUSE is of interest to organizations interested to take part in
development and publication of this new subset specification for
using ISO Standard 32000-2.
|
|
ISO 24517 (PDF/E)
international standard for engineering documents
such as construction drawings and usually derived from CAD files.
Approved international standard since 2008
The upcoming standard was earlier planned to be PDF/E-2 based on PDF 2.0
to provide an archival model for engineering content including 3D.
The industry was more interested in making this a part of PDF/A
instead of following a new standard PDF/E-2.
The ISO then stopped working on PDF/E-2 standard
and is making it a part of:
The use of PDF/E is low.
The Purpose of PDF/E
PDF/E ("PDF Engineering") is based on the PDF format and
specifies how PDF should be used for the creation of documents
in engineering workflows; including 3D in the PDF/E context and
archiving of engineering content.
A point of contact of the benefits of PDF/E
in almost every engineering field.
PDF/E Competence Center is a platform for information and discussion
for experts in 3D technology, architects and construction specialists,
as well as developers of PLM applications.
For all engineers who use PDF technology as an
integral component of their day-to-day work.
Key benefits of PDF/E
Benefits with PDF/E:
- Dramatically reduces requirements for
expensive proprietary software.
- Lowers storage and exchange costs as compared to paper.
- Facilitates trustworthy exchange and markup
across multiple applications and platforms.
- Vendor-independent; PDF/E is developed and
maintained by the PDF/E ISO committee.
|
|
3D PDF
The Portable Document Format for Engineering
3D PDF is a PDF file with 3D geometry inside
Please also view the free booklet:
"PDF in Manufacturing, The future of 3D documentation"
The use of 3D PDF is high.
Follow/participate in the work with the future development of 3D PDF:
PDF Association - 3D PDF Industry Working Group
The Purpose of 3D PDF
3D PDF specifies how PDF should be used for the creation of
documents in engineering workflows; including 3D in
the context of archiving of engineering content.
In the PDF context, 3D models are referred to as
3D artwork (ISO 32000:2, 13.6).
PDF files containing 3D artwork / 3D geometry
are commonly known as 3D PDF files.
PDF allows authors to combine dynamic, rich 3D artwork with
metadata, text, images, video and forms in a 3D PDF document.
PDF is at the heart of manufacturing and engineering communications.
PDF technology supports manufacturing worldwide, conveying ideas,
plans, communications, agreements, specifications, contracts…
and of course, 2D and 3D drawings and supporting content
throughout complex workflows and across corporate,
organizational and process boundaries.
3D PDF and PDF Complements Each Other
A PDF file is a self-contained document file.
PDF files can be displayed for reading using various PDF viewers.
A 3D PDF is a little special, inside the PDF document there is
a 3D viewing window, where you can rotate, zoom and pan
the contents of a 3D scene.
The actual data for the 3D view is embedded inside the PDF file.
The file actually contains a 3D geometric representation of the scene,
not just images from different viewpoints.
3D PDF files are powerful documents that support the following features:
- 3D artwork is visualized or printed as part of a page
- 3D artwork can be interactive and programmatically
manipulated using JavaScript
- 3D artwork can be displayed, or instanced,
in multiple places in a document
- 2D PDF content such as title block, revisionblock,
list of materials, and other information that must be
placed on a drawing sheet can be overlaid on 3D artwork
ISO 32000 (PDF) has supported embedded collections of
three-dimensional objects (3D models), such as those used
by CAD software, since the capability was introduced
with PDF 1.6 in 2005.
Since then the use of 3D content in PDF has exploded as
vendors, manufacturers and customers demand greater access
to detailed technical information at every stage of the design,
development, manufacturing and support processes.
Among other segments, 3D PDF in manufacturing is a
critical application of PDF and 3D technology servicing
the entire product lifecycle, architecture and civil engineering
needs for rich, data-driven and interactive documentation.
Because they are PDF files, 3D PDF files are compact,
secure and easy to share.
3D PDF documents are completely interactive and
can be annotated and measured.
This powerful, easy to use format is transforming
how we communicate engineering data today.
Most 3D CAD applications have some level of support
for creating 3D PDF files. Additionally, there are a number of
applications that can create 3D PDF files from the most popular
3D CAD formats without requiring an expensive CAD software license.
3D PDF Geometry Standards
3D PDF is a PDF file with 3D geometry inside.
3D PDF is an interactive 3D Model
embedded into an interactive PDF document.
(Click on picture)
Picture source:
Presentation on 3D PDF by Adobe Systems GmbH
at PDF Days Europe 2018
3D PDF Geometry Standards
The 3D portion within the PDF can be composed of
either a PRC U3D, a U3D or a STEP 3D encoding type.
PDF 2.0 therefore now supports three 3D formats:
- U3D, as defined by ECMA-363,
Universal 3D File Format, 3rd Edition (U3D), June 2006.
Approved international standard since 2006.
- PRC, as defined by ISO 14739-1:2014 Document management,
3D use of Product Representation Compact (PRC) format,
Part 1: PRC 10001
Approved international standard since 2014.
- STEP AP 242, as defined by ISO 10303-242,
Industrial automation systems and integration,
Product data representation and exchange,
Part 242: Application protocol:
Managed model-based 3D engineering.
Approved international standard since 2014.
PRC, U3D and STEP represent the foundation of
3D interactive data in the PDF context.
All three interactive 3D visualization model formats
can utilize PDF 2.0’s RichMedia annotation framework,
but PDF 1.6 and PDF 1.7 files can only use
the U3D format with 3D annotations.
STEP 3D Model File Format, ISO 10303-242
STEP 3D AP 242, ISO 10303-242,
industrial automation systems and integration,
Product data representation and exchange,
Part 242: Application protocol:
Managed model-based 3D engineering.
More information on support of STEP in PDF 2.0:
Product Representation Compact (PRC) File Format, ISO 14379
PRC provides mechanisms for the main constructs of 3D CAD models.
PRC is an accurate, highly compressible format optimized to
support different representations of a 3D CAD model.
PRC was developed from inception as a file format
capable of representing 3D model data from all of
the popular CAD authoring applications.
PRC data files contain product structure data and can
optionally contain precise 3D geometry, visualization,
metadata and Product Manufacturing Information (PMI).
3D models can be stored within PDF as the model's exact and
accurate BREP geometry, tessellated data or both.
Because of this, PRC models can be both visualized by people and
exported from a PDF for use in Computer Aided Design (CAD);
Computer-Aided Manufacturing (CAM) and
Computer-Aided Engineering (CAE) systems.
Suitability:
PRC is the best format to use for representing
three dimensional technical data in a PDF document.
PRC offers the following advantages over U3D:
- Data structures for CAD data including assemblies,
precise geometry, tessellation, PMI, text, annotations, etc.
U3D is limited to mesh data.
- Better compression than U3D.
- International standard (ISO 14739)
Applications:
PRC is well suited for communicating technical data for
most general engineering processes, including:
- Reporting
- Design review
- Quality planning and reports
- CAE reports
- Supply chain collaboration
- Training
- Archiving
Varying levels of compression can be applied to the 3D CAD data
when it is converted to the PRC format using a proper 3D PDF Converter.
The 3D data stored in PRC format in a PDF is interoperable
with many industry applications for CAM and CAE.
Key features:
- Tessellated meshes
- Precise B-rep geometry
- Product Structure
- Product manufacturing information (PMI)
- Properties / Metadata
- Highly compressible
- Animation through Javascript
Universal 3D (U3D) File Format, ECMA 363
Universal 3D (U3D) is a standard for a compressed binary
file format for 3D computer graphics data that can be
embedded in PDF documents since PDF 1.7 (ISO 32000-1).
U3D is no longer in active development.
U3D was designed as a general-purpose visualization format
with features such as keyframe animation.
The format is optimized to store triangle meshes,
lines and points with hierarchical structure, metadata,
color and texture.
Suitability:
Unlike PRC, U3D lacks the CAD specific data structures for
geometry, topology, text and PMI.
In PDF, U3D is best suited to animation in 3D PDF.
Applications:
- Technical publications
- Marketing Materials
The U3D format is natively supported by the PDF format and
3D objects in U3D format can be inserted into PDF documents and
interactively visualized by PDF viewers.
Key features:
- Tessellated meshes
- Structure
- Light sources
- Textures
- Animation (keyframe)
3D PDF showcases
The demonstration 3D PDF files below are
provided by PDF Association members.
These showcases illustrate the broad range of
workflows and use cases for PDF as a
delivery-vehicle for 3D representations.
Demonstrations are in the following categories:
- Bill of Materials (BOM)
- Change request
- Product data quality (PDQ)
- Service documentation
- Technical Data Package (TDP)
- Technical documentation
- Work in progress
- Work instructions
Explore and evaluate all 3D PDF files:
Archiving of 3D Documents
+
Archiving of 3D PDF using PDF/A-3 or PDF/A-4
PDF isn’t just a container for 2D and 3D information;
it’s also an archival solution for final-form document content.
PDF/A-4, a subset of PDF 2.0 supporting long term preservation
of PDF files, allows 3D content and associated JavaScript,
making it a viable solution for archiving manufacturing content.
- Background:
3D is a complex world of its own.
In order to create 3D models, 3D programs are needed,
like Catia, AutoCAD, etc.
Each industry has their own set of preferred tools,
solutions and document/file formats.
Basically there is not an agreed common exchange format
between these tools (i.e. not the same as the PDF format
that acts as a digital exchange format/"digital paper").
PDF 2.0 (ISO 32000-2) supports two 3D formats directly
within PDF’s framework for 3D constructs on PDF pages:
The first 3D format is:
The second 3D format is:
Both formats can be embedded in PDF making it more user friendly,
especially considering that free PDF reader software tools are
available to view them or interact with these formats.
More general tools will be available to handle certain format aspects,
such as syntax validation of 3D models, reading and writing metadata
in 3D models, …
- Recommendations on how to manage and archive 3D documents.
First and foremost you have to clear out the format in which
the 3D information is present; the PRC or the U3D format?
Options for archiving 3D documents:
Next step is to decide on which PDF standard format
to use to embed the 3D information:
PDF/A-3 (based on PDF 1.7, 2012), or
PDF/A-4 (based on PDF 2.0, 2020) (newer)
Both formats, PDF/A-3 and PDF/A-4, allows embedded
attachments in PDF/A format or another format.
However no guarantee (for reusability) for something that is
embedded, but which is not itself PDF-based information.
For many years the PDF/E (ISO 24517:2008) served to provide
an archival model for engineering content including 3D;
representation of constructions drawings and diagrams with
moving 3D models and usually derived from CAD files.
The initial plan to define a new flavor,
PDF/E-2 based on PDF 2.0, was cancelled in 2018
to instead make it a part of PDF/A-4.
PDF/A-4:
- PDF/A-4e makes it possible to mend 3D models in
PRC or U3D formats inside PDF/A-4
- PDF/A-4 is explicitly positioned
as a standard for 3D archiving
The Value of PDF for Archiving of 3D PDF
PDF is the only open format capable of archiving of
3D PDF engineering data, documents, and records.
3D PDF Tools
"3D PDF-Ready" Software Tools:
Many "3D PDF-Ready" tools are available to support
all aspects of 3D PDF production environments, including:
3D PDF Creation
3D PDF compliant files can be created
directly from professional software tools.
The current 3D PDF specifications are
well established and mature as far as software developers
are concerned, among them:
|
|
ISO 16613-1:2017 (PDF/VCR)
international standard for Variable Content Replacement
Approved international standard since 2017
The PDF/VCR standard builds on PDF/X-4
to provide support for variable content replacement.
The use of PDF/VCR is low.
Follow/participate in the work with the future development of PDF/VCR:
PDF Association - PDF/VCR Industry Working Group
The Purpose of PDF/VCR
PDF/VCR enables variable data printing applications
using PDF template-based variable content substitution and
a framework for in-RIP variable content merging.
PDF/VCR (ISO 16613)
In 2017 ISO published the standard
"ISO 16613-1:2017 Graphic technology - Variable content replacement
Part 1: Using PDF/X for variable content replacement (PDF/VCR-1)".
PDF/VCR enables variable data printing applications
using PDF template-based variable content substitution where:
- a PDF template file containing pages with
variable content substitution fields (placeholders)
is delivered ahead of a print production run and may be
reused across multiple print production runs, and
- PDF-based variable data substitution content is provided
during running print production and merged with the PDF
template to produce final form variable content page output.
PDF/VCR (PDF for variable content replacement),
is a set of base technical requirements for a PDF template file format,
a PDF-based variable data substitution content format and a framework
for in-RIP variable content merging.
PDF/VCR is pretty much similar to:
The difference here is that the variable data is coming from a CSV file and
is applicable to high performance real data printing, for example:
read data from a credit card and use it when printing the envelope.
The PDF/VCR base technical requirements do not include
writer and processor conformance, however ISO 16613-1:2017
also defines the PDF/VCR-1 conformance level which is based on
the PDF/VCR base technical requirements and defines requirements for:
- the PDF/VCR-1 template file format;
- the PDF/VCR-1 data sequence format,
a variable data substitution content format;
- a PDF/VCR-1 writer,
a software application which can generate PDF/VCR-1 template files;
- a PDF/VCR-1 data provider,
a software application which can generate PDF/VCR-1 data sequences;
- a PDF/VCR-1 processor,
a software application which can perform substitution (replacement)
of PDF/VCR-1 template placeholder objects with substitution content
provided within a PDF/VCR-1 data sequence.
The use of PDF/VCR is low.
|
|
ISO 16612-3:2020 (PDF/VT)
international standard for Personalized Print / Variable Data
Approved international standard since 2010
The PDF/VT-3 standard builds on PDF/X-6
to provide support for the PDF 2.0 imaging model
in the variable and transactional printing context.
The use of PDF/VT is low.
Follow/participate in the work with the future development of PDF/VT:
PDF Association - PDF/VT Industry Working Group
The Purpose of PDF/VT
PDF/VT is based on PDF format to support variable data printing.
PDF/VT is optimized for the specific needs of
Variable (“V”) and Transactional (“T”) workflows.
PDF/VT efficiently addresses the requirements of modern
Variable Data Printing (VDP), bringing all the well-known
advantages of PDF workflow to the world of personalized print.
PDF/VT (ISO 16612-2 and 16612-3)
In 2010 the International Organization for Standardization (ISO)
published a new standard called
“ISO 16612-2:2010 - Graphic technology - Variable data exchange
Using PDF/X-4 and PDF/X-5 (PDF/VT-1 and PDF/VT-2)”.
It’s designed specifically to support robust delivery and production
of modern variable data print jobs.
ISO 16612-3 (PDF/VT-3) was published at the end of 2020.
Standards for variable data transactional publishing
based on PDF/X - PDF/VT
By building on PDF/X, and therefore on PDF,
these standards enable the use of many of the features that
graphic designers have come to expect to be able to use
for work in commercial print, publication, etc., and therefore
wished to use for complimentary advertising in direct mail and
transpromo campaigns, and in labels and packaging.
By also including document metadata that can convey
the designer/ purchaser’s requirements, it allows for more
complete automation of production in support of today’s
increasingly complex and demanding requirements around
page count and separate components to be delivered together.
PDF/VT requirements and conformance levels
These PDF/VT standards define four conformance levels,
reflecting both different use cases for variable data,
and changes in the print industry over time.
- PDF/VT-1
All content for a print job is included in a single PDF file,
which must also conform to PDF/X-4 (ISO 15930-7:2010).
The vast majority of current PDF/VT production is PDF/VT-1,
and until the publication of PDF/VT-3 at the end of 2020,
this was the only PDF/VT standard recommend for workflows
unless all parties in that workflow agree to use one of the others.
- PDF/VT-2
Designed to support a "chunking" workflow to allow
something almost indistinguishable from streaming,
that is where the first pages of the job are being printed before
the last ones have been created by the composition engine.
It does this by providing a method whereby large assets
such as images that are used multiple times
(for example for many recipients each)
can be saved into a single PDF file, known as a target file.
A series of "chunks", each defining a range of pages to be
printed and saved as a PDF/VT-2 file, is then produced.
Each PDF/VT-2 file includes references to the assets
in the target file(s), which means that those large assets
don’t need to be repeated in every PDF/VT-2 file.
PDF/VT-2 is not widely implemented or used.
- PDF/VT-2s
A variant of PDF/VT-2 where both the target files
containing re-used assets and the PDF/VT-2 files
themselves are wrapped into a single MIME stream.
The intention is to simplify the delivery of a stream for printing
where there isn’t a shared file system accessible to both
the submission tool and the receiving digital front end.
PDF/VT-2s are even less widely implemented than
PDF/VT-2 and should be avoided.
- PDF/VT-3
Was published in late 2020 and is based on PDF/X-6,
which, in turn, is based on PDF 2.0.
PDF/VT-3 allows:
- both fully self-contained PDF/X-6 models
as well as external data-dependent
PDF/X-6p and PDF/X-6n workflows
- use of graphic object definitions to specify
graphical content data only once, independent of
the number of times it is referenced in the file,
and better color management of jobs that are
printed on multiple different media
- unification of PDF generation for multi-channel delivery
with excellent accessibility capabilities
- including hinting information allowing for a
variety of processing optimisation strategies
Just like PDF/X-6, PDF/ VT-3 is expected to become
the most commonly used PDF/VT standard over time,
replacing PDF/VT-1.
The Value of PDF/VT
Just like PDF/X, the real value of PDF/VT is more in simplifying
communication of requirements and best practice than in
defining anything significantly different from
what can be achieved in baseline PDF.
In a sense it relieves the graphic designer and composition tool operator
of the need to consider some of these constraints when they make a file;
just select “PDF/VT” in the menu when generating the file for print and
it will be done for you.
But the PDF/VT standard concentrates on providing support for
predictable and repeatable output and for automation;
it does not focus on how the desired elements should be written
into that PDF file in order to maximize the efficiency of processing.
So using PDF/VT is a very good way of improving the PDF document
delivery workflow in many ways and is definitely recommended.
But it’s not the whole story.
There are many things that users can do to optimize
the processing of those jobs as well and to help
avoid last-minute problems.
Key advantages of PDF/VT
Using PDF/VT files instead of pragmatically defined “optimized PDF” files
provides a number of distinct benefits for both creators and printers:
- PDF/VT builds on the work done for static artwork delivery
for both conventional and digital print in the PDF/X family
of standards, which have become an extremely common way
of enforcing best practices and simplifying the creation of
preflight profiles etc.
- PDF/VT provides the framework for a composition engine
to include a hierarchical tree of metadata in the file,
to encapsulate the intents and expectations of
the designer/purchaser.
- A PDF/VT file may include hints and steering information
that can be used in for processing of an optimized workflow.
- PDF/VT is portable.
It provides a reliable container for blind exchange of
final-form, graphically rich, variable content.
- PDF/VT takes full advantage of the PDF imaging model for
printing graphically rich personalised communication
(e.g. variable transparency effects).
- PDF/VT enables caching for recurring elements in VDP jobs.
- PDF/VT can be preflighted with standard off-the-shelf tools.
- PDF/VT enables reliable proofing and distributed
review/approval workflows prior to printing,
using readily available PDF viewing software.
- PDF/VT enables predictable color for VDP jobs,
based on moderna ICC color management.
- PDF/VT provides a robust metadata infrastructure to enable
sophisticated/dynamic/granular runtime controls for
VDP print production (e.g. filtering, rules-based imposition,
audit trail, barcoding, checkpoint re-start).
- PDF/VT i device-independent and object-oriented,
and enables VDP jobs to be dynamically repurposed,
refactored, or retargeted to different presses.
- PDF/VT benefits direct marketing campaigns,
and also enhances management of high-volume
print runs (e.g. "TransPromo").
Designing Optimized PDF/VT Files
for Efficiency is Important
The use of PDF is ubiquitous across most sectors of the print industry.
The use of variable data to add personalization with variable text,
graphic, and image content is also expanding across multiple sectors
of the print industry, from its roots in transactional, through commercial,
wide format, labels, packaging, and into industrial print.
The growth of PDF and variable content is one of
the key reasons for the adoption of digital printing.
Digital printing can help everyone in the design and
print supply chain increase their profitability.
Increasing profitability requires that digital presses are
kept running at full engine speed, and that is where
design decisions can make a difference.
The way in which a variable data PDF file is designed and
constructed can then have a significant influence on
the efficiency of digital presses.
Avoiding print production workflow with inefficient PDF files,
boils down to a very simple maxim:
- Don’t ask a workflow processing component
to do more than it has to!
PDF for Static Production vs Variable Content Jobs
For static print production the process of saving
a well-designed PDF file for print is important,
requiring attention to the settings available
in the design tool.
Regardless of the print environment, analog or digital,
there is value in creating an efficient PDF file.
The faster a file is processesed the more efficient
the workflow can be.
Variable content jobs may be built from pages that use
a template with variable data, like name, address and
billing amount or discount offer amount.
The pages look the same and share a design language,
but the content of pages may be variable and unique;
as a job,this is a variable content job.
These types of jobs place additional demands on
the processing power available because each page
is different and must be processed separately.
While minor inefficiencies in a job may only add
a few milliseconds to the processing of each page,
when multiplied by the number of pages in the job,
the result can show up as a delay that grows to
minutes and sometimes hours.
There are often many ways of achieving the same
visual appearance which can vary significantly in
the amount of processing required to print them.
Sometimes the most efficient method for the print company
requires a little more prepress work for the designer.
Sometimes there’s a win-win where improved print performance
can be gained by making a few changes that also result in a PDF file
that can be shared more efficiently on the web and on mobile devices.
While some optimization is under the direct control of
the graphic and document composition tool vendors,
there are steps the designer can take on while in prepress mode.
For instance, by understanding what options design tools
(layout, templating, or other software) offer also for preflighting.
Catching obvious preflight errors early in the design process can save
valuable time in downstream process / for the production team.
callas software blog on:
PDF/VT - Application Notes
The PDF/VT Application Notes discuss topics that aid
implementers of PDF/VT workflow tools and demonstrate
the various design features of the PDF/VT file format.
Recommended reading:
"Best Practice in creating PDF files for Variable Data Printing (VDP)":
All PDF/VT developers and designers will benefit from these guides
on how to create efficient and optimized PDF files appropriate for
today’s high-speed production requirements.
(For download of the guides, click on the pictures below)
- Developer Edition
Helps developers with the information they need to ensure that
PDF files can be processed (rendered and printed) as quickly
as possible without compromising their visual appearance.
(September 2, 2022)
(September 2, 2022)
PDF/VT Tools
"PDF/VT-Ready" Software Tools:
Many "PDF/VT-Ready" tools are available to support
all aspects of PDF/VT production environments, including:
PDF/VT Creation
PDF/VT compliant files can be created directly from
professional page layout packages.
The current PDF/VT-3 specification is supported
by software developers and vendors, among them:
|
|
ISO 15930 (PDF/X)
international standard for
prepress digital data and graphics exchange
Approved international standard since 2001
Please also view the free booklet:
"PDF/X in a Nutshell, PDF for Printing - The ISO standard"
The use of PDF/X is very high.
Follow/participate in the work with the future development of PDF/X:
PDF Association - PDF/X Industry Working Group
December 1, 2021, marks the twentieth anniversary of
the publication of the first ISO PDF/X standard.
Since then PDF/X has heavily impacted the print publishing industry
and the development of other PDF-based standards.
Dov Isaacs, for over 31 years Principal Scientist at Adobe Inc., shares in this
multi-part article, how PDF/X paved the way for future PDF standardization and
how the PDF/X family of standards revolutionized the graphics art industry,
as well as lessons learned along the way:
Martin Bailey, Global Graphics,
has worked on PDF subset standards since 1994, starting with PDF/X:
Adobe PostScript - an early precursor to PDF.
In 1984, PostScript was launched as an
object-based page description language to describe how
text, graphics and images should be displayed.
Public Access to an early version of
Postscript Source Code (December 1, 2022):
The purpose of PDF/X
The purpose of PDF/X is to facilitate graphics exchange ("blind exchange").
In this context "blind exchange" refers to a common standard for
both the creator (design individual ) and the receiver (print shop)
and everyone between.
In the early days of the digital graphics workflow era software and hardware
vendors within the graphics industry used own proprietary formats for
data exchange which caused overhead and didn't scale well.
Today, PDF/X is the accepted global format and cornerstone for all
graphics arts workflows. It is endorsed, supported and used by all
industry players; by small creative shops, by large global agencies,
by software vendors, by manufactures of heavy printing machinery,...
Core principals of PDF/X
PDF/X was designed to constrain PDF files in order to cater for
specific use-cases in the graphics/print industry.
Therefore PDF/X has a series of printing-related requirements
which do not apply to standard PDF files.
One principle of PDF/X is that conforming files must be complete;
fully self-contained, and everything on a PDF page has to be printable.
Nothing may appear on a PDF/X page that is either:
- not printable at all (e.g. video), or where
- print output is not fully defined (e.g. font not embedded)
These requirements apply in all parts and conformance levels of PDF/X:
- An Output Intent must be present that uses an ICC profile
to specify the intended printing conditions
(print device type, paper type) when colors (or shades of grey)
are defined.
- Spot colors may only be used if they have an alternate color,
and this alternate color must be the same of all occurrences of
the respective spot color.
- Fonts must be embedded (either fully embedded,
or as an embedded subset in which all characters
used in the text are present).
- Images must be present in the PDF
(no external graphical content is allowed)
- No password protection of any type
- No transfer curves (since they modify appearance of colors).
- No alternative images (e.g. no low-resolution alternates)
- If the bleed zone is defined,
the Bleed-Box must be outside the printable area (the TrimBox).
- No use of LZW compression.
- No annotations in the print area.
- No audio, video or 3D annotations.
- No form fields of JavaScript.
- No embedded files.
- PDF metadata must indicate whether the PDF has been trapped, or not.
- PDF metadata must claim conformance to PDF/X and to
which part and conformance level of the PDF/X standard.
What is not in PDF/X
PDF/X only defines the general requirements for a reliable exchange of
prepress data; the ISO standard self does not specify the quality requirement.
These requirements are typically different for each
- printing process:
like sheetfed offset, web offset, newspaper printing,
flexo printing, screen printing, etc.
and
- markets segments
like magazines, newspapers, art books, etc.
For example, PDF/X does not define a minimum resolution for images.
It simply requires that images are embedded
(since external references are not allowed).
Although these provisions are important,
the PDF/X ISO Committee made an early decision not to attempt
to include quality requirements for every PDF/X use case.
Among other benefits, this approach makes quality requirements easier
to update than if they were part of the official ISO standard.
The task of defining quality requirements was instead taken over by
The Ghent Workgroup.
PDF/X Requirements
PDF/X has expanded into a family of standards supporting a
wide variety of print production workflows and use cases.
Each part of PDF/X builds on the previous part
providing flexibility while ensuring reliable exchange:
PDF/X-1a: Complete exchange
PDF/X-1a was the first and most restrictive
member of the PDF/X family.
PDF/X-1a aims for “complete exchange”;
a single file must contain all information needed for
for printing the document as intended by the sender.
Printing a PDF/X-1a file must be possible
without requiring prior color correction.
Therefore, print elements can only use CMYK,
greyscale or spot colors; no RGB or
device-independent color spaces are permitted.
PDF/X-3: Color management
In PDF/X-3, graphics can use CMYK, greyscale,
RGB, Lab and ICC based color spaces.
It requires, however, that device color spaces may be
used only if the same color space is used for
the ICC profile in the Output Intent.
PDF/X-2: Partial exchange
The strict requirement of including all resources inside
a single file is not appropriate for every workflow.
PDF/X-2 addresses this need; it allows the use of
proxy elements referencing external graphics.
Otherwise, PDF/X-2 is the same as PDF/X-3,
so it allows color managed elements next to spot colors
and device colors prepared for the specified output intent.
PDF/X-4: Transparency
The previous PDF/X variants do not support the features of
more modern (beyond PDF 1.4) versions of PDF.
By 2008, it was time to bring PDF/X up to date
with current PDF specifications.
PDF/X-4 is based on PDF 1.6, published in 2004.
This specification added support for new features, including layers,
JPEG2000, OpenType fonts, and 16-bit images.
In addition, PDF/X-4 allows the use of transparency,
a PDF 1.4 feature forbidden in PDF/X until PDF/X-4.
PDF/X-4 includes two variations known as “conformance levels”:
- PDF/X-4
Inherits the rules of PDF/X-3 for complete exchange
in color managed workflows, with the requirement of
always embedding the output intent ICC profile, and
- PDF/X-4p
Provides a form of partial exchange;
it allows the ICC profile to be maintained externally.
This ensures better efficiency in workflows where
many files share the same output intent,
or where embedding the ICC profile would
substantially increase the file size.
PDF/X-5: More flexibility
PDF/X-5 is a set of three conformance levels,
all geared towards different workflows.
Each conformance level expands on PDF/X-4 or PDF/X-4p:
- PDF/X-5n
Allows for n-colorant color spaces that are used where
the traditional four print process colors (CMYK) are not enough.
n-colorant color spaces may be required to enable a larger
color gamut (e.g. CMYK plus Green, Violet, Orange) to allow
for more accurate skin tones, pastel colors or the like.
Another use of PDF/X-5n is in packaging,
in which certain product-specific spot colors
are also used for imagery as process colors.
- PDF/X-5g
Extends the PDF/X-4 standard with the ability
to use external raster and vector graphics.
Like the older PDF/X-2, a PDF/X-5g file can contain
temporary placeholders that reference an external resource.
- PDF/X-5pg
Takes PDF/X-5g one step further.
It offers the same method for external graphics as PDF/X-5g,
and combines it with the PDF/X-4p’s option of the output
intent referenced as an external ICC profile.
PDF/X-6 (ISO 15930-9:2020):
PDF/X-6 is building on ISO 32000-2, better known as PDF 2.0.
Incorporates all of the features and benefits of
the PDF/X‐1a, PDF/X‐3 and PDF/X‐4 specifications
while adding support for new features in PDF 2.0
PDF/X-6 relaxes some requirements.
New to PDF 2.0 are page level Output Intents and
better support for multi channel print color spaces
(more channels than just CMYK) as is increasingly
used in packaging or on digital printing devices.
Annotations may be used within the print area
if they have a printable appearance that complies with
the same requirements as any other page content.
The new optional conformance levels in PDF/X-6
(conformance levels n and p) accommodate a wider
variety of process optimizations and workflows as they
allow ICC profiles to be maintained externally.
For the first time PDF/X-6 permits PDF/X files
to have annotations, including digital signatures,
form fields and videos, reducing complexity in
multi-channel workflows.
PDF/X-6 Application Notes.
Intended for users and implementers of
PDF/X-6, PDF/X-6p, and PDF/X-6n:
Written by two members of ISO Commission TC130/TF2/WG2,
which is responsible for the PDF/X specifications.
Since this document is not an official ISO document,
it has been published by the American APTech
(Association for PRINT Technologies).
PDF/X: Users/Industry segments
Which PDF/X version to use?
For PDF/X-based workflow it’s critical that the chosen PDF/X version
fits the capabilities and objectives of that workflow.
If you have a modern workflow and output rip,
then you are more than capable of handling a PDF/X-4 file.
In an older workflow system that has difficulty digesting
some of the newer PDF functionality such as transparency,
it’s probably not a good idea to attempt adopting PDF/X-4;
an older PDF/X version is probably more suitable.
Best is to first evaluate the own workflow for PDF/X-4 readiness;
the Ghent Workgroup Output Suite is a free dedicated test suite.
When testing, all applications within the workflow must be considered,
including (but not limited to): imposition software, color servers,
ink saving software, trapping software, and output RIPs.
User segments
Designers, creators and advertising agencies
The benefit for a design company in working with PDF/X
is that it’s easier than coping with a myriad of PDF creation
settings from different printing companies and suppliers.
The output settings needed to create valid PDF/X files are
pre-configured into most professional page layout and
design applications, so generating a PDF/X file is just
a matter of selecting the required PDF/X version,
and ensuring the file is compliant after creation.
If an artwork creator supplies a conforming PDF/X file,
then any print service provider should have the tools
and knowledge to be able to process and print
that file without problems.
Magazines and newspapers
Magazines and newspapers often integrate content
produced elsewhere (e.g., advertising) into their products.
Typically, these publishers produce very detailed specifications
on how a PDF file should be created and checked
before they receive it.
Due to the sheer volume of content they receive and deliver,
and the deadlines they work to, they normally expect any
incoming advertising files to be correct when delivered.
The production for these types of publications
is split into two distinct areas:
- receiving files for advertising
PDF files of advertising content are
received from external suppliers;
these files are checked and then incorporated
with editorial content in a layout application to
create the final pages of the publication.
- delivering final pages for print
The completed publication is exported
as a PDF file and sent to a print site.
Until today, most newspapers and magazine publishers have
adopted PDF/X-1a and in turn deliver their final pages to
the printer as PDF/X-1a files.
Why haven’t periodical publishers embraced
the newer versions of PDF/X, such as PDF/X-4?
The answer is straightforward; their current workflows
are working predictably and correctly.
A key driver for the change to PDF/X-4 is that PDF/X-1a files
are not very useful when it comes to re-purposing content.
However, newspaper and magazine publishers increasingly
need to develop cross-media content that’s optimized for
smart phones, tablets or online publication.
This requirement, along with the gradual acceptance of
modern production techniques, is driving the newspaper
and magazine publishing industry towards PDF/X-4.
Commercial print and digital printing
For many companies PDF/X-1a was the standard
they used for a long time for the same reasons as
the periodical publishers discussed earlier:
predictability and responsibility.
When PDF/X-1 was first released, live transparency caused
a major problem for printing companies, as their output RIPs
and workflows were not capable of handling it correctly.
However, thanks to PDF/X-4 in 2008 provided support for
transparency, today’s commercial and digital printers
are switching to instead use PDF/X-4.
If you have a modern workflow and output RIP,
then you are more than capable of handling a PDF/X-4 file.
One aspect of PDF/X-4 that often causes concern,
particularly in sheetfed and web offset, is the fact that
color spaces such as RGB and Lab are allowed.
Many printers are not confident in handling files that contain
these color spaces, and prefer to handle only CMYK and
and spot color based files.
It is, however, perfectly possible to use PDF/X-4 based
preflight configurations that forbid these color spaces
(i.e., permitting only a subset of PDF/X-4).
The newer Ghent Workgroup Preflight Technical Specifications
are all based on PDF/X-4.
In digital printing, particularly with output engines that have
large color gamuts, RGB based files are beneficial, as they
can use the full color gamut available in the press rather
than constraining the color gamut to that of a conventional
CMYK based process.
Large format printing
PDF/X-1a, PDF/X-3 and PDF/X-4 are all relevant to
large format printing, but there are certain aspects of
each that should be recognized.
The choice of format will depend on the type of work,
the workflow and knowledge of the printing company in question.
One attribute of a PDF file that can be a requirement for
large format printing is the ‘user unit’. A PDF file
(prior to PDF 2.0 which allows pages measurable in kilometers)
has a technical size limitation of 200 x 200 inches,
which is fine for most commercial printing,
but when you want to print a poster that covers
the side of a building, this limitation becomes an issue.
To overcome this size limitation the PDF 1.6 specification
included a function called ‘UserUnit’ which effectively enables
the size of the PDF to be scaled by a multiplication factor,
allowing the creation of larger page sizes.
The PDF/X-4 specification is based on PDF 1.6,
so if it’s a requirement that PDF files are supplied
at their correct size, then PDF/X-4 would be needed.
However very often in this market, files are supplied
at a smaller size than the final required size,
and are enlarged on output.
Digital large format devices very often have large
multi-color ink sets to deliver a wide color gamut.
Some devices have up to 12 inks to maximize
the quality of printing, and can produce most available
spot colors (excepting special inks such as metallic).
The output RIPs on these devices often have very sophisticated
color management functionality in order to work with
these ink sets, and it makes sense that PDF files
being printed should maximize this capability.
In this case PDF/X-3 or PDF/X-4 can be useful as
they allow color-managed color spaces such as Lab,
CalRGB or use of an embedded ICC profile.
When investigating PDF/X for large format,
a key consideration is the output RIP driving the printer.
There are a large variety of different large format RIPs
available, with different quality and functionality.
Thorough testing is advisable to ensure the output of
the required PDF/X level is correct and predictable,
before implementing a PDF/X based workflow.
A useful tool for testing is the Ghent Workgroup Output Suite.
Labels and packaging printing
Label and packaging differs from other methods of
print production for several reasons.
A key distinction is that the size of the final job is often not a
square or a rectangle, so it cannot be defined by a PDF page box.
Additionally, in packaging, the use of multiple spot color inks rather
than just CMYK is very common, with spot colors frequently used
in image separations as well as in text and vector graphics.
Additionally, within modern packaging production,
extended gamut printing is becoming more prevalent,
especially with digital devices.
Extended gamut printing uses a fixed ink set of CMYK, plus
additional spot colors (orange, violet and green are typical)
to produce a very large color gamut, allowing a large range
of spot colors to be produced without the need to run
individual spot color inks.
All PDF/X formats require that an output intent is defined
that uses an ICC profile to characterize the intended output.
Output intents use normally CMYK ICC profiles, but for
PDF/X-4 or PDF/X-3 that can also be RGB or even Gray profiles.
To fully support multi-channel workflows with PDF/X,
a multi-channel color profile is required.
Multi-channel profiles are not supported by any of
the previously mentioned PDF/X standards.
The only PDF/X version that allows for multi-channel
profile support is PDF/X-5n.
As of this writing, 2017, PDF/X usage in the label and
packaging market is not widespread, but with PDF 2.0 and
the upcoming PDF/X-6, functionality will be added to
make adoption easier and more beneficial.
PDF/X Tools
"PDF/X-Ready" Software Tools:
Many "PDF/X-Ready" tools are available to support
all aspects of PDF/X production environments, including:
PDF/X Creation
PDF/X files can be created directly from
professional page layout packages.
When exporting to PDF, the user can simply select
the required PDF/X-1, PDF/X-3, or PDF/X-4 version and
the software will guide the user, allowing only configuration
settings that will produce a valid PDF/X file.
It is not possible to directly export a valid PDF/X file using
the output options within office applications such as
Microsoft Word or Apache OpenOffice.
However, it is possible to export a PDF file that can then be
converted to PDF/X using an additional application capable
of correcting the file to meet one of the PDF/X standards.
These solutions can be desktop or server-based,
depending on the volume of files that need to be processed.
These solutions generally begin by checking PDF/X conformance,
and subsequent correction to PDF/X is part of this process.
PDF/X Conformance and Correction
Quality control and PDF/X conformance are a
key part of the production process.
It doesn’t matter if you are supplying files to
a print service provider, or processing PDF files
within a print company; quality control is paramount.
Failure to ensure that a PDF file meets the required standard
can result in missed deadlines, wasted time, material and extra cost.
The later a problem with a PDF file is detected,
the more expensive that problem is to fix.
The graphic arts industry uses a specific term
for this quality control process: "Preflight".
A print service provider will thoroughly preflight/check a
PDF file before it enters the production process to ensure
it is of sufficient quality for the required printed product.
Most PDF preflight solutions offer the opportunity for a
Print Service Provider PSP) to correct a lot of the issues
that can arise within PDF files.
This can be done as part of the service the PSP provides
to its customer, or can be chargeable.
In newspaper or magazine production, it is not uncommon
for publications to insist on a "print ready" PDF/X file.
These publications are not willing to take the responsibility for any
potential issues that may arise if they correct the file themselves.
PDF/X preflight and correction solutions are available
in several different types of applications.
Desktop solutions
For users who have a relatively low number of files
to process, a manual application will probably be
the most appropriate.
Server-based solutions
For users who must check and correct hundreds or thousands
of PDF/X files a day, hot-folder driven and server-based
preflight solutions are available.
These applications are often also available as
Command Line Interface (CLI) software capable of
driving the quality control process programmatically.
These allow high volume automated production,
and can be driven by external systems using
database connections or XML job tickets to allow
the preflight check to be specific to the customer’s
order or advertising booking.
PDF workflow and output
PDF/X conformance and preflight are just two of
the prepress production processes that a PDF/X file
must go through to be successfully printed.
When working with PDF/X, it’s important that all pieces
and processes in a print production workflow system are
configured appropriately to handle the PDF/X version in use;
it is not sufficient to just use a PDF/X preflight check.
Many workflow vendors provide data sheets explaining how
workflows must be configured to handle PDF/X files correctly.
Programming libraries
Programming libraries allow developers to integrate
PDF/X functionality into their own applications without
having to develop the technology from scratch.
Some desktop or server-based products are also
available as programming libraries.
With these "Software Development Kits",
companies can add PDF/X functionality with minimal effort,
and bring solutions to market very quickly.
These libraries offer PDF/X creation,
PDF/X preflight and/or correction.
The current PDF/X specifications are well established and mature
as far as software developers are concerned, among them:
|
|