idnits 2.17.1 draft-hardy-pdf-mime-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 23, 2017) is 2170 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3778 (Obsoleted by RFC 8118) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Hardy 3 Internet-Draft L. Masinter 4 Obsoletes: 3778 (if approved) D. Markovic 5 Intended status: Informational Adobe Systems Incorporated 6 Expires: August 27, 2017 D. Johnson 7 PDF Association 8 M. Bailey 9 Global Graphics 10 February 23, 2017 12 The application/pdf Media Type 13 draft-hardy-pdf-mime-05 15 Abstract 17 The Portable Document Format (PDF) is an ISO standard (ISO 18 32000-1:2008) defining a final-form document representation language 19 in use for document exchange, including on the Internet, since 1993. 20 This document provides an overview of the PDF format and updates the 21 media type registration of "application/pdf". It obsoletes RFC 3778. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on August 27, 2017. 40 Copyright Notice 42 Copyright (c) 2017 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 3 60 4. Subset Standards . . . . . . . . . . . . . . . . . . . . . . 5 61 5. PDF Versions . . . . . . . . . . . . . . . . . . . . . . . . 6 62 6. PDF Implementations . . . . . . . . . . . . . . . . . . . . . 6 63 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 64 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 65 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 9.1. Normative References . . . . . . . . . . . . . . . . . . 9 67 9.2. Informative References . . . . . . . . . . . . . . . . . 9 68 Appendix A. Changes since RFC 3778 . . . . . . . . . . . . . . . 10 69 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 71 1. Introduction 73 This document is intended to provide updated information on the 74 registration of the MIME Media Type "application/pdf" for documents 75 defined in the PDF [ISOPDF], "Portable Document Format", syntax. It 76 obsoletes [RFC3778]. 78 PDF was originally envisioned as a way to reliably communicate and 79 view printed information electronically across a wide variety of 80 machine configurations, operating systems, and communication 81 networks. 83 PDF is used to represent "final form" formatted documents. PDF pages 84 may include text, images, graphics and multimedia content such as 85 video and audio. PDF is also capable of containing auxiliary 86 structures including annotations, bookmarks, file attachments, 87 hyperlinks, logical structure and metadata. These features are 88 useful for navigation, building collections of related documents and 89 for reviewing and commenting on documents. A rich JavaScript model 90 has been defined for interacting with PDF documents. 92 PDF used the imaging model of the PostScript [PS] page description 93 language to render complex text, images, and graphics in a device and 94 resolution-independent manner. 96 PDF supports encryption and digital signatures. The encryption 97 capability is combined with access control information to facilitate 98 management of the functionality available to the recipient. PDF 99 supports the inclusion of document and object-level metadata through 100 the eXtensible Metadata Platform[XMP]. 102 2. History 104 PDF is used widely in the Internet community. The first version of 105 PDF, 1.0, was published in 1993 by Adobe Systems Incorporated. Since 106 then PDF has grown to be a widely-used format for capturing and 107 exchanging formatted documents electronically across the Web, via 108 e-mail and virtually every other document exchange mechanism. In 109 2008, PDF 1.7 was published as an ISO standard [ISOPDF], ISO 110 32000-1:2008. It was adopted using ISO Fast-Track process and is 111 technically identical to Adobe Portable Document Format version 1.7 112 [AdobePDF] referenced by [RFC3778]. 114 The ISO TC-171 committee is presently working on a refresh of PDF, 115 known as ISO 32000-2, with a version of PDF 2.0, expected to be 116 published in 2017. 118 In addition to ISO 32000-1:2008 and 32000-2, several subset standards 119 have been defined to address specific use cases and standardized by 120 the ISO. These standards include PDF for Archival (PDF/A) [ISOPDFA], 121 PDF for Engineering (PDF/E) [ISOPDFE], PDF for Universal 122 Accessibility (PDF/UA) [ISOPDFUA], PDF for Variable Data and 123 Transactional Printing (PDF/VT) [ISOPDFVT], and PDF for Prepress 124 Digital Data Exchange (PDF/X) [ISOPDFX]. The subset standards are 125 fully compliant PDF files capable of being displayed in a general PDF 126 viewer. 128 3. Fragment Identifiers 130 Fragment identifiers appear at the end of a URI, and provide a way to 131 reference an anchor to subordinate content within the target of the 132 URI, or additional parameters to the process of opening the 133 identified content. The syntax and semantics of fragment identifiers 134 is referenced in the media type definition. 136 The specification of fragment identifiers for PDF appeared originally 137 in [RFC3778], but now will be included in ISO 32000-2 [ISOPDF2]. 138 This section is a summary of that material. Any disagreements 139 between that document and this should be resolved in favor of the ISO 140 32000-2 definition, once that has been approved. 142 A fragment identifier for PDF has one or more parameters, separated 143 by the ampersand (&) or pound (#) character. Each parameter consists 144 of the parameter name, "=" (equal), and the parameter value; lists of 145 values are comma-separated, and parameter value strings may be URI- 146 encoded ([RFC3986]). Parameters are processed left to right. 148 Coordinate values (such as , , ) are expressed in 149 the default user space coordinate system of the document: 1/72 of an 150 inch measured down and to the right from the upper-left corner of the 151 (current) page. ([ISOPDF2] 8.3.2.3 "User Space") 153 The following parameters identify subordinate content of a PDF file, 154 but also may be used to set the document view to make the (start of) 155 the identified content visible: 157 page= 158 Identifies a specified (physical) page; the first page in the 159 document has a pageNum value of 1. 161 nameddest= 162 Identifies a named destination ([ISOPDF2] 12.3.2.4 "Named 163 destinations"). 165 structelem= 166 structID is a byte string with URI encoding; identifies the 167 structure element with ID key within a StructElem dictionary of 168 the document. 170 comment= 171 The commentID is the value of an annotation name, which is defined 172 by the NM key in the corresponding annotation dictionary (of the 173 selected page. ([ISOPDF2] 12.5.2 "Annotation dictionaries") 175 ef= 176 Identifies the embedded file where the parameter string 177 matches a file specification dictionary in the EmbeddedFiles name 178 tree. If the "ef" parameter is not at the end of the fragment 179 identifier, then the rest of the fragment identifier (after the 180 ampersand or hash delimiter) is applied to the embedded file 181 according to its own media type. This allows identification of 182 content within the embedded file (which itself might be a PDF 183 file). 185 NOTE: When opening a PDF file that is not from a trusted source, 186 processor may choose to prompt the user or even prevent opening of 187 the file. 189 These parameters also operate on the view of the PDF document when it 190 is opened. 192 zoom=,, 193 is the percentage to which the document should be zoomed, 194 where a value of 100 correspond to a zoom of 100%. and 195 are optional, but both must be specified if either is 196 included. 198 view=, 199 The arguments correspond to those found in [ISOPDF2] 12.3.2.2 200 "Explicit destinations". keyword is one of the keywords defined 201 in [ISOPDF2] "Table 149: Destination syntax" with appropriate 202 position values. 204 viewrect=,,, 205 Set the view rectangle. 207 highlight=,,, 208 Highlight the specified rectangle. 210 search= 211 Open the document and search for one or more words, selecting the 212 first matching word in the document. wordList is a string enclosed 213 in quotation marks where individual words are separated by the 214 space character (or %20). 216 fdf= 217 Imports data into PDF form fields. The URI is either a relative 218 or absolute URI to an FDF or XFDF file. The fdf parameter should 219 be specified as the last parameter to a given URI. 221 4. Subset Standards 223 Several subsets of PDF have been published as distinct ISO standards: 225 o PDF/X, initially released in 2001 as PDF/X-1a [ISOPDFX], specifies 226 how to use PDF for graphics exchange, with the aim to fascilitate 227 correct and predictable printing by print service providers. The 228 standard has gone through multiple revisions over the years and 229 has several published parts, the most recently released being part 230 8, specifying different levels of conformance: PDF/X-1a:2001, PDF/ 231 X-3:2002, PDF/X-1a:2003, PDF/X-3:2003, PDF/X-4, PDF/X-4p, PDF/ 232 X-5g, PDF/X-5pg and PDF/X-5n. 234 o PDF/A, initially released in 2005, specifies how to use PDF for 235 long-term preservation (archiving) of electronic documents. It 236 prohibits PDF features which are not well suited to long term 237 archiving of documents, including JavaScript or executable file 238 launches. Its requirements for PDF/A viewers include color 239 management guidelines and support for embedded fonts. There are 240 three parts of this standard and a total of eight conformance 241 levels: PDF/A-1a, PDF/A-1b, PDF/A-2a, PDF/A-2b, PDF/A-2u, PDF/ 242 A-3a, PDF/A-3b and PDF/A-3u. 244 o PDF/E, initially released in 2008 as PDF/E-1 [ISOPDFE], specifies 245 how to use PDF in engineering workflows, such as manufacturing, 246 construction and geospatial analysis. Future revisions of PDF/E 247 are supposed to include support for 3D PDF workflows. 249 o PDF/VT, initially released in 2010, specifies how to use PDF in 250 variable and transactional printing. It is based on PDF/X, and 251 adds adidtional restrictions on PDF content elements and 252 supporting metadata. It specifies three conformance levels: PDF/ 253 VT-1, PDF/VT-2 and PDF/VT-2s [ISOPDFVT]. 255 o PDF/UA, initially released in 2012 as PDF/UA-1 [ISOPDFUA], 256 specifies how to create accessible electonic documents. It 257 requires use of ISO 32000's Tagged PDF feature, and adds many 258 requirements regarding semantic correctness in applying logical 259 structures to content in PDF documents. 261 All of these subset standards use "application/pdf" media type. The 262 subset standards are generally not exclusive, so it is possible to 263 construct a PDF file which conforms to, for example, both PDF/A-2b 264 and PDF/X-4 subset standards. 266 PDF documents claiming conformance to one or more of the subset 267 standards use XMP metadata to identify levels of conformance. PDF 268 processors should examine document metadata streams for such subset 269 standards identifiers and, if apropriate, label documents as such 270 when presenting them to the user. 272 5. PDF Versions 274 PDF format has gone through several revisions, primarily for the 275 addition of features. PDF features have generally been added in a 276 way that older viewers "fail gracefully", because they can just 277 ignore features they do not recognize. Even so, the older the PDF 278 version produced, the more legacy viewers will support that version, 279 but the fewer features will be enabled. See [ISOPDF] Annex I, "PDF 280 Versions and Compatibility". 282 6. PDF Implementations 284 PDF files are experienced through a reader or viewer of PDF files. 285 For most of the common platforms in use (iOS, OS X, Windows, Android, 286 ChromeOS, Kindle) and for most browsers (Edge, Safari, Chrome, 287 Firefox), PDF viewing is built-in. In addition, there are many PDF 288 viewers available for download and install. The PDF specification 289 was published and freely available since the format was introduced in 290 1993, so hundreds of companies and organizations make tools for PDF 291 creation, viewing, and manipulation. 293 7. Security Considerations 295 PDF is certainly a complex media type as per Section 4.6 of 296 [RFC6838], which sets requirements for security analysis of media 297 type registrations. [RFC3778] (which this document obsoletes) 298 contained a detailed analysis of some of the security issues for PDF 299 implementations known at the time. While the analysis isn't 300 necessarily wrong, the threat analysis is much too limited, and the 301 mitigations somewhat out of date. There is now extensive literature 302 on security threats involving PDF implementations and how to avoid 303 them, consistent with broad implementation over decades. We are not 304 registering a new media type but rather making a primarily 305 administrative update. With those caveats: 307 The PDF file format allows several constructs which may compromise 308 security if handled inadequately by PDF processors. For example: 310 o PDF may contain scripts to customize the displaying and processing 311 of PDF files. These scripts are expressed in a version of 312 JavaScript and are intended for execution by the PDF processor. 314 o PDF file may refer to other PDF files for portions of content. 315 PDF processors are expected to find these external files and load 316 them in order to display the document. 318 o PDF may act as a container for various files embedded in it (for 319 example, as attached files). PDF processors may offer 320 functionality to open and display such files or store them on the 321 system, such as with the "ef" open action. THe PDF specification 322 places no restrictions on types of files which may be embedded, so 323 PDF processors should be extremely careful to prevent unwanted 324 execution of attached executables or decompression of attached 325 archives which may store dangerous files in the host file system. 327 o PDF files may contain links to content on the internet. PDF 328 processors may offer functionality to show such content upon 329 following the link. 331 o The fragment identifier syntax (Section 3) contains directives for 332 opening ("ef") or inluding ("fdf") additional material. 334 PDF interpreters executing any scripts or programs related to these 335 constructs must be extremely careful to insure that untrusted 336 software is executed in a protected environment. 338 In addition, the PDF processor itself, as well as its plugins, 339 scripts etc. may be a source of insecurity, by either obvious or 340 subtle means. 342 8. IANA Considerations 344 This document updates the registration of "application/pdf", a media 345 type registration as defined in [RFC6838]: 347 Type name: application 349 Subtype name: pdf 351 Required parameters: none 353 Optional parameter: none 355 Encoding considerations: binary 357 Security considerations: See Section 7 of this document. 359 Interoperability considerations: See Section 5 of this document. 361 Published specification: ISO 32000-1:2008 (PDF 1.7) [ISOPDF]. ISO 362 32000-2 (PDF 2.0) [ISOPDF2] is currently under development. 364 Applications which use this media type: See Section 6 of this 365 document. 367 Fragment identifier considerations: See Section 3 of this document. 369 Additional information: 371 Deprecated alias names for this type: none 373 Magic number(s): All PDF files start with the characters '%PDF-' 374 followed by the PDF version number, e.g., "%PDF-1.7". These 375 characters are in US-ASCII encoding. 377 File extension(s): .pdf 379 Macintosh file type code(s): "PDF " 380 Person & email address to contact for further information: Duff 381 Johnson , Peter Wyatt 382 , ISO 32000 Project Leaders 384 Intended usage: COMMON 386 Restrictions on usage: none 388 Author: Authors of this document 390 Change controller: ISO; in particular, ISO 32000 is by ISO/TC 171/SC 391 02/WG 08, "PDF specification". Duff Johnson 392 and Peter Wyatt 444 [PS] Adobe Systems Incorporated, "PostScript Language 445 Reference, third edition", 1999. 447 [AdobePDF] 448 Adobe Systems Incorporated, "PDF Reference, sixth 449 edition", 2006. 451 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 452 Specifications and Registration Procedures", BCP 13, 453 RFC 6838, DOI 10.17487/RFC6838, January 2013, 454 . 456 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 457 Resource Identifier (URI): Generic Syntax", STD 66, 458 RFC 3986, DOI 10.17487/RFC3986, January 2005, 459 . 461 [RFC3778] Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The 462 application/pdf Media Type", RFC 3778, 463 DOI 10.17487/RFC3778, May 2004, 464 . 466 Appendix A. Changes since RFC 3778 468 This specification replaces RFC 3778, which previously defined the 469 "application/pdf" Media Type. Differences include: 471 o To reflect the transition from a proprietary specification by 472 Adobe to an open ISO Standard, the Change Controller has changed 473 from Adobe to ISO, and references updated. 475 o The overview of PDF capabilitiies, the history of PDF, and the 476 descriptions of PDF subsets were updated to reflect more recent 477 relevant history. 479 o The section on Fragment identifiers was updated to closely reflect 480 the material which has been added to ISO-32000-2. 482 o The status of popular PDF implementations was updated. 484 o The Security Considerations were updated to match the current 485 understanding of PDF vulnerabilities. 487 o The registration template was updated to match RFC 6838. 489 Authors' Addresses 491 Matthew Hardy 492 Adobe Systems Incorporated 493 345 Park Ave 494 San Jose, CA 95110 495 USA 497 Email: mahardy@adobe.com 499 Larry Masinter 500 Adobe Systems Incorporated 501 345 Park Ave 502 San Jose, CA 95110 503 USA 505 Email: masinter@adobe.com 506 URI: http://larry.masinter.net 508 Dejan Markovic 509 Adobe Systems Incorporated 510 345 Park Ave 511 San Jose, CA 95110 512 USA 514 Email: dmarkovi@adobe.com 515 Duff Johnson 516 PDF Association 517 Neue Kantstrasse 14 518 Berlin 14057 519 Germany 521 Email: duff.johnson@pdfa.org 523 Martin Bailey 524 Global Graphics 525 2030 Cambourne Business Park 526 Cambridge CB23 6DW 527 UK 529 Email: martin.bailey@globalgraphics.com 530 URI: http://www.globalgraphics.com