1 2 3 4 5 6 7Network Working Group L. Masinter 8Request for Comments: 2397 Xerox Corporation 9Category: Standards Track August 1998 10 11 12 The "data" URL scheme 13 14Status of this Memo 15 16 This document specifies an Internet standards track protocol for the 17 Internet community, and requests discussion and suggestions for 18 improvements. Please refer to the current edition of the "Internet 19 Official Protocol Standards" (STD 1) for the standardization state 20 and status of this protocol. Distribution of this memo is unlimited. 21 22Copyright Notice 23 24 Copyright (C) The Internet Society (1998). All Rights Reserved. 25 261. Abstract 27 28 A new URL scheme, "data", is defined. It allows inclusion of small 29 data items as "immediate" data, as if it had been included 30 externally. 31 322. Description 33 34 Some applications that use URLs also have a need to embed (small) 35 media type data directly inline. This document defines a new URL 36 scheme that would work like 'immediate addressing'. The URLs are of 37 the form: 38 39 data:[<mediatype>][;base64],<data> 40 41 The <mediatype> is an Internet media type specification (with 42 optional parameters.) The appearance of ";base64" means that the data 43 is encoded as base64. Without ";base64", the data (as a sequence of 44 octets) is represented using ASCII encoding for octets inside the 45 range of safe URL characters and using the standard %xx hex encoding 46 of URLs for octets outside that range. If <mediatype> is omitted, it 47 defaults to text/plain;charset=US-ASCII. As a shorthand, 48 "text/plain" can be omitted but the charset parameter supplied. 49 50 The "data:" URL scheme is only useful for short values. Note that 51 some applications that use URLs may impose a length limit; for 52 example, URLs embedded within <A> anchors in HTML have a length limit 53 determined by the SGML declaration for HTML [RFC1866]. The LITLEN 54 (1024) limits the number of characters which can appear in a single 55 56 57 58Masinter Standards Track [Page 1] 59 60RFC 2397 The "data" URL scheme August 1998 61 62 63 attribute value literal, the ATTSPLEN (2100) limits the sum of all 64 lengths of all attribute value specifications which appear in a tag, 65 and the TAGLEN (2100) limits the overall length of a tag. 66 67 The "data" URL scheme has no relative URL forms. 68 693. Syntax 70 71 dataurl := "data:" [ mediatype ] [ ";base64" ] "," data 72 mediatype := [ type "/" subtype ] *( ";" parameter ) 73 data := *urlchar 74 parameter := attribute "=" value 75 76 where "urlchar" is imported from [RFC2396], and "type", "subtype", 77 "attribute" and "value" are the corresponding tokens from [RFC2045], 78 represented using URL escaped encoding of [RFC2396] as necessary. 79 80 Attribute values in [RFC2045] are allowed to be either represented as 81 tokens or as quoted strings. However, within a "data" URL, the 82 "quoted-string" representation would be awkward, since the quote mark 83 is itself not a valid urlchar. For this reason, parameter values 84 should use the URL Escaped encoding instead of quoted string if the 85 parameter values contain any "tspecial". 86 87 The ";base64" extension is distinguishable from a content-type 88 parameter by the fact that it doesn't have a following "=" sign. 89 904. Examples 91 92 A data URL might be used for arbitrary types of data. The URL 93 94 data:,A%20brief%20note 95 96 encodes the text/plain string "A brief note", which might be useful 97 in a footnote link. 98 99 The HTML fragment: 100 101 <IMG 102 SRC="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw 103 AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz 104 ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp 105 a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl 106 ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis 107 F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH 108 hhx4dbgYKAAA7" 109 ALT="Larry"> 110 111 112 113 114Masinter Standards Track [Page 2] 115 116RFC 2397 The "data" URL scheme August 1998 117 118 119 could be used for a small inline image in a HTML document. (The 120 embedded image is probably near the limit of utility. For anything 121 else larger, data URLs are likely to be inappropriate.) 122 123 A data URL scheme's media type specification can include other 124 parameters; for example, one might specify a charset parameter. 125 126 data:text/plain;charset=iso-8859-7,%be%fg%be 127 128 can be used for a short sequence of greek characters. 129 130 Some applications may use the "data" URL scheme in order to provide 131 setup parameters for other kinds of networking applications. For 132 example, one might create a media type 133 application/vnd-xxx-query 134 135 whose content consists of a query string and a database identifier 136 for the "xxx" vendor's databases. A URL of the form: 137 138 data:application/vnd-xxx- 139 query,select_vcount,fcol_from_fieldtable/local 140 141 could then be used in a local application to launch the "helper" for 142 application/vnd-xxx-query and give it the immediate data included. 143 1445. History 145 146 This idea was originally proposed August 1995. Some versions of the 147 data URL scheme have been used in the definition of VRML, and a 148 version has appeared as part of a proposal for embedded data in HTML. 149 Various changes have been made, based on requests, to elide the media 150 type, pack the indication of the base64 encoding more tightly, and 151 eliminate "quoted printable" as an encoding since it would not easily 152 yield valid URLs without additional %xx encoding, which itself is 153 sufficient. The "data" URL scheme is in use in VRML, new applications 154 of HTML, and various commercial products. It is being used for object 155 parameters in Java and ActiveX applications. 156 1576. Security 158 159 Interpretation of the data within a "data" URL has the same security 160 considerations as any implementation of the given media type. An 161 application should not interpret the contents of a data URL which is 162 marked with a media type that has been disallowed for processing by 163 the application's configuration. 164 165 166 167 168 169 170Masinter Standards Track [Page 3] 171 172RFC 2397 The "data" URL scheme August 1998 173 174 175 Sites which use firewall proxies to disallow the retrieval of certain 176 media types (such as application script languages or types with known 177 security problems) will find it difficult to screen against the 178 inclusion of such types using the "data" URL scheme. However, they 179 should be aware of the threat and take whatever precautions are 180 considered necessary within their domain. 181 182 The effect of using long "data" URLs in applications is currently 183 unknown; some software packages may exhibit unreasonable behavior 184 when confronted with data that exceeds its allocated buffer size. 185 1867. References 187 188 [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, 189 "Uniform Resource Identifiers (URI): Generic Syntax", RFC 190 2396, August 1998. 191 192 [RFC1866] Berners-Lee, T., and D. Connolly, "Hypertext Markup 193 Language - 2.0.", RFC 1866, November 1995. 194 195 [RFC2045] Freed N., and N. Borenstein., "Multipurpose Internet Mail 196 Extensions (MIME) Part One: Format of Internet Message 197 Bodies", RFC 2045, November 1996. 198 199Author contact information: 200 201 Larry Masinter 202 Xerox Palo Alto Research Center 203 3333 Coyote Hill Road 204 Palo Alto, CA 94304 205 206 EMail: masinter@parc.xerox.com 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226Masinter Standards Track [Page 4] 227 228RFC 2397 The "data" URL scheme August 1998 229 230 231Full Copyright Statement 232 233 Copyright (C) The Internet Society (1998). All Rights Reserved. 234 235 This document and translations of it may be copied and furnished to 236 others, and derivative works that comment on or otherwise explain it 237 or assist in its implementation may be prepared, copied, published 238 and distributed, in whole or in part, without restriction of any 239 kind, provided that the above copyright notice and this paragraph are 240 included on all such copies and derivative works. However, this 241 document itself may not be modified in any way, such as by removing 242 the copyright notice or references to the Internet Society or other 243 Internet organizations, except as needed for the purpose of 244 developing Internet standards in which case the procedures for 245 copyrights defined in the Internet Standards process must be 246 followed, or as required to translate it into languages other than 247 English. 248 249 The limited permissions granted above are perpetual and will not be 250 revoked by the Internet Society or its successors or assigns. 251 252 This document and the information contained herein is provided on an 253 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 254 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 255 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 256 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 257 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282Masinter Standards Track [Page 5] 283 284