One document matched: draft-abarth-mime-sniff-00.txt
Working Group A. Barth
Internet-Draft U.C. Berkeley
Expires: July 13, 2009 I. Hickson
Google, Inc.
January 9, 2009
Content-Type Processing Model
draft-abarth-mime-sniff-00
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on July 13, 2009.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Barth & Hickson Expires July 13, 2009 [Page 1]
Internet-Draft Content-Type Processing Model January 2009
Abstract
Many Web servers supply incorrect Content-Type headers with their
HTTP responses. In order to be compatible with these Web servers,
Web browsers must consider the content of HTTP responses as well as
the Content-Type header when determining the effective mime type of
the response. This document describes an algorithm for determining
the effective mime type of HTTP responses that balances security and
compatibility considerations.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Web Pages . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Text or Binary . . . . . . . . . . . . . . . . . . . . . . . . 8
5. Unknown Type . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7. Feed or HTML . . . . . . . . . . . . . . . . . . . . . . . . . 15
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18
Barth & Hickson Expires July 13, 2009 [Page 2]
Internet-Draft Content-Type Processing Model January 2009
1. Introduction
The HTTP Content-Type header indicates the mime type of an HTTP
responses. However, many HTTP servers supply a Content-Type that
does not match the actual contents of the response. Historically,
Web browsers have been tolerated these servers by examining the
content of HTTP responses in addition to the Content-Type header to
determine the effective mime type of the response.
Without a clear specification of how to "sniff" the mime type, each
browser vendor was forced to reverse engineer the behavior of the
other borwsers and to developed their own algorithm. These divergent
algorithms have lead to a lack of interoperability between browsers
and to security issues when the site intends an HTTP response to be
interpreted as one mime type but the browser interpretes the
responses as another mime type.
These security issues are must severe when a Web site lets users
upload files and then serves the contents of those files with a low-
privilege mime type (such as text/plain or image/jpeg). In the
absense of mime sniffing, this user-generated content will not be
able to run JavaScript, but if the browser treats the response as
text/html, then the user can mount a cross-site scripting attack by
including JavaScript code in the uploaded file.
This document describes a mime sniffing algorithm that carefully
balances the compatibility needs of browser vendors with the security
constraints. The algorithm has been constructed with reference to
mime sniffing algorithms present in popular Web browsers, an
extensive database of Web content, and metrics collected from
implementations deployed to a sizable number of Web users.
Warning! It is imperative that the algorithm in this document be
followed exactly. When a user agent uses different heuristics for
content type detection than the server expects, security problems can
occur. For example, if a server believes that the client will treat
a contributed file as an image (and thus treat it as benign), but a
Web browser believes the content to be HTML (and thus execute any
scripts contained therein), the end user can be exposed to malicious
content, making the user vulnerable to cookie theft attacks and other
cross-site scripting attacks.
Barth & Hickson Expires July 13, 2009 [Page 3]
Internet-Draft Content-Type Processing Model January 2009
2. Metadata
What explicit Content-Type metadata is associated with the resource
(the resource's type information) depends on the protocol that was
used to fetch the resource.
For HTTP resources, only the first Content-Type HTTP header, if any,
contributes any type information; the explicit type of the resource
is then the value of that header, interpreted as described by the
HTTP specifications. If the Content-Type HTTP header is present but
the value of the first such header cannot be interpreted as described
by the HTTP specifications (e.g. because its value doesn't contain a
U+002F SOLIDUS ('/') character), then the resource has no type
information (even if there are multiple Content-Type HTTP headers and
one of the other ones is syntactically correct). [HTTP]
For resources fetched from the file system, user agents should use
platform-specific conventions, e.g. operating system extension/type
mappings.
Extensions must not be used for determining resource types for
resources fetched over HTTP.
For resources fetched over most other protocols, e.g. FTP, there is
no type information.
The algorithm for extracting an encoding from a Content-Type, given a
string s, is as follows. It either returns an encoding or nothing.
1. Find the first seven characters in s that are an ASCII case-
insensitive match for the word "charset". If no such match is
found, return nothing.
2. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters
that immediately follow the word 'charset' (there might not be
any).
3. If the next character is not a U+003D EQUALS SIGN ('='), return
nothing.
4. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters
that immediately follow the equals sign (there might not be any).
5. Process the next character as follows:
* If it is a U+0022 QUOTATION MARK ('"') and there is a later
U+0022 QUOTATION MARK ('"') in s, or
Barth & Hickson Expires July 13, 2009 [Page 4]
Internet-Draft Content-Type Processing Model January 2009
* If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027
APOSTROPHE ("'") in s
Return the string between this character and the next
earliest occurrence of this character.
* If it is an unmatched U+0022 QUOTATION MARK ('"'),
* If it is an unmatched U+0027 APOSTROPHE ("'"), or
* If there is no next character
Return nothing.
* Otherwise
Return the string from this character to the first U+0009,
U+000A, U+000C, U+000D, U+0020, or U+003B character or the
end of s, whichever comes first.
Note: The above algorithm is a willful violation of the HTTP
specification. [RFC2616]
Barth & Hickson Expires July 13, 2009 [Page 5]
Internet-Draft Content-Type Processing Model January 2009
3. Web Pages
The sniffed type of a resource must be found as follows:
1. If the user agent is configured to strictly obey Content-Type
headers for this resource, then jump to the last step in this set
of steps.
2. If the resource was fetched over an HTTP protocol and there is an
HTTP Content-Type header and the value of the first such header
has bytes that exactly match one of the following lines:
+-------------------------------+--------------------------------+
| Bytes in Hexadecimal | Textual representation |
+-------------------------------+--------------------------------+
| 74 65 78 74 2f 70 6c 61 69 6e | text/plain |
+-------------------------------+--------------------------------+
| 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=ISO-8859-1 |
| 3b 20 63 68 61 72 73 65 74 3d | |
| 49 53 4f 2d 38 38 35 39 2d 31 | |
+-------------------------------+--------------------------------+
| 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=iso-8859-1 |
| 3b 20 63 68 61 72 73 65 74 3d | |
| 69 73 6f 2d 38 38 35 39 2d 31 | |
+-------------------------------+--------------------------------+
| 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=UTF-8 |
| 3b 20 63 68 61 72 73 65 74 3d | |
| 55 54 46 2d 38 | |
+-------------------------------+--------------------------------+
...then jump to the "text or binary" section below.
3. Let official type be the type given by the Content-Type metadata
for the resource, ignoring parameters. If there is no such type,
jump to the unknown type step below. Comparisons with this type,
as defined by MIME specifications, are done in an ASCII case-
insensitive manner. [RFC2046]
4. If official type is "unknown/unknown" or "application/unknown",
jump to the unknown type step below.
5. If official type ends in "+xml", or if it is either "text/xml" or
"application/xml", then the sniffed type of the resource is
official type; return that and abort these steps.
6. If official type is an image type supported by the user agent
(e.g. "image/png", "image/gif", "image/jpeg", etc), then jump to
the "images" section below, passing it the official type.
Barth & Hickson Expires July 13, 2009 [Page 6]
Internet-Draft Content-Type Processing Model January 2009
7. If official type is "text/html", then jump to the feed or HTML
section below.
8. The sniffed type of the resource is official type.
Barth & Hickson Expires July 13, 2009 [Page 7]
Internet-Draft Content-Type Processing Model January 2009
4. Text or Binary
1. The user agent may wait for 512 or more bytes of the resource to
be available.
2. Let n be the smaller of either 512 or the number of bytes already
available.
3. If n is 4 or more, and the first bytes of the resource match one
of the following byte sets:
+----------------------+--------------+
| Bytes in Hexadecimal | Description |
+----------------------+--------------+
| FE FF | UTF-16BE BOM |
| FF FE | UTF-16LE BOM |
| EF BB BF | UTF-8 BOM |
+----------------------+--------------+
...then the sniffed type of the resource is "text/plain". Abort
these steps.
4. If none of the first n bytes of the resource are binary data
bytes then the sniffed type of the resource is "text/plain".
Abort these steps.
+-------------------------+
| Binary data byte ranges |
+-------------------------+
| 0x00 -- 0x08 |
| 0x0B |
| 0x0E -- 0x1A |
| 0x1C -- 0x1F |
+-------------------------+
5. If the first bytes of the resource match one of the byte
sequences in the "pattern" column of the table in the unknown
type section below, ignoring any rows whose cell in the
"security" column says "scriptable" (or "n/a"), then the sniffed
type of the resource is the type given in the corresponding cell
in the "sniffed type" column on that row; abort these steps.
Warning! It is critical that this step not ever return a
scriptable type (e.g. text/html), as otherwise that would
allow a privilege escalation attack.
Barth & Hickson Expires July 13, 2009 [Page 8]
Internet-Draft Content-Type Processing Model January 2009
6. Otherwise, the sniffed type of the resource is "application/
octet-stream".
Barth & Hickson Expires July 13, 2009 [Page 9]
Internet-Draft Content-Type Processing Model January 2009
5. Unknown Type
1. The user agent may wait for 512 or more bytes of the resource to
be available.
2. Let stream length be the smaller of either 512 or the number of
bytes already available.
3. For each row in the table below:
* If the row has no "WS" bytes:
1. Let pattern length be the length of the pattern (number of
bytes described by the cell in the second column of the
row).
2. If stream length is smaller than pattern length then skip
this row.
3. Apply the "and" operator to the first pattern length bytes
of the resource and the given mask (the bytes in the cell
of first column of that row), and let the result be the
data.
4. If the bytes of the data matches the given pattern bytes
exactly, then the sniffed type of the resource is the type
given in the cell of the third column in that row; abort
these steps.
* If the row has a "WS" byte:
1. Let index_pattern be an index into the mask and pattern
byte strings of the row.
2. Let index_stream be an index into the byte stream being
examined.
3. Loop: If indexstream points beyond the end of the byte
stream, then this row doesn't match, skip this row.
4. Examine the indexstreamth byte of the byte stream as
follows:
- If the index_patternth byte of the pattern is a normal
hexadecimal byte and not a "WS" byte:
If the "and" operator, applied to the index_streamth
byte of the stream and the index_patternth byte of
Barth & Hickson Expires July 13, 2009 [Page 10]
Internet-Draft Content-Type Processing Model January 2009
the mask, yield a value different that the
index_patternth byte of the pattern, then skip this
row.
Otherwise, increment index_pattern to the next byte
in the mask and pattern and index_stream to the next
byte in the byte stream.
- Otherwise, if the indexpatternth byte of the pattern is
a "WS" byte:
"WS" means "whitespace", and allows insignificant
whitespace to be skipped when sniffing for a type
signature.
If the index_streamth byte of the stream is one of
0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF),
0x0D (ASCII CR), or 0x20 (ASCII space), then
increment only the index_stream to the next byte in
the byte stream.
Otherwise, increment only the index_pattern to the
next byte in the mask and pattern.
5. If index_pattern does not point beyond the end of the mask
and pattern byte strings, then jump back to the loop step
in this algorithm.
6. Otherwise, the sniffed type of the resource is the type
given in the cell of the third column in that row; abort
these steps.
4. If none of the first n bytes of the resource are binary data
bytes then the sniffed type of the resource is "text/plain".
Abort these steps.
5. Otherwise, the sniffed type of the resource is "application/
octet-stream".
The table used by the above algorithm is:
+-------------------+-------------------+-----------------+------------+
| Mask in Hex | Pattern in Hex | Sniffed type | Security |
+-------------------+-------------------+-----------------+------------+
| FF FF DF DF DF DF | 3C 21 44 4F 43 54 | text/html | Scriptable |
| DF DF DF FF DF DF | 59 50 45 20 48 54 | | |
| DF DF | 4D 4C | | |
| |
Barth & Hickson Expires July 13, 2009 [Page 11]
Internet-Draft Content-Type Processing Model January 2009
| Comment: The string "<!DOCTYPE HTML" in US-ASCII or compatible |
| encodings, case-insensitively. |
+-------------------+-------------------+-----------------+------------+
| FF FF DF DF DF DF | WS 3C 48 54 4D 4C | text/html | Scriptable |
| |
| Comment: The string "<HTML" in US-ASCII or compatible encodings, |
| case-insensitively, possibly with leading spaces. |
+-------------------+-------------------+-----------------+------------+
| FF FF DF DF DF DF | WS 3C 48 45 41 44 | text/html | Scriptable |
| |
| Comment: The string "<HEAD" in US-ASCII or compatible encodings, |
| case-insensitively, possibly with leading spaces. |
+-------------------+-------------------+-----------------+------------+
| FF FF DF DF DF DF | WS 3C 53 43 52 49 | text/html | Scriptable |
| DF DF | 50 54 | | |
| |
| Comment: The string "<SCRIPT" in US-ASCII or compatible |
| encodings, case-insensitively, possibly with leading |
| spaces. |
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF FF | 25 50 44 46 2D | application/pdf | Scriptable |
| |
| Comment: The string "%PDF-", the PDF signature. |
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF FF FF | 25 21 50 53 2D 41 | application/ | Safe |
| FF FF FF FF FF | 64 6F 62 65 2D | postscript | |
| |
| Comment: The string "%!PS-Adobe-", the PostScript signature. |
+-------------------+-------------------+-----------------+------------+
| FF FF 00 00 | FE FF 00 00 | text/plain | n/a |
| |
| Comment: UTF-16BE BOM |
+-------------------+-------------------+-----------------+------------+
| FF FF 00 00 | FF FF 00 00 | text/plain | n/a |
| |
| Comment: UTF-16LE BOM |
+-------------------+-------------------+-----------------+------------+
| FF FF FF 00 | EF BB BF 00 | text/plain | n/a |
| |
| Comment: UTF-8 BOM |
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF FF FF | 47 49 46 38 37 61 | image/gif | Safe |
| |
| Comment: The string "GIF87a", a GIF signature. |
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF FF FF | 47 49 46 38 39 61 | image/gif | Safe |
| |
| Comment: The string "GIF89a", a GIF signature. |
Barth & Hickson Expires July 13, 2009 [Page 12]
Internet-Draft Content-Type Processing Model January 2009
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF FF FF | 89 50 4E 47 0D 0A | image/png | Safe |
| FF FF | 1A 0A | | |
| |
| Comment: The PNG signature. |
+-------------------+-------------------+-----------------+------------+
| FF FF FF | FF D8 FF | image/jpeg | Safe |
| |
| Comment: A JPEG SOI marker followed by a byte of another marker. |
+-------------------+-------------------+-----------------+------------+
| FF FF | 42 4D | image/bmp | Safe |
| |
| Comment: The string "BM", a BMP signature. |
+-------------------+-------------------+-----------------+------------+
| FF FF FF FF | 00 00 01 00 | image/vnd. | Safe |
| | | microsoft.icon | |
| |
| Comment: A 0 word following by a 1 word, a Windows Icon signature. |
+-------------------+-------------------+-----------------+------------+
Note: I'd like to add types like MPEG, AVI, Flash, Java, etc, to the
above table.
User agents may support further types if desired, by implicitly
adding to the above table. However, user agents should not use any
other patterns for types already mentioned in the table above, as
this could then be used for privilege escalation (where, e.g., a
server uses the above table to determine that content is not HTML and
thus safe from XSS attacks, but then a user agent detects it as HTML
anyway and allows script to execute).
The column marked "security" is used by the algorithm in the "text or
binary" section, to avoid sniffing text/plain content as a type that
can be used for a privilege escalation attack.
Barth & Hickson Expires July 13, 2009 [Page 13]
Internet-Draft Content-Type Processing Model January 2009
6. Image
If the resource's official type is "image/svg+xml", then the sniffed
type of the resource is its official type (an XML type).
Otherwise, if the first bytes of the resource match one of the byte
sequences in the first column of the following table, then the
sniffed type of the resource is the type given in the corresponding
cell in the second column on the same row:
+-------------------------+--------------------------+----------+
| Bytes in Hexadecimal | Sniffed type | Comment |
+-------------------------+--------------------------+----------+
| 47 49 46 38 37 61 | image/gif | "GIF87a" |
| 47 49 46 38 39 61 | image/gif | "GIF89a" |
| 89 50 4E 47 0D 0A 1A 0A | image/png | |
| FF D8 FF | image/jpeg | |
| 42 4D | image/bmp | "BM" |
| 00 00 01 00 | image/vnd.microsoft.icon | |
+-------------------------+--------------------------+----------+
Otherwise, the sniffed type of the resource is the same as its
official type.
Barth & Hickson Expires July 13, 2009 [Page 14]
Internet-Draft Content-Type Processing Model January 2009
7. Feed or HTML
1. The user agent may wait for 512 or more bytes of the resource to
be available.
2. Let s be the stream of bytes, and let s[i] represent the byte in
s with position i, treating s as zero-indexed (so the first byte
is at i=0).
3. If at any point this algorithm requires the user agent to
determine the value of a byte in s which is not yet available,
or which is past the first 512 bytes of the resource, or which
is beyond the end of the resource, the user agent must stop this
algorithm, and assume that the sniffed type of the resource is
"text/html".
Note: User agents are allowed, by the first step of this
algorithm, to wait until the first 512 bytes of the resource
are available.
4. Initialize pos to 0.
5. If s[0] is 0xEF, s[1] is 0xBB, and s[2] is 0xBF, then set pos to
3. (This skips over a leading UTF-8 BOM, if any.)
6. Loop start: Examine s[pos].
* If it is 0x09 (ASCII tab), 0x20 (ASCII space), 0x0A (ASCII
LF), or 0x0D (ASCII CR)
Increase pos by 1 and repeat this step.
* If it is 0x3C (ASCII "<")
Increase pos by 1 and go to the next step.
* If it is anything else
The sniffed type of the resource is "text/html". Abort
these steps.
7. If the bytes with positions pos to pos+2 in s are exactly equal
to 0x21, 0x2D, 0x2D respectively (ASCII for "!--"), then:
1. Increase pos by 3.
2. If the bytes with positions pos to pos+2 in s are exactly
equal to 0x2D, 0x2D, 0x3E respectively (ASCII for "-->"),
Barth & Hickson Expires July 13, 2009 [Page 15]
Internet-Draft Content-Type Processing Model January 2009
then increase pos by 3 and jump back to the previous step
(the step labeled loop start) in the overall algorithm in
this section.
3. Otherwise, increase pos by 1.
4. Return to step 2 in these substeps.
8. If s[pos] is 0x21 (ASCII "!"):
1. Increase pos by 1.
2. If s[pos] equal 0x3E, then increase pos by 1 and jump back
to the step labeled loop start in the overall algorithm in
this section.
3. Otherwise, return to step 1 in these substeps.
9. If s[pos] is 0x3F (ASCII "?"):
1. Increase pos by 1.
2. If s[pos] and s[pos+1] equal 0x3F and 0x3E respectively,
then increase pos by 1 and jump back to the step labeled
loop start in the overall algorithm in this section.
3. Otherwise, return to step 1 in these substeps.
10. Otherwise, if the bytes in s starting at pos match any of the
sequences of bytes in the first column of the following table,
then the user agent must follow the steps given in the
corresponding cell in the second column of the same row.
Barth & Hickson Expires July 13, 2009 [Page 16]
Internet-Draft Content-Type Processing Model January 2009
+----------------------+-----------------------------------+-----------+
| Bytes in Hexadecimal | Requirement | Comment |
+----------------------+-----------------------------------+-----------+
| 72 73 73 | The sniffed type of the resource | "rss" |
| | is "application/rss+xml"; abort | |
| | these steps. | |
+----------------------+-----------------------------------+-----------+
| 66 65 65 64 | The sniffed type of the resource | "feed" |
| | si "application/atom+xml"; abort | |
| | these steps. | |
+----------------------+-----------------------------------+-----------+
| 72 64 66 3A 52 44 46 | Continue to the next step in this | "rdf:RDF" |
| | algorithm. | |
+----------------------+-----------------------------------+-----------+
If none of the byte sequences above match the bytes in s
starting at pos, then the sniffed type of the resource is "text/
html". Abort these steps.
11. ???? If, before the next ">", you find two xmlns* attributes
with http://www.w3.org/1999/02/22-rdf-syntax-ns# and
http://purl.org/rss/1.0/ as the namespaces, then the sniffed
type of the resource is "application/rss+xml", abort these
steps. (maybe we only need to check for http://purl.org/rss/1.0/
actually) ????
12. Otherwise, the sniffed type of the resource is "text/html".
For efficiency reasons, implementations may wish to implement this
algorithm and the algorithm for detecting the character encoding of
HTML documents in parallel.
Barth & Hickson Expires July 13, 2009 [Page 17]
Internet-Draft Content-Type Processing Model January 2009
Authors' Addresses
Adam Barth
Univeristy of California, Berkeley
Email: abarth@eecs.berkeley.edu
URI: http://www.adambarth.com/
Ian Hickson
Google, Inc.
Email: ian@hixie.ch
URI: http://ln.hixie.ch/
Barth & Hickson Expires July 13, 2009 [Page 18]
| PAFTECH AB 2003-2026 | 2026-04-24 01:28:58 |