One document matched: draft-ruellan-headerdiff-00.xml
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type='text/xsl' href='rfc2629xslt/rfc2629.xslt' ?>
<?rfc header="Documentation"?>
<!--?rfc private="RFC2629 through XSLT"?-->
<?rfc toc="yes"?>
<!-- <?rfc topblock="no"?> -->
<?rfc strict="no"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc-ext allow-markup-in-artwork="yes" ?>
<?rfc-ext include-references-in-index="yes" ?>
<rfc docName="draft-ruellan-headerdiff-00">
<front>
<title abbrev="HTTP Header Diff">Header Diff: A compact HTTP header representation
for HTTP/2.0</title>
<author initials="H" surname="Ruellan" fullname="Hervé Ruellan">
<organization abbrev=""></organization>
<address>
<postal>
<street></street>
<city></city>
<region></region>
<code></code>
<country></country>
</postal>
<phone></phone>
<email>herve.ruellan@crf.canon.fr</email>
<uri></uri>
</address>
</author>
<author initials="J" surname="Fujisawa" fullname="Jun Fujisawa">
<organization abbrev="">Canon, Inc.</organization>
<address>
<postal>
<street>3-30-2 Shimomaruko</street>
<city>Ohta-ku, Tokyo </city>
<region></region>
<code>146-8501</code>
<country>Japan</country>
</postal>
<phone></phone>
<email>fujisawa.jun@canon.co.jp</email>
<uri></uri>
</address>
</author>
<author initials="R" surname="Bellessort" fullname="Romain Bellessort">
<organization abbrev=""></organization>
<address>
<postal>
<street></street>
<city></city>
<region></region>
<code></code>
<country></country>
</postal>
<phone></phone>
<email>romain.bellessort@crf.canon.fr</email>
<uri></uri>
</address>
</author>
<author initials="Y" surname="Fablet" fullname="Youenn Fablet">
<organization abbrev=""></organization>
<address>
<postal>
<street></street>
<city></city>
<region></region>
<code></code>
<country></country>
</postal>
<phone></phone>
<email>youenn.fablet@crf.canon.fr</email>
<uri></uri>
</address>
</author>
<date month="March" year="2013" />
<keyword>HTTP</keyword>
<keyword>Header</keyword>
<abstract>
<t>This document describes a format adapted to efficiently represent
HTTP headers in the context of HTTP/2.0.</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>
This document describes a format adapted to efficiently represent
HTTP headers in the context of HTTP/2.0.
</t>
</section>
<section title="Overview" anchor="overview">
<section title="Design Principles">
<t>
HTTP headers can be represented in various ways.
As shown by SPDY,
Deflate compresses very well HTTP headers.
But the use of Deflate has been found to cause security issues.
In particular, the compression of sensitive data,
together with other data controlled by an attacker,
may lead to leakage of the sensitive data.
The processing and memory costs may also
be too high for some classes of devices.
</t>
<t>
Having a lightweight compact HTTP header representation is therefore useful.
To design this representation, the focus was put on the following points:
<list style="symbols">
<t>Simplicity: the representation should have a small number of
options that
allow handling any kind of headers;
in particular, the
use of dedicated codecs for each type of header
value is not
considered here.</t>
<t>Efficiency: the representation should provide good compression
at a small
encoding/decoding cost for both processing and memory.</t>
<t>Flexibility: the representation should be compatible with
constrained devices,
but also provide improved efficiency when
more capable devices are used.</t>
<t>Deflate-friendly: Deflate has proven its efficiency for
encoding HTTP headers.
A good HTTP header representation should be
efficient as a
pre-compression step prior to applying Deflate.</t>
</list>
</t>
</section>
<section title="Outline">
<t>
The HTTP header representation described in this document is based on
indexing tables that store (name,value) pairs,
called header tables in the remainder of this document.
Header tables are incrementally updated during the whole HTTP/2.0
session.
Two independent header tables are used during a HTTP/2.0 session, one
for HTTP request headers and one for HTTP response headers.
</t>
<t>
The encoder is responsible for deciding which headers to insert as
(name,value) pairs in the header table.
The decoder follows exactly what the encoder prescribes.
This enables decoders to remain simple and
understand a wide variety of encoders.
</t>
<t>
A header may be represented as a literal, an index or a delta.
If represented as a literal or a delta, the representation specifies
whether this header is used to update the indexing table.
The different representations are described in
<xref target="header.representation" />.
</t>
<t>
To improve literal headers representation compactness, header names
are indexed in a specific name table.
Two independent name tables are used during a HTTP/2.0 session, one
for HTTP request headers and another for HTTP response headers.
</t>
<t>
An example illustrating the use of the different tables to represent
headers is available in <xref target="example"/>.
Once a set of header is represented using the available representations,
it can optionally be compressed with Deflate.
</t>
</section>
<section title="Integration within HTTP/2.0 ">
<t>
The headers are inserted in the HTTP/2.0 frames at the same place as
defined in SPDY
(next chart was adapted from draft-ietf-httpbis-http2-00).
<figure>
<artwork>
+------------------------------------+
|1| version | 1 |
+------------------------------------+
| Flags (8) | Length (24 bits) |
+------------------------------------+
|X| Stream-ID (31bits) |
+------------------------------------+
|X| Associated-To-Stream-ID (31bits) |
+------------------------------------+
| Pri|Unused | Slot | |
+-------------------+ |
| Number of Name/Value pairs (int32) |
+------------------------------------+
| Encoded name/value pair | <+
+------------------------------------+ | (*)
| (repeats) | <+
(*) This section is the "Name/Value Header Block",
and may be compressed.
</artwork>
</figure>
</t>
<t>
The modifications to SPDY are the following:
<list style="symbols">
<t>
The headers are not represented as string tokens but using one of
the possible representation described in
<xref target="detailed.format" />.
</t>
<t>The Deflate step is made optional.</t>
</list>
</t>
<section title="Deflate Usage" anchor="deflate">
<t>
The header representation described in
<xref target="detailed.format" />
is amenable to Deflate compression.
The Deflate algorithm improves the compression at the expense of
additional processing.
</t>
<t>
At least two potential drawbacks have been identified when using
Deflate.
First, security issues may arise when using Deflate, like the
<eref
target="http://lists.w3.org/Archives/Public/ietf-http-wg/2012JulSep/1273.html">CRIME attack</eref>.
Second, it may increase the workload of network intermediaries: they may need to uncompress and recompress the headers
of all messages, even though they only need to process a few of them.
</t>
<t>
The use of Deflate may still be envisioned if properly set up.
Several approaches are available and should be studied:
<list>
<t>
Restricting Deflate to Huffman-only coding is an option.
This is supported by many Deflate implementations such as zlib.
It may be used in the environments
subject to CRIME attacks.
This approach should be compared to the direct use of hand-tailored
Huffman coding.
</t>
<t>
The use of indexing mechanisms prior Deflate may solve
some security issues.
More precise analysis of the security impact of using indexing
mechanisms prior Deflate should be studied as described in
<xref target="security.issues" />.
</t>
<t>
Partial use of Deflate on a selected subset of headers may also be
an option as described in
<xref target="deflate.partial" />.
</t>
<t>
Restricting the use of Deflate to safe cases,
such as controlled environments (widgets, native applications), anonymous connections and so on can be envisioned.
For instance, restricting the use of Deflate to HTTP response headers should not enable CRIME-like attacks.
</t>
</list>
</t>
</section>
</section>
</section>
<section title="Indexing Strategies" anchor="indexing.strategies">
<section title="Indexing Tables" anchor="indexing.tables">
<section title="Header Table" anchor="header.table">
<t>
A header table consists in an ordered list of (name, value) pairs.
Once a header pair is inserted in the header table, its index does not change until the pair gets removed.
A pair is either inserted at the end of the table or replaces an existing pair depending on the chosen representation.
</t>
<t>
Header names should be represented as lower-case strings.
A header name is matching with a pair name if they are equal using
a character-based,
<spanx>case insensitive</spanx>
comparison.
A header value is matching with a pair value if they are equal using
a character-based,
<spanx>case sensitive</spanx>
comparison.
A header is matching with a (name,value) pair if both name and value
are matching.
</t>
<t>
The header table is progressively updated
based on headers represented as literal (as defined in <xref target="literal" />) or delta (as defined in <xref target="delta" />).
Two update mechanisms are defined:
<list style="symbols">
<t>
Incremental indexing: the represented header is inserted at the end
of the header table as a (name, value) pair.
The inserted pair index is set to the next free index in the table:
it is equal to the number of headers
in the table before its insertion.
</t>
<t>
Substitution indexing: the represented header contains
an index to an existing (name,value) pair.
The existing pair value is replaced by the header value.
</t>
</list>
Incremental and substitution indexing are optional.
If none of them is selected in a header representation,
the header table is not updated.
In particular, no update happens on the header table
when processing an indexed representation.
</t>
<t>
The header table size can be bounded so as to limit the memory
requirements.
The header table size is defined as the sum of the length
(as defined in <xref target="string.literal.representation" />)
of the values of all header table pairs.
Header names are not counted in the header table size.
</t>
</section>
<section title="Name Table" anchor="name.table">
<t>
A name table is an ordered list of name entries
that is used to efficiently represent header names.
A header name is matching a name table entry if they are equal using
a character-based, <spanx>case insensitive</spanx> comparison.
</t>
<t>
If a header name is matching a name table entry, it is
represented as an integer based on the index of the entry,
as described in <xref target="integer.representation" />.
If a header name is not matching any of the name table entry,
it is represented as a string, as described in
<xref target="string.literal.representation" />.
A new entry containing the name is then inserted at the end of the name table.
Once inserted in the name table,
a header name is never removed and its index is never changing.
</t>
<t>
To optimize the representation of the headers exchanged at the beginning of the
HTTP/2.0 session, the header name table is initally populated with common header names.
The initial header names list is provided in <xref target="initial.headers" />.
</t>
</section>
</section>
<section title="Header Representation" anchor="header.representation">
<section title="Literal Representation" anchor="literal">
<t>
The literal representation defines a header
independentently of the header table.
A literal header is represented as:
<list style="symbols">
<t>
A header name, represented using the name table, as described
in <xref target="name.table" />.
</t>
<t>
The header value, represented as a literal string,
as described in <xref target="string.literal.representation" />.
</t>
</list>
</t>
</section>
<section title="Indexed Representation" anchor="indexed">
<t>
The indexed representation defines a header as a match
to a (name,value) pair in the header table.
An indexed header is represented as:
<list style="symbols">
<t>
An integer representing the index of the matching
(name,value) pair, as described in <xref target="integer.representation" />.
</t>
</list>
</t>
</section>
<section title="Delta Representation" anchor="delta">
<t>
The delta representation defines a header as a reference to a (name,value) pair
contained in the header table.
The names must match between the represented header and the reference pair.
The values should start by a common substring between the represented header and the reference pair.
</t>
<t>
A delta header is represented as:
<list style="symbols">
<t>
An integer representing the index of the reference (name,
value) pair, as described in <xref target="integer.representation" />.
The pair name must match the name of the header.</t>
<t>
An integer representing the length of the common prefix shared
between the header value and the pair value,
as described in <xref target="integer.representation" />.</t>
<t>
A string representing the suffix value to append to the common prefix to
obtain the header value,
as defined in <xref target="string.literal.representation" />.
</t>
</list>
</t>
</section>
</section>
</section>
<section title="Detailed Format" anchor="detailed.format">
<section title="Low-level representations" anchor="string.encoding">
<section title="Integer representation" anchor="integer.representation">
<t>
Integers are used to represent name indexes, pair indexes or string lengths.
The integer representation keeps byte-alignment as much
as possible as this allows various processing
optimizations as well as efficient use of DEFLATE.
For that purpose, an integer representation
always finishes at the end of a byte.
</t>
<t>
An integer is represented in two parts: a prefix that fills the current byte
and an optional list of bytes that are used if the integer value does not fit in the prefix.
The number of bits of the prefix (called N) is a parameter of the integer representation.
</t>
<t>
The N-bit prefix allows filling the current byte.
If the value is small enough (strictly less than 2^N-1), it is encoded within the N-bit prefix.
Otherwise all the bits of the prefix are set to 1
and the value is encoded using an <eref target="http://en.wikipedia.org/wiki/Variable-length_quantity">
unsigned variable length integer</eref> representation.
</t>
<t>
The algorithm to represent an integer I is as follows:
<list style="numbers">
<t>If I < 2^N - 1, encode I on N bits</t>
<t>Else, encode 2^N - 1 on N bits and do the following steps:</t>
<t><list style="numbers">
<t> Set I to (I - 2^N - 1) and Q to 1</t>
<t>While Q > 0</t>
<t><list style="numbers">
<t>Compute Q and R, quotient and remainder of I divided by 2^7</t>
<t>If Q is strictly greater than 0, write one 1 bit;
otherwise, write one 0 bit</t>
<t>Encode R on the next 7 bits</t>
<t>I = Q</t>
</list></t>
</list></t>
</list>
</t>
<section title="Example 1: Encoding 10 using a 5-bit prefix"
anchor="integer.representation.example1">
<t>
The value 10 is to be encoded with a 5-bit prefix.
<list style="symbols">
<t>
10 is less than 31 (= 2^5 - 1)
and is represented using the 5-bit prefix.
</t>
</list>
</t>
<figure>
<artwork>
+-----+-----+-----+-----+-----+-----+-----+-----+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-----+-----+-----+-----+-----+-----+-----+-----+
| X | X | X | 0 | 1 | 0 | 1 | 0 | 10 stored on 5 bits
+-----+-----+-----+-----+-----+-----+-----+-----+
</artwork>
</figure>
</section>
<section title="Example 2: Encoding 1337 using a 5-bit prefix"
anchor="integer.representation.example2">
<t>
The value I=1337 is to be encoded with a 5-bit prefix.
<list style="symbols">
<t>1337 is greater than 31 (= 2^5 - 1).</t>
<t><list style="symbols">
<t>The 5-bit prefix is filled with its max value (31).</t>
</list></t>
<t>The value to represent on next bytes is I = 1337 - 2^5 =
1305.</t>
<t><list style="symbols">
<t>1305 = 128*10 + 25, i.e. Q=10 and R=25.</t>
<t>Q is greater than 1, bit 8 is set to 1.</t>
<t>The remainder R=25 is encoded on next 7 bits.</t>
<t>I is replaced by the quotient Q=10.</t>
</list></t>
<t>The value to represent on next bytes is I = 10.</t>
<t><list style="symbols">
<t>10 = 128*0 + 10, i.e. Q=0 and R=10.</t>
<t>Q is equal to 0, bit 16 is set to 0.</t>
<t>The remainder R=10 is encoded on next 7 bits.</t>
<t>I is replaced by the quotient Q=0.</t>
</list></t>
<t>The process ends.</t>
</list>
</t>
<figure>
<artwork>
+----+-----+-----+-----+-----+-----+-----+-----+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-----+-----+-----+-----+-----+-----+-----+-----+
| X | X | X | 1 | 1 | 1 | 1 | 1 | Prefix = 31
| 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | Q>=1, R=25
| 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Q=0 , R=10
+-----+-----+-----+-----+-----+-----+-----+-----+
</artwork>
</figure>
</section>
</section>
<section title="String literal representation" anchor="string.literal.representation">
<t>
Literal strings can represent header names,
header values or header values suffix in the case of delta coding.
They are encoded in two parts:
<list style="numbers">
<t>The string length, defined as the number of bytes needed to
store its UTF-8 representation, is represented as an integer with
a zero bits prefix.
If the string length is strictly less than 128, it is represented
as one byte.</t>
<t>The string value represented as a list of UTF-8 characters.</t>
</list>
</t>
</section>
</section>
<section title="Indexed Header Representation">
<t>
Indexed headers can be represented as short indexed header if
the matching pair index is strictly below 64.
Otherwise it is represented as a long indexed header.
</t>
<section title="Short Indexed Header">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 1 | 0 | 00 0000 - 11 1111 |
| | | Matching pair index |
| | | (if strictly lower than 64) |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation starts with the '10' 2-bit pattern,
followed by the index of the matching pair,
represented on 6 bits.
A short indexed header is always coded in one byte.
</t>
</section>
<section title="Long Indexed Header">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | ... | ... | e | f |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 1 | 1 | 00 0000 0000 0000 - 11 1111 1111 1111 1111 |
| | | Matching pair index |
| | | (if equal to or greater than 64) |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation starts with the '11' 2-bit pattern,
followed by the value of the index of the matching pair minus 64, represented as an integer with a 14-bit prefix
A long indexed header is coded in two bytes if the index minus 64 is strictly below 16383.
</t>
</section>
</section>
<section title="Literal Header Representation">
<section title="Literal Header without Indexing">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| | | | 0 0000 |
| | | | New header name symbol |
| 0 | 0 | 0 |---------------------------------------|
| | | | 0 0001 - 1 1111 |
| | | | Index of matching header name |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation, which does not involve updating the header table, starts with the '000' 3-bit pattern.
</t>
<t>
If the header name matches a header name entry whose index is IN,
the value (IN+1) is represented
as an integer with a 5-bit prefix.
Note that if the index is strictly below 30, one byte is used.
</t>
<t>
If the header name does not match a header name entry,
the value 0 is represented on 5 bits
followed by the header name, represented as a literal string.
</t>
<t>
Header name representation is followed by the header value represented as a literal string.
</t>
</section>
<section title="Literal Header with Indexing">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| | | | | 0000 |
| | | |Indexing| New header name symbol |
| 0 | 0 | 1 | |-------------------------------|
| | | | Mode | 0001 - 1111 |
| | | | | Index of matching header name |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation starts with the '001' 3-bit pattern.
The fourth bit sets the indexing mode: 0 for incremental indexing, 1
for substitution indexing.
</t>
<t>
If the header name matches a header name entry whose index is IN,
the value (IN+1) is represented as an integer with a 4-bit prefix.
Note that if the index is strictly below 14, one byte is used.
</t>
<t>
If the header name does not match a header name entry,
the value 0 is represented on 4 bits
followed by the header name, represented as a literal string.
</t>
<t>
Header name representation is followed by the header value represented as a string as
described in <xref target="string.literal.representation" />.
In the case of substitution indexing, the substituted (name,value)
pair index is inserted before the header value as a zero-bit prefix integer.
The header value is represented as a literal string.
</t>
</section>
</section>
<section title="Delta Header Representation">
<section title="Delta Header without Indexing">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 0 | 0 0000 - 1 1111 |
| | | | Index of reference pair |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation starts with the '010' 3-bit pattern.
</t>
<t>
It continues with the index IR of the reference header pair.
The value IR is represented as an integer with a 5-bit prefix.
Note that if the index is strictly below 31, one byte is used.
</t>
<t>
Index value is followed by:
<list style="numbers">
<t>
the length of the common prefix shared between the header value
and the pair value, represented as an integer with a zero-bit prefix.
</t>
<t>
the header value suffix represented as a literal string.
</t>
</list>
</t>
</section>
<section title="Delta Header with Indexing">
<figure>
<artwork>
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 1 | 1 |Indexing| 0000 - 1111 |
| | | | Mode | Index of reference pair |
+-------+-------+-------+-------+-------+-------+-------+-------+
</artwork>
</figure>
<t>
This representation starts with the '011' 3-bit pattern.
The fourth bit sets the indexing mode, 0 for incremental indexing and
1 for substitution indexing.
</t>
<t>
It continues with the index IR of the reference header pair.
The value IR is represented as an integer with a 4-bit prefix.
Note that if the index is strictly below 15, one byte is used.
</t>
<t>
Index value is followed by:
<list style="numbers">
<t>
the length of the common prefix shared between the header value
and the pair value, represented as an integer with a zero-bit
prefix.
</t>
<t>
the header value suffix, represented as a literal string.
</t>
</list>
</t>
</section>
</section>
</section>
<section title="Parameter Negotiation">
<t>
Two parameters may be used to accomodate the client and server processing and memory requirements:
<list style="symbols">
<t>A parameter Nh that configures the size of the header table.
The size can be computed as 2^Nh.
Nh is exchanged as an unsigned integer.
</t>
<t>A parameter Nd that configures the Deflate step.
If Nd is equal to zero, no Deflate step is used.
Otherwise, Deflate is used with a sliding window equal to 2^Nd.
Huffman-only coding is advertised using the Deflate block initial bits.
Nd is exchanged as an unsigned integer.
</t>
</list>
</t>
<t>
This section should be further completed, including but not limited to the following points:
<list style="symbols">
<t>Define default values?</t>
<t>Define when negotiation happens: at the beginning only, any time
during the session...</t>
<t>Define how are exchanged these parameters,
probably using SETTINGS frames, through the definition of 4 settings
(outgoing-Nd, outgoing-Nh, preferred-incomingNd,
preferred-incoming-Nh).</t>
</list>
</t>
</section>
<section title="Open Questions">
<section title="Typed Codecs">
<t>
Typed codecs may be useful to represent header values, especially
on response side.
An additional typed header representation could be defined,
adding support for a small number of codecs such as:
<list style="symbols">
<t>An Integer codec: may be useful for headers such as 'Age' and
'Content-Length'.</t>
<t>A Date codec: may be useful for headers such as 'Date',
'Expires', 'If-Modified-Since', 'Last-Modified'.</t>
</list>
</t>
</section>
<section title="Specific header processing">
<t>
Some (name,value) pairs may be singled out to improve network nodes
processing, such as:
<list style="symbols">
<t>The request line (verb, version and URL) for HTTP requests.</t>
<t>The response line (status and version) for HTTP responses.</t>
</list>
For those headers, the specification may define specific rules that
can improve the processing cost, at the expense of some compression loss:
<list style="symbols">
<t>All those headers are placed as the first headers in the
SYN_STREAM frame, with a predefined order.</t>
<t>Verb, version and status are represented as integers with
zero-bit prefix.</t>
<t>Indexed, delta or literal representation may be used for URL values.
In the case of delta and literal representation, only the
substitution mode is used so that
a processor only needs to store the URL of the previous message to
compute the URL of the current message.
</t>
</list>
</t>
</section>
<section title="Security Issues" anchor="security.issues">
<t>Adequate use of delta and indexed representation
before using of Deflate
are supposed to solve security issues such as the CRIME attack.
For instance, if cookie headers are represented as indexed headers as much as possible,
attackers may be prevented from progressively learning its value.
</t>
<t>
This point should be confirmed with deeper analysis.
Additional study should also be done to evaluate whether
the proposed indexed and delta representation
create any new security issue.
</t>
</section>
<section title="Deflate Partial Usage" anchor="deflate.partial">
<t>
To circumvent Deflate issues related to both security and network
intermediaries,
the header set of a given message can be split in two buckets.
A first bucket would be sent without using Deflate, while the second
bucket would be further compressed using Deflate.
The decision would be done by the encoder.
The specification could define a minimum set of
headers that SHOULD never be compressed using Deflate,
for instance URLs and cookies.
</t>
<t>
Another possibility would be to define Deflate as a specific representation.
The use of Deflate would then be decided on per-header basis.
That would enable excluding any header that may contain sensitive data.
The overall scheme would be less efficient though (padding bits and so on).
</t>
<t>
Additional analysis of the complexity and benefit of these
approaches would be needed to go further.
For instance, these approaches should be compared to
the use of Deflate restricted to Huffman-coding
in terms of simplicity and compression benefits.
</t>
</section>
<section title="Max length and entry numbers">
<t>
The integer representation allows representation of unbounded
values.
If bounding the table entries number or string lengths, the integer
encoding may be further optimized.
</t>
<t><list style="symbols">
<t>Decide whether to limit the length of strings to a max value and
if so which value, 32768?</t>
<t>Decide whether to limit name table entries to 256?</t>
<t>Decide whether to limit header table entries to 16384?</t>
</list></t>
</section>
</section>
<section anchor="Security" title="Security Considerations">
<t>This section should be completed according the previous sections.</t>
</section>
<section anchor="IANA" title="IANA Considerations">
<t>This memo includes no request to IANA.</t>
</section>
</middle>
<back>
<!--references title="Informative References">
</references-->
<section title="Initial header names" anchor="initial.headers">
<section title="Requests">
<t>Indexes strictly lower than 14 are always encoded on 1 byte.
Hence, the 14 most frequent names should be set in the 14 first positions.
This table may be updated based on statistical analysis
of header names frequency and specific HTTP 2.0 header rules (like removal of 'proxy-connection', url being split or not...).</t>
<figure>
<artwork>
+---------+------------------------------------+
| Index | Header Name |
+---------+------------------------------------+
| 0 | accept |
+---------+------------------------------------+
| 1 | accept-charset |
+---------+------------------------------------+
| 2 | accept-encoding |
+---------+------------------------------------+
| 3 | accept-language |
+---------+------------------------------------+
| 4 | cookie |
+---------+------------------------------------+
| 5 | method |
+---------+------------------------------------+
| 6 | host |
+---------+------------------------------------+
| 7 | if-modified-since |
+---------+------------------------------------+
| 8 | keep-alive |
+---------+------------------------------------+
| 9 | url |
+---------+------------------------------------+
| 10 | user-agent |
+---------+------------------------------------+
| 11 | version |
+---------+------------------------------------+
| 12 | proxy-connection |
+---------+------------------------------------+
| 13 | referer |
+---------+------------------------------------+
| 14 | accept-datetime |
+---------+------------------------------------+
| 15 | authorization |
+---------+------------------------------------+
| 16 | allow |
+---------+------------------------------------+
| 17 | cache-control |
+---------+------------------------------------+
| 18 | connection |
+---------+------------------------------------+
| 19 | content-length |
+---------+------------------------------------+
| 20 | content-md5 |
+---------+------------------------------------+
| 21 | content-type |
+---------+------------------------------------+
| 22 | date |
+---------+------------------------------------+
| 23 | expect |
+---------+------------------------------------+
| 24 | from |
+---------+------------------------------------+
| 25 | if-match |
+---------+------------------------------------+
| 26 | if-none-match |
+---------+------------------------------------+
| 27 | if-range |
+---------+------------------------------------+
| 28 | if-unmodified-since |
+---------+------------------------------------+
| 29 | max-forwards |
+---------+------------------------------------+
| 30 | pragma |
+---------+------------------------------------+
| 31 | proxy-authorization |
+---------+------------------------------------+
| 32 | range |
+---------+------------------------------------+
| 33 | te |
+---------+------------------------------------+
| 34 | upgrade |
+---------+------------------------------------+
| 35 | via |
+---------+------------------------------------+
| 36 | warning |
+---------+------------------------------------+
</artwork>
</figure>
</section>
<section title="Responses">
<t>
Indexes strictly lower than 14 are always encoded on 1 byte.
Hence, the 14 most frequent names should be set in the 14 first positions.
This table may be updated based on statistical analysis
of header names frequency and specific HTTP 2.0 header rules.
</t>
<figure>
<artwork>
+---------+------------------------------------+
| Index | Header Name |
+---------+------------------------------------+
| 0 | age |
+---------+------------------------------------+
| 1 | cache-control |
+---------+------------------------------------+
| 2 | content-length |
+---------+------------------------------------+
| 3 | content-type |
+---------+------------------------------------+
| 4 | date |
+---------+------------------------------------+
| 5 | etag |
+---------+------------------------------------+
| 6 | expires |
+---------+------------------------------------+
| 7 | last-modified |
+---------+------------------------------------+
| 8 | server |
+---------+------------------------------------+
| 9 | set-cookie |
+---------+------------------------------------+
| 10 | status |
+---------+------------------------------------+
| 11 | vary |
+---------+------------------------------------+
| 12 | version |
+---------+------------------------------------+
| 13 | via |
+---------+------------------------------------+
| 14 | access-control-allow-origin |
+---------+------------------------------------+
| 15 | accept-ranges |
+---------+------------------------------------+
| 16 | allow |
+---------+------------------------------------+
| 17 | connection |
+---------+------------------------------------+
| 18 | content-disposition |
+---------+------------------------------------+
| 19 | content-encoding |
+---------+------------------------------------+
| 20 | content-language |
+---------+------------------------------------+
| 21 | content-location |
+---------+------------------------------------+
| 22 | content-md5 |
+---------+------------------------------------+
| 23 | content-range |
+---------+------------------------------------+
| 24 | link |
+---------+------------------------------------+
| 25 | location |
+---------+------------------------------------+
| 26 | p3p |
+---------+------------------------------------+
| 27 | pragma |
+---------+------------------------------------+
| 28 | proxy-authenticate |
+---------+------------------------------------+
| 29 | refresh |
+---------+------------------------------------+
| 30 | retry-after |
+---------+------------------------------------+
| 31 | strict-transport-security |
+---------+------------------------------------+
| 32 | trailer |
+---------+------------------------------------+
| 33 | transfer-encoding |
+---------+------------------------------------+
| 34 | warning |
+---------+------------------------------------+
| 35 | www-authenticate |
+---------+------------------------------------+
</artwork>
</figure>
</section>
<section title="Example" anchor="example">
<t>Here is an example that illustrates different representations and how tables are updated.</t>
<section title="First header set">
<t>
The first header set to represent is the following:
<figure><artwork>
url: http://www.example.org/my-example/index.html
user-agent: my-user-agent
x-my-header: first
</artwork></figure>
The header table is empty, all headers are represented as literal headers with indexing.
The 'x-my-header' header name is not in the header name table and is encoded literally.
This gives the following representation:
<figure><artwork>
0x2A (literal header with indexing, name index = 9)
0x2C (header value string length = 44)
http://www.example.org/my-example/index.html
0x2B (literal header with indexing, name index = 10)
0x0D (header value string length = 43)
my-user-agent
0x20 (literal header with indexing, new name)
0x0B (header name string length = 11)
x-my-header
0x05 (header value string length = 5)
first
</artwork></figure>
The header tables are as follow after the processing of these headers:
<figure><artwork>
Name table
+---------+---------------------------------------------+
| Index | Header Name |
+---------+---------------------------------------------+
| 0 | accept |
+---------+---------------------------------------------+
| 1 | accept-charset |
+---------+---------------------------------------------+
| ... | ... |
+---------+---------------------------------------------+
| 36 | warning |
+---------+---------------------------------------------+
| 37 | x-my-header | added name
+---------+---------------------------------------------+
</artwork></figure>
<figure><artwork>
Header table
+----+-------------+------------------------------------+
| 0 | url | http://www.example.org/ | added pair
| | | my-example/index.html |
+----+-------------+------------------------------------+
| 1 | user-agent | my-user-agent | added pair
+----+-------------+------------------------------------+
| 2 | x-my-header | first | added pair
+----+-------------+------------------------------------+
</artwork></figure>
</t>
</section>
<section title="Second header set">
<t>
The second header set to represent is the following:
<figure><artwork>
url: http://www.example.org/my-example/resources/script.js
user-agent: my-user-agent
x-my-header: second
</artwork></figure>
The url header is represented as a delta header with substitution.
The user-agent header will be represented as a short header.
The x-my-header will be represented as a literal header with indexing.
<figure><artwork>
0x70 (delta header with substitution, header index = 0)
0x22 (common prefix length = 32)
0x13 (suffix value length = 19)
resources/script.js
0x81 (indexed header, index = 1)
0x2f 0x17 (literal header with indexing, name index = 37)
0x05 (header value string length = 5)
second
</artwork></figure>
The name table remains unchanged. The header table is updated as follow:
<figure><artwork>
+----+-------------+------------------------------------+
| 0 | url | http://www.example.org/ | substituted
| | | my-example/resources/script.js | pair
+----+-------------+------------------------------------+
| 1 | user-agent | my-user-agent |
+----+-------------+------------------------------------+
| 2 | x-my-header | first |
+----+-------------+------------------------------------+
| 3 | x-my-header | second | added pair
+----+-------------+------------------------------------+
</artwork></figure>
</t>
</section>
</section>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 06:58:16 |