One document matched: draft-rpeon-httpbis-header-compression-00.txt
Network Working Group R. Peon
Internet-Draft Google, Inc
Intended status: Informational Oct 12, 2012
Expires: April 15, 2013
Header Delta-Compression for HTTP/2.0
draft-rpeon-httpbis-header-compression-00
Abstract
This document describes a mechanism for compressing streams of groups
of key-value pairs, often known as Headers in an HTTP session. See
RFC 2616 [RFC2616] or successors for more information about headers.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 15, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Peon Expires April 15, 2013 [Page 1]
Internet-Draft HTTP/2 Header Compression Oct 2012
Table of Contents
1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Example-data-model . . . . . . . . . . . . . . . . . . . . . . 3
4. Compression-overview . . . . . . . . . . . . . . . . . . . . . 4
5. Decompression-overview . . . . . . . . . . . . . . . . . . . . 6
6. On-the-wire-overview . . . . . . . . . . . . . . . . . . . . . 7
7. Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8. Security Considerations . . . . . . . . . . . . . . . . . . . 10
9. Requirements Notation . . . . . . . . . . . . . . . . . . . . 10
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
11. Normative References . . . . . . . . . . . . . . . . . . . . . 10
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10
Peon Expires April 15, 2013 [Page 2]
Internet-Draft HTTP/2 Header Compression Oct 2012
1. Overview
There have been several problems pointed out with the use of the gzip
compressor in SPDY [SPDY]. The biggest of these problems is that it
is possible for a smart attacker to inject content into the
compressor, and then to test hypotheses about the prior contents of
the compressor by examining the output size after each such content
injection. The other issue is that gzip often consumes more CPU than
many would like, especially in situations where one is doing forward
or reverse proxying. The compressor proposed here intends to solve
the first issue and significantly mitigate the second, while
providing compression that is not too much worse than gzip.
2. Definitions
user-agent: The program or device which a human interacts with
directly and which typically initiates the transport layer
connection or session
client: Synonym for user-agent
server: The computer or device which typically accepts a
connection, stores, and serves data
proxy: An entity acting as a server for the client, and a
client for the server
header: An complete set of key-value pairs, either request-
headers, or response headers as defined in RFC2616 [RFC2616]
section 5.3 or 6.2, respectively
3. Example-data-model
The author's current experimental implementation uses something very
close to the following data-model for doing compression.
Peon Expires April 15, 2013 [Page 3]
Internet-Draft HTTP/2 Header Compression Oct 2012
struct ValEntry {
KeyMapIterator krt;
ValMapIterator vrt;
LRUIterator lrt;
Index val_idx;
};
typedef map<string, ValEntry*> ValMap;
struct KeyEntry {
int refcnt;
Index key_idx;
ValMap valmap;
}
typedef map<string, KeyEntry> KeyMap;
typedef set<Index> HeaderGroup;
typedef map<Index, HeaderGroup> HeaderGroups;
struct CompressorState {
HeaderGroups header_groups;
KeyMap key_map;
list<ValEntry*> lru;
};
typedef map<KeyIdx, string>; KeyIdxMap;
typedef map<ValIdx, pair<KeyIdxMap::iterator,string>; ValIdxMap;
struct DecompressorState {
KeyIdxMap key_idx_map;
ValIdxMap val_idx_map;
list<KeyIdxMap::iterator> lru;
};
4. Compression-overview
Note that this only describes conceptually how the algorithm works
and is very much pseudo-code
User-agents and servers SHOULD NOT NOT use EREF operations.
User-agents and servers SHOULD use huffman-coding when so doing
reduces the size of the bytes on the wire
Proxies MAY use EREF operations when throughput is more important
than client experienced latency.
Peon Expires April 15, 2013 [Page 4]
Internet-Draft HTTP/2 Header Compression Oct 2012
Proxies MAY choose to encode all strings as literal strings and skip
huffman coding.
Compress(headers):
first_used_val_idx = CurrentValIdx()
old_keymap = keymap
For header in headers:
if key in keymap:
keymap[key].refcnt += 1
for header in headers:
key_map_iterator = lookup(header.key)
if (!key_map_iterator):
key_iterator = insert_new_key(header.key)
insert_new_value_from_key(key_iterator, header.val)
instructions.append(KVSto(key,val))
else if (val_iterator = lookup_val_in_keymap_entry(key_map_iterator,
header.val)):
// we've found that the key,value pair already exists in the compressor
// we now must check to see if it is already visible, in which case
// there is nothing to do. If it isn't visible, however, we'll toggle
// the visibility on.
if (!header_group.includes(val_iterator_>val_idx)):
instructions.append(Toggle(val_iterator_>val_idx));
else:
// we know we found a key, but not the entire key_value.
// thus, we emit a clone() instructions.
instructions.append(Clone(key_iterator_>key_id, header.val)
if key in old_keymap:
keymap[key].refcnt += 1
for header_group_entry in header_group:
if header_group_entry not in header:
instructions.append(header_group_entry_>idx())
for instruction in instructions:
ExecuteInstruction(instructions)
SerializeInstructions(output_buffer, instructions, first_used_val_idx)
ExecuteInstruction(instruction):
while (instruction.size_delta() + state_size > max_state_size):
RemoveOldestLRUItem()
if instruction.opcode == "clone":
InsertKV(instruction.lookup_key(), instruction.val(), GetNextLRUIdx())
elif instruction.opcode == "kvsto":
InsertKV(instruction.key(), instruction.val(), GetNextLRUIdx())
Peon Expires April 15, 2013 [Page 5]
Internet-Draft HTTP/2 Header Compression Oct 2012
elif instruction.opcode == "togglerange":
for (i=instruction.start; i <= instruction.last; ++i)
bool prev_state = ToggleIdx(i)
if prev_state == NOT_VISIBLE:
MoveToFrontOfLRU(i)
elif instruction.opcode == "toggle":
bool prev_state = ToggleIdx(instruction.index)
if prev_state == NOT_VISIBLE:
MoveToFrontOfLRU(instruction.index)
SerializeInstructions(output_buffer, instructions, first_val_idx):
// Note that huffman-encoding is optional-- a bit (somwhere) would
// indicate that it was or was not used for any particular string.
output_buffer.WriteValIdx(first_val_idx)
for toggle in filter(TOGGLE, instructions):
output_buffer.WriteBits(TOGGLE)
output_buffer.WriteValIdx(toggle.val_idx)
for toggle in filter(TOGGLE_RANGE, instructions):
output_buffer.WriteBits(TOGGLE_RANGE)
output_buffer.WriteValIdx(toggle.start)
output_buffer.WriteValIdx(toggle.last)
for clone in filter(CLONE, instructions):
output_buffer.WriteBits(CLONE)
output_buffer.WriteKeyIdx(clone.key_idx)
output_buffer.WriteString(huffman.encode(clone.val))
for kvsto in filter(KVSTO, instructions):
output_buffer.WriteString(huffman.encode(kvsto.key))
output_buffer.WriteString(huffman.encode(kvsto.val))
for eref in filter(EREF, instructions):
output_buffer.WriteString(huffman.encode(kvsto.key))
output_buffer.WriteString(huffman.encode(kvsto.val))
5. Decompression-overview
Note that this only describes conceptually how the algorithm works
and is very much pseudo-code
Peon Expires April 15, 2013 [Page 6]
Internet-Draft HTTP/2 Header Compression Oct 2012
Decode():
header_group = headers.header_group // as received from sender
val_idx = headers_frame.first_val_idx // as received from sender.
header_groups[header_group].ephemereal_headers.clear()
for instruction in instructions:
ExecuteInstruction(instruction, header_group)
ExecuteInstruction(instruction, header_group):
while (instruction.size_delta() + state_size > max_state_size):
RemoveOldestLRUItem()
if instruction.opcode == TOGGLE:
if instruction.idx in header_groups[header_group]:
header_groups[header_group].remove(instruction.idx)
else:
header_groups[header_group].insert(instruction.idx)
MoveToHeadOfLRU(instruction.idx)
elif instruction.opcode == TOGGLE_GROUP:
for i=instruction.begin; i <= instruction.last; ++i:
ExecuteInstruction(Toggle(i), header_group);
elif instruction.opcode == CLONE:
KeyIdxMap::iterator kimi = key_idx_map.lookup(instruction.key_idx)
val_idx_map.insert(++val_idx, pair(kimi, instruction.val))
elif instruction.opcode == KVSTO:
key_idx = GetNextKeyIdx()
KeyIdxMap::iterator kimi = key_idx_map.insert(key_idx,
instruction.key)
val_idx_map.insert(++val_idx, pair(kimi, instruction.val))
elif instruction.opcode == EREF:
header_groups[header_group].ephemereal_headers.append(
pair(instruction.key, instruction.val))
// This would return the complete set of headers, e.g. for an HTTP
// request or response
InflateHeaders(header_group):
Headers headers
for idx in header_groups[header_group]:
headers.append(ValIdxMap[idx].first->first, ValIdxMap[idx].second)
headers.append(header_groups[header_group].ephemereal_headers);
return headers
6. On-the-wire-overview
This would replace the current HEADERS frame described in the SPDY/2
draft specification.
Peon Expires April 15, 2013 [Page 7]
Internet-Draft HTTP/2 Header Compression Oct 2012
The header-group-id MUST be emitted immediately after the HEADERS
frame boilerplate.
A field indicating the start of the ID-sequence space MUST be
immediately after the header-group-id
A sequence of 'Toggle' or 'ToggleRange' operations MUST follow, if
any of these operations are to be emitted at all
A sequence of 'Clone' or 'KVSto' operations MUST then follow, if any
of these operations are to be emitted at all
A sequence of 'ERef' operations MUST then follow, if any of these are
to be emitted.
If the size of the aggregate set of instructions is larger than the
frame-length, the 'finished' bit of the HEADERS frame boilerplate
MUST be set to 0. This will indicate that the next subsequent
HEADERS frame on that stream is a continuation of a previous HEADERS
frame
If all of the 'Toggle', 'ToggleRange', 'Clone', and 'KVSTo'
operations have been emitted, the 'finished' bit of the HEADERS frame
which encapsulates the last instruction MUST be set to 1.
The rough ABNF for this would be:
HEADERS_FRAME: HEADER_FRAME_BOILERPLATE GROUP-ID FIRST-VAL-IDX INSTRUCTIONS
HEADER_FRAME_BOILERPLATE: LENGTH STREAM_ID HEADERS_OPCODE HEADER_DONE_FLAG
HEADER_DONE_FLAG: (TRUE | FALSE)
INSTRUCTIONS : *(TOGGLE | TOGGLERANGE) *( CLONE | KVSTO) *(EREF)
TOGGLE: TOGGLE_OPCODE VAL_IDX
TOGGLERANGE: TOGGLERANGE_OPCODE VAL_IDX_START VAL_IDX_END
CLONE: CLONE_OPCODE KEY_IDX VAL_STRING
KVSTO: KVSTO_OPCODE KEY_STRING VAL_STRING
EREF: EREF_OPCODE KEY_STRING VAL_STRING
KEY_STRING: MAYBE_COMPRESSED_STRING
VAL_STRING: MAYBE_COMPRESSED_STRING
MAYBE_COMPRESSED_STRING: TRUE HUFFMAN_COMPRESSED_STRING |
FALSE STRING_LITERAL
VAL_IDX_START: int
VAL_IDX_END: int
VAL_IDX: int
A complete set of headers has only been expressed when the HEADERS
frame has the HEADERS_DONE bit set to true. If the headers being
Peon Expires April 15, 2013 [Page 8]
Internet-Draft HTTP/2 Header Compression Oct 2012
serialized do not fit into one HEADERS frame, the HEADERS_DONE flag
for all but the last HEADERS frame MUST be set to false
There are alternate serializations which have various interesting
properties. Here are some interesting variations among these
alternatives:
Serialize all constant-length information before serializing
any string data. This allows for a clear separation of byte-
aligned vs. potentially non-byte-aligned data
Have explicit sections for each opcode type, with a sentinel
denoting the end of each range of opcode. Since the number of
opcode types is small, this could reduce the overhead consumed
in repeating the opcode.
As above, but have a count for the number of each opcode
instead of using a sentinel
Use huffman coding on the entire payload of the HEADERS frame,
not just the strings. TOGGLES are far more common than other
operations, and using fewer bits for these could be beneficial
7. Operations
Toggle: This encodes a single ValIdx, which, if already present in
the HeaderGroup associated with the current HEADERS frame, will be
removed, else added. If added, then the LRU entry which the index
references MUST be moved to the head of the LRU.
ToggleRange: This incodes two ValIdx, denoting an inclusive range of
ValIdx which shall be toggled as per the Toggle operation, above.
Clone: This encodes a KeyIdx which references a pre-existing key, and
references a new string. The Clone operation inserts a new ValEntry
into the ValMap associated with the KeyMap entry identified by the
KeyIdx. The ValIdx for this new ValEntry is the ++current_val_idx;
The new ValEntry MUST be inserted at the head of the LRU.
KVSto: This encodes a new key string and new value string. The key
will be inserted into the KeyMap, and assigned a new KeyIdx. The
value string will be inserted into the Valmap associated with the
KeyMap entry that was just added. As with Clone, a new unique ValIdx
will be associated with it in a monotonically increasing sequence by
using the value of ++current_val_idx. The new ValEntry MUST be
inserted at the head of the LRU.
Peon Expires April 15, 2013 [Page 9]
Internet-Draft HTTP/2 Header Compression Oct 2012
Eref: This encodes a key and value literal which, while interpreted
as part of the header frame, doesn't cause any persistent change in
the compressor state. ERef stands for "ephemereal reference".
8. Security Considerations
The compressor algorithm described here is expected to be immune to
the current attacks against encrypted stream-based compressors such
as TLS+gzip, but more scrutiny is warranted. The reason that it is
believed that the algorithm(s) expressed here is immune is that any
backreference to a header key or value always requires a whole-text
match, and thus any probe of the compression context confirms no
hypothesis unless the attacker has guessed the entire plaintext.
9. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
10. Acknowledgements
11. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[SPDY] Belshe, M. and R. Peon, "SPDY PROTOCOL",
<http://tools.ietf.org/html/draft-mbelshe-httpbis-spdy>.
Author's Address
Roberto Peon
Google, Inc
Email: fenix@google.com
Peon Expires April 15, 2013 [Page 10]
| PAFTECH AB 2003-2026 | 2026-04-24 07:04:17 |