One document matched: draft-snell-httpbis-bohe-00.xml


<?xml version="1.0"?> 
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [ 
  <!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
]>
<?rfc toc="yes"?> 
<?rfc strict="yes"?> 
<?rfc symrefs="yes" ?> 
<?rfc sortrefs="yes"?> 
<?rfc compact="yes"?> 
<rfc category="info" ipr="trust200811" docName="draft-snell-httpbis-bohe-00"> 
  <front> 
    <title abbrev="application/merge-patch"> 
      HTTP/2.0 Discussion: Binary Optimized Header Encoding
    </title> 
 
    <author initials="J.M." surname="Snell" fullname="James M Snell"> 
      <address> 
        <email>jasnell@gmail.com</email> 
      </address> 
    </author> 
    
    <date month="August" year="2012" /> 
 
    <keyword>I-D</keyword> 
    <keyword>http</keyword>
    <keyword>spdy</keyword>
 
    <abstract> 
      <t>This memo describes a proposed alternative encoding for
      headers within SPDY SYN_STREAM, SYN_REPLY and HEADERS 
      frames.</t> 
    </abstract> 
 
  </front> 
  
  <middle> 

    <section title="Binary Optimized Header Encoding">
    
      <t>Binary Optimized Header Encoding is a proposed
      alternative serialization for headers within SPDY SYN_STREAM, 
      SYN_REPLY and HEADERS frames that is designed to optimize 
      generation, consumption and processing of the most commonly 
      used HTTP headers.</t>
      
<figure><preamble>Alternate Header Block Serialization:</preamble><artwork>
   +------------------------------------+
   |      Number of Headers (8bit)      |
   +------------------------------------+
   |T|            Header                |
   +------------------------------------+
   | ...                                |
</artwork></figure>

      <t>Within the existing SPDY Header Block, a 32-bit value is 
      used to identify the number of headers within the block. 
      For all practical purposes, it is exceedingly unlikely that a 
      single block of headers will contain anywhere near 
      4,294,967,295 distinct headers. Obviously a 32-bit integer
      is significant overkill for this purpose. As an alternative,
      an 8-bit value suggested.</t>
      
      <t>The header block consists of zero or more distinct headers,
      each of which begin with a single Type-bit whose value indicates
      the type of header. There are two header types: Registered and
      Extension. The specific structure of the remaining header 
      depends on the header type.</t>
      
      <t>The header block MAY be compressed as described within
      [draft-montenegro-httpbis-speed-mobility-02].</t>
        
      <section title="Registered Headers">
        
        <t>Registered Headers are well-known and well-defined 
        headers for which there is a published RFC and IANA 
        registration. Each is assigned an unsigned 12-bit integer
        identifier and an unsigned 3-bit integer codepage. If the codepage 
        is 0, the implication is that the header MUST be understood 
        in order for the request or response message to be handled 
        properly. Codepages 1-5 represent "MUST-IGNORE" headers; that 
        is, such headers MUST be ignored by processors if they are 
        unrecognized by the processing application. Codepages 6 and 7 
        are reserved for "Private Use", with Codepage 6 being used 
        for "MUST UNDERSTAND PRIVATE USE" headers.</t>
        
<figure><preamble>The structure of Registered Headers:</preamble><artwork>
  +------------------------------+
  |0| cp(3-bit) |   id (12-bit)  |
  +------------------------------+
  | flags(8-bit) |  len (16-bit) |
  +------------------------------+
  |          value...            |
  +------------------------------+
</artwork></figure>

          <t>The first single bit within the structure is the Type-bit. 
          When this bit is off, the header is a Registered Header.</t>
          
          <t>The next three bits identify the headers codepage. The value is 
          interpreted as an unsigned integer in the range 0-7.</t>
          
          <t>The next twelve bits specify the header's specific numeric 
          identifier within the codepage.</t>
          
          <t>Following the identifier are 8 reserved flag bits.
            <list style="symbols">
              <t>Bit 0x1 indicates that the header value contains 
              UTF-8 encoded character content. If the bit is not 
              set, the value is assumed to contain non-character-based
              binary data.</t>
              <t>Bit 0x2 indicates that the header specifies multiple
              NUL (0) separated values. When set, processors MUST treat
              NUL (0) octets within the value as a delimiter and 
              not as part of the value itself.</t>
            </list>
          </t>
          
          <t>The remaining content of the structure consists of a 16-bit 
          unsigned integer specifying the remaining length of the header 
          value. The value MAY be zero length.</t>
          
          <t>The minimum length of a registered header is 5-octets (40-bits).</t>
          
          <t>When Flag 0x2 is set, the header may contain multiple 
          values separated by a single NUL (0) byte. Each distinct value 
          MUST NOT be zero-length. When Flag 0x2 is not set, and 
          Flag 0x1 is also not set, NUL bytes contained within the value are 
          to be considered part of the value. The use of NUL bytes within 
          character-based values is not permitted except when used as a 
          delimiter separating multiple values.</t>
          
          <t>When multiple values are included, the value length field MUST
          specify the total length, in octets, of all values plus the 
          number of NUL (0) byte separators. For example, for a header 
          value consisting of the two strings "foo" and "bar", the total
          value length would be 7.</t>
          
        </section>
      
        <section title="Extension Headers">
          
          <t>Extension Headers are simple name+value pairs essentially
          as they exist today, but with a number of important modifications.</t>
          
<figure><preamble>The structure of Extension Headers</preamble><artwork>
  +------------------------------+
  |1| flags(7-bit) | namelen (8) |
  +------------------------------+
  | name | val len (16) | value  |
  +------------------------------+
</artwork></figure>

          <t>The first single bit is the Type-bit. When this bit is on,
          the header is an Extension Header.</t>
          
          <t>The next seven bits are reserved flags.
            <list style="symbols">
              <t>Bit 0x1 indicates that the header value contains 
              UTF-8 encoded character content. If the bit is not 
              set, the value is assumed to contain non-character-based
              binary data.</t>
              <t>Bit 0x2 indicates that the header specifies multiple
              NUL (0) separated values. When set, processors MUST treat
              NUL (0) octets within the value as a value-separated and 
              not as part of the value itself.</t>
            </list>
          </t>
          
          <t>The next 8-bits specify the length in octets of the ASCII-encoded 
          header name as unsigned integer, followed by the name itself. The 
          name MUST conform to the field-name construction as defined in 
          [draft-ietf-httpbis-p1-messaging-2].</t>
          
          <t>The length of the remaining value is specified as an unsigned 16-bit integer,
          followed by the value itself. Zero length values are permitted.</t>
          
          <t>When Flag 0x2 is set, the header may contain multiple 
          values separated by a single NUL (0) byte. Each distinct value 
          MUST be zero-length.</t>
          
          <t>When multiple values are included, the value length field MUST
          specify the total length, in octets, of all values plus the 
          number of NUL (0) byte separators. For example, for a header 
          value consisting of the two strings "foo" and "bar", the total
          value length would be 7.</t>
          
        </section>
              
      <section title="Binary vs. Character Values">
        
        <t>Specific header values can be encoded as either a stream of 
        binary octets or as UTF-8 encoded character data.</t>
        
        <t>For example, within the existing SPDY specification, the 
        HTTP Version is represented as a header using the field-name
        ":version" with the version number represented as an ASCII string,
        consuming 19-bytes in all.</t>
        
<figure><preamble>Version Header using the existing SPDY encoding:</preamble><artwork>
  00 00 00 08 3a 76 65 72  |....:ver|
  73 69 6f 6e 00 00 00 03  |sion....|
  31 2e 31                 |2.0|
</artwork></figure>

        <t>Using the Binary Optimized Header Encoding, this can be reduced
        to a compact 7 or 8 bytes using either binary or character data:</t>
        
<figure><preamble>Version Header using Character Data:</preamble><artwork>
  00 01 01 00 03 31 2e 31  |.....2.0|
</artwork></figure>

<figure><preamble>Version Header using Binary Data:</preamble><artwork>
  00 01 00 00 02 02 00     |.......|
</artwork></figure>

        <t>Likewise, SPDY uses a ":method" header to specify the HTTP Method
        used for a particular request, with the value represented as an 
        ASCII string, consuming 18 bytes for GET requests.</t>

<figure><preamble>Method Header using the existing SPDY encoding:</preamble><artwork>
  00 00 00 07 3a 6d 65 74  |....:met|
  68 6f 64 00 00 00 03 47  |hod....G|
  45 54                    |GET|
</artwork></figure>

        <t>Using optimized encoding, this can be reduced to a compact
        6 or 8 bytes using either binary or character data:</t>
        
<figure><preamble>Method Header using Character Data:</preamble><artwork>
  00 02 01 00 03 47 45 54  |.....GET|
</artwork></figure>

<figure><preamble>Method Header using Binary Data, assuming the value 0x1 is
defined to represent the GET method:</preamble><artwork>
  00 02 00 00 01 01       |......|
</artwork></figure>

        <t>There are many headers used within HTTP applications for which 
        binary encodings would be difficult or unnecessary. For those, 
        utilizing the character encoding option would be appropriate.
        With some work it should be possible to define optimized
        binary encodings for many of the existing complex headers.</t>
        
      </section>
    
      <section title="Example Headers">
        
        <t>Assume the following registered headers:</t>
        
        <texttable>
          <ttcol>HTTP Header</ttcol>
          <ttcol>Codepage</ttcol>
          <ttcol>ID</ttcol>
          <c>Version</c>
          <c>0</c>
          <c>1</c>
          <c>Method</c>
          <c>0</c>
          <c>2</c>
          <c>Host</c>
          <c>0</c>
          <c>3</c>
          <c>Path (Request URI)</c>
          <c>0</c>
          <c>4</c>
          <c>Accept-Language</c>
          <c>1</c>
          <c>1</c>
        </texttable>
        
        <t>And the following values representing known HTTP Methods:</t>
        
        <texttable>
          <ttcol>Method</ttcol>
          <ttcol>Value</ttcol>
          <c>GET</c>
          <c>1</c>
          <c>POST</c>
          <c>2</c>
          <c>PUT</c>
          <c>3</c>
          <c>DELETE</c>
          <c>4</c>
          <c>PATCH</c>
          <c>5</c>
          <c>HEAD</c>
          <c>6</c>
          <c>OPTIONS</c>
          <c>7</c>
          <c>CONNECT</c>
          <c>8</c>
        </texttable>

<figure><preamble>The Version header can be encoded as (7-bytes):</preamble><artwork>
  00 01 00 00 02 02 00    |.......|
</artwork></figure>

<figure><preamble>The GET Method header can be encoded as (6-bytes):</preamble><artwork>
  00 02 00 00 01 01       |......|
</artwork></figure>

<figure><preamble>The Host Header can be encoded as (20-bytes):</preamble><artwork>
  00 03 01 00 0f 77 77 77 |.....www|
  2e 65 78 61 6d 70 6c 65 |.example|
  2e 6f 72 67             |.org|
</artwork></figure>

<figure><preamble>A simple Accept-Lang header would be encoded as (10-bytes):</preamble><artwork>
  10 01 01 00 05 65 6e 2d |.....en-|
  55 53                   |US|
</artwork></figure>
        
<figure><preamble>A Path header encoding the request URI (45-bytes):</preamble><artwork>
  00 04 01 00 28 2f 74 68  |...../th|
  69 73 2f 69 73 2f 74 68  |is/is/th|
  65 2f 72 65 71 75 65 73  |e/reques|
  74 3f 69 73 3d 69 74 26  |t?is=it&|
  6e 6f 74 3d 62 65 61 75  |not=beau|
  74 69 66 75 6c           |tiful|
</artwork></figure>
        
        <t>The combined serialization of the five headers into a single block
        requires a total of 89 bytes. By comparison, the equivalent serialization
        using the existing SPDY encoding requires 150 bytes sans compression
        (28 bytes of which are wasted by the unnecessary use of int32).</t>
        
<figure><preamble>The equivalent SPDY encoding:</preamble><artwork>
  00 00 00 05 00 00 00 08  |........|
  3a 76 65 72 73 69 6f 6e  |:version|
  00 00 00 03 31 2e 31 00  |....1.1.|
  00 00 07 3a 6d 65 74 68  |...:meth|
  6f 64 00 00 00 03 47 45  |od....GE|
  54 00 00 00 05 3a 68 6f  |T....:ho|
  73 74 00 00 00 0f 77 77  |st....ww|
  77 2e 65 78 61 6d 70 6c  |w.exampl|
  65 2e 6f 72 67 00 00 00  |e.org...|
  0f 41 63 63 65 70 74 2d  |.Accept-|
  4c 61 6e 67 75 61 67 65  |Language|
  00 00 00 05 65 6e 2d 55  |....en-U|
  53 00 00 00 05 3a 70 61  |S....:pa|
  74 68 00 00 00 28 2f 74  |th..../t|
  68 69 73 2f 69 73 2f 74  |his/is/t|
  68 65 2f 72 65 71 75 65  |he/reque|
  73 74 3f 69 73 3d 69 74  |st?is=it|
  26 6e 6f 74 3d 62 65 61  |&not=bea|
  75 74 69 66 75 6c        |utiful|
</artwork></figure>

        <t>Note that the equivalent information encoded within 
        an HTTP/1.1 request message requires 102 bytes.</t>
        
      </section>

    </section>
    
    <section title="Security Considerations">
      <t>TBD</t>
    </section>
        
  </middle> 

  <back>
    <references title="Normative References"> 
  &rfc2119;
    </references>    
    
    <section title="Additional Examples">
      
        <t>Assuming the following (intentionally incomplete) header 
        registrations adapted from the existing http-bis specifications.</t>
      
        <texttable>
          <ttcol>HTTP Header</ttcol>
          <ttcol>Codepage</ttcol>
          <ttcol>ID</ttcol>
          <c>Version</c>
          <c>0</c>
          <c>1</c>
          <c>Method</c>
          <c>0</c>
          <c>2</c>
          <c>Host</c>
          <c>0</c>
          <c>3</c>
          <c>Path (Request URI)</c>
          <c>0</c>
          <c>4</c>
          <c>Status</c>
          <c>0</c>
          <c>5</c>
          <c>Status-Text</c>
          <c>0</c>
          <c>6</c>
          <c>Content-Length</c>
          <c>0</c>
          <c>7</c>
          <c>Content-Type</c>
          <c>0</c>
          <c>8</c>
          <c>Content-Encoding</c>
          <c>0</c>
          <c>9</c>
          <c>Expect</c>
          <c>0</c>
          <c>10</c>
          <c>Location</c>
          <c>0</c>
          <c>11</c>
          <c>Last-Modified</c>
          <c>0</c>
          <c>12</c>
          <c>ETag</c>
          <c>0</c>
          <c>13</c>
          <c>If-Match</c>
          <c>0</c>
          <c>14</c>
          <c>If-None-Match</c>
          <c>0</c>
          <c>15</c>
          <c>If-Modified-Since</c>
          <c>0</c>
          <c>16</c>
          <c>If-Unmodified-Since</c>
          <c>0</c>
          <c>17</c>
          <c>Age</c>
          <c>0</c>
          <c>18</c>
          <c>Cache-Control</c>
          <c>0</c>
          <c>19</c>
          <c>Expires</c>
          <c>0</c>
          <c>20</c>
          <c>Vary</c>
          <c>0</c>
          <c>21</c>
          
          <c>Accept</c>
          <c>1</c>
          <c>1</c>
          <c>Accept-Language</c>
          <c>1</c>
          <c>2</c>
          <c>Accept-Charset</c>
          <c>1</c>
          <c>3</c>
          <c>Accept-Encoding</c>
          <c>1</c>
          <c>4</c>
          <c>Allow</c>
          <c>1</c>
          <c>5</c>
          <c>Content-Language</c>
          <c>1</c>
          <c>6</c>
          <c>Content-Location</c>
          <c>1</c>
          <c>7</c>
          <c>Date</c>
          <c>1</c>
          <c>8</c>
          <c>From</c>
          <c>1</c>
          <c>9</c>
          <c>Warning</c>
          <c>1</c>
          <c>10</c>
          
        </texttable>
      
        <t>And the following values representing known HTTP Methods:</t>
      
        <texttable>
          <ttcol>Method</ttcol>
          <ttcol>Value</ttcol>
          <c>GET</c>
          <c>1</c>
          <c>POST</c>
          <c>2</c>
          <c>PUT</c>
          <c>3</c>
          <c>DELETE</c>
          <c>4</c>
          <c>PATCH</c>
          <c>5</c>
          <c>HEAD</c>
          <c>6</c>
          <c>OPTIONS</c>
          <c>7</c>
          <c>CONNECT</c>
          <c>8</c>
        </texttable>
      
        <t>We can derive the following optimized encodings:</t>
        
<figure><preamble>Version Header:</preamble><artwork>
  00 01 00 00 02 02 00    |.......|
</artwork></figure>

<figure><preamble>Method Header (GET Request)</preamble><artwork>
  00 02 00 00 01 01       |......|
</artwork></figure>

<figure><preamble>Method Header (PATCH Request)</preamble><artwork>
  00 02 00 00 01 05       |......|
</artwork></figure>

<figure><preamble>Method Header (Custom "FOO" Method)</preamble><artwork>
  00 02 01 00 03 46 4F 4F |.....FOO|
</artwork></figure>

<figure><preamble>Host Header:</preamble><artwork>
  00 03 01 00 0f 77 77 77 |.....www|
  2e 65 78 61 6d 70 6c 65 |.example|
  2e 6f 72 67             |.org|
</artwork></figure>

<figure><preamble>Representation of HTTP Response Status ("200 OK"):</preamble><artwork>
  00 05 00 00 01 C8 00 06 |........|
  01 00 02 4F 4B          |...OK|
</artwork></figure>

<t>The status above is represented as two separate headers, one containing the 
status code, the other containing the status text.</t>

<figure><preamble>Content-Length Header (value encoded as uint32):</preamble><artwork>
  00 07 00 00 04 00 00 00 |........|
  C8                      |.|
</artwork></figure>

<figure><preamble>Content-Type Header:</preamble><artwork>
  00 08 01 00 0A 69 6d 61 |.....ima|
  67 65 2f 6a 70 65 67    |ge/jpeg|
</artwork></figure>

<figure><preamble>Expect Header (Expect: 100):</preamble><artwork>
  00 0A 00 00 01 64       |......|
</artwork></figure>

<figure><preamble>Last-Modified (Using RFC3339 Format):</preamble><artwork>
  00 0C 01 00 19 32 30 31 |.....201|
  32 2d 30 38 2d 30 31 54 |2-08-01T|
  30 34 3a 32 33 3a 31 32 |04:23:12|
  2e 31 32 33 34 5a       |.1234Z|
</artwork></figure>

<figure><preamble>ETag (Strong Entity-Tag, String-format):</preamble><artwork>
  00 0D 01 00 07 22 61 62 |....."ab|
  63 64 65 22             |cde"|
</artwork></figure>

<figure><preamble>If-None-Match:</preamble><artwork>
  00 0F 01 00 07 22 61 62 |....."ab|
  63 64 65 22             |cde"|
</artwork></figure>

<figure><preamble>If-None-Match (Multiple values)</preamble><artwork>
  00 0F 03 00 0F 22 61 62 |....."ab|
  63 64 65 22 00 22 61 62 |cde"."ab|
  63 64 66 22             |cdf"|
</artwork></figure>

<figure><preamble>Allow (GET, POST, FOO):</preamble><artwork>
  10 05 02 00 07 01 00 02 |........|
  00 46 4f 4f             |.FOO|
</artwork></figure>

    </section>
    
  </back>
</rfc> 
 

PAFTECH AB 2003-20262026-04-24 05:40:12