One document matched: draft-trammell-tcpm-timestamp-interval-01.xml


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<rfc ipr="trust200902" category="exp" docName="draft-trammell-tcpm-timestamp-interval-01.txt">
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>

  <front>
    <title abbrev="Timestamp Intervals">
      Encoding of Time Intervals for the TCP Timestamp Option
    </title>

    <author fullname="Richard Scheffenegger" 
            initials="R." surname="Scheffenegger">
     <organization>NetApp, Inc.</organization>
     <address>
       <postal>
         <street>Am Euro Platz 2</street>
         <city>1120 Vienna</city>
         <country>Austria</country>
       </postal>
       <phone>+43 1 3676811 3146</phone>
       <email>rs@netapp.com</email>
     </address>
    </author>
  
    <author fullname="Mirja Kuehlewind" 
            initials="M." surname="Kuehlewind">
      <organization>University of Stuttgart</organization>
      <address>
        <postal>
          <street>Pfaffenwaldring 47</street>
          <city>70569 Stuttgart</city>
          <country>Germany</country>
        </postal>
        <email>mirja.kuehlewind@ikr.uni-stuttgart.de</email>
      </address>
    </author>

    <author fullname="Brian Trammell" 
            initials="B." surname="Trammell">
      <organization abbrev="ETH Zurich">
        Swiss Federal Institute of Technology Zurich
      </organization>
      <address>
        <postal>
          <street>Gloriastrasse 35</street>
          <city>8092 Zurich</city>
          <country>Switzerland</country>
        </postal>
        <phone>+41 44 632 70 13</phone>
        <email>trammell@tik.ee.ethz.ch</email>
      </address>
    </author>

    <date month="July" day="15" year="2013"/>
    <area>Transport</area>
    <workgroup>TCP Maintenance and Minor Extensions (tcpm)</workgroup>

    <keyword>TCP</keyword>
    <keyword>timestamp</keyword>
    <keyword>interval</keyword>
    
    <abstract>

      <t>The TCP Timestamp option would be useful for additional measurements
      if it could be assumed that the interval between ticks of the timestamp
      clock are regular, and if that interval were known. In practice, many
      implementations do use a timestamp clock source that has a regular
      interval. This draft specifies a compact encoding for exposing the
      timestamp interval to a receiver, and discusses applications therefor.</t>

    </abstract>
  </front>  
  <middle>
    
    <!-- <section anchor="dtd" title="Discussion and To Do">
      
      <t>What we really want: timestamp side channeling</t>
      
      <t>This mechanism should imply SACK semantics if SACK available</t>

      <t>Plan: TS Nego becomes TS Cap PS.</t>
      
      <t>Timestamp interval exposure is defined to be valid after the ACK of
      the SYN</t>
      
      <t>Timestamp interval exposure can be in the SYN; TS subsequent.</t>
      
      <t>Question: do we want masking?</t>
      
    </section> -->

    <section anchor="intro" title="Introduction">

      <t>The Timestamp option originally introduced in <xref target="RFC1323"/>
      was designed to support only two very specific mechanisms, round trip
      time measurement (RTTM), and protection against wrapped sequence numbers
      (PAWS), assuming a particular TCP algorithm (Reno).</t>

      <t>While <xref target="RFC1323"/> specifies only that timestamps "must be
      at least approximately proportional to real time" to support RTTM, many
      implementations generate timestamp values from a regular timing source.
      Determining the real-time interval represented by a single tick makes
      additional measurements possible. In addition to easing passive
      measurements using the timestamp option, it also makes possible the
      measurement of inter-departure time; the comparison of inter-departure
      time to inter-arrival time can be used to one-way delay variation
      measurement, useful for congestion control algorithms as well in QoS
      applications.</t>

      <t>This document specifies a compact encoding for timestamp intervals
      which can be exported via any number of mechanisms, either through a new
      TCP option, by piggybacking on the timestamp option as in <xref
      target="I-D.scheffenegger-tcpm-timestamp-negotiation"/>, or through other
      in- or out-of-band means. This document specifies an experimental TCP
      option for experiments with interval exposure separate from any other
      mechanism.</t>
        
    </section>

    <section anchor="terms" title="Terminology">

      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>

      <t>Terms defined in <xref target="RFC1323"/> are used in this document as
      defined there.</t>

      <t>This document defines the following additional term:</t>
      
      <t><list style="hanging">

        <t hangText="Timestamp interval"><vspace />
          The interval between two ticks of the timestamp clock source running
          at a constant frequency. Note that the timestamp clock is not
          required to be identical with the TCP clock, even though most
          implementations use the same clock for practical purposes. </t>
          </list></t>

    </section>
    
    <section title="Timestamp interval exposure">

      <t>This section describes the requirements for interval encoding, then
      specifies an interval to meet these requirements based on a 16-bit
      reduced-precision encoding of a 42-bit fixed-point unsigned integer.</t>

      <section title="Interval encoding requirements">

        <t>The choice of a timestamp interval is generally
        implementation-specific, and there are a small number of commonly
        chosen intervals. However, a general solution must support not only
        common cases, but uncommon ones, and provide future flexibility to
        allow an implementation to dynamically choose new timestamp intervals
        for new sockets, based on network conditions and specific requirements
        for timestamp measurements.</t>
        
        <t>There are some sensible bounds on the range of timestamp intervals
        that must be reasonably supported. The minimum inter-packet interval
        for 64-byte packets (i.e., back-to-back ACK segments) on a future 400
        Gigabit Ethernet would be about 1ns; smaller intervals need not be
        supported with current technology, even for applications for which a
        unique timestamp for every packet would be useful. On the other side of
        the scale, low-bandwidth, high-latency links may operate with timestamp
        intervals on the order of seconds.</t>

        <t>The precision required by timestamp interval export, on the other
        hand, is determined by the applications for which the information will
        be used and the precision of the underlying clock source. As many clock
        sources may provide less than maximum precision (due to e.g. interrupt
        jitter), there should be some way to represent variable precision.</t>
        
        <t>As a timestamp interval will need to be bound to a connection
        in-band at runtime, a space-efficient encoding is necessary.</t>
        
        <t>These requirements indicate a reduced-precision encoding of a
        fixed-point interval, expressed in seconds, as described in the next
        subsection.</t>
        
      </section>
      
      <section title="Interval encoding specification" anchor="sec-encoding">

        <t>A 42-bit fixed-point unsigned integer with 4 bits before the
        decimal point and 38 bits after, expressed in seconds, is sufficient
        to encode an interval range from just under 16 seconds
        (0x3ff ffff ffff) down to 2^-38 s or 3.64 ps
        (0x000 0000 0001), meeting the range requirement. Sufficient
        precision for the applications envisioned by this document is provided
        by exporting just the 11 most significant bits of the interval value
        (here, the "value"), coupled with a 5-bit "scale" which locates the
        least significant bit of the value within the larger field: a scale of
        31 places the value field between bits 41 and 31 inclusive of the
        fixed-point integer for the largest intervals, while a scale of 0
        places the value field between bits 10 and 0 inclusive. By using a
        scale such that the most significant bit of the value is not 1, less
        than 11 bits of precision can be signaled, as well; implementations
        SHOULD NOT represent more precision in an exported timestamp interval
        than they actually support. Full precision export is available down to
        2^-27 s (or 7.45 ns) with diminishing precision down to 3.64 ps. This
        arrangement therefore allows the representation of timestamp intervals
        over 13 orders of magnitude and 11 bits of precision with only two
        octets. The details of this encoding are illustrated in <xref
        target="fig-int-encoding"/>.</t>

<?rfc needLines='18'?>
        <t><figure anchor="fig-int-encoding" title="Timestamp interval encoding using scaled fixed-point integer" align="center">
        <artwork align="center"><![CDATA[
MSb                                          LSb
 41   37     31       23       15       7     0
+----+------+--------+--------+--------+-------+
| int|                frac                     |   full value
+----+------+--------+--------+--------+-------+
                 /             \
              +-+               \
             /                   \
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |  scale  |        value        |                encoded interval
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   15     11 10                  0
]]></artwork></figure></t>

      <t>This encoded 16-bit interval is then exported for a given
      connection as a standalone TCP option or as part of the extended
      timestamp negotiation described in the following subsections.</t>

      <t>A sender explicitly signals that it uses an irregular timestamp clock
      by sending zero for both scale and value (i.e., 0x0000). Combinations of
      a value of zero and a non-zero scale are reserved for future use. These
      values MUST NOT be sent as a timestamp interval, and SHOULD presently be
      interpreted by the receiver as exposing an irregular timestamp clock.</t>
      
      <t>For implementations that support only a single timestamp interval for
      all flows in all situations, the encoded interval can be implemented as a
      constant. Encodings for common timestamp intervals with maximum precision
      are given in <xref target="tab-common-intervals-11bit"/>. Encodings for
      9-bit precision, the maximum available from common software interrupt
      clock sources, are given in <xref
      target="tab-common-intervals-9bit"/>.</t>

<t><vspace blankLines='66'/></t>

<?rfc needLines='19' ?>
<texttable anchor="tab-common-intervals-11bit" align="center"
  title="Encodings for common timestamp intervals with maximum precision">
  <ttcol align="right">interval</ttcol>
  <ttcol align="right">frequency</ttcol>
  <ttcol align="center">scale</ttcol>
  <ttcol align="center">value</ttcol>
  <ttcol align="center">combined</ttcol>
  <c>16 s </c>  <c>0.06 Hz </c> <c>0x1f</c> <c>0x7ff</c> <c>0xffff</c>
  <c>1 s </c>   <c> 1 Hz </c>   <c>0x1c</c> <c>0x400</c> <c>0xe400</c>
  <c>0.5 s </c> <c> 2 Hz </c>   <c>0x1b</c> <c>0x400</c> <c>0xdc00</c>
  <c>100 ms</c>      <c>10 Hz </c>   <c>0x18</c> <c>0x666</c> <c>0xc666</c>
  <c> 10 ms</c>      <c>100 Hz </c>  <c>0x15</c> <c>0x51f</c> <c>0xad1f</c>
  <c>  4 ms</c>      <c>250 Hz </c>  <c>0x14</c> <c>0x419</c> <c>0xa419</c>
  <c>  1 ms</c>      <c> 1 kHz</c>        <c>0x12</c> <c>0x418</c> <c>0x9418</c>
  <c>200 us</c>      <c> 5 kHz</c>        <c>0x0f</c> <c>0x68e</c> <c>0x7e8e</c>
  <c> 50 us</c>      <c>20 kHz</c>        <c>0x0d</c> <c>0x68e</c> <c>0x6e8e</c>
  <c>  1 us</c>      <c> 1 MHz</c>        <c>0x08</c> <c>0x432</c> <c>0x4432</c>
  <c> 60 ns</c>      <c>16.7 MHz</c>      <c>0x04</c> <c>0x407</c> <c>0x2407</c>
  <c> none </c> <c>--------</c>      <c>0x00</c> <c>0x000</c> <c>0x0000</c>
</texttable>
    
<?rfc needLines='18' ?>
<texttable anchor="tab-common-intervals-9bit" align="center"
  title="Encodings for common timestamp intervals with 9-bit precision">
  <ttcol align="right">interval</ttcol>
  <ttcol align="right">frequency</ttcol>
  <ttcol align="center">scale</ttcol>
  <ttcol align="center">value</ttcol>
  <ttcol align="center">combined</ttcol>
  <c>1.0 s</c>   <c>1 Hz</c>     <c>0x1e</c><c>0x100</c><c>0xf100</c>
  <c>0.5 s</c>   <c>2 Hz</c>     <c>0x1d</c><c>0x100</c><c>0xe900</c>
  <c>100 ms</c>  <c>10 Hz</c>    <c>0x1a</c><c>0x199</c><c>0xd199</c>
  <c> 10 ms</c>  <c>100 Hz</c>   <c>0x17</c><c>0x147</c><c>0xb947</c>
  <c>  4 ms</c>  <c>250 Hz</c>   <c>0x16</c><c>0x106</c><c>0xb106</c>
  <c>  1 ms</c>  <c>1 kHz</c>    <c>0x14</c><c>0x106</c><c>0xa106</c>
  <c>200 us</c>  <c>5 kHz</c>    <c>0x11</c><c>0x1a3</c><c>0x89a3</c>
  <c> 50 us</c>  <c>20 kHz</c>   <c>0x0f</c><c>0x1a3</c><c>0x79a3</c>
  <c>  1 us</c>  <c>1 MHz</c>    <c>0x0a</c><c>0x10c</c><c>0x510c</c>
  <c> 60 ns</c>  <c>16.7 MHz</c> <c>0x06</c><c>0x101</c><c>0x3101</c>
  <c> none </c> <c>--------</c>      <c>0x00</c> <c>0x000</c> <c>0x0000</c>
</texttable>

<!--<t><vspace blankLines='66'/></t>-->

        </section>

        <section title="Timestamp Interval experimental TCP option" anchor="sec-option">

          <t>This section specifies an experimental TCP option, using an ExID
          and magic number as described in <xref
          target="I-D.ietf-tcpm-experimental-options"/>, for exporting
          timestamp intervals. This option MAY appear in any TCP segment after
          the SYN segment to advertise the sender's timestamp interval, encoded
          as in <xref target="sec-encoding"/> above. If the receiver uses
          timestamp interval information, it stores the interval for the
          duration of the connection, or until a subsequent Timestamp Interval
          option is received. The receiver may assume the Timestamp Interval is
          applicable from the point of receipt of the option; i.e. that all
          subsequent received segments with the same or a subsequent sequence
          number as the segment containing the option export timestamps with
          the stated option.</t>
          
          <t>If a sender has previously sent a timestamp interval to a
          receiver, and changes the timestamp interval on the connection, it
          MUST send a new Timestamp Interval option.</t>
          
          <t>This option MUST NOT appear in a segment in which a TCP Timestamp
          option is also not present.</t>

          <figure anchor="fig-option" title="Structure of Timestamp Interval Experimental TCP option for interval export" align="center">
          <artwork align="center"><![CDATA[
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Kind = 253  |  Length = 8   |         ExID = 0x75ec         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      more magic = 0xffee      |   encoded advertised interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork></figure>

          <t>Should timestamp interval exposure prove useful, and a separate
          TCP Option be chosen as the preferred method to send it in-band, this
          option would have a length of 4, and the use of an ExID and magic
          number would preserve word alignment in implementations transitioning
          from experimental to production TCP Option usage.</t>

 </section>
 <!--
        <section title="Interval export during TS negotiation"> 

          <t>[EDITOR'S NOTE: bind to new revision of the TS negotiation draft;
          requires TS negotiation that can flexibly add 16 bits of content to
          the negotiation handshake.]</t>
      
       </section>
   -->
      </section>
<!--
      <section title="Timestamp interval negotiation">
        
        <t>[EDITOR'S NOTE: describe here how a receiver could ask a sender for
        a specific TS rate: an option with two encoded intervals could be
        handled as consisting of an advertised interval (first interval) and a
        requested interval (second interval). A sender that gets an interval
        request must then send a ts interval option which advertises the
        closest interval it is willing to support. This mechanism could also be
        used to implicitly request that timestamps be turned on, if it is
        decided that 1323 could be updated to support mid-connection
        initialization of TS.]</t>

      </section>
-->

      <section title="Guidelines for defined-interval timestamp export">

        <t>As noted above, implementations SHOULD NOT indicate more precision
        than they support. As common software interrupt clock sources provide
        about 9 bits of precision, these should be indicated with 2 leading
        zero bits in the value field. Low variance software clocks (e.g. CPU
        cycle counters) should be indicated with a single leading zero bit,
        and hardware injecting the timestamp into the header with high
        precision should use the full precision. Similarly, if the clock
        source exhibits a very high variability (e.g. when running in a
        virtualized environment), 3 or more leading zeros should be used in
        the value field.</t>

       <t>Timestamp intervals faster than about 1 ms SHOULD be implemented by
       inserting the timestamp "late" before transmitting a segment to avoid
       unnecessary timing jitter.</t>

       <t>Intervals on the order of 1us or less are intended for use
       with for hardware-assisted implementations, e.g. direct use of a
       (shifted) CPU cycle counter as clock source.</t>

      </section>


    <!-- <section title="Timestamp interval estimation">
      
      <t>[EDITOR'S NOTE: discuss ways to estimate timestamp intervals at the
      receiver without additional export (e.g., algorithms based on PLLs or the
      like); discuss how this can be used to defend against bad or malicious
      timestamp interval export - adapt the following text?]</t>

        <t>Today, each TCP host may use an arbitrary, locally defined clock
        source to derive the timestamp value from. Even though only a handful
        of typically used clock rates are implemented in common TCP stacks,
        this does not guarantee that any future stack will choose the same
        clock rate. This poses a problem for current state of the art
        heuristics, which try to determine the senders timestamp clock rate by
        pure passive observation of the TCP stream, and affects both advanced
        heuristics in the partner host of a TCP session, or arbitrarily located
        passive observation points to estimate TCP session parameters.</t>

        <t>The proposed mechanism would reveal this information explicitly,
        even though other environmental factors, such as the operation of a TCP
        stack in a virtualized environment, may result in some deviations in
        the actually used clock rate.</t>

        <t>High-speed and real-time stacks would be expected to operate with
        higher clock rates, while the observed variance in (known) timestamp
        clock vs. reference clock could help in determining between physical
        and virtual end hosts, for example.</t>
      
    </section> -->

    <section title="IANA Considerations">
      
      <t>This document uses the Experimental Option Experiment Identifier
      (ExID) 0x75ec ffee to identify the Timestamp Interval experimental option
      in <xref target="sec-option"/>; an application for this codepoint in the
      IANA TCP Experimental Option ExID registry has already been submitted.</t>
      
    </section>

    <section title="Security Considerations">
      <t>[EDITOR'S NOTE: discuss implications of misuse -- what can I break by
      sending a bad interval?]</t>
    </section>

  </middle>
  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.1323" ?>
      <?rfc include="reference.RFC.2119" ?>
      <?rfc include="reference.I-D.scheffenegger-tcpm-timestamp-negotiation" ?>
      <?rfc include="reference.I-D.ietf-tcpm-experimental-options" ?>
    </references>
    <references title="Informative References">
      <!-- <?rfc include="reference.RFC.3522" ?> -->
      <!-- <?rfc include="reference.RFC.4015" ?> -->
      <!-- <?rfc include="reference.RFC.6013" ?> -->
      <!-- <?rfc include="reference.RFC.6247" ?> -->

      <!-- <?rfc include="reference.I-D.ietf-tcpm-tcp-security.xml"?> -->
      <!-- <?rfc include="reference.I-D.ietf-tcpm-1323bis.xml"?> -->
      <!-- <?rfc include="reference.I-D.ietf-tcpm-anumita-tcp-stronger-checksum"?> -->
      <?rfc include="reference.I-D.ietf-ledbat-congestion"?>
      <!-- <?rfc include="reference.I-D.blanton-tcp-reordering"?> -->
      <!-- <?rfc include="reference.I-D.sabatini-tcp-sack"?> -->

      <reference anchor="Chirp"
          target="http://bobbriscoe.net/projects/netsvc_i-f/chirp_pfldnet10.pdf">
        <front>
          <title>Chirping for Congestion Control - Implementation Feasibility</title>
        
          <author initials="M." surname="Kuehlewind">
            <organization>University of Stuttgart</organization>
          </author>
          <author initials="B." surname="Briscoe">
            <organization>British Telekom</organization>
          </author>
          <date month="Nov" year="2010"/>
        </front>
      </reference>

      <!-- <reference anchor="Cho08"
          target="http://ubinet.yonsei.ac.kr/v2/publication/hpmn_papaers/ic/2008_Enhanced%20Response%20Algorithm%20for%20Spurious%20TCP.pdf">
        <front>
          <title>Enhanced Response Algorithm for Spurious TCP
            Timeout (ER-SRTO)</title>
        
          <author initials="I." surname="Cho">
            <organization>Yonsei University</organization>
          </author>
          <author initials="J." surname="Han">
            <organization>Yonsei University</organization>
          </author>
          <author initials="J." surname="Lee">
            <organization>Yonsei University</organization>
          </author>
          <date month="Jan" year="2008"/>
        </front>
      </reference> -->
      
      <!-- <reference anchor="CUBIC"
          target="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.3152&rep=rep1&type=pdf">
        <front>
          <title>CUBIC: A New TCP-Friendly High-Speed
            TCP Variant</title>
        
          <author initials="I." surname="Rhee">
            <organization>NC State University</organization>
          </author>
          <author initials="S." surname="Ha">
            <organization>NC State University</organization>
          </author>
          <author initials="L." surname="Xu">
            <organization>University of Nebraska</organization>
          </author>
          <date month="Feb" year="2005"/>
        </front>
      </reference> -->
     
      <!-- <reference anchor="BSD10"
          target="http://caia.swin.edu.au/reports/100219A/CAIA-TR-100219A.pdf">
        <front>
          <title>Timing enhancements to the FreeBSD kernel to
            support delay and rate based TCP mechanisms</title>
        
          <author initials="D." surname="Hayes">
            <organization>Swinburne University of Technology</organization>
          </author>
          
          <date month="Feb" year="2010"/>
        </front>
      </reference> -->
          
<!--       <reference anchor="Linux"
          target="http://www.cs.clemson.edu/~westall/853/linuxtcp.pdf">
        <front>
          <title>Linux TCP</title>
        
          <author initials="P." surname="Sarolahti">
            <organization>Nokia</organization>
          </author>
          
          <date month="Apr" year="2007"/>
        </front>
      </reference> -->
      
      
      <!-- <reference anchor="PH04"
          target="citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.2748&rep=rep1&type=pdf">
        <front>
          <title>The Peak-Hopper: A New End-to-End
            Retransmission Timer for Reliable Unicast Transport</title>
        
          <author initials="H." surname="Eckstroem">
            <organization>Ericsson</organization>
          </author>
          <author initials="R." surname="Ludwig">
            <organization>Ericsson</organization>
          </author>
          
          <date month="Apr" year="2004"/>
        </front>
      </reference> -->
      
        
      <!-- <reference anchor="Path99"
          target="http://www.icir.org/mallman/papers/estimation.ps">
        <front>
          <title>On Estimating End-to-End Network Path Properties</title>
        
          <author initials="M." surname="Allman">
            <organization>NASA</organization>
          </author>
          <author initials="V." surname="Paxson">
            <organization>ICSI</organization>
          </author>
          
          <date month="Sep" year="1999"/>
   
        </front>
      </reference> -->
    </references>
    
      <section title="Methodology for one-way delay variation measurement using
                      known timestamp intervals">

        <t>New congestion control algorithms are currently proposed, that react
        on the measured one-way delay variation (see <xref
        target="I-D.ietf-ledbat-congestion"/>, <xref target="Chirp"/>). This
        control variable is updated after each received ACK.</t>

        <t>C(t) = TSval(t) - TSecr(t)</t>

        <t>V(t) = C(t) - C(t-1)</t>
        
        <t>provided that the timestamp clocks at both ends are running at
        roughly the same rate. Without prior knowledge of the timestamp clock
        interval used by the partner, a sender can try to learn this interval
        by observing the exchanged segments for a duration of a few RTTs.
        However, such a scheme fails if the partner uses some form of implicit
        integrity check of the timestamp values, which would appear as either
        random scrambling of LSB bits in the timestamp, or give the impression
        of much shorter clock intervals than what is actually used. If the
        partner uses some form of segment counting as timestamp value, without
        any direct relationship to the wall-clock time, the above formula will
        fail to yield meaningful results. Finally the network conditions need
        to remain stable during any such training phase, so that the sender can
        arrive at reasonable estimates of the partners timestamp clock tick
        duration.</t>
        
        <t>[EDITOR'S NOTE: the following refers to a mask field which doesn't
        exist anymore, needs a rewrite. Shouldn't we define C(t) = (TSecr(t) - TSval(t)) * (TSinterval(remote) / TSinterval(local))?]</t>
        
        <t>This note addresses these concerns by providing a means by which
        both host are required to use a timestamp clock that is closely related
        to the wall-clock time, with known clock rate, and also provides means
        by which a host can signal the use of a few LSB bits for timestamp
        value integrity checks. To arrive at a valid one-way delay (OWD)
        variation, first the timestamp received from the partner has to be
        right-shifted by a known amount of bits as defined by the mask field.
        Next the local and remote timestamp values need to be normalized to a
        common base clock interval (typically, the local clock interval): </t>
                
<figure><artwork align="left"><![CDATA[
                                                      remote interval
C  = (TSecr >> local mask) - (TSval >> remote mask) * ---------------
 t                                                    local interval
]]></artwork></figure>

        <t>V(t) = C(t) - C(t-1)</t>

        <t>[EDITOR'S NOTE: the following refers to field definitions from the old TS nego draft; needs a rewrite.]</t>
        
        <t>The adjustment factor can be calculated once during the timestamp
        capability negotiation phase, and pure integer arithmetic can be used
        during per-segment processing:</t>
        
        <t>EXP.min = min(EXP.loc, EXP.rem)</t>

        <t>EXP.rem -= EXP.min</t>
        
        <t>EXP.loc -= EXP.min</t>
        
        <t>FRAC.rem = (0x800 | FRAC.rem) << EXP.rem</t>
        
        <t>FRAC.loc = (0x800 | FRAC.loc) << EXP.loc</t>
        
        <t>and assuming that the local clock tick duration is lower</t>
        
        <t>ADJ = FRAC.rem / FRAC.loc</t>
        
        <t>with ADJ being a integer variable. For higher precision, two
        appropriately calculated integers can be used.</t>
          
        <t>Any previously required training on the remote clock interval can be
        removed, resulting in a simpler and more dependable algorithm.
        Furthermore, transient network effects during the training phase which
        may result in a wrong inference of the remote clock interval are
        eliminated completely.</t>
        
        <t>Though specified for endpoint usage for congestion control, the
        difference betwen interarrival and interdeparture time used by this
        algorithm is applicable for passive measurement of jitter, as well.</t>
          
      </section>
  </back>  
</rfc>

<!-- additional content from nego draft -->

<!--        <t>The shortest meaningful interval was found to be a 64 byte  
          packet (i.e. ACK segment on Ethernet-like encoding) sent at a 
          rate of 100 Gbit/s. This corresponds to a maximum timestamp clock 
          rate of around 200 MHz, or an interval between clock ticks of 
          around 5 ns.
        </t>
        <t>The wide range of indicated timestamp clock intervals (spanning  
          9 orders of (decimal) magnitude, or 28 binary digits, and the 
          limitation to no more than 24 bits requires the use of a logarithmic 
          encoding. Since the precision of the timestamp clock value is most
          valuable at low frequencies (long tick durations), the clock rate
          is encoded as a time duration. This results in full precision for
          common used timestamp clock intervals, while allowing even
          shorter intervals at reduced precision. A format was chosen 
          that is simple to implement and poses no risk of confusion with
          common floating point representations.
        </t>
        <t>Conceptually, the timestamp clock
          interval can be represented as a unsigned integer with 42 bits 
          length. In this form, the least significant bit represents an 
          interval of 2^-38 sec (3.64 ps), while still allowing a maximum interval of 
          16 sec. This value is then shifted to the right, until it can be 
          represented by only 11 significant bits, and the number of shift operations 
          is stored as scaling adjustment factor (ADJ).
        </t>
        <t>A value of 
          zero (both ADJ and INT are set to zero) is supported and indicates, 
          that the timestamp values are NOT  correlated to wall-clock 
          time (i.e. the sender may perform some form of segment counting
          or sequence number extension instead). A host receiving an 
          interval of zero from the other end host MUST NOT perform 
          time-based heuristics which take the received TSval into 
          account, but SHOULD apply the regular PAWS test. The sender MUST
          guarantee the monotonic increase of timestamp values just as
          with <xref target="RFC1323"/>.
        </t>
        <t>Timestamp clock periods faster than 1 ms SHOULD be implemented 
          by inserting the timestamp "late" before transmitting a segment
          to avoid unnecessary timing jitter. Shortest clock periods, 
          with intervals of only a few microseconds or less, are provided
          for hardware-assisted implementations, e.g. direct use of a 
          (shifted) CPU cycle counter as clock source. 
        </t>
        <t>The range of possible values runs from 15.99 s to 7.45 ns with
          highest precision, and down to 3.64 ps with reducing precision,
          which is also the shortest difference in tick duration, that 
          could be resolved. This equates to clock frequencies of 0.06 Hz, 
          134 MHz and 275 GHz respectively.
        </t>
        <t>Despite the provision of such
          a large dynamic range, a receiver should consider that a 
          timestamp clock may deviate from the indicated rate by a large 
          fraction. Similarly, a sender SHOULD refrain from signaling
          the clock interval with too much precision (significant bits),
          if the clock can not be sampled with low variance over time.
          This can be achieved by shifting the value more to the right, so
          that INT will contain a number of leading zero bits. 
        </t>
        <t>If a sender is using a less precise clock source, 
          fewer significant bits can be used to implicitly signal this. For example,
          a timestamp clock interval of approximately 1 ms (1/1024th sec) can 
          be represented by both (INT=0x001, ADJ=28) and (INT=0x400, ADJ=18). A
          more accurate representation of 1 ms would be (INT=0x418, ADJ=18).
          The latter representation carries more significant bits, indicating a
          more stable clock source with low jitter.
        </t>
        <t>As rough guidance, common software interrupt clock sources should
          be indicated with 2 leading zero bits in the INT field, as in the 
          examples below. Low variance software clocks (e.g. CPU cycle 
          counters) may be indicated with a single leading zero bit, and
          hardware injecting the timestamp into the header with high precision
          may use the full precision. Similarly, if the clock source exhibits
          a very high variability (e.g. when running in a virtualized 
          environment), 3 or more leading zeros should be used in
          the INT field. 
        </t>
        <t>Example for an timestamp capability negotiation, to indicate that the 
          senders timestamp clock (TCP clock) is running with 1 ms per tick, and 
          using a clock source of typical quality (e.g. software timer interrupt):
        </t>
        <t>SYN, TSval=<X>, TSecr=EXO|VER=0|MSK=0|RES=0|ADJ=20|INT=0x106 
        </t>
        <t>For a similar clock source running at 10ms, the fields would look like this:
        </t>
        <t>SYN, TSval=<X>, TSecr=EXO|VER=0|MSK=0|RES=0|ADJ=23|INT=0x147 
        </t>
    
        <texttable anchor="Tab1" align="center"
          title="Common used TCP Timestamp Clock intervals">
          <ttcol align="right">tick interval</ttcol>
          <ttcol align="right">tick frequency</ttcol>
          <ttcol align="center">encoding at highest precision</ttcol>
          <ttcol align="center">encoding at lowest precision</ttcol>
          <c>16 s </c>  <c>0.06 Hz </c> <c>ADJ=31, INT=0x7FF</c> <c>ADJ=31, INT=0x7FF</c>
          <c>1 s </c>   <c> 1 Hz </c>   <c>ADJ=28, INT=0x400</c> <c>ADJ=31, INT=0x080</c>
          <c>0.5 s </c> <c> 2 Hz </c>   <c>ADJ=27, INT=0x400</c> <c>ADJ=31, INT=0x040</c>
          <c>100 ms</c>      <c>10 Hz </c>   <c>ADJ=24, INT=0x666</c> <c>ADJ=31, INT=0x00C</c>
          <c> 10 ms</c>      <c>100 Hz </c>  <c>ADJ=21, INT=0x51F</c> <c>ADJ=31, INT=0x001</c>
          <c>  4 ms</c>      <c>250 Hz </c>  <c>ADJ=20, INT=0x419</c> <c>ADJ=30, INT=0x001</c>
          <c>  1 ms</c>      <c> 1 kHz</c>        <c>ADJ=18, INT=0x418</c> <c>ADJ=28, INT=0x001</c>
          <c>200 us</c>      <c> 5 kHz</c>        <c>ADJ=15, INT=0x68E</c> <c>ADJ=25, INT=0x001</c>
          <c> 50 us</c>      <c>20 kHz</c>        <c>ADJ=13, INT=0x68E</c> <c>ADJ=23, INT=0x001</c>
          <c>  1 us</c>      <c> 1 MHz</c>        <c> ADJ=8, INT=0x432</c>  <c>ADJ=18, INT=0x001</c>
          <c> 60 ns</c>      <c>16.7 MHz</c>      <c> ADJ=4, INT=0x407</c>  <c>ADJ=14, INT=0x001</c>
        </texttable>
        <t>The timestamp clock values a host is using must not necessarily 
          run synchronous with the internal TCP clock. Different clock 
          sources, such as a NTP stratum, RTC, CPU cycle counters, or other 
          independent clocks can be used to derive the TSval. This allows the
          decoupling of the coarse-grained TCP clock used for retransmission 
          and delayed ACK timeouts, from the clock intervals indicated in
          the TSval itself.
        </t>
        <t><xref target="RFC1323"/> only defines a single use of the timestamp 
          for timing purposes. Using the TCP timer directly as timestamp clock 
          source is useful for RTT measurement and calculation of the RTO, to 
          minimize subsequent RTT conversations.
        </t> 
        <t>Most stacks will not support dynamically adjusted timestamp clock 
          intervals (i.e. for local, sub-sea, or satellite links). Therefore, the
          indicated clock duration can be a static, compile time value. To 
          use the indicated clock interval, for example to perform one-way 
          delay variation calculations, simple integer operations can be 
          used after an initial conversion of the wire representation to
          longer (i.e. 32 or 64 bit) integer values (see 
          <xref target="owd"/> for some details).
        </t>
    -->

PAFTECH AB 2003-20262026-04-24 05:40:16