One document matched: draft-fairhurst-tcpm-newcwv-04.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<rfc category="std" docName="draft-fairhurst-tcpm-newcwv-04" ipr="trust200902"
     obsoletes="2861" updates="5681">
  <!-- Updates SDP -->

  <?rfc toc="yes"?>

  <!-- generate a table of contents -->

  <?rfc symrefs="yes"?>

  <!-- use anchors instead of numbers for references -->

  <?rfc sortrefs="yes" ?>

  <!-- alphabetize the references -->

  <?rfc compact="yes" ?>

  <!-- conserve vertical whitespace -->

  <?rfc subcompact="no" ?>

  <!-- but keep a blank line between list items -->

  <front>
    <title abbrev="new-CWV">Updating TCP to support Application-Limited
    Traffic</title>

    <author fullname="Godred Fairhurst" initials="G." surname="Fairhurst">
      <organization>University of Aberdeen</organization>

      <address>
        <postal>
          <street>School of Engineering</street>

          <street>Fraser Noble Building</street>

          <city>Aberdeen</city>

          <region>Scotland</region>

          <code>AB24 3UE</code>

          <country>UK</country>
        </postal>

        <email>gorry@erg.abdn.ac.uk</email>

        <uri>http://www.erg.abdn.ac.uk</uri>
      </address>
    </author>

    <author fullname="Arjuna Sathiaseelan" initials="A."
            surname="Sathiaseelan">
      <organization>University of Aberdeen</organization>

      <address>
        <postal>
          <street>School of Engineering</street>

          <street>Fraser Noble Building</street>

          <city>Aberdeen</city>

          <region>Scotland</region>

          <code>AB24 3UE</code>

          <country>UK</country>
        </postal>

        <email>arjuna@erg.abdn.ac.uk</email>

        <uri>http://www.erg.abdn.ac.uk</uri>
      </address>
    </author>

    <date day="06" month="August" year="2012" />

    <area>Transport</area>

    <workgroup>TCPM Working Group</workgroup>

    <keyword>CWV</keyword>

    <keyword>TCP</keyword>

    <abstract>
      <t>This document addresses issues that arise when TCP is used to support
      traffic that exhibits periods where the transmission rate is limited by
      the application rather than the congestion window. It updates TCP to
      allow a TCP sender to restart quickly following either an idle or
      application-limited interval. The method is expected to benefit
      application-limited TCP applications, while also providing an
      appropriate response if congestion is experienced.</t>

      <t>It also evaluates TCP Congestion Window Validation, CWV, an IETF
      experimental specification defined in RFC 2861, and concludes that CWV
      sought to address important issues, but failed to deliver a widely used
      solution. This document therefore proposes an update to the status of
      RFC 2861 by recommending it is moved from Experimental to Historic
      status, and that it is replaced by the current specification.</t>
    </abstract>
  </front>

  <middle>
    <!-- text starts here -->

    <section title="Introduction" toc="include">
      <t>TCP is used to support a range of application behaviours. The TCP
      congestion window (cwnd) controls the number of packets/bytes that a TCP
      flow may have in the network at any time. A bulk application will always
      have data available to transmit. The rate at which it sends is therefore
      limited by the maximum permitted by the receiver and congestion windows.
      In contrast, a rate-limited application will experience periods when the
      sender is either idle or is unable to send at the maximum rate permitted
      by the cwnd. This latter case is called application-limited. The focus
      of this document is on the operation of TCP in such an idle or
      application-limited case.</t>

      <t>Standard TCP <xref target="RFC5681"></xref> requires the cwnd to be
      reset to the restart window (RW) when an application becomes idle. <xref
      target="RFC2861"></xref> noted that this TCP behaviour was not always
      observed in current implementations. Recent experiments <xref
      target="Bis08"></xref> confirm this to still be the case.</t>

      <t>Standard TCP does not control growth of the cwnd when a TCP sender is
      application-limited. An application-limited sender may therefore grow a
      cwnd beyond that corresponding to the current transmit rate, resulting
      in a value that does not reflect current information about the state of
      the network path the flow is using. Use of such an invalid cwnd may
      result in reduced application performance and/or could significantly
      contribute to network congestion.</t>

      <t><xref target="RFC2861"></xref> proposed a solution to these issues in
      an experimental method known as Congestion Window Validation (CWV). CWV
      was intended to help reduce cases where TCP accumulated an invalid cwnd.
      The use and drawbacks of using CWV with an application are discussed in
      Section 2.</t>

      <t>Section 4 specifies an alternative to CWV that seeks to address the
      same issues, but does this in a way that is expected to mitigate the
      impact on an application that varies its transmission rate. The method
      described applies to both an application-limited and an idle
      condition.</t>
    </section>

    <section title="Reviewing experience with TCP-CWV">
      <t>RFC 2861 described a simple modification to the TCP congestion
      control algorithm that decayed the cwnd after the transition to a
      “sufficiently-long” idle period. This used the slow-start
      threshold (ssthresh) to save information about the previous value of the
      congestion window. The approach relaxed the standard TCP behaviour <xref
      target="RFC5681"></xref> for an idle session, intended to improve
      application performance. CWV also modified the behaviour for an
      application-limited session where a sender transmitted at a rate less
      than allowed by cwnd.</t>

      <t>RFC 2861 has been implemented in some mainstream operating systems as
      the default behaviour <xref target="Bis08"></xref>. Analysis (e.g. <xref
      target="Bis10"></xref>) has shown that a TCP sender using CWV is able to
      use available capacity on a shared path after an idle period. This can
      benefit some applications, especially over long delay paths, when
      compared to slow-start restart specified by standard TCP. However, CWV
      would only benefit an application if the idle period were less than
      several Retransmission Time Out (RTO) intervals <xref
      target="RFC6298"></xref>, since the behaviour would otherwise be the
      same as for standard TCP, which resets the cwnd to the RW after this
      period.</t>

      <t>Experience with CWV suggests that although CWV benefits the network
      in an application-limited scenario (reducing the probability of network
      congestion), the behaviour can be too conservative for many common
      rate-limited applications. This mechanism does not therefore offer the
      desirable increase in application performance for rate-limited
      applications and it is unclear whether applications actually use this
      mechanism in the general Internet.</t>

      <t>It is therefore concluded that CWV is often a poor solution for many
      rate-limited applications. It has the correct motivation, but has the
      wrong approach to solving this problem.</t>
    </section>

    <section title="Terminology" toc="include">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"></xref>.</t>

      <t>The document assumes familiarity with the terminology of TCP
      congestion control <xref target="RFC5681"></xref>.</t>
    </section>

    <section title="An updated TCP response to idle and application-limited periods  "
             toc="include">
      <t>This section proposes an update to the TCP congestion control
      behaviour during an idle or application-limited period. The new method
      permits a TCP sender to preserve the cwnd when an application becomes
      idle for a period of time (set in this specification to 5 minutes, see
      section 5). This period, where actual usage is less than allowed by
      cwnd, is named the non-validated phase. The method allows an application
      to resume transmission at a previous rate without incurring the delay of
      slow-start. However, if the TCP sender experiences congestion using the
      preserved cwnd, it is required to immediately reset the cwnd to an
      appropriate value specified by the method. If a sender does not take
      advantage of the preserved cwnd within five minutes, the value of cwnd
      is reduced, ensuring the value then reflects the capacity that was
      recently actually used.</t>

      <t>The method requires that the TCP SACK option is enabled. This allows
      the sender to select a cwnd following a congestion event that is based
      on the measured path capacity, better reflecting the fair-share. A
      similar approach was proposed by TCP Jump Start <xref
      target="Liu07"></xref>, as a congestion response after more rapid
      opening of a TCP connection.</t>

      <t>It is expected that this update will satisfy the requirements of many
      rate-limited applications and at the same time provide an appropriate
      method for use in the Internet. It also reduces the incentive for an
      application to send data simply to keep transport congestion state.
      (This is sometimes known as "padding").</t>

      <t>The new method does not differentiate between times when the sender
      has become idle or application-limited. This is partly a response to
      recognition that some applications wish to transmit at a rate-limited,
      and that it can be hard to make a distinction between
      application-limited and idle behaviour. This is expected to encourage
      applications and TCP stacks to use standards-based congestion control
      methods. It may also encourage the use of long-lived connections where
      this offers benefit (such as persistent http).</t>

      <t>The method is specified in following subsections.</t>

      <section title="A method for preserving cwnd in idle and application-limited periods."
               toc="include">
        <t>The method described in this document updates <xref
        target="RFC5681"></xref>. Use of the method REQUIRES a TCP sender and
        the corresponding receiver to enable the TCP SACK option <xref
        target="RFC3517"></xref>.</t>

        <t><xref target="RFC5681"></xref> defines a variable FlightSize , that
        indicates the amount of outstanding data in the network. This equal to
        the value of Pipe calculated based on the pipe algorithm <xref
        target="RFC3517"></xref>. In RFC5681 this value is used during loss
        recovery, whereas in this method it is also used during normal data
        transfer. A sender is not required to continuously track this value,
        but SHOULD measure the volume of data in the network with a sampling
        period of not less than one RTT period.</t>
      </section>

      <section title="The nonvalidated phase" toc="include">
        <t>The updated method creates a new TCP sender phase that captures
        whether the cwnd reflects a validated or non-validated value. The
        phases are defined as:</t>

        <t><list style="symbols">
            <t>Validated phase: FlightSize >=(3/4)*cwnd. This is the normal
            phase, where cwnd is expected to be an approximate indication of
            the available capacity currently available along the network path,
            and the standard methods are used to increase cwnd (currently
            <xref target="RFC5681"></xref>).</t>

            <t>Non-validated phase: FlightSize <(1/4)*cwnd. This is the
            phase where the cwnd has a value based on a previous measurement
            of the available capacity, and the usage of this capacity has not
            been validated in the previous RTT. That is, when it is not known
            whether the cwnd reflects the currently available capacity
            available along the network path. The mechanisms to be used in
            this phase seek to determine whether any resumed rate remains safe
            for the Internet path, i.e., it quickly reduces the rate if the
            flow is known to induce congestion. These mechanisms are specified
            in section 4.3.</t>
          </list>The values 1/4 and 3/4 were selected to reduce the effects of
        variations in the measured FlightSize.</t>
      </section>

      <section title="TCP congestion control during the nonvalidated phase"
               toc="include">
        <t>A TCP sender that enters the non-validated phase MUST preserve the
        cwnd (i.e., this neither grows nor reduces while the sender remains in
        this phase). The phase is concluded after a fixed period of time (five
        minutes, as explained in section or when the sender transmits using
        the full cwnd (i.e. it is no longer application-limited).</t>

        <t>The behaviour in the non-validated phase is specified as:</t>

        <t><list style="symbols">
            <t>If the sender consumes all the available space within the cwnd
            (i.e., the remaining unused cwnd in bytes is less than one Sender
            Maximum Segment Size, SMSS), then the sender MUST exit the
            non-validated phase. The threshold value of cwnd required for the
            sender to enter the non-validated phase is intentionally different
            to that required to leave the phase. This introduces hysteresis to
            avoid rapid oscillation between the phases. Note that a change
            between phases does not significantly impact an
            application-limited sender, but serves to determine its behaviour
            if it substantially increases its transmission rate.</t>

            <t>If the sender receives an indication of congestion while in the
            non-validated phase (i.e. detects loss, or an Explicit Congestion
            Notification, ECN, mark <xref target="RFC3168"></xref>), the
            sender MUST exit the non-validated phase (reducing the cwnd as
            defined in section 4.3.1).</t>

            <t>If the Retransmission Time Out (RTO) expires while in the
            non-validated phase, the sender MUST exit the non-validated phase.
            It then resumes using the Standard TCP RTO mechanism <xref
            target="RFC5681"></xref>. (The resulting reduction of cwnd
            describe din section 4.3.2 is appropriate, since any accumulated
            path history is considered unreliable).</t>
          </list></t>

        <section title="Response to congestion in the nonvalidated phase"
                 toc="include">
          <t>Reception of congestion feedback while in the non-validated phase
          is interpreted as an indication that it was inappropriate for the
          sender to use the preserved cwnd. The sender is therefore required
          to quickly reduce the rate to avoid further congestion. Since the
          cwnd does not have a validated value, a new cwnd value must be
          selected based on the utilised rate.</t>

          <t>A sender that detects a packet-drop or receives an ECN marked
          packet MUST calculate a safe cwnd, by setting it to the value
          specified in Section 3.2 of <xref target="RFC5681"></xref>.</t>

          <t>At the end of the recovery phase, the TCP sender MUST reset the
          cwnd using the method below:</t>

          <figure>
            <artwork><![CDATA[        cwnd = ((FlightSize - R)/2).]]></artwork>
          </figure>

          <t></t>

          <t>Where, R is the volume of data that was reported as
          unacknowledged by the SACK information. This follows the method
          proposed for Jump Start [<xref target="Liu07"></xref>.</t>

          <t>The inclusion of the term R makes this adjustment is more
          conservative than standard TCP. This is required, since the sender
          may have sent more segments than Standard TCP would have done.</t>

          <t>If the sender implements a method that allows it to identify the
          number of ECN-marked segments within a windowthat were observed by
          the receiver, the sender SHOULD use the method above, further
          reducing R by the number of marked segments.</t>
        </section>

        <section title="Adjustment at the end of the nonvalidated phase  "
                 toc="default">
          <t>During the non-validated phase, the sender may produce bursts of
          data of up to the cwnd in size. While this is no different to
          standard TCP, it is desirable to control the maximum burst size,
          e.g. by setting a burst size limit, using a pacing algorithm, or
          some other method <xref target="Hug01"></xref>.</t>

          <t>An application that remains in the non-validated phase for a
          period greater than five minutes is required to adjust its
          congestion control state. At the end of the non-validated phase, the
          sender MUST update the ssthresh:</t>

          <figure>
            <artwork><![CDATA[        sthresh = max(ssthresh, 3*cwnd/4).
]]></artwork>
          </figure>

          <t></t>

          <t>(This adjustment of ssthresh ensures that the sender records that
          it has safely sustained the present rate. The change is beneficial
          to application-limited flows that encounter occasional congestion,
          and could otherwise suffer an unwanted additional delay in
          recovering the transmission rate.)</t>

          <t>The sender MUST then update cwnd:</t>

          <figure>
            <artwork><![CDATA[          cwnd = max(FlightSize*2, IW).

]]></artwork>
          </figure>

          <t>Where IW is the TCP inital window <xref
          target="RFC5681"></xref>.</t>

          <t>(This allows an application to continue to send at the currently
          utilised rate, and not incur delay should it increase to twice the
          utilised rate.)</t>

          <t>After completing this adjustment, the sender MAY re-enter the
          non-validated phase, if required (see section 4.2).</t>
        </section>
      </section>
    </section>

    <section title="Determining a safe period to preserve cwnd" toc="include">
      <t>This section documents the rationale for selecting the maximum period
      that cwnd may be preserved.</t>

      <t>Preserving cwnd avoids undesirable side effects that would result if
      the cwnd were to be preserved for an arbitrary long period, which was a
      part of the problem that CWV originally attempted to address. The period
      a sender may safely preserve the cwnd, is a function of the period that
      a network path is expected to sustain the capacity reflected by cwnd.
      There is no ideal choice for this time.</t>

      <t>The period of five minutes was chosen as a compromise that was larger
      than the idle intervals of common applications, but not sufficiently
      larger than the period for which the capacity of an Internet path may
      commonly be regarded as stable. The capacity of wired networks is
      usually relatively stable for periods of several minutes and that load
      stability increases with the capacity. This suggests that cwnd may be
      preserved for at least a few minutes.</t>

      <t>There are cases where the TCP throughput exhibits significant
      variability over a time less than five minutes. Examples could include
      wireless topologies, where TCP rate variations may fluctuate on the
      order of a few seconds as a consequence of medium access protocol
      instabilities. Mobility changes may also impact TCP performance over
      short time scales. Senders that observe such rapid changes in the path
      characteristic may also experience increased congestion with the new
      method, however such variation would likely also impact TCP’s
      behaviour when supporting interactive and bulk applications.</t>

      <t>Routing algorithms may modify the network path, disrupting the RTT
      measurement and changing the capacity available to a TCP connection,
      however such changes do not often occur within a time frame of a few
      minutes.</t>

      <t>The value of five minutes is therefore expected to be sufficient for
      most current applications. Simulation studies also suggest that for many
      practical applications, the performance using this value will not be
      significantly different to that observed using a non-standard method
      that does not reset the cwnd after idle.</t>

      <t>Finally, other TCP sender mechanisms have used a 5 minute timer, and
      there could be simplifications in some implementations by reusing the
      same interval. TCP defines a default user timeout of 5 minutes <xref
      target="RFC0793"></xref> i.e. how long transmitted data may remain
      unacknowledged before a connection is forcefully closed.</t>
    </section>

    <section title="Security Considerations" toc="include">
      <t>General security considerations concerning TCP congestion control are
      discussed in <xref target="RFC5681"></xref>. This document describes an
      algorithm that updates one aspect of the congestion control procedures,
      and so the considerations described in RFC 5681 also apply to this
      algorithm.</t>
    </section>

    <section title="IANA Considerations" toc="include">
      <t>There are no IANA considerations.</t>
    </section>

    <section title="Acknowledgments">
      <t>The authors acknowledge the contributions of Dr I Biswas and Dr R
      Secchi in supporting the evaluation of CWV and for their help in
      developing the mechanisms proposed in this draft. We also acknowledge
      comments received from the Internet Congestion Control Research Group,
      in particular Yuchung Cheng, Mirja Kuehlewind, and Joe Touch.</t>

      <t><!----></t>
    </section>

    <section title="Other related work - Author Notes">
      <t>There are several issues to be discussed more widely:</t>

      <t><list style="empty">
          <t>• Should the method explicitly state a procedure for
          limiting burstiness or pacing?</t>

          <t><list style="empty">
              <t>This is often regarded as good practice, but isn't a formal
              part of TCP. draft-hughes-restart-00.txt provides some
              discussion of this topic.</t>
            </list></t>

          <t>• There are potential interaction with the proposal to raise
          the TCP initial Window to ten segments, do these cases need to be
          elaborated?</t>

          <t><list style="empty">
              <t>This relates to draft-ietf-tcpm-initcwnd.</t>

              <t>The two methods have different functions and different
              response to loss/congestion.</t>

              <t>IW=10 proposes an experimental update to TCP that would allow
              faster opening of the cwnd, and also a large (same size) restart
              window. This approach is based on the assumption that many
              forward paths can sustain bursts of up to ten segments without
              (appreciable) loss. Such a significant increase in cwnd must be
              matched with an equally large reduction of cwnd if
              loss/congestion is detected, and such a congestion indication is
              likely to require future use of IW=10 to be disabled for this
              path for some time. This guards against the unwanted behaviour
              of a series of short flows continuously flooding a network path
              without network congestion feedback.</t>

              <t>In contrast, new-CWV proposes a standards-track update with a
              rationale that relies on recent previous path history to select
              an appropriate cwnd after restart.</t>

              <t>The behaviour differs in three ways:</t>

              <t>1) For applications that send little initially, new-cwv may
              constrain more than IW=10, but would not require the connection
              to reset any path information when a restart incurred loss. In
              contrast, new-cwv would allow the TCP connection to preserve the
              cached cwnd, any loss, would impact cwnd, but not impact other
              flows.</t>

              <t>2) For applications that utilise more capacity than provided
              by a cwnd=10, this method would permit a larger restart window
              compared to a restart using IW=10. This is justified by the
              recent path history.</t>

              <t>3) new-CWV is attended to also be used for
              application-limited use, where the application sends, but does
              not seek to fully utilise the cwnd. In this case, new-cwv
              constrains the cwnd to that justified by the recent path
              history. The performance trade-offs are hence different, and it
              would be possible to enable new-cwv when also using IW=10, and
              yield the benefits of this.</t>
            </list></t>

          <t>• There is potential overlap with the Laminar proposal
          (draft-mathis-tcpm-tcp-laminar)</t>

          <t><list style="empty">
              <t>The current draft was intended as a standards-track update to
              TCP, rather than a new transport variant. At least, it would be
              good to understand how the two interact and whether there is a
              possibility of a single method.</t>
            </list></t>
        </list></t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc sortrefs="yes"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.0793.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2861.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.6298.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3168.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3517.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.5681.xml"?>
    </references>

    <references title="Informative References">
      <reference anchor="Bis08">
        <front>
          <title>A Practical Evaluation of Congestion Window Validation
          Behaviour, 9th Annual Postgraduate Symposium in the Convergence of
          Telecommunications, Networking and Broadcasting (PGNet), Liverpool,
          UK</title>

          <author fullname="I." surname="Biswas">
            <organization></organization>
          </author>

          <author fullname="G." surname="Fairhurst">
            <organization></organization>
          </author>

          <date day="01" month="June" year="2008" />
        </front>
      </reference>

      <reference anchor="Liu07">
        <front>
          <title>Congestion Control without a Startup Phase, 5th International
          Workshop on Protocols for Fast Long-Distance Networks (PFLDnet), Los
          Angeles, California, USA</title>

          <author fullname="D" surname="Liu">
            <organization></organization>
          </author>

          <author fullname="M." surname="Allman">
            <organization></organization>
          </author>

          <author fullname="S" surname="Jiny">
            <organization></organization>
          </author>

          <author fullname="L." surname="Wang">
            <organization></organization>
          </author>

          <date day="01" month="February" year="2007" />
        </front>
      </reference>

      <reference anchor="Bis10">
        <front>
          <title>Analysing TCP for Bursty Traffic, Int'l J. of Communications,
          Network and System Sciences, 7(3)</title>

          <author fullname="I." surname="Biswas">
            <organization></organization>
          </author>

          <author fullname="A." surname="Sathiaseelan">
            <organization></organization>
          </author>

          <author fullname="R." surname="Secchi">
            <organization></organization>
          </author>

          <author fullname="G." surname="Fairhurst">
            <organization></organization>
          </author>

          <date day="01" month="June" year="2010" />
        </front>
      </reference>

      <reference anchor="Hug01">
        <front>
          <title>Issues in TCP Slow-Start Restart After Idle
          (Work-in-Progress)</title>

          <author fullname="A.S." surname="Hughes">
            <organization></organization>
          </author>

          <author fullname="J." surname="Touch">
            <organization></organization>
          </author>

          <author fullname="J." surname="Heidemann">
            <organization></organization>
          </author>

          <date day="01" month="December" year="2001" />
        </front>
      </reference>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 16:29:56