http://stupid.domain.name/ietf/

One document matched: draft-ietf-aqm-recommendation-01.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- Some of the more generally applicable PIs that most I-Ds might want to use -->
<!-- Try to enforce the ID-nits conventions and DTD validity -->
<?rfc strict="yes" ?>
<!-- Items used when reviewing the document -->
<?rfc comments="no" ?>
<!-- Controls display of <cref> elements -->
<?rfc inline="no" ?>
<!-- When no, put comments at end in comments section,
                                 otherwise, put inline -->
<?rfc editing="no" ?>
<!-- When yes, insert editing marks: editing marks consist of a 
                                 string such as <29> printed in the blank line at the 
                                 beginning of each paragraph of text. -->
<!-- Create Table of Contents (ToC) and set some options for it.  
         Note the ToC may be omitted for very short documents,but idnits insists on a ToC 
         if the document has more than 15 pages. -->
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<!-- If "yes" eliminates blank lines before main section entries. -->
<?rfc tocdepth="3"?>
<!-- Sets the number of levels of sections/subsections... in ToC -->
<!-- Choose the options for the references. 
         Some like symbolic tags in the references (and citations) and others prefer 
         numbers. The RFC Editor always uses symbolic tags.
         The tags used are the anchor attributes of the references. -->
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<!-- If "yes", causes the references to be sorted in order of tags.
                                 This doesn't have any effect unless symrefs is "yes" also. -->
<!-- These two save paper: Just setting compact to "yes" makes savings by not starting each 
         main section on a new page but does not omit the blank lines between list items. 
         If subcompact is also "yes" the blank lines between list items are also omitted. -->
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<!-- end of list of popular I-D processing instructions -->
<!-- end of list of processing instructions -->
<rfc category="bcp" docName="draft-ietf-aqm-recommendation-01"
     ipr="trust200902" obsoletes="2309">
  <front>
    <title abbrev="Active Queue Management Recommendations">IETF
    Recommendations Regarding Active Queue Management</title>

    <author fullname="Fred Baker" initials="F." role="editor" surname="Baker">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street></street>

          <city>Santa Barbara</city>

          <code>93117</code>

          <region>California</region>

          <country>USA</country>
        </postal>

        <email>fred@cisco.com</email>
      </address>
    </author>

    <author fullname="Godred Fairhurst" initials="G." role="editor"
            surname="Fairhurst">
      <organization>University of Aberdeen</organization>

      <address>
        <postal>
          <street>School of Engineering</street>

          <street>Fraser Noble Building</street>

          <city>Aberdeen</city>

          <region>Scotland</region>

          <code>AB24 3UE</code>

          <country>UK</country>
        </postal>

        <email>gorry@erg.abdn.ac.uk</email>

        <uri>http://www.erg.abdn.ac.uk</uri>
      </address>
    </author>

    <date day="30" month="January" year="2014" />

    <area>Internet Engineering Task Force</area>

    <workgroup></workgroup>

    <abstract>
      <t>This memo presents recommendations to the Internet community
      concerning measures to improve and preserve Internet performance. It
      presents a strong recommendation for testing, standardization, and
      widespread deployment of active queue management (AQM) in network
      devices, to improve the performance of today's Internet. It also urges a
      concerted effort of research, measurement, and ultimate deployment of
      AQM mechanisms to protect the Internet from flows that are not
      sufficiently responsive to congestion notification.</t>

      <t>The note largely repeats the recommendations of RFC 2309, updated
      after fifteen years of experience and new research.</t>
    </abstract>

    <!--
    <note title="Note to IESG">
      <t></t>
      <t></t>
    </note>
  -->

    <!--
      <texttable anchor="table_example" title="A Very Simple Table">
      <preamble>Tables use ttcol to define column headers and widths.
      Every cell then has a "c" element for its content.</preamble>
          <ttcol align="center">ttcol #1</ttcol>
                                    <ttcol align="center">ttcol #2</ttcol>
                      <c>c #1</c>		<c>c #2</c>
                      <c>c #3</c>		<c>c #4</c>
                      <c>c #5</c>		<c>c #6</c>
      <postamble>which is a very simple example.</postamble>
      </texttable>
    -->
  </front>

  <middle>
    <!--		
      <t>There are multiple list styles: "symbols", "letters", "numbers",
"hanging", "format", etc.</t>
      <t>
	<list style="symbols">
	    <t>First bullet</t>
	    <t>Second bullet</t>
	</list>
     </t>
-->

    <!--
<figure anchor="reference" title="Figure">
<artwork align="center">
<![CDATA[
	ASCII artwork goes here... 
]]>
</artwork>
</figure>
-->

    <section title="Introduction">
      <t>The Internet protocol architecture is based on a connectionless end-
      to-end packet service using the Internet Protocol, whether <xref
      target="RFC0791">IPv4</xref> or <xref target="RFC2460">IPv6</xref>. The
      advantages of its connectionless design: flexibility and robustness,
      have been amply demonstrated. However, these advantages are not without
      cost: careful design is required to provide good service under heavy
      load. In fact, lack of attention to the dynamics of packet forwarding
      can result in severe service degradation or "Internet meltdown". This
      phenomenon was first observed during the early growth phase of the
      Internet of the mid 1980s <xref target="RFC0896"></xref><xref
      target="RFC0970"></xref>, and is technically called "congestive
      collapse".</t>

      <t>The original fix for Internet meltdown was provided by Van Jacobsen.
      Beginning in 1986, Jacobsen developed the congestion avoidance
      mechanisms that are now required in TCP implementations <xref
      target="Jacobson88"></xref> <xref target="RFC1122"></xref>. These
      mechanisms operate in Internet hosts to cause TCP connections to "back
      off" during congestion. We say that TCP flows are "responsive" to
      congestion signals (i.e., marked or dropped packets) from the network.
      It is primarily these TCP congestion avoidance algorithms that prevent
      the congestive collapse of today's Internet.</t>

      <t>However, that is not the end of the story. Considerable research has
      been done on Internet dynamics since 1988, and the Internet has grown.
      It has become clear that the <xref target="RFC5681">TCP congestion
      avoidance mechanisms</xref>, while necessary and powerful, are not
      sufficient to provide good service in all circumstances. Basically,
      there is a limit to how much control can be accomplished from the edges
      of the network. Some mechanisms are needed in the network devices to
      complement the endpoint congestion avoidance mechanisms. These
      mechanisms may be implemented in network devices that include routers,
      switches, and other network middleboxes.</t>

      <t>It is useful to distinguish between two classes of algorithms related
      to congestion control: "queue management" versus "scheduling"
      algorithms. To a rough approximation, queue management algorithms manage
      the length of packet queues by marking or dropping packets when
      necessary or appropriate, while scheduling algorithms determine which
      packet to send next and are used primarily to manage the allocation of
      bandwidth among flows. While these two AQM mechanisms are closely
      related, they address different performance issues.</t>

      <t>This memo highlights two performance issues:</t>

      <t>The first issue is the need for an advanced form of queue management
      that we call "active queue management." <xref target="Section2"></xref>
      summarizes the benefits that active queue management can bring. A number
      of Active Queue Management (AQM) procedures are described in the
      literature, with different characteristics. This document does not
      recommend any of them in particular, but does make recommendations that
      ideally would affect the choice of procedure used in a given
      implementation.</t>

      <t>The second issue, discussed in <xref target="Section4"></xref> of
      this memo, is the potential for future congestive collapse of the
      Internet due to flows that are unresponsive, or not sufficiently
      responsive, to congestion indications. Unfortunately, there is no
      consensus solution to controlling congestion caused by such aggressive
      flows; significant research and engineering will be required before any
      solution will be available. It is imperative that this work be
      energetically pursued, to ensure the future stability of the
      Internet.</t>

      <t><xref target="Section5"></xref> concludes the memo with a set of
      recommendations to the Internet community concerning these topics.</t>

      <t>The discussion in this memo applies to "best-effort" traffic, which
      is to say, traffic generated by applications that accept the occasional
      loss, duplication, or reordering of traffic in flight. It also applies
      to other traffic, such as real-time traffic that can adapt its sending
      rate to reduce loss and/or delay. It is most effective, when the
      adaption occurs on time scales of a single RTT or a small number of
      RTTs, for <xref target="RFC1633">elastic traffic</xref>.</t>

      <t><xref target="RFC2309"></xref> resulted from past discussions of
      end-to-end performance, Internet congestion, and Random Early Discard
      (RED) in the End-to-End Research Group of the Internet Research Task
      Force (IRTF). This update results from experience with this and other
      algorithms, and the AQM discussion within the IETF<xref
      target="AQM-WG"></xref>.</t>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119"></xref>.</t>
      </section>
    </section>

    <section anchor="Section2" title="The Need For Active Queue Management">
      <t>The traditional technique for managing the queue length in a network
      device is to set a maximum length (in terms of packets) for each queue,
      accept packets for the queue until the maximum length is reached, then
      reject (drop) subsequent incoming packets until the queue decreases
      because a packet from the queue has been transmitted. This technique is
      known as "tail drop", since the packet that arrived most recently (i.e.,
      the one on the tail of the queue) is dropped when the queue is full.
      This method has served the Internet well for years, but it has two
      important drawbacks:<list style="numbers">
          <t>Lock-Out <vspace blankLines="1" />In some situations tail drop
          allows a single connection or a few flows to monopolize queue space,
          preventing other connections from getting room in the queue. This
          "lock-out" phenomenon is often the result of synchronization or
          other timing effects.</t>

          <t>Full Queues <vspace blankLines="1" />The tail drop discipline
          allows queues to maintain a full (or, almost full) status for long
          periods of time, since tail drop signals congestion (via a packet
          drop) only when the queue has become full. It is important to reduce
          the steady-state queue size, and this is perhaps the most important
          goal for queue management. <vspace blankLines="1" />The naive
          assumption might be that there is a simple tradeoff between delay
          and throughput, and that the recommendation that queues be
          maintained in a "non-full" state essentially translates to a
          recommendation that low end-to-end delay is more important than high
          throughput. However, this does not take into account the critical
          role that packet bursts play in Internet performance. Even though
          TCP constrains the congestion window of a flow, packets often arrive
          at network devices in bursts <xref target="Leland94"></xref>. If the
          queue is full or almost full, an arriving burst will cause multiple
          packets to be dropped. This can result in a global synchronization
          of flows throttling back, followed by a sustained period of lowered
          link utilization, reducing overall throughput. <vspace
          blankLines="1" />The point of buffering in the network is to absorb
          data bursts and to transmit them during the (hopefully) ensuing
          bursts of silence. This is essential to permit the transmission of
          bursty data. Normally small queues are prefered in network devices,
          with sufficient queue capacity to absorb the bursts. The
          counter-intuitive result is that maintaining normally-small queues
          can result in higher throughput as well as lower end-to-end delay.
          In summary, queue limits should not reflect the steady state queues
          we want to be maintained in the network; instead, they should
          reflect the size of bursts that a network device needs to
          absorb.</t>
        </list></t>

      <t>Besides tail drop, two alternative queue disciplines that can be
      applied when a queue becomes full are "random drop on full" or "drop
      front on full". Under the random drop on full discipline, a network
      device drops a randomly selected packet from the queue (which can be an
      expensive operation, since it naively requires an O(N) walk through the
      packet queue) when the queue is full and a new packet arrives. Under the
      "drop front on full" discipline <xref target="Lakshman96"></xref>, the
      network device drops the packet at the front of the queue when the queue
      is full and a new packet arrives. Both of these solve the lock-out
      problem, but neither solves the full-queues problem described above.</t>

      <t>We know in general how to solve the full-queues problem for
      "responsive" flows, i.e., those flows that throttle back in response to
      congestion notification. In the current Internet, dropped packets
      provide a critical mechanism indicating congestion notification to
      hosts. The solution to the full-queues problem is for network devices to
      drop packets before a queue becomes full, so that hosts can respond to
      congestion before buffers overflow. We call such a proactive approach
      AQM. By dropping packets before buffers overflow, AQM allows network
      devices to control when and how many packets to drop.</t>

      <t>In summary, an active queue management mechanism can provide the
      following advantages for responsive flows. <list style="numbers">
          <t>Reduce number of packets dropped in network devices <vspace
          blankLines="1" />Packet bursts are an unavoidable aspect of packet
          networks <xref target="Willinger95"></xref>. If all the queue space
          in a network device is already committed to "steady state" traffic
          or if the buffer space is inadequate, then the network device will
          have no ability to buffer bursts. By keeping the average queue size
          small, AQM will provide greater capacity to absorb
          naturally-occurring bursts without dropping packets. <vspace
          blankLines="1" />Furthermore, without AQM, more packets will be
          dropped when a queue does overflow. This is undesirable for several
          reasons. First, with a shared queue and the tail drop discipline,
          this can result in unnecessary global synchronization of flows,
          resulting in lowered average link utilization, and hence lowered
          network throughput. Second, unnecessary packet drops represent a
          possible waste of network capacity on the path before the drop
          point. <vspace blankLines="1" />While AQM can manage queue lengths
          and reduce end-to-end latency even in the absence of end-to-end
          congestion control, it will be able to reduce packet drops only in
          an environment that continues to be dominated by end-to-end
          congestion control.</t>

          <t>Provide a lower-delay interactive service <vspace
          blankLines="2" />By keeping a small average queue size, AQM will
          reduce the delays experienced by flows. This is particularly
          important for interactive applications such as short Web transfers,
          Telnet traffic, or interactive audio-video sessions, whose
          subjective (and objective) performance is better when the end-to-end
          delay is low.</t>

          <t>Avoid lock-out behavior <vspace blankLines="1" />AQM can prevent
          lock-out behavior by ensuring that there will almost always be a
          buffer available for an incoming packet. For the same reason, AQM
          can prevent a bias against low capacity, but highly bursty, flows.
          <vspace blankLines="1" />Lock-out is undesirable because it
          constitutes a gross unfairness among groups of flows. However, we
          stop short of calling this benefit "increased fairness", because
          general fairness among flows requires per-flow state, which is not
          provided by queue management. For example, in a network device using
          AQM with only FIFO scheduling, two TCP flows may receive very
          different share of the network capacity simply because they have
          different round-trip times <xref target="Floyd91"></xref>, and a
          flow that does not use congestion control may receive more capacity
          than a flow that does. For example, a router may maintain per-flow
          state to achieve general fairness by a per-flow scheduling algorithm
          such as Fair Queueing (FQ) <xref target="Demers90"></xref>, or a
          Class-Based Queue scheduling algorithm such as CBQ <xref
          target="Floyd95"></xref>. <vspace blankLines="1" />In contrast, AQM
          is needed even for network devices that use per-flow scheduling
          algorithms such as FQ or class-based scheduling algorithms, such as
          CBQ. This is because per-flow scheduling algorithms by themselves do
          not control the overall queue size or the size of individual queues.
          AQM is needed to control the overall average queue sizes, so that
          arriving bursts can be accommodated without dropping packets. In
          addition, AQM should be used to control the queue size for each
          individual flow or class, so that they do not experience
          unnecessarily high delay. Therefore, AQM should be applied across
          the classes or flows as well as within each class or flow. <vspace
          blankLines="1" /></t>
        </list>In short, scheduling algorithms and queue management should be
      seen as complementary, not as replacements for each other.</t>

      <!--Proposed new text block :-->

      <t>It is also important to differentiate the choice of buffer size for a
      queue in a switch/router or other network device, and the threshold(s)
      and other parameters that determine how and when an AQM algorithm
      operates. One the one hand, the optimum buffer size is a function of
      operational requirements and should generally be sized to be sufficient
      to buffer the largest normal traffic burst that is expected. This size
      depends on the number and burstiness of traffic arriving at the queue
      and the rate at which traffic leaves the queue. Different types of
      traffic and deployment scenarios will lead to different requirements. On
      the other hand, the choice of AQM algorithm and associated parameters is
      a function of the way in which congestion is experienced and the
      required reaction to achieve acceptable performance. This latter topic
      is the primary topic of the following sections.</t>
    </section>

    <section anchor="Section4" title="Managing Aggressive Flows">
      <t>One of the keys to the success of the Internet has been the
      congestion avoidance mechanisms of TCP. Because TCP "backs off" during
      congestion, a large number of TCP connections can share a single,
      congested link in such a way that link bandwidth is shared reasonably
      equitably among similarly situated flows. The equitable sharing of
      bandwidth among flows depends on all flows running compatible congestion
      avoidance algorithms, i.e., methods conformant with the current TCP
      specification <xref target="RFC5681"></xref>.</t>

      <t>We call a flow "TCP-friendly" when it has a congestion response that
      approximates the average response expected of a TCP flow. One example
      method of a TCP-friendly scheme is the TCP-Friendly Rate Control
      algorithm <xref target="RFC5348"></xref>. In this document, the term is
      used more generally to describe this and other algorithms that meet
      these goals.</t>

      <t>It is convenient to divide flows into three classes: (1) TCP Friendly
      flows, (2) unresponsive flows, i.e., flows that do not slow down when
      congestion occurs, and (3) flows that are responsive but are not
      TCP-friendly. The last two classes contain more aggressive flows that
      pose significant threats to Internet performance, which we will now
      discuss. <list style="numbers">
          <t>TCP-Friendly flows <vspace blankLines="1" />A TCP-friendly flow
          responds to congestion notification within a small number of path
          Round Trip Times (RTT), and in steady-state it uses no more capacity
          than a conformant TCP running under comparable conditions (drop
          rate, RTT, MTU, etc.). This is described in the remainder of the
          document.</t>

          <t>Non-Responsive Flows <vspace blankLines="1" />The User Datagram
          Protocol (UDP) <xref target="RFC0768"></xref> provides a minimal,
          best-effort transport to applications and upper-layer protocols
          (both simply called "applications" in the remainder of this
          document) and does not itself provide mechanisms to prevent
          congestion collapse and establish a degree of fairness <xref
          target="RFC5405"></xref>. <vspace blankLines="1" />There is a
          growing set of UDP-based applications whose congestion avoidance
          algorithms are inadequate or nonexistent (i.e, a flow that does not
          throttle its sending rate when it experiences congestion). Examples
          include some UDP streaming applications for packet voice and video,
          and some multicast bulk data transport. If no action is taken, such
          unresponsive flows could lead to a new congestive collapse <xref
          target="RFC2309"></xref>.<vspace blankLines="1" />In general,
          UDP-based applications need to incorporate effective congestion
          avoidance mechanisms <xref target="RFC5405"></xref>. Further
          research and development of ways to accomplish congestion avoidance
          for presently unresponsive applications continue to be
          important.Network devices need to be able to protect themselves
          against unresponsive flows, and mechanisms to accomplish this must
          be developed and deployed. Deployment of such mechanisms would
          provide an incentive for all applications to become responsive by
          either using a congestion-controlled transport (e.g. TCP, SCTP,
          DCCP) or by incorporating their own congestion control in the
          application. <xref target="RFC5405"></xref>.</t>

          <t>Non-TCP-friendly Transport Protocols <vspace blankLines="1" />A
          second threat is posed by transport protocol implementations that
          are responsive to congestion, but, either deliberately or through
          faulty implementation, are not TCP-friendly. Such applications may
          gain an unfair share of the available network capacity. <vspace
          blankLines="1" />For example, the popularity of the Internet has
          caused a proliferation in the number of TCP implementations. Some of
          these may fail to implement the TCP congestion avoidance mechanisms
          correctly because of poor implementation. Others may deliberately be
          implemented with congestion avoidance algorithms that are more
          aggressive in their use of capacity than other TCP implementations;
          this would allow a vendor to claim to have a "faster TCP". The
          logical consequence of such implementations would be a spiral of
          increasingly aggressive TCP implementations, leading back to the
          point where there is effectively no congestion avoidance and the
          Internet is chronically congested. <vspace blankLines="1" />Another
          example could be an RTP/UDP video flow that uses an adaptive codec,
          but responds incompletely to indications of congestion or over
          responds over an excessively long time period. Such flows are
          unlikely to be responsive to congestion signals in a time frame
          comparable to a small number of end-to-end transmission delays.
          However, over a longer timescale, perhaps seconds in duration, they
          could moderate their speed, or increase their speed if they
          determine capacity to be available. <vspace
          blankLines="1" />Tunneled traffic aggregates carrying multiple
          (short) TCP flows can be more aggressive than standard bulk TCP.
          Applications (e.g. web browsers and peer-to-peer file-sharing) have
          exploited this by opening multiple connections to the same
          endpoint.</t>
        </list></t>

      <t>The projected increase in the fraction of total Internet traffic for
      more aggressive flows in classes 2 and 3 clearly poses a threat to
      future Internet stability. There is an urgent need for measurements of
      current conditions and for further research into the ways of managing
      such flows. This raises many difficult issues in identifying and
      isolating unresponsive or non-TCP-friendly flows at an acceptable
      overhead cost. Finally, there is as yet little measurement or simulation
      evidence available about the rate at which these threats are likely to
      be realized, or about the expected benefit of algorithms for managing
      such flows.</t>

      <t>Another topic requiring consideration is the appropriate granularity
      of a "flow" when considering a queue management method. There are a few
      "natural" answers: 1) a transport (e.g. TCP or UDP) flow (source
      address/port, destination address/port, DSCP); 2) a source/destination
      host pair (IP addresses, DSCP); 3) a given source host or a given
      destination host. We suggest that the source/destination host pair gives
      the most appropriate granularity in many circumstances. However, it is
      possible that different vendors/providers could set different
      granularities for defining a flow (as a way of "distinguishing"
      themselves from one another), or that different granularities could be
      chosen for different places in the network. It may be the case that the
      granularity is less important than the fact that a network device needs
      to be able to deal with more unresponsive flows at *some* granularity.
      The granularity of flows for congestion management is, at least in part,
      a question of policy that needs to be addressed in the wider IETF
      community.</t>
    </section>

    <section anchor="Section5" title="Conclusions and Recommendations">
      <t>The IRTF, in publishing <xref target="RFC2309"></xref>, and the IETF
      in subsequent discussion, has developed a set of specific
      recommendations regarding the implementation and operational use of AQM
      procedures. This document updates these to include: <list
          style="numbers">
          <t>Network devices SHOULD implement some AQM mechanism to manage
          queue lengths, reduce end-to-end latency, and avoid lock-out
          phenomena within the Internet.</t>

          <t>Deployed AQM algorithms SHOULD support Explicit Congestion
          Notification (ECN) as well as loss to signal congestion to
          endpoints.</t>

          <t>The algorithms that the IETF recommends SHOULD NOT require
          operational (especially manual) configuration or tuning.</t>

          <t>AQM algorithms SHOULD respond to measured congestion, not
          application profiles.</t>

          <t>AQM algorithms SHOULD NOT interpret specific transport protocol
          behaviours.</t>

          <t>Transport protocol congestion control algorithms SHOULD maximize
          their use of available capacity (when there is data to send) without
          incurring undue loss or undue round trip delay.</t>

          <t>Research, engineering, and measurement efforts are needed
          regarding the design of mechanisms to deal with flows that are
          unresponsive to congestion notification or are responsive, but are
          more aggressive than present TCP.</t>
        </list></t>

      <t>These recommendations are expressed using the word "SHOULD". This is
      in recognition that there may be use cases that have not been envisaged
      in this document in which the recommendation does not apply. However,
      care should be taken in concluding that one's use case falls in that
      category; during the life of the Internet, such use cases have been
      rarely if ever observed and reported on. To the contrary, available
      <xref target="Papagiannaki"> research </xref> says that even high speed
      links in network cores that are normally very stable in depth and
      behavior experience occasional issues that need moderation.</t>

      <section anchor="useAQM"
               title="Operational deployments SHOULD  use AQM procedures">
        <t>AQM procedures are designed to minimize delay induced in the
        network by queues that have filled as a result of host behavior.
        Marking and loss behaviors provide a signal that buffers within
        network devices are becoming unnecessarily full, and that the sender
        would do well to moderate its behavior.</t>
      </section>

      <section anchor="signaling" title="Signaling to the transport endpoints">
        <t>There are a number of ways a network device may signal to the end
        point that the network is becoming congested and trigger a reduction
        in rate. The signalling methods include:</t>

        <t><list style="symbols">
            <t>Delaying transport segments (packets) in flight, such as in a
            queue.</t>

            <t>Dropping transport segments (packets) in transit.</t>

            <t>Marking transport segments (packets), such as using Explicit
            Congestion Control<xref target="RFC3168"></xref> <xref
            target="RFC4301"></xref> <xref target="RFC4774"></xref> <xref
            target="RFC6040"></xref> <xref target="RFC6679"></xref>.</t>
          </list></t>

        <t>The use of scheduling mechanisms, such as priority queuing,
        classful queuing, and fair queuing, is often effective in networks to
        help a network serve the needs of a range of applications. Network
        operators can use these methods to manage traffic passing a choke
        point. This is discussed in <xref target="RFC2474"></xref> and <xref
        target="RFC2475"></xref>.</t>

        <t>Increased network latency can be used as an implicit signal of
        congestion. E.g., in TCP additional delay can affect ACK Clocking and
        has the result of reducing the rate of transmission of new data. In
        RTP, network latency impacts the RTCP-reported RTT and increased
        latency can trigger a sender to adjust its rate. Methods such as
        LEDBAT <xref target="RFC6817"></xref> assume increased latency as a
        primary signal of congestion.</t>

        <t>It is essential that all Internet hosts respond to loss <xref
        target="RFC5681"> </xref>, <xref target="RFC5405"></xref><xref
        target="RFC2960"></xref><xref target="RFC4340"></xref>. Packet
        dropping by network devices that are under load has two effects: It
        protects the network, which is the primary reason that network devices
        drop packets. The detection of loss also provides a signal to a
        reliable transport (e.g. TCP, SCTP) that there is potential congestion
        using a pragmatic heuristic; "when the network discards a message in
        flight, it may imply the presence of faulty equipment or media in a
        path, and it may imply the presence of congestion. To be conservative
        transport must the latter." Unreliable transports (e.g. using UDP)
        need to similarly react to loss <xref target="RFC5405"></xref></t>

        <t>Network devices SHOULD use use an AQM algorithm to determine the
        packets that are marked or discarded due to congestion.</t>

        <t>Loss also has an effect on the efficiency of a flow and can
        significantly impact some classes of application. In reliable
        transports the dropped data must be subsequently retransmitted. While
        other applications/transports may adapt to the absence of lost data,
        this still implies inefficient use of available capacity and the
        dropped traffic can affect other flows. Hence, loss is not entirely
        positive; it is a necessary evil.</t>

        <section title="AQM and ECN">
          <t>Explicit Congestion Notification (ECN) <xref
          target="RFC4301"></xref> <xref target="RFC4774"></xref> <xref
          target="RFC6040"></xref> <xref target="RFC6679"></xref>. is a
          network-layer function that allows a transport to receive network
          congestion information from a network device without incurring the
          unintended consequences of loss. ECN includes both transport
          mechanisms and functions implemented in network devices, the latter
          rely upon using AQM to decider whether to ECN-mark.</t>

          <t>Congestion for ECN-capable transports is signalled by a network
          device setting the "Congestion Experienced (CE)" codepoint in the IP
          header. This codepoint is noted by the remote receiving end point
          and signalled back to the sender using a transport protocol
          mechanism, allowing the sender to trigger timely congestion control.
          The decision to set the CE codepoint requires an AQM algorithm
          configured with a threshold. Non-ECN capable flows (the default) are
          dropped under congestion.</t>

          <t>Network devices SHOULD use an AQM algorithm that marks
          ECN-capable traffic when making decisions about the response to
          congestion. Network devices need to implement this method by marking
          ECN-capable traffic or by dropping non-ECN-capable traffic.</t>

          <t>Safe deployment of ECN requires that network devices drop
          excessive traffic, even when marked as originating from an
          ECN-capable transport. This is necessary because (1) A
          non-conformant, broken or malicious receiver could conceal an ECN
          mark, and not report this to the sender (2) A non-conformant, broken
          or malicious sender could ignore a reported ECN mark, as it could
          ignore a loss without using ECN (3) A malfunctioning or
          non-conforming network device may similarly "hide" an ECN mark. In
          normal operation such cases should be very uncommon.</t>

          <t>Network devices SHOULD use an algorithm to drop excessive
          traffic, even when marked as originating from an ECN-capable
          transport.</t>
        </section>
      </section>

      <section anchor="autotuning"
               title="AQM algorithms deployed SHOULD NOT require operational tuning">
        <t>A number of AQM algorithms have been proposed. Many require some
        form of tuning or setting of parameters for initial network
        conditions. This can make these algorithms difficult to use in
        operational networks.</t>

        <t><!--Proposed new text block :-->AQM algorithms need to consider
        both "initial conditions" and "operational conditions". The former
        includes values that exist before any experience is gathered about the
        use of the algorithm, such as the configured speed of interface,
        support for full duplex communication, interface MTU and other
        properties of the link. The latter includes information observed from
        monitoring the size of the queue, experienced queueing delay, rate of
        packet discards, etc.</t>

        <t>This document therefore recommends that AQM algorithm proposed for
        deployment in the Internet:</t>

        <t><list style="symbols">
            <t>SHOULD NOT require tuning of initial or configuration
            parameters. An algorithm needs to provide a default behaviour that
            auto-tunes to a reasonable performance for typical network
            conditions. This is expected to ease deployment and operation.</t>

            <t>MAY support further manual tuning that could improve
            performance in a specific deployed network. Algorithms that lack
            such variables are acceptable, but if such variables exist, they
            SHOULD be externalized (made visible to the operator). Guidance
            needs to be provided on the cases where autotuning is unlikely to
            achieve satisfactory performance and to identify the set of
            parameters that can be tuned. This is expected to enable the
            algorithm to be deployed in networks that have specific
            characteristics (variable/larger delay; networks were capacity is
            impacted by interactions with lower layer mechanisms, etc)</t>

            <t>MAY provide logging and alarm signals to assist in identifying
            if an algorithm using manual or auto-tuning is functioning as
            expected. (e.g., this could be based on an internal consistency
            check between input, output, and mark/drop rates over time). This
            is expected to encourage deployment by default and allow operators
            to identify potential interactions with other network
            functions.</t>
          </list>Hence, self-tuning algorithms are to be preferred. Algorithms
        recommended for general Internet deployment by the IETF need to be
        designed so that they do not require operational (especially manual)
        configuration or tuning.</t>
      </section>

      <section title="AQM algorithms SHOULD respond to measured congestion, not application profiles.">
        <t>Not all applications transmit packets of the same size. Although
        applications may be characterised by particular profiles of packet
        size this should not be used as the basis for AQM (see next section).
        Other methods exist, e.g. Differentiated Services queueing,
        Pre-Congestion Notification (PCN) <xref target="RFC5559"></xref>, that
        can be used to differentiate and police classes of application.
        Network devices may combine AQM with these traffic classification
        mechanisms and perform AQM only on specific queues within a network
        device.</t>

        <t>An AQM algorithm should not deliberately try to prejudice the size
        of packet that performs best (i.e. preferentially drop/mark based only
        on packet size). Procedures for selecting packets to mark/drop SHOULD
        observe actual or projected time a packet is in a queue (bytes at a
        rate being an analog to time). When an AQM algorithm decides whether
        to drop (or mark) a packet, it is RECOMMENDED that the size of the
        particular packet should not be taken into account <xref
        target="Byte-pkt"></xref>.</t>

        <t>Applications (or transports) generally know the packet size that
        they are using and can hence make their judgments about whether to use
        small or large packets based on the data they wish to send and the
        expected impact on the delay or throughput, or other performance
        parameter. When a transport or application responds to a dropped or
        marked packet, the size of the rate reduction should be proportionate
        to the size of the packet that was sent <xref
        target="Byte-pkt"></xref>.</t>

        <!--New Text block :-->

        <t>AQM-enabled system MAY instantiate different instances of an AQM
        algorithm to be applied within the same traffic class. Traffic classes
        may be differentiated based on an Access Control List (ACL), the
        packet DiffServ Code Point (DSCP) <xref target="RFC5559"></xref>,
        setting of the ECN field<xref target="RFC3168"></xref> <xref
        target="RFC4774"> </xref> or an equivalent codepoint at a lower layer.
        This recommendation goes beyond what is defined in RFC 3168, by
        allowing more than one instance of an AQM to handle both ECN-capable
        and non-ECN-capable packets.</t>
      </section>

      <section anchor="alltraffic"
               title="AQM algorithms SHOULD NOT be dependent on specific transport protocol behaviours">
        <t>In deploying AQM, network devices need to support a range of
        Internet traffic and SHOULD NOT make implicit assumptions about the
        characteristics desired by the set transports/applications the network
        supports. That is, AQM methods should be opaque to the choice of
        transport and application.</t>

        <t>AQM algorithms are often evaluated by considering <xref
        target="RFC0793">TCP</xref> with a limited number of applications.
        Although TCP is the predominant transport in the Internet today, this
        no longer represents a sufficient selection of traffic for
        verification. There is significant use of <xref
        target="RFC0768">UDP</xref> in voice and video services, and some
        applications find utility in <xref target="RFC4960">SCTP</xref> and
        <xref target="RFC4340"> DCCP </xref>. Hence, AQM algorithms should
        also demonstrate operation with transports other than TCP and need to
        consider a variety of applications. Selection of AQM algorithms also
        needs to consider use of tunnel encapsulations that may carry traffic
        aggregates.</t>

        <t>AQM algorithms SHOULD NOT target or derive implicit assumptions
        about the characteristics desired by specific transports/applications.
        Transports and applications need to respond to the congestion signals
        provided by AQM (i.e. dropping or ECN-marking) in a timely manner
        (within a few RTT at the latest).</t>
      </section>

      <section anchor="tcpcc"
               title="Interactions with congestion control algorithms">
        <t>Applications and transports need to react to received implicit or
        explicit signals that indicate the presence of congestion. This
        section identifies issues that can impact the design of transport
        protocols when using paths that use AQM.</t>

        <t>Transport protocols and applications need timely signals of
        congestion. The time taken to detect and respond to congestion is
        increased when network devices queue packets in buffers. It can be
        difficult to detect tail losses at a higher layer and may sometimes
        require transport timers or probe packets to detect and respond to
        such loss. Loss patterns may also impact timely detection, e.g. the
        time may be reduced when network devices do not drop long runs of
        packets from the same flow.</t>

        <t>A common objective is to deliver data from its source end point to
        its destination in the least possible time. When speaking of TCP
        performance, the terms "knee" and "cliff" area defined by <xref
        target="Jain94"></xref>. They respectively refer to the minimum
        congestion window that maximises throughput and the maximum congestion
        window that avoids loss. An application that transmits at the rate
        determined by this window has the effect of maximizing the rate or
        throughput. For the sender, exceeding the cliff is ineffective, as it
        (by definition) induces loss; operating at a point close to the cliff
        has a negative impact on other traffic and applications, triggering
        operator activities, such as those discussed in <xref
        target="RFC6057"></xref>. Operating below the knee reduces the
        throughput, since the sender fails to use available network capacity.
        As a result, the behavior of any elastic transport congestion control
        algorithm designed to minimise delivery time should seek to use an
        effective window at or above the knee and well below the cliff. Choice
        of an appropriate rate can significantly impact the loss and delay
        experienced not only by a flow, but by other flows that share the same
        queue.</t>

        <t>Some applications may send less than permitted by the congestion
        control window (or rate). Examples include multimedia codecs that
        stream at some natural rate (or set of rates) or an application that
        is naturally interactive (e.g., some web applications, gaming,
        transaction-based protocols). Such applications may have different
        objectives. They may not wish to maximise throughput, but may desire a
        lower loss rate or bounded delay.</t>

        <t>The correct operation of an AQM-enabled network device MUST NOT
        rely upon specific transport responses to congestion signals.</t>
      </section>

      <section anchor="research" title="The need for further research">
        <t><xref target="RFC2309">The second recommendation of </xref> called
        for further research into the interaction between network queues and
        host applications, and the means of signaling between them. This
        research has occurred, and we as a community have learned a lot.
        However, we are not done.</t>

        <t>We have learned that the problems of congestion, latency and
        buffer-sizing have not gone away, and are becoming more important to
        many users. A number of self-tuning AQM algorithms have been found
        that offer significant advantages for deployed networks. There is also
        renewed interest in deploying AQM and the potential of ECN.</t>

        <t>In 2013, an obvious example of further research is the need to
        consider the use of Map/Reduce applications in data centers; do we
        need to extend our taxonomy of TCP/SCTP sessions to include not only
        "mice" and "elephants", but "lemmings". Where "Lemmings" are flash
        crowds of "mice" that the network inadvertently tries to signal to as
        if they were elephant flows, resulting in head of line blocking in
        data center applications.</t>

        <t>Examples of other required research include:</t>

        <t><list style="symbols">
            <t>Research into new AQM and scheduling algorithms.</t>

            <t>Research into the use of and deployment of ECN alongside
            AQM.</t>

            <t>Tools for enabling AQM (and ECN) deployment and measuring the
            performance.</t>

            <t>Methods for mitigating the impact of non-conformant and
            malicious flows.</t>

            <!--New text block :-->

            <t>Research to understand the implications of using new network
            and transport methods on applications.</t>
          </list>Hence, this document therefore reiterates the call of RFC
        2309: we need continuing research as applications develop.</t>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This memo asks the IANA for no new parameters.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>While security is a very important issue, it is largely orthogonal to
      the performance issues discussed in this memo.</t>

      <t>Many deployed network devices use queueing methods that allow
      unresponsive traffic to capture network capacity, denying access to
      other traffic flows. This could potentially be used as a
      denial-of-service attack. This threat could be reduced in network
      devices deploy AQM or some form of scheduling. We note, however, that a
      denial-of-service attack may create unresponsive traffic flows that may
      be indistinguishable from other traffic flows (e.g. tunnels carrying
      aggregates of short flows, high-rate isochronous applications). New
      methods therefore may remain vulnerable, and this document recommends
      that ongoing research should consider ways to mitigate such attacks.</t>
    </section>

    <section anchor="Privacy" title="Privacy Considerations">
      <t>This document, by itself, presents no new privacy issues.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>The original recommendation in <xref target="RFC2309"></xref> was
      written by the End-to-End Research Group, which is to say Bob Braden,
      Dave Clark, Jon Crowcroft, Bruce Davie, Steve Deering, Deborah Estrin,
      Sally Floyd, Van Jacobson, Greg Minshall, Craig Partridge, Larry
      Peterson, KK Ramakrishnan, Scott Shenker, John Wroclawski, and Lixia
      Zhang. This is an edited version of that document, with much of its text
      and arguments unchanged.</t>

      <t>The need for an updated document was agreed to in the tsvarea meeting
      at IETF 86. This document was reviewed on the aqm@ietf.org list.
      Comments came from Colin Perkins, Richard Scheffenegger, and Dave
      Taht.</t>

      <t>Gorry Fairhurst was in part supported by the European Community under
      its Seventh Framework Programme through the Reducing Internet Transport
      Latency (RITE) project (ICT-317700).</t>
    </section>
  </middle>

  <back>
    <!-- references split to informative and normative -->

    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.3168" ?>

      <?rfc include="reference.RFC.6679" ?>

      <?rfc include="reference.RFC.4301" ?>

      <?rfc include="reference.RFC.4774" ?>

      <?rfc include="reference.RFC.5405" 
?>

      <?rfc include="reference.RFC.5681" 
?>

      <?rfc include="reference.RFC.6040" ?>

      <reference anchor="Byte-pkt">
        <front>
          <title>Byte and Packet Congestion Notification
          (draft-ietf-tsvwg-byte-pkt-congest)</title>

          <author fullname="Bob Briscoe">
            <organization></organization>

            <address>
              <postal>
                <street></street>

                <city></city>

                <region></region>

                <code></code>

                <country></country>
              </postal>

              <phone></phone>

              <facsimile></facsimile>

              <email></email>

              <uri></uri>
            </address>
          </author>

          <author fullname="Jukka  Manner">
            <organization>Internet Engineering Task Force, Work in
            Progress</organization>
          </author>

          <date day="1" month="July" year="2013" />
        </front>
      </reference>
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.0768" ?>

      <?rfc include="reference.RFC.0791" ?>

      <?rfc include="reference.RFC.0793" ?>

      <?rfc include="reference.RFC.0896" ?>

      <?rfc include="reference.RFC.0970" ?>

      <?rfc include="reference.RFC.1122" ?>

      <?rfc include="reference.RFC.1633"?>

      <?rfc include="reference.RFC.2309"?>

      <?rfc include="reference.RFC.2460" ?>

      <?rfc include="reference.RFC.2474" ?>

      <?rfc include="reference.RFC.2475"?>

      <?rfc include="reference.RFC.2960"?>

      <?rfc include="reference.RFC.4340" ?>

      <?rfc include="reference.RFC.4960" ?>

      <?rfc include="reference.RFC.5348"?>

      <?rfc include="reference.RFC.5559"?>

      <?rfc include="reference.RFC.6057" ?>

      <?rfc include="reference.RFC.6817" ?>

      <reference anchor="Floyd91">
        <front>
          <title>Connections with Multiple Congested Gateways in
          Packet-Switched Networks Part 1: One-way Traffic.</title>

          <author fullname="S. Floyd" initials="S" surname="Floyd">
            <organization></organization>
          </author>

          <date month="October" year="1991" />
        </front>

        <seriesInfo name="Computer Communications Review" value="" />
      </reference>

      <reference anchor="Floyd95">
        <front>
          <title>Link-sharing and Resource Management Models for Packet
          Networks</title>

          <author fullname="S. Floyd" initials="S" surname="Floyd">
            <organization></organization>
          </author>

          <author fullname="Van Jacobson" initials="V" surname="Jacobson">
            <organization></organization>
          </author>

          <date month="August" year="1995" />
        </front>

        <seriesInfo name="IEEE/ACM Transactions on Networking" value="" />
      </reference>

      <reference anchor="Demers90">
        <front>
          <title>Analysis and Simulation of a Fair Queueing Algorithm,
          Internetworking: Research and Experience</title>

          <author fullname="A. Demers" initials="A" surname="Demers">
            <organization></organization>
          </author>

          <author fullname="S. Keshav" initials="S" surname="Keshav">
            <organization></organization>
          </author>

          <author fullname="S. Shenker" initials="S" surname="Shenker">
            <organization></organization>
          </author>

          <date year="1990" />
        </front>

        <seriesInfo name="SIGCOMM Symposium proceedings on Communications architectures and protocols"
                    value="" />
      </reference>

      <reference anchor="Willinger95">
        <front>
          <title>Self-Similarity Through High-Variability: Statistical
          Analysis of Ethernet LAN Traffic at the Source Level</title>

          <author fullname="W. Willinger" initials="W" surname="Willinger">
            <organization></organization>
          </author>

          <author fullname="M. Taqqu" initials="M" surname="Taqqu">
            <organization></organization>
          </author>

          <author fullname="R." initials="R" surname="Sherman">
            <organization></organization>
          </author>

          <author fullname="D.V. Wilson" initials="D" surname="Wilson">
            <organization></organization>
          </author>

          <author fullname="Van Jacobson" initials="V" surname="Jacobson">
            <organization></organization>
          </author>

          <date month="August" year="1995" />
        </front>

        <seriesInfo name="SIGCOMM Symposium proceedings on Communications architectures and protocols"
                    value="" />
      </reference>

      <reference anchor="Jacobson88">
        <front>
          <title>Congestion Avoidance and Control</title>

          <author fullname="Van Jacobson" initials="V" surname="Jacobson">
            <organization>Lawrence Berkeley Network Labs</organization>
          </author>

          <date month="August" year="1988" />
        </front>

        <seriesInfo name="SIGCOMM Symposium proceedings on Communications architectures and protocols"
                    value="" />
      </reference>

      <reference anchor="Lakshman96">
        <front>
          <title>The Drop From Front Strategy in TCP Over ATM and Its
          Interworking with Other Control Features</title>

          <author fullname="T. V. Lakshman" initials="TV" surname="Lakshman">
            <organization></organization>
          </author>

          <author fullname="Arnie Neidhardt" initials="A" surname="Neidhardt">
            <organization></organization>
          </author>

          <author fullname="Teunis Ott" initials="T" surname="Ott">
            <organization></organization>
          </author>

          <date year="1996" />
        </front>

        <seriesInfo name="IEEE Infocomm" value="" />
      </reference>

      <reference anchor="Leland94">
        <front>
          <title>On the Self-Similar Nature of Ethernet Traffic (Extended
          Version)</title>

          <author fullname="W. Leland" initials="W" surname="Leland">
            <organization></organization>
          </author>

          <author fullname="M. Taqqu" initials="M" surname="Taqqu">
            <organization></organization>
          </author>

          <author fullname="W. Willinger" initials="W" surname="Willinger">
            <organization></organization>
          </author>

          <author fullname="D. Wilson" initials="D" surname="Wilson">
            <organization></organization>
          </author>

          <date month="February" year="1994" />
        </front>

        <seriesInfo name="IEEE/ACM Transactions on Networking" value="" />
      </reference>

      <reference anchor="Jain94">
        <front>
          <title>Congestion avoidance scheme for computer networks</title>

          <author fullname="Rajendra K. Jain" initials="Raj" surname="Jain">
            <organization>Digital Equipment Corporation</organization>
          </author>

          <author fullname="KK Ramakrishnan" initials="KK"
                  surname="Ramakrishnan">
            <organization>Digital Equipment Corporation</organization>
          </author>

          <author fullname="Chiu Dah-Ming" initials="Chiu" surname="Dah-Ming">
            <organization>Digital Equipment Corporation</organization>
          </author>

          <date day="27" month="December" year="1994" />
        </front>

        <seriesInfo name="US Patent Office" value="5377327" />
      </reference>

      <reference anchor="Papagiannaki"
                 target="http://www.ieee-infocom.org/2004/Papers/37_4.PDF">
        <front>
          <title>Analysis of Point-To-Point Packet Delay In an Operational
          Network</title>

          <author fullname="B-Y. Choi">
            <organization>Sprint ATL</organization>
          </author>

          <author fullname="S. Moon">
            <organization>KAIST</organization>
          </author>

          <author fullname="Z-L. Zhang">
            <organization>University of Minnesota</organization>
          </author>

          <author fullname="K. Papagiannaki">
            <organization>Sprint ATL</organization>
          </author>

          <author fullname="C. Diot">
            <organization>Intel ResearchIETF</organization>
          </author>

          <date month="March" year="2004" />
        </front>

        <seriesInfo name="IEEE Infocom" value="2004" />
      </reference>

      <reference anchor="AQM-WG">
        <front>
          <title>IETF AQM WG</title>

          <author>
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>
    </references>

    <?rfc ?>

    <section anchor="log" title="Change Log">
      <t><list style="hanging">
          <t hangText="Initial Version:">March 2013</t>

          <t
          hangText="Minor update of the algorithms that the IETF recommends SHOULD NOT require operational (especially manual) configuration or tuningdate:">April
          2013</t>

          <t
          hangText="Major surgery.  This draft is for discussion at IETF-87 and  expected to be further updated.">July
          2013</t>

          <t
          hangText="-00 WG Draft - Updated transport recommendations; revised deployment configuration section; numerous minor edits.">Oct
          2013</t>

          <t
          hangText="-01 WG Draft - Updated transport recommendations; revised deployment configuration section; numerous minor edits.">Jan
          2014 - Feedback from WG.</t>
        </list></t>
    </section>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 01:50:48