One document matched: draft-briscoe-conex-re-ecn-motiv-01.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='http://xml.resource.org/authoring/rfc2629.xslt' ?>
<!-- Alterations to I-D/RFC boilerplate -->
<?rfc private="" ?>
<!-- Default private="" Produce an internal memo 2.5pp shorter than an I-D or RFC -->
<?rfc rfcprocack="yes" ?>
<!-- Default rfcprocack="no" add a short sentence acknowledging xml2rfc -->
<?rfc strict="no" ?>
<!-- Default strict="no" Don't check I-D nits -->
<?rfc rfcedstyle="yes" ?>
<!-- Default rfcedstyle="yes" attempt to closely follow finer details from the latest observable RFC-Editor style -->
<!-- IETF process -->
<?rfc iprnotified="no" ?>
<!-- Default iprnotified="no" I haven't disclosed existence of IPR to IETF -->
<!-- ToC format -->
<?rfc toc="yes" ?>
<!-- Default toc="no" No Table of Contents -->
<!-- Cross referencing, footnotes, comments -->
<?rfc symrefs="yes" ?>
<!-- Default symrefs="no" Don't use anchors, but use numbers for refs -->
<?rfc sortrefs="yes"?>
<!-- Default sortrefs="no" Don't sort references into order -->
<?rfc comments="no" ?>
<!-- Default comments="yes" Don't render comments -->
<?rfc inline="no" ?>
<!-- Default inline="no" if comments is "yes", then render comments inline; otherwise render them in an `Editorial Comments' section -->
<?rfc editing="no"?>
<!-- Default editing="no" Don't insert editing marks for ease of discussing draft versions -->
<!-- Pagination control -->
<?rfc compact="yes"?>
<!-- Default compact="no" Start sections on new pages -->
<?rfc subcompact="no"?>
<!-- Default subcompact="(as compact setting)" yes/no is not quite as compact as yes/yes -->
<!-- HTML formatting control -->
<?rfc emoticonic="yes" ?>
<!-- Default emoticonic="no" Doesn't prettify HTML format -->
<rfc category="historic" docName="draft-briscoe-conex-re-ecn-motiv-01"
     ipr="trust200902">
  <front>
    <title abbrev="Re-ECN: Framework">Re-ECN: A Framework for adding
    Congestion Accountability to TCP/IP</title>

    <author fullname="Bob Briscoe" initials="B." role="editor"
            surname="Briscoe">
      <organization>BT</organization>

      <address>
        <postal>
          <street>B54/77, Adastral Park</street>

          <street>Martlesham Heath</street>

          <city>Ipswich</city>

          <code>IP5 3RE</code>

          <country>UK</country>
        </postal>

        <phone>+44 1473 645196</phone>

        <email>bob.briscoe@bt.com</email>

        <uri>http://bobbriscoe.net/</uri>
      </address>
    </author>

    <author fullname="Arnaud Jacquet" initials="A." surname="Jacquet">
      <organization>BT</organization>

      <address>
        <postal>
          <street>B54/70, Adastral Park</street>

          <street>Martlesham Heath</street>

          <city>Ipswich</city>

          <code>IP5 3RE</code>

          <country>UK</country>
        </postal>

        <phone>+44 1473 647284</phone>

        <email>arnaud.jacquet@bt.com</email>

        <uri></uri>
      </address>
    </author>

    <author fullname="Toby Moncaster" initials="T." surname="Moncaster">
      <organization>Moncaster.com</organization>

      <address>
        <postal>
          <street>Dukes</street>

          <street>Layer Marney</street>

          <city>Colchester</city>

          <code>CO5 9UZ</code>

          <country>UK</country>
        </postal>

        <email>toby@moncaster.com</email>
      </address>
    </author>

    <author fullname="Alan Smith" initials="A." surname="Smith">
      <organization>BT</organization>

      <address>
        <postal>
          <street>B54/76, Adastral Park</street>

          <street>Martlesham Heath</street>

          <city>Ipswich</city>

          <code>IP5 3RE</code>

          <country>UK</country>
        </postal>

        <phone>+44 1473 640404</phone>

        <email>alan.p.smith@bt.com</email>

        <!--                <uri>?</uri> -->
      </address>
    </author>

    <date day="22" month="October" year="2012" />

    <area>Transport</area>

    <workgroup>Transport Area Working Group</workgroup>

    <keyword>Quality of Service</keyword>

    <keyword>QoS</keyword>

    <keyword>Congestion Control</keyword>

    <keyword>Differentiated Services</keyword>

    <keyword>Integrated Services</keyword>

    <keyword>Admission Control</keyword>

    <keyword>Signalling</keyword>

    <keyword>Protocol</keyword>

    <keyword>Pre-emption</keyword>

    <abstract>
      <t>This document describes a framework for using a new protocol called
      re-ECN (re-inserted explicit congestion notification), which can be
      deployed incrementally around unmodified routers. Re-ECN allows accurate
      congestion monitoring throughout the network thus enabling the upstream
      party at any trust boundary in the internetwork to be held responsible
      for the congestion they cause, or allow to be caused. So, networks can
      introduce straightforward accountability for congestion and policing
      mechanisms for incoming traffic from end-customers or from neighbouring
      network domains. As well as giving the motivation for re-ECN this
      document also gives examples of mechanisms that can use the protocol to
      ensure data sources respond correctly to congestion. And it describes
      example mechanisms that ensure the dominant selfish strategy of both
      network domains and end-points will be to use the protocol honestly.</t>

      <t>Note concerning Intended Status: If this draft were ever published as
      an RFC it would probably have historic status. There is limited space in
      the IP header, so re-ECN had to compromise by requiring the receiver to
      be ECN-enabled otherwise the sender could not use re-ECN. Re-ECN was a
      precursor to chartering of the IETF's Congestion Exposure (ConEx)
      working group, but during chartering there were still too few ECN
      receivers enabled, therefore it was decided to pursue other compromises
      in order to fit a similar capability into the IP header.</t>
    </abstract>
  </front>

  <middle>
    <!-- ================================================================ -->

    <note title="Authors' Statement: Status (to be removed by the RFC Editor)">
      <t>Although the re-ECN protocol is intended to make a simple but
      far-reaching change to the Internet architecture, the most immediate
      priority for the authors is to delay any move of the ECN nonce to
      Proposed Standard status. The argument for this position is developed in
      <xref target="retcp_Nonce_Limitation" />.</t>
    </note>

    <!-- ================================================================ -->

    <section anchor="retcp_Introduction" title="Introduction">
      <t>This document aims to: <list style="symbols">
          <t>Describe the motivation for wanting to introduce re-ECN;</t>

          <t>Provide a very brief description of the protocol;</t>

          <t>The framework within which the protocol sits;</t>

          <t>To show how a number of hard problems become much easier to solve
          once re-ECN is available in IP.</t>
        </list></t>

      <t>This introduction starts with a run through of these 4 points.</t>

      <section anchor="retcp_Introduction_motivation" title="Motivation">
        <t>Re-ECN is proposed as a means of allowing accurate monitoring of
        congestion throughout the Internet. The current Internet relies on the
        vast majority of end-systems running TCP and reacting to detected
        congestion by reducing their sending rates. Thus congestion control is
        conducted by the collaboration of the majority of end-systems.</t>

        <t>In this situation it is possible for applications that are
        unresponsive to congestion to take whatever share of bottleneck
        resources they want from responsive flows, the responsive flows reduce
        their sending rate in face of congestion and effectively get out of
        the way of unresponsive flows. An increasing proportion of such
        applications could lead to congestion collapse being more common <xref
        target="RFC3714" />. Each network has no visibility of whole path
        congestion and can only respond to congestion on a local basis.</t>

        <t>Using re-ECN will allow any point along a path to calculate
        congestion both upstream and downstream of that point. As a
        consequence of this policing of congestion /could/ be carried out in
        the network if end-systems fail to do so. Re-ECN enables flows and
        users to be policed and for policing to happen at network ingress and
        at network borders.</t>
      </section>

      <section anchor="retcp_Introduction_protocol_in_brief"
               title="Re-ECN Protocol in Brief">
        <t>In re-ECN each sender makes a prediction of the congestion that
        each flow will cause and signals that prediction within the IP headers
        of that flow. The prediction is based on, but not limited to, feedback
        received from the receiver. Sending a prediction of the congestion
        gives network equipment a view of the congestion downstream and
        upstream.</t>

        <t>In order to explain this mechanism we introduce the notion of IP
        packets carrying different, notional values dependent on the state of
        their header flags: <list style="symbols">
            <t>Negative - are those marked by queues when incipient congestion
            is detected. This is exactly the same as ECN <xref
            target="RFC3168" />;</t>

            <t>Positive - are sent by the sender in proportion to the number
            of bytes in packets that have been marked negative according to
            feedback received from the receiver;</t>

            <t>Cautious - are sent whenever the sender cannot be sure of the
            correct amount of positive bytes to inject into the network for
            example, at the start of a flow to indicate that feedback has not
            been established;</t>

            <t>Cancelled - packets sent by the sender as positive that get
            marked as negative by queues in the network due to incipient
            congestion;</t>

            <t>Neutral - normal IP packets but show queues that they can be
            marked negative.</t>
          </list></t>

        <t>A flow starts to transmit packets. No feedback has been established
        so a number of cautious packets are sent (see the protocol definition
        <xref target="Re-TCP" /> for an analysis of how many cautious packets
        should be sent at flow start). The rest are sent as neutral.</t>

        <t>The packets traverse a congested queue. A fraction are marked
        negative as an indication of incipient congestion.</t>

        <t>The packets are received by the receiver. The receiver feeds back
        to the sender a count of the number of packets that have been marked
        negative. This feedback can be provided either by the transport (e.g.
        TCP) or by higher-layer control messages.</t>

        <t>The sender receives the feedback and then sends a number of
        positive packets in proportion to the bytes represented by packets
        that have been marked negative. It is important to note that
        congestion is revealed by the fraction of marked packets rather than a
        field in the IP header. This is due to the limited code points
        available and includes use of the last unallocated bit (sometimes
        called the evil bit <xref target="RFC3514" />). Full details of the
        code points used is given in <xref target="Re-TCP" />. This lack of
        codepoints is, however, the case with IPv4. ECN is similarly
        restricted.</t>

        <t>The number of bytes inside the negative packets and positive
        packets should therefore be approximately equal at the termination
        point of the flow. To put it another way, the balance of negative and
        positive should be zero.</t>
      </section>

      <section anchor="retcp_Introduction_re_ecn_framework"
               title="The Re-ECN Framework">
        <t>The introducion of the protocol enables 3 things:<list
            style="symbols">
            <t>Gives a view of whole path congestion;</t>

            <t>Enables policing of flows;</t>

            <t>It allows networks to monitor the flow of congestion across
            their borders.</t>
          </list></t>

        <t>At any point in the network a device can calculate the upstream
        congestion by calculating the fraction of bytes in negative packets to
        total packets. This it could do using ECN by calculating the fraction
        of packets marked Congestion Experienced.</t>

        <t>Using re-ECN a device in the network can calculate downstream
        congestion by subtracting the fraction of negative packets from the
        fraction of positive packets.</t>

        <t>A user can be restricted to only causing a certain amount of
        congestion. A Policer could be introduced at the ingress of a network
        that counts the number of positive packets being sent and limits the
        sender if that sender ties to transmit more positive packets than
        their allowance.</t>

        <t>A user could deliberately ignore some or all of the feedback and
        transmit packets with a zero or much lower proportion of positive
        packets than negative packets. To solve this a Dropper is proposed.
        This would be placed at the egress of a network. If the number of
        negative packets exceeds the number of positive packets then the flow
        could be dropped or some other sanction enacted.</t>

        <t>Policers and droppers could be used between networks in order to
        police bulk traffic. A whole network harbouring users causing
        congestion in downstream networks can be held responsible or policed
        by its downstream neighbour.</t>
      </section>

      <section anchor="retcp_Introduction_hard_problems"
               title="Solving Hard Problems">
        <t>We have already shown that by making flows declare the level of
        congestion they are causing that they can be policed, more
        specifically these are the kind of problems that can be solved: <list
            style="symbols">
            <t>mitigating distributed denial of service (DDoS);</t>

            <t>simplifying differentiation of quality of service (QoS);</t>

            <t>policing compliance to congestion control;</t>

            <t>inter-provider service monitoring;</t>

            <t>etc.</t>
          </list></t>

        <t>Uniquely, re-ECN manages to enable solutions to these problems
        without unduly stifling innovative new ways to use the Internet. This
        was a hard balance to strike, given it could be argued that DDoS is an
        innovative way to use the Internet. The most valuable insight was to
        allow each network to choose the level of constraint it wishes to
        impose. Also re-ECN has been carefully designed so that networks that
        choose to use it conservatively can protect themselves against the
        congestion caused in their network by users on other networks with
        more liberal policies.</t>

        <t>For instance, some network owners want to block applications like
        voice and video unless their network is compensated for the extra
        share of bottleneck bandwidth taken. These real-time applications tend
        to be unresponsive when congestion arises. Whereas elastic TCP-based
        applications back away quickly, ending up taking a much smaller share
        of congested capacity for themselves. Other network owners want to
        invest in large amounts of capacity and make their gains from
        simplicity of operation and economies of scale.</t>

        <t>While we have designed re-ECN so that networks can choose to deploy
        stringent policing, this does not imply we advocate that every network
        should introduce tight controls on those that cause congestion. Re-ECN
        has been specifically designed to allow different networks to choose
        how conservative or liberal they wish to be with respect to policing
        congestion. But those that choose to be conservative can protect
        themselves from the excesses that liberal networks allow their
        users.</t>

        <t>Re-ECN allows the more conservative networks to police out flows
        that have not asked to be unresponsive to congestion---not because
        they are voice or video---just because they don't respond to
        congestion. But it also allows other networks to choose not to police.
        Crucially, when flows from liberal networks cross into a conservative
        network, re-ECN enables the conservative network to apply penalties to
        its neighbouring networks for the congestion they allow to be caused.
        And these penalties can be applied to bulk data, without regard to
        flows.</t>

        <t>Then, if unresponsive applications become so dominant that some of
        the more liberal networks experience congestion collapse <xref
        target="RFC3714" />, they can change their minds and use re-ECN to
        apply tighter controls in order to bring congestion back under
        control.</t>

        <t>Re-ECN reduces the need for complex network equipment to perform
        these functions.</t>
      </section>

      <section anchor="retcp_Introduction_compass"
               title="The Rest of this Document">
        <t>This document is structured as follows. First the motivation for
        the new protocol is given (<xref target="retcp_Motivation" />)
        followed by the incentive framework that is possible with the protocol
        <xref target="retcp_Incentive_Framework" />. <xref
        target="retcp_Other_Applications" /> then describes other important
        applications re-ECN, such as policing DDoS, QoS and congestion
        control. Although these applications do not require standardisation
        themselves, they are described in a fair degree of detail in order to
        explain how re-ECN can be used. Given re-ECN proposes to use the last
        undefined bit in the IPv4 header, we felt it necessary to outline the
        potential that re-ECN could release in return for being given that
        bit.</t>

        <t>Deployment issues discussed throughout the document are brought
        together in <xref target="retcp_Incremental_Deployment" />, which is
        followed by a brief section explaining the somewhat subtle rationale
        for the design from an architectural perspective (<xref
        target="retcp_Architectural_Rationale" />). We end by describing
        related work (<xref target="retcp_Related_Work" />), listing security
        considerations (<xref target="retcp_Security_Considerations" />) and
        finally drawing conclusions (<xref target="retcp_Conclusions" />).</t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Reqs_notation" title="Requirements notation">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119" />.</t>

      <t>This document first specifies a protocol, then describes a framework
      that creates the right incentives to ensure compliance to the protocol.
      This could cause confusion because the second part of the document
      considers many cases where malicious nodes may not comply with the
      protocol. When such contingencies are described, if any of the above
      keywords are not capitalised, that is deliberate. So, for instance, the
      following two apparently contradictory sentences would be perfectly
      consistent: i) x MUST do this; ii) x may not do this.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Motivation" title="Motivation">
      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Policing_Congestion_Response"
               title="Policing Congestion Response">
        <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

        <section anchor="retcp_Policing_Problem" title="The Policing Problem">
          <t>The current Internet architecture trusts hosts to respond
          voluntarily to congestion. Limited evidence shows that the large
          majority of end-points on the Internet comply with a TCP-friendly
          response to congestion. But telephony (and increasingly video)
          services over the best effort Internet are attracting the interest
          of major commercial operations. Most of these applications do not
          respond to congestion at all. Those that can switch to lower rate
          codecs.</t>

          <t>Of course, the Internet is intended to support many different
          application behaviours. But the problem is that this freedom can be
          exercised irresponsibly. The greater problem is that we will never
          be able to agree on where the boundary is between responsible and
          irresponsible. Therefore re-ECN is designed to allow different
          networks to set their own view of the limit to irresponsibility, and
          to allow networks that choose a more conservative limit to push back
          against congestion caused in more liberal networks.</t>

          <t>As an example of the impossibility of setting a standard for
          fairness, mandating TCP-friendliness would set the bar too high for
          unresponsive streaming media, but still some would say the bar was
          too low <xref target="relax-fairness" />. Even though all known
          peer-to-peer filesharing applications are TCP-compatible, they can
          cause a disproportionate amount of congestion, simply by using
          multiple flows and by transferring data continuously relative to
          other short-lived sessions. On the other hand, if we swung the other
          way and set the bar low enough to allow streaming media to be
          unresponsive, we would also allow denial of service attacks, which
          are typically unresponsive to congestion and consist of multiple
          continuous flows.</t>

          <t>Applications that need (or choose) to be unresponsive to
          congestion can effectively take (some would say steal) whatever
          share of bottleneck resources they want from responsive flows.
          Whether or not such free-riding is common, inability to prevent it
          increases the risk of poor returns for investors in network
          infrastructure, leading to under-investment. An increasing
          proportion of unresponsive or free-riding demand coupled with
          persistent under-supply is a broken economic cycle. Therefore, if
          the current, largely co-operative consensus continues to erode,
          congestion collapse could become more common in more areas of the
          Internet <xref target="RFC3714" />.</t>

          <t>While we have designed re-ECN so that networks can choose to
          deploy stringent policing, this does not imply we advocate that
          every network should introduce tight controls on those that cause
          congestion. Re-ECN has been specifically designed to allow different
          networks to choose how conservative or liberal they wish to be with
          respect to policing congestion. But those that choose to be
          conservative can protect themselves from the excesses that liberal
          networks allow their users.</t>
        </section>

        <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

        <section anchor="retcp_Case_Against_Bottleneck_Policing"
                 title="The Case Against Bottleneck Policing">
          <t>The state of the art in rate policing is the bottleneck policer,
          which is intended to be deployed at any forwarding resource that may
          become congested. Its aim is to detect flows that cause
          significantly more local congestion than others. Although operators
          might solve their immediate problems by deploying bottleneck
          policers, we are concerned that widespread deployment would make it
          extremely hard to evolve new application behaviours. We believe the
          IETF should offer re-ECN as the preferred protocol on which to base
          solutions to the policing problems of operators, because it would
          not harm evolvability and, frankly, it would be far more effective
          (see later for why).</t>

          <t>Approaches like <xref target="XCHOKe" /> & <xref
          target="pBox" /> are nice approaches for rate policing traffic
          without the benefit of whole path information (such as could be
          provided by re-ECN). But they must be deployed at bottlenecks in
          order to work. Unfortunately, a large proportion of traffic
          traverses at least two bottlenecks (in two access networks),
          particularly with the current traffic mix where peer-to-peer
          file-sharing is prevalent. If ECN were deployed, we believe it would
          be likely that these bottleneck policers would be adapted to combine
          ECN congestion marking from the upstream path with local congestion
          knowledge. But then the only useful placement for such policers
          would be close to the egress of the internetwork.</t>

          <t>But then, if these bottleneck policers were widely deployed
          (which would require them to be more effective than they are now),
          the Internet would find itself with one universal rate adaptation
          policy (probably TCP-friendliness) embedded throughout the network.
          Given TCP's congestion control algorithm is already known to be
          hitting its scalability limits and new algorithms are being
          developed for high-speed congestion control, embedding TCP policing
          into the Internet would make evolution to new algorithms extremely
          painful. If a source wanted to use a different algorithm, it would
          have to first discover then negotiate with all the policers on its
          path, particularly those in the far access network. The IETF has
          already traveled that path with the Intserv architecture and found
          it constrains scalability <xref target="RFC2208" />.</t>

          <t>Anyway, if bottleneck policers were ever widely deployed, they
          would be likely to be bypassed by determined attackers. They
          inherently have to police fairness per flow or per
          source-destination pair. Therefore they can easily be circumvented
          either by opening multiple flows (by varying the end-point port
          number); or by spoofing the source address but arranging with the
          receiver to hide the true return address at a higher layer.</t>
        </section>

        <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->
      </section>

      <!-- ________________________________________________________________ -->
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Incentive_Framework"
             title="Re-ECN Incentive Framework">
      <t>The aim is to create an incentive environment that ensures optimal
      sharing of capacity despite everyone acting selfishly (including lying
      and cheating). Of course, the mechanisms put in place for this can lie
      dormant wherever co-operation is the norm.</t>

      <section anchor="retcp_revealing_congestion"
               title="Revealing Congestion Along the Path">
        <t>Throughout this document we focus on path congestion. But some
        forms of fairness, particularly TCP's, also depend on round trip time.
        If TCP-fairness is required, we also propose to measure downstream
        path delay using re-feedback. We give a simple outline of how this
        could work in <xref target="retcp_Re-TTL" />. However, we do not
        expect this to be necessary, as researchers tend to agree that only
        congestion control dynamics need to depend on RTT, not the rate that
        the algorithm would converge on after a period of stability.</t>

        <t>Recall that re-ECN can be used to measure path congestion at any
        point on the path. End-systems know the whole path congestion. The
        receiver knows this by the ratio of negative packets to all other
        packets it observes. The sender knows this same information via the
        feedback.</t>

        <?rfc needLines="19" ?>

        <figure anchor="retcp_Fig_Up_Down_Congestion_Imprecise"
                title="A 2-Queue Example (Imprecise)">
          <artwork><![CDATA[
                                       
    +---+  +----+                +----+  +---+   
    | S |--| Q1 |----------------| Q2 |--| R |
    +---+  +----+                +----+  +---+
      .      .                      .      .
    ^ .      .                      .      .
    | .      .                      .      .
    | .     positive fraction       .      .
 3% |-------------------------------+======= 
    | .      .                      |      .
 2% | .      .                      |      .
    | .      .  negative fraction   |      .
 1% | .      +----------------------+      .
    | .      |                      .      .
 0% +--------------------------------------->
      ^          ^                      ^
      L          M                      N    Observation points
 
]]></artwork>
        </figure>

        <t><xref target="retcp_Fig_Up_Down_Congestion_Imprecise" /> uses a
        simple network to illustrate how re-ECN allows queues to measure
        downstream congestion. The receiver counts negative packets as being
        3% of all received packets. This fraction is fed back to the sender.
        The sender sets 3% of its packets to be positive to match this. This
        fraction of positive packets can be observed along the path. This is
        shown by the horizontal line at 3% in the figure. The negative
        fraction is shown by the stepped line which rises to meet the positive
        fraction line with steps at at each queue where packets are marked
        negative. Two queues are shown (Q1 and Q2) that are currently
        congested. Each time packets pass through a fraction are marked red;
        1% at Q1 and 2% at Q2). The approximate downstream congestion can be
        measured at the observation points shown along the path by subtracting
        the negative fraction from the positive fraction, as shown in the
        table below. <xref target="Re-TCP" /> [ref other document] derives
        these approximations from a precise analysis).</t>

        <?rfc needLines="9" ?>

        <texttable anchor="retcp_Tab_Downstream_Congestion_Example"
                   title="Downstream Congestion Measured at Example Observation Points">
          <ttcol align="center">Observation point</ttcol>

          <ttcol align="center">Approx downstream congestion</ttcol>

          <c>L</c>

          <c>3% - 0% = 3%</c>

          <c>M</c>

          <c>3% - 1% = 2%</c>

          <c>N</c>

          <c>3% - 3% = 0%</c>
        </texttable>

        <t>All along the path, whole-path congestion remains unchanged so it
        can be used as a reference against which to compare upstream
        congestion. The difference predicts downstream congestion for the rest
        of the path. Therefore, measuring the fractions of negative and
        positive packets at any point in the Internet will reveal upstream,
        downstream and whole path congestion.</t>

        <t>Note: to be absolutely clear these fractions are averages that
        would result from the behaviour of the protocol handler mechanically
        sending positive packets in direct response to incoming feedback---we
        are not saying any protocol handler has to work with these average
        fractions directly.</t>

        <!--<t>{ToDo: Consider whether this para is necessary.} Indeed, it would actually be incorrect for the protocol handlers to work with marking fractions, because TCP congestion control typically halves the packet rate every time there is congestion feedback. Too few packets would re-echo congestion if 3% of the halved packet rate was re-echoed in response to 3% of the earlier, higher packet rate being marked. The re-ECN algorithm for TCP specified by this document balances congestion markings and re-echoed markings octet for octet (which for a TCP with constant size packets also implies packet for packet). 
</t> -->

        <!--______________________________________________________________________-->

        <section anchor="retcp_pos_neg_flows"
                 title="Positive and Negative Flows">
          <t>In section <xref target="retcp_Introduction_protocol_in_brief" />
          we introduced the notion of IP packets having different values
          (negative, positive, cautious, cancelled and neutral). So positive
          and cautious packets have a value of +1, negative -1, and cancelled
          and neutral have zero value.</t>

          <t>In the rest of this document we will loosely talk of positive or
          negative flows. A negative flow is one where more negative bytes
          than positive bytes arrive at the reciever. Likewise positive flows
          are where more positive bytes arrive than negative bytes. Both of
          these indicate that the wrong amount of positive bytes have been
          sent.</t>
        </section>
      </section>

      <section anchor="retcp_re_ecn_framework_detail"
               title="Incentive Framework Overview">
        <t><xref target="retcp_Fig_Incentive_Framework" /> sketches the
        incentive framework that we will describe piece by piece throughout
        this section. We will do a first pass in overview, then return to each
        piece in detail. We re-use the earlier example of how downstream
        congestion is derived by subtracting upstream congestion from path
        congestion (<xref target="retcp_Fig_Up_Down_Congestion_Imprecise" />)
        but depict multiple trust boundaries to turn it into an internetwork.
        For clarity, only downstream congestion is shown (the difference
        between the two earlier plots). The graph displays downstream path
        congestion seen in a typical flow as it traverses an example path from
        sender S to receiver R, across networks N1, N2 & N3. Everyone is
        shown using re-ECN correctly, but we intend to show why everyone would
        /choose/ to use it correctly, and honestly.</t>

        <t>Three main types of self-interest can be identified: <list
            style="symbols">
            <t>Users want to transmit data across the network as fast as
            possible, paying as little as possible for the privilege. In this
            respect, there is no distinction between senders and receivers,
            but we must be wary of potential malice by one on the other;</t>

            <t>Network operators want to maximise revenues from the resources
            they invest in. They compete amongst themselves for the custom of
            users.</t>

            <t>Attackers (whether users or networks) want to use any
            opportunity to subvert the new re-ECN system for their own gain or
            to damage the service of their victims, whether targeted or
            random.</t>
          </list></t>

        <?rfc needLines="13" ?>

        <figure anchor="retcp_Fig_Incentive_Framework"
                title="Incentive Framework">
          <artwork><![CDATA[

       policer                        dropper
        |                                |
        |                                |
      S <-----N1----> <---N2---> <---N3--> R    domain
                     |          |                                
                     |          |                
                   Border Gateways
 

]]></artwork>
        </figure>

        <t>
          <list style="hanging">
            <t hangText="Source congestion control:">We want to ensure that
            the sender will throttle its rate as downstream congestion
            increases. Whatever the agreed congestion response (whether
            TCP-compatible or some enhanced QoS), to some extent it will
            always be against the sender's interest to comply.</t>

            <t hangText="Ingress policing:">But it is in all the network
            operators' interests to encourage fair congestion response, so
            that their investments are employed to satisfy the most valuable
            demand. The re-ECN protocol ensures packets carry the necessary
            information about their own expected downstream congestion so that
            N1 can deploy a policer at its ingress to check that S1 is
            complying with whatever congestion control it should be using
            (<xref target="retcp_Rate_Policing" />). If N1 is extremely
            conservative it could police each flow, but it is likely to just
            police the bulk amount of congestion each customer causes without
            regard to flows, or if it is extremely liberal it need not police
            congestion control at all. Whatever, it is always preferable to
            police traffic at the very first ingress into an internetwork,
            before non-compliant traffic can cause any damage.</t>

            <t hangText="Edge egress dropper:">If the policer ensures the
            source has less right to a high rate the higher it declares
            downstream congestion, the source has a clear incentive to
            understate downstream congestion. But, if flows of packets are
            understated when they enter the internetwork, they will have
            become negative by the time they leave. So, we introduce a dropper
            at the last network egress, which drops packets in flows that
            persistently declare negative downstream congestion (see <xref
            target="retcp_Dropper" /> for details).</t>

            <t hangText="Inter-domain traffic policing:">But next we must ask,
            if congestion arises downstream (say in N3), what is the ingress
            network's (N1's) incentive to police its customers' response? If
            N1 turns a blind eye, its own customers benefit while other
            networks suffer. This is why all inter-domain QoS architectures
            (e.g. Intserv, Diffserv) police traffic each time it crosses
            a trust boundary. We have already shown that re-ECN gives a
            trustworthy measure of the expected downstream congestion that a
            flow will cause by subtracting negative volume from positive at
            any intermediate point on a path. N3 (say) can use this measure to
            police all the responses to congestion of all the sources beyond
            its upstream neighbour (N2), but in bulk with one very simple
            passive mechanism, rather than per flow, as we will now
            explain.</t>

            <t
            hangText="Emulating policing with inter-domain congestion penalties:">Between
            high-speed networks, we would rather avoid per-flow policing, and
            we would rather avoid holding back traffic while it is policed.
            Instead, once re-ECN has arranged headers to carry downstream
            congestion honestly, N2 can contract to pay N3 penalties in
            proportion to a single bulk count of the congestion metrics
            crossing their mutual trust boundary (<xref
            target="retcp_Inter-domain_Policing" />). In this way, N3 puts
            pressure on N2 to suppress downstream congestion, for every flow
            passing through the border interface, even though they will all
            start and end in different places, and even though they may all be
            allowed different responses to congestion. The figure depicts this
            downward pressure on N2 by the solid downward arrow at the egress
            of N2. Then N2 has an incentive either to police the congestion
            response of its own ingress traffic (from N1) or to emulate
            policing by applying penalties to N1 in turn on the basis of
            congestion counted at their mutual boundary. In this recursive
            way, the incentives for each flow to respond correctly to
            congestion trace back with each flow precisely to each source,
            despite the mechanism not recognising flows (see <xref
            target="retcp_E2e_QoS" />).</t>

            <!-- <t hangText="A digression: Ingress edge synthesis of end-to-end QoS:"> In fact, by deliberately allowing some customers a more lax response to congestion, an ingress network can synthesise differentiated service. If the ingress network charges a higher but /flat/ subscription to its customers for this privilege, it would recover the higher but unpredictable downstream charges (see <xref target="retcp_Per-user_Policing" />). The ingress effectively brokers the risk of downstream congestion charges. Charging to reduce the risk of having to respond to congestion is equivalent to offering enhanced quality of service (see <xref target="retcp_Rate_Policing" /> and the caveat in <xref target="retcp_E2e_QoS" />). 
</t> 
-->

            <t hangText="Inter-domain congestion charging diversity:">Any two
            networks are free to agree any of a range of penalty regimes
            between themselves but they would only provide the right
            incentives if they were within the following reasonable
            constraints. N2 should expect to have to pay penalties to N3 where
            penalties monotonically increase with the volume of congestion and
            negative penalties are not allowed. For instance, they may agree
            an SLA with tiered congestion thresholds, where higher penalties
            apply the higher the threshold that is broken. But the most
            obvious (and useful) form of penalty is where N3 levies a charge
            on N2 proportional to the volume of downstream congestion N2 dumps
            into N3. In the explanation that follows, we assume this specific
            variant of volume charging between networks - charging
            proportionate to the volume of congestion.</t>

            <t>We must make clear that we are not advocating that everyone
            should use this form of contract. We are well aware that the IETF
            tries to avoid standardising technology that depends on a
            particular business model. And we strongly share this desire to
            encourage diversity. But our aim is merely to show that border
            policing can at least work with this one model, then we can assume
            that operators might experiment with the metric in other models
            (see <xref target="retcp_Inter-domain_Policing" /> for examples).
            Of course, operators are free to complement this usage element of
            their charges with traditional capacity charging, and we expect
            they will as predicted by economics.</t>

            <t hangText="No congestion charging to users:">Bulk congestion
            penalties at trust boundaries are passive and extremely simple,
            and lose none of their per-packet precision from one boundary to
            the next (unlike Diffserv all-address traffic conditioning
            agreements, which dissipate their effectiveness across long
            topologies). But at any trust boundary, there is no imperative to
            use congestion charging. Traditional traffic policing can be used,
            if the complexity and cost is preferred. In particular, at the
            boundary with end customers (e.g. between S and N1), traffic
            policing will most likely be more appropriate. Policer complexity
            is less of a concern at the edge of the network. And end-customers
            are known to be highly averse to the unpredictability of
            congestion charging.</t>

            <t hangText="NOTE WELL:">This document neither advocates nor
            requires congestion charging for end customers and advocates but
            does not require inter-domain congestion charging.</t>

            <t
            hangText="Competitive discipline of inter-domain traffic engineering:">With
            inter-domain congestion charging, a domain seems to have a
            perverse incentive to fake congestion; N2's profit depends on the
            difference between congestion at its ingress (its revenue) and at
            its egress (its cost). So, overstating internal congestion seems
            to increase profit. However, smart border routing <xref
            target="Smart_rtg" /> by N1 will bias its routing towards the
            least cost routes. So, N2 risks losing all its revenue to
            competitive routes if it overstates congestion (see <xref
            target="retcp_Traffic_Engineering" />). In other words, if N2 is
            the least congested route, its ability to raise excess profits is
            limited by the congestion on the next least congested route.</t>

            <t hangText="Closing the loop:">All the above elements conspire to
            trap everyone between two opposing pressures, ensuring the
            downstream congestion metric arrives at the destination neither
            above nor below zero. So, we have arrived back where we started in
            our argument. The ingress edge network can rely on downstream
            congestion declared in the packet headers presented by the sender.
            So it can police the sender's congestion response accordingly.</t>

            <t hangText="Evolvability of congestion control:">We have seen
            that re-ECN enables policing at the very first ingress. We have
            also seen that, as flows continue on their path through further
            networks downstream, re-ECN removes the need for further
            per-domain ingress policing of all the different congestion
            responses allowed to each different flow. This is why the
            evolvability of re-ECN policing is so superior to bottleneck
            policing or to any policing of different QoS for different flows.
            Even if all access networks choose to conservatively police
            congestion per flow, each will want to compete with the others to
            allow new responses to congestion for new types of application.
            With re-ECN, each can introduce new controls independently,
            without coordinating with other networks and without having to
            standardise anything. But, as we have just seen, by making
            inter-domain penalties proportionate to bulk downtream congestion,
            downstream networks can be agnostic to the specific congestion
            response for each flow, but they can still apply more penalty the
            more liberal the ingress access network has been in the response
            to congestion it allowed for each flow.</t>
          </list>
        </t>

        <!-- {ToDo: Leads to optimality. Proportional fairness between
      networks, but they may choose the congestion response they make their
      users keep to.} -->

        <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

        <t>We now take a second pass over the incentive framework, filling in
        the detail.</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Dropper" title="Egress Dropper">
        <t>As traffic leaves the last network before the receiver (domain N3
        in <xref target="retcp_Fig_Incentive_Framework" />), the fraction of
        positive octets in a flow should match the fraction of negative octets
        introduced by congestion marking (red packets), leaving a balance of
        zero. If it is less (a negative flow), it implies that the source is
        understating path congestion (which will reduce the penalties that N2
        owes N3).</t>

        <t>If flows are positive, N3 need take no action---this simply means
        its upstream neighbour is paying more penalties than it needs to, and
        the source is going slower than it needs to. But, to protect itself
        against persistently negative flows, N3 will need to install a dropper
        at its egress. <xref target="retcp_Alg_Sanction_Negative" /> gives a
        suggested algorithm for this dropper. There is no intention that the
        dropper algorithm needs to be standardised, it is merely provided to
        show that an efficient, robust algorithm is possible. But whatever
        algorithm is used must meet the criteria below: <list style="symbols">
            <t>It SHOULD introduce minimal false positives for honest
            flows;</t>

            <t>It SHOULD quickly detect and sanction dishonest flows (minimal
            false negatives);</t>

            <t>It SHOULD be invulnerable to state exhaustion attacks from
            malicious sources. For instance, if the dropper uses flow-state,
            it should not be possible for a source to send numerous packets,
            each with a different flow ID, to force the dropper to exhaust its
            memory capacity (rationale for SHOULD: Continuously sending
            keep-alive packets might be perfectly reasonable behaviour, so we
            can't distinguish a deliberate attack from reasonable levels of
            such behaviour. Therefore it is strictly impossible to be
            invulnerable to such an attack);</t>

            <t>It MUST introduce sufficient loss in goodput so that malicious
            sources cannot play off losses in the egress dropper against
            higher allowed throughput. Salvatori <xref target="CLoop_pol" />
            describes this attack, which involves the source understating path
            congestion then inserting forward error correction (FEC) packets
            to compensate expected losses;</t>

            <t>It MUST NOT be vulnerable to `identity whitewashing', where a
            transport can label a flow with a new ID more cheaply than paying
            the cost of continuing to use its current ID.</t>
          </list></t>

        <t>Note that the dropper operates on flows but we would like it not to
        require per-flow state. This is why we have been careful to ensure
        that all flows MUST start with a cautious packet. If a flow does not
        start with a cautious packet, a dropper is likely to treat it
        unfavourably. This risk makes it worth sending a cautious packet at
        the start of a flow, even though there is a cost to the sender of
        doing so (positive `worth'). Indeed, with cautious packets, the rate
        at which a sender can generate new flows can be limited (<xref
        target="retcp_Policer_Implementations" />). In this respect, cautious
        packets work like Handley's state set-up bit <xref
        target="Steps_DoS" />.</t>

        <t><xref target="retcp_Alg_Sanction_Negative" /> also gives an example
        dropper implementation that aggregates flow state. Dropper algorithms
        will often maintain a moving average across flows of the fraction of
        positive packets. When maintaining an average across flows, a dropper
        SHOULD only allow flows into the average if they start with a cautious
        packet, but it SHOULD NOT include cautious packets in the positive
        packet average. A sender sends cautious packets when it does not have
        the benefit of feedback from the receiver. So, counting cautious
        packets would be likely to make the average unnecessarily positive,
        providing headroom (or should we say footroom?) for dishonest
        (negative) traffic.</t>

        <t>If the dropper detects a persistently negative flow, it SHOULD drop
        sufficient negative and neutral packets to force the flow to not be
        negative. Drops SHOULD be focused on just sufficient packets in
        misbehaving flows to remove the negative bias while doing minimal
        extra harm.</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Rate_Policing" title="Ingress Policing">
        <t>Access operators who wish to limit the congeston that a sender is
        able to cause can deploy policers at the very first ingress to the
        internetwork. Re-ECN has been designed to avoid the need for
        bottleneck policing so that we can avoid a future where a single rate
        adaptation policy is embedded throughout the network. Instead, re-ECN
        allows the particular rate adaptation policy to be solely agreed
        bilaterally between the sender and its ingress access provider ([ref
        other document] discusses possible ways to signal between them), which
        allows congestion control to be policed, but maintains its
        evolvability, requiring only a single, local box to be updated.</t>

        <t><xref target="retcp_Policer_Implementations" /> gives examples of
        per-user policing algorithms. But there is no implication that these
        algorithms are to be standardised, or that they are ideal. The ingress
        rate policer is the part of the re-ECN incentive framework that is
        intended to be the most flexible. Once endpoint protocol handlers for
        re-ECN and egress droppers are in place, operators can choose exactly
        which congestion response they want to police, and whether they want
        to do it per user, per flow or not at all.</t>

        <t>The re-ECN protocol allows these ingress policers to easily perform
        bulk per-user policing (<xref target="retcp_Per_User_Policing" />).
        This is likely to provide sufficient incentive to the user to
        correctly respond to congestion without needing the policing function
        to be overly complex. If an access operator chose they could use
        per-flow policing according to the widely adopted TCP rate adaptation
        ( <xref target="retcp_Per_Flow_Policing" />) or other alternatives,
        however this would introduce extra complexity to the system.</t>

        <t>If a per-flow rate policer is used, it should use path (not
        downstream) congestion as the relevant metric, which is represented by
        the fraction of octets in packets with positive (positive and cautious
        packets) and cancelled packets. Of course, re-ECN provides all the
        information a policer needs directly in the packets being policed. So,
        even policing TCP's AIMD algorithm is relatively straightforward
        (<xref target="retcp_Per_Flow_Policing" />).</t>

        <t>Note that we have included cancelled packets in the measure of path
        congestion. cancelled packets arise when the sender sends a positive
        packet in response to feedback, but then this positive packet just
        happens to be congestion marked itself. One would not normally expect
        many cancelled packets at the first ingress because one would not
        normally expect much congestion marking to have been necessary that
        soon in the path. However, a home network or campus network may well
        sit between the sending endpoint and the ingress policer, so some
        congestion may occur upstream of the policer. And if congestion does
        occur upstream, some cancelled packets should be visible, and should
        be taken into account in the measure of path congestion.</t>

        <t>But a much more important reason for including cancelled packets in
        the measure of path congestion at an ingress policer is that a sender
        might otherwise subvert the protocol by sending cancelled packets
        instead of neutral packets. Like neutral, cancelled packets are worth
        zero, so the sender knows they won't be counted against any quota it
        might have been allowed. But unlike neutral packets, cancelled packets
        are immune to congestion marking, because they have already been
        congestion marked. So, it is both correct and useful that cancelled
        packets should be included in a policer's measure of path congestion,
        as this removes the incentive the sender would otherwise have to mark
        more packets as cancelled than it should.</t>

        <t>An ingress policer should also ensure that flows are not already
        negative when they enter the access network. As with cancelled
        packets, the presence of negative packets will typically be unusual.
        Therefore it will be easy to detect negative flows at the ingress by
        just detecting negative packets then monitoring the flow they belong
        to.</t>

        <t>Of course, even if the sender does operate its own network, it may
        arrange not to congestion mark traffic. Whether the sender does this
        or not is of no concern to anyone else except the sender. Such a
        sender will not be policed against its own network's contribution to
        congestion, but the only resulting problem would be overload in the
        sender's own network.</t>

        <t>Finally, we must not forget that an easy way to circumvent re-ECN's
        defences is for the source to turn off re-ECN support, by setting the
        Not-RECT codepoint, implying RFC3168 compliant traffic. Therefore an
        ingress policer should put a general rate-limit on Not-RECT traffic,
        which SHOULD be lax during early, patchy deployment, but will have to
        become stricter as deployment widens. Similarly, flows starting
        without a cautious packet can be confined by a strict rate-limit used
        for the remainder of flows that haven't proved they are well-behaved
        by starting correctly (therefore they need not consume any flow
        state---they are just confined to the `misbehaving' bin if they carry
        an unrecognised flow ID).</t>

        <!-- {ToDo:
      Weighted Policer} -->
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Inter-domain_Policing"
               title="Inter-domain Policing">
        <t>One of the main design goals of re-ECN is for border security
        mechanisms to be as simple as possible, otherwise they will become the
        pinch-points that limit scalability of the whole internetwork. We want
        to avoid per-flow processing at borders and to keep to passive
        mechanisms that can monitor traffic in parallel to forwarding, rather
        than having to filter traffic inline---in series with forwarding. Such
        passive, off-line mechanisms are essential for future high-speed
        all-optical border interconnection where packets cannot be buffered
        while they are checked for policy compliance.</t>

        <t>So far, we have been able to keep the border mechanisms simple,
        despite having had to harden them against some subtle attacks on the
        re-ECN design. The mechanisms are still passive and avoid per-flow
        processing.</t>

        <t>The basic accounting mechanism at each border interface simply
        involves accumulating the volume of packets with positive worth
        (positive and cautious packets), and subtracting the volume of those
        with negative worth (red packets). Even though this mechanism takes no
        regard of flows, over an accounting period (say a month) this
        subtraction will account for the downstream congestion caused by all
        the flows traversing the interface, wherever they come from, and
        wherever they go to. The two networks can agree to use this metric
        however they wish to determine some congestion-related penalty against
        the upstream network. Although the algorithm could hardly be simpler,
        it is spelled out using pseudo-code in <xref
        target="retcp_Bulk_Alg_Metering" />.</t>

        <!-- {ToDo:
      Replace the xml from here to just before "Note that the guiding
      principle..." with that in
      draft-briscoe-tsvwg-re-ecn-border-cheat-02a_fragment.xml} -->

        <t>Various attempts to subvert the re-ECN design have been made. In
        all cases their root cause is persistently negative flows. But, after
        describing these attacks we will show that we don't actually have to
        get rid of all persistently negative flows in order to thwart the
        attacks.</t>

        <t>In honest flows, downstream congestion is measured as positive
        minus negative volume. So if all flows are honest (i.e. not
        persistently negative), adding all positive volume and all negative
        volume without regard to flows will give an aggregate measure of
        downstream congestion. But such simple aggregation is only possible if
        no flows are persistently negative. Unless persistently negative flows
        are completely removed, they will reduce the aggregate measure of
        congestion. The aggregate may still be positive overall, but not as
        positive as it would have been had the negative flows been
        removed.</t>

        <t>In <xref target="retcp_Dropper" /> we discussed how to sanction
        traffic to remove, or at least to identify, persistently negative
        flows. But, even if the sanction for negative traffic is to discard
        it, unless it is discarded at the exact point it goes negative, it
        will wrongly subtract from aggregate downstream congestion, at least
        at any borders it crosses after it has gone negative but before it is
        discarded.</t>

        <t>We rely on sanctions to deter dishonest understatement of
        congestion. But even the ultimate sanction of discard can only be
        effective if the sender is bothered about the data getting through to
        its destination. A number of attacks have been identified where a
        sender gains from sending dummy traffic or it can attack someone or
        something using dummy traffic even though it isn't communicating any
        information to anyone: <list style="symbols">
            <t>A host can send traffic with no positive packets towards its
            intended destination, aiming to transmit as much traffic as any
            dropper will allow <xref target="Bauer06" />. It may add
            forward error correction (FEC) to repair as much drop as it
            experiences.</t>

            <t>A host can send dummy traffic into the network with no positive
            packets and with no intention of communicating with anyone, but
            merely to cause higher levels of congestion for others who do want
            to communicate (DoS). So, to ride over the extra congestion,
            everyone else has to spend more of whatever rights to cause
            congestion they have been allowed.</t>

            <t>A network can simply create its own dummy traffic to congest
            another network, perhaps causing it to lose business at no cost to
            the attacking network. This is a form of denial of service
            perpetrated by one network on another. The preferential drop
            measures in [ref other document] provide crude protection against
            such attacks, but we are not overly worried about more accurate
            prevention measures, because it is already possible for networks
            to DoS other networks on the general Internet, but they generally
            don't because of the grave consequences of being found out. We are
            only concerned if re-ECN increases the motivation for such an
            attack, as in the next example.</t>

            <t>A network can just generate negative traffic and send it over
            its border with a neighbour to reduce the overall penalties that
            it should pay to that neighbour. It could even initialise the TTL
            so it expired shortly after entering the neighbouring network,
            reducing the chance of detection further downstream. This attack
            need not be motivated by a desire to deny service and indeed need
            not cause denial of service. A network's main motivator would most
            likely be to reduce the penalties it pays to a neighbour. But, the
            prospect of financial gain might tempt the network into mounting a
            DoS attack on the other network as well, given the gain would
            offset some of the risk of being detected.</t>
          </list></t>

        <!-- {ToDo: Consider adding Bauer's other (estoeric) attack:
      Bauer et al argue there may even be cases where it is in a user's direct
      self-interest to send dummy traffic if they are playing a strategic game
      based on the assumption that a network operator will eventually upgrade
      capacity if congestion persists. If a user knows she will have high
      future demand, she may calculate it is worth sending a small amount of
      dummy traffic to push up congestion for other users with current demand.
      This may lead the network to upgrade earlier than it otherwise would,
      resulting in her higher future demand not causing any congestion, so
      that overall she spends less of her rights to cause congestion by
      strategically sending dummy traffic. } -->

        <t>The first step towards a solution to all these problems with
        negative flows is to be able to estimate the contribution they make to
        downstream congestion at a border and to correct the measure
        accordingly. Although ideally we want to remove negative flows
        themselves, perhaps surprisingly, the most effective first step is to
        cancel out the polluting effect negative flows have on the measure of
        downstream congestion at a border. It is more important to get an
        unbiased estimate of their effect, than to try to remove them all. A
        suggested algorithm to give an unbiased estimate of the contribution
        from negative flows to the downstream congestion measure is given in
        <xref target="retcp_Inflation_Negative_Flows" />.</t>

        <t>Although making an accurate assessment of the contribution from
        negative flows may not be easy, just the single step of neutralising
        their polluting effect on congestion metrics removes all the gains
        networks could otherwise make from mounting dummy traffic attacks on
        each other. This puts all networks on the same side (only with respect
        to negative flows of course), rather than being pitched against each
        other. The network where this flow goes negative as well as all the
        networks downstream lose out from not being reimbursed for any
        congestion this flow causes. So they all have an interest in getting
        rid of these negative flows. Networks forwarding a flow before it goes
        negative aren't strictly on the same side, but they are disinterested
        bystanders---they don't care that the flow goes negative downstream,
        but at least they can't actively gain from making it go negative. The
        problem becomes localised so that once a flow goes negative, all the
        networks from where it happens and beyond downstream each have a small
        problem, each can detect it has a problem and each can get rid of the
        problem if it chooses to. But negative flows can no longer be used for
        any new attacks.</t>

        <t>Once an unbiased estimate of the effect of negative flows can be
        made, the problem reduces to detecting and preferably removing flows
        that have gone negative as soon as possible. But importantly, complete
        eradication of negative flows is no longer critical---best endeavours
        will be sufficient.</t>

        <t>For instance, let us consider the case where a source sends traffic
        with no positive packets at all, hoping to at least get as much
        traffic delivered as network-based droppers will allow. The flow is
        likely to go at least slightly negative in the first network on the
        path (N1 if we use the example network layout in <xref
        target="retcp_Fig_Incentive_Framework" />). If all networks use the
        algorithm in <xref target="retcp_Inflation_Negative_Flows" /> to
        inflate penalties at their border with an upstream network, they will
        remove the effect of negative flows. So, for instance, N2 will not be
        paying a penalty to N1 for this flow. Further, because the flow
        contributes no positive packets at all, a dropper at the egress will
        completely remove it.</t>

        <t>The remaining problem is that every network is carrying a flow that
        is causing congestion to others but not being held to account for the
        congestion it is causing. Whenever the fail-safe border algorithm
        (<xref target="retcp_Fail-safes" />) or the border algorithm to
        compensate for negative flows (<xref
        target="retcp_Inflation_Negative_Flows" />) detects a negative flow,
        it can instantiate a focused dropper for that flow locally. It may be
        some time before the flow is detected, but the more strongly negative
        the flow is, the more quickly it will be detected by the fail-safe
        algorithm. But, in the meantime, it will not be distorting border
        incentives. Until it is detected, if it contributes to drop anywhere,
        its packets will tend to be dropped before others if queues use the
        preferential drop rules in [ref other document], which discriminate
        against non-positive packets. All networks below the point where a
        flow goes negative (N1, N2 and N3 in this case) have an incentive to
        remove this flow, but the queue where it first goes negative (in N1)
        can of course remove the problem for everyone downstream.</t>

        <t>In the case of DDoS attacks, <xref
        target="retcp_DDoS_Mitigation" /> describes how re-ECN mitigates their
        force.</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Fail-safes" title="Inter-domain Fail-safes">
        <t>The mechanisms described so far create incentives for rational
        network operators to behave. That is, one operator aims to make
        another behave responsibly by applying penalties and expects a
        rational response (i.e. one that trades off costs against benefits).
        It is usually reasonable to assume that other network operators will
        behave rationally (policy routing can avoid those that might not). But
        this approach does not protect against the misconfigurations and
        accidents of other operators.</t>

        <t>Therefore, we propose the following two mechanisms at a network's
        borders to provide "defence in depth". Both are similar: <list
            style="hanging">
            <t hangText="Highly positive flows:">A small sample of positive
            packets should be picked randomly as they cross a border
            interface. Then subsequent packets matching the same source and
            destination address and DSCP should be monitored. If the fraction
            of positive packets is well above a threshold (to be determined by
            operational practice), a management alarm SHOULD be raised, and
            the flow MAY be automatically subject to focused drop.</t>

            <t hangText="Persistently negative flows:">A small sample of
            congestion marked (red) packets should be picked randomly as they
            cross a border interface. Then subsequent packets matching the
            same source and destination address and DSCP should be monitored.
            If the balance of positive packets minus negative packets
            (measured in bytes) is persistently negative, a management alarm
            SHOULD be raised, and the flow MAY be automatically subject to
            focused drop.</t>
          </list></t>

        <t>Both these mechanisms rely on the fact that highly positive (or
        negative) flows will appear more quickly in the sample by selecting
        randomly solely from positive (or negative) packets.</t>
      </section>

      <section anchor="retcp_Case_against_Classic_Feedback"
               title="The Case against Classic Feedback">
        <t>A system that produces an optimal outcome as a result of everyone's
        selfish actions is extremely powerful. Especially one that enables
        evolvability of congestion control. But why do we have to change to
        re-ECN to achieve it? Can't classic congestion feedback (as used
        already by standard ECN) be arranged to provide similar incentives and
        similar evolvability? Superficially it can. Kelly's seminal work
        showed how we can allow everyone the freedom to evolve whatever
        congestion control behaviour is in their application's best interest
        but still optimise the whole system of networks and users by placing a
        price on congestion to ensure responsible use of this
        freedom <xref target="Evol_cc" />). Kelly used ECN with its
        classic congestion feedback model as the mechanism to convey
        congestion price information. The mechanism could be thought of as
        volume charging; except only the volume of packets marked with
        congestion experienced (CE) was counted.</t>

        <t>However, below we explain why relying on classic feedback
        /required/ congestion charging to be used, while re-ECN achieves the
        same powerful outcome (given it is built on Kelly's foundations), but
        does not /require/ congestion charging. In brief, the problem with
        classic feedback is that the incentives have to trace the indirect
        path back to the sender---the long way round the feedback loop. For
        example, if classic feedback were used in <xref
        target="retcp_Fig_Incentive_Framework" />, N2 would have had to
        influence N1 via all of N3, R & S rather than directly. <list
            style="hanging">
            <t hangText="Inability to agree what is happening downstream:">In
            order to police its upstream neighbour's congestion response, the
            neighbours should be able to agree on the congestion to be
            responded to. Whatever the feedback regime, as packets change
            hands at each trust boundary, any path metrics they carry are
            verifiable by both neighbours. But, with a classic path metric,
            they can only agree on the /upstream/ path congestion.</t>

            <t hangText="Inaccessible back-channel:">The network needs a
            whole-path congestion metric if it wants to control the source.
            Classically, whole path congestion emerges at the destination, to
            be fed back from receiver to sender in a back-channel. But, in any
            data network, back-channels need not be visible to relays, as they
            are essentially communications between the end-points. They may be
            encrypted, asymmetrically routed or simply omitted, so no network
            element can reliably intercept them. The congestion charging
            literature solves this problem by charging the receiver and
            assuming this will cause the receiver to refer the charges to the
            sender. But, of course, this creates unintended
            side-effects...</t>

            <t hangText="`Receiver pays' unacceptable:">In connectionless
            datagram networks, receivers and receiving networks cannot prevent
            reception from malicious senders, so `receiver pays' opens them to
            `denial of funds' attacks.</t>

            <t
            hangText="End-user congestion charging unacceptable in many societies:">Even
            if 'denial of funds' were not a problem, we know that end-users
            are highly averse to the unpredictability of congestion charging
            and anyway, we want to avoid restricting network operators to just
            one retail tariff. But with classic feedback only an upstream
            metric is available, so we cannot avoid having to wrap the
            `receiver pays' money flow around the feedback loop, necessarily
            forcing end-users to be subjected to congestion charging.</t>
          </list></t>

        <t>To summarise so far, with classic feedback, policing congestion
        response without losing evolvability /requires/ congestion charging of
        end-users and a `receiver pays' model, whereas, with re-ECN, it is
        still possible to influence incentives using congestion charging but
        using the safer `sender pays' model. However, congestion charging is
        only likely to be appropriate between domains. So, without losing
        evolvability, re-ECN enables technical policing mechanisms that are
        more appropriate for end users than congestion pricing.</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Simulations" title="Simulations">
        <t>Simulations of policer and dropper performance done for the
        multi-bit version of re-feedback have been included in section 5
        "Dropper Performance" of <xref target="Re-fb" />. Simulations of
        policer and dropper for the re-ECN version described in this document
        are work in progress.</t>
      </section>
    </section>

    <!-- =============================================================== -->

    <section anchor="retcp_Other_Applications"
             title="Other Applications of Re-ECN">
      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_DDoS_Mitigation" title="DDoS Mitigation">
        <t>A flooding attack is inherently about congestion of a resource.
        Because re-ECN ensures the sources causing network congestion
        experience the cost of their own actions, it acts as a first line of
        defence against DDoS. As load focuses on a victim, upstream queues
        grow, requiring honest sources to pre-load packets with a higher
        fraction of positive packets. Once downstream queues are so congested
        that they are dropping traffic, they will be marking to negative the
        traffic they do forward 100%. Honest sources will therefore be sending
        positive packets 100% (and therefore being severely rate-limited at
        the ingress).</t>

        <t>Senders under malicious control can either do the same as honest
        sources, and be rate-limited at ingress, or they can understate
        congestion by sending more neutral RECT packets than they should. If
        sources understate congestion (i.e. do not re-echo sufficient positive
        packets) and the preferential drop ranking is implemented on queues
        ([ref othe document]), these queues will preserve positive traffic
        until last. So, the neutral traffic from malicious sources will all be
        automatically dropped first. Either way, the malicious sources cannot
        send more than honest sources.</t>

        <t>Further, hosts under malicious control will tend to be re-used for
        many different attacks. They will therefore build up a long term
        history of causing congestion. Therefore, as long as the population of
        potentially compromisable hosts around the Internet is limited, the
        per-user policing algorithms in <xref
        target="retcp_Per_User_Policing" /> will gradually throttle down
        zombies and other launchpads for attacks. Therefore, widespread
        deployment of re-ECN could considerably dampen the force of DDoS.
        Certainly, zombie armies could hold their fire for long enough to be
        able to build up enough credit in the per-user policers to launch an
        attack. But they would then still be limited to no more throughput
        than other, honest users.</t>

        <t>Inter-domain traffic policing (see <xref
        target="retcp_Inter-domain_Policing" />)ensures that any network that
        harbours compromised `zombie' hosts will have to bear the cost of the
        congestion caused by traffic from zombies in downstream networks. Such
        networks will be incentivised to deploy per-user policers that
        rate-limit hosts that are unresponsive to congestion so they can only
        send very slowly into congested paths. As well as protecting other
        networks, the extremely poor performance at any sign of congestion
        will incentivise the zombie's owner to clean it up. However, the host
        should behave normally when using uncongested paths.</t>

        <t>Uniquely, re-ECN handles DDoS traffic without relying on the
        validity of identifiers in packets. Certainly the egress dropper
        relies on uniqueness of flow identifiers, but not their validity. So
        if a source spoofs another address, re-ECN works just as well, as long
        as the attacker cannot imitate all the flow identifiers of another
        active flow passing through the same dropper (see <xref
        target="retcp_Limitations" />). Similarly, the ingress policer relies
        on uniqueness of flow IDs, not their validity. Because a new flow will
        only be allowed any rate at all if it starts with a cautious packet,
        and the more cautious packets there are starting new flows, the more
        they will be limited. Essentially a re-ECN policer limits the bulk of
        all congestion entering the network through a physical interface;
        limiting the congestion caused by each flow is merely an optional
        extra.</t>

        <!--            <t>Note, however, that delay in detecting attacks does leave re-feedback briefly vulnerable. -->

        <!--            </t> -->
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_E2e_QoS" title="End-to-end QoS">
        <t>{ToDo: (Section 3.3.2 of <xref target="Re-fb" /> entitled `Edge
        QoS' gives an outline of the text that will be added here).}</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Traffic_Engineering" title="Traffic Engineering">
        <t>Classic feedback makes congestion-based traffic engineering
        inefficient too. Network N3 can see which of its two alternative
        upstream networks N2 and N3 are less congested. But it is N1 that
        makes the routing decision. This is why current traffic engineering
        requires a continuous message stream from congestion monitors to the
        routing controller. And even then the monitors can only be trusted for
        /intra-/domain traffic engineering. The trustworthiness of re-ECN
        enables /inter-/domain traffic engineering without messaging overhead.
        {ToDo: Elaborate}</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Inter-Provider_Monitoring"
               title="Inter-Provider Service Monitoring">
        <t>{ToDo: }</t>
      </section>

      <!-- {ToDo: Routing Certain Flag Incentivising Slow-Start} -->
    </section>

    <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

    <section anchor="retcp_Limitations" title="Limitations">
      <t>{ToDo:See also: slide of limitations}</t>

      <t>The known limitations of the re-ECN approach are: <list
          style="symbols">
          <t>We still cannot defend against the attack described in <xref
          target="retcp_Security_Considerations" /> where a malicious source
          sends negative traffic through the same egress dropper as another
          flow and imitates its flow identifiers, allowing a malicious source
          to cause an innocent flow to experience heavy drop.</t>

          <t>Re-feedback for TTL (re-TTL) would also be desirable at the same
          time as re-ECN. Unfortunately this requires a further standards
          action for the mechanisms briefly described in <xref
          target="retcp_Re-TTL" /></t>

          <t>Traffic must be ECN-capable for re-ECN to be effective. The only
          defence against malicious users who turn off ECN capbility is that
          networks are expected to rate limit Not-ECT traffic and to apply
          higher drop preference to it during congestion. Although these are
          blunt instruments, they at least represent a feasible scenario for
          the future Internet where Not-ECT traffic co-exists with re-ECN
          traffic, but as a severely hobbled under-class. We recommend (<xref
          target="retcp_Deployment_Features" />) that while accommodating a
          smooth initial transition to re-ECN, policing policies should
          gradually be tightened to rate limit Not-ECT traffic more strictly
          in the longer term.</t>

          <t>When checking whether a flow is balancing positive packets with
          negative packets (measured in bytes), re-ECN can only account for
          congestion marking, not drops. So, whenever a sender experiences
          drop, it does not have to re-echo the congestion event by sending
          positive packet(s). Nonetheless, it is hardly any advantage to be
          able to send faster than other flows only if your traffic is dropped
          and the other traffic isn't.</t>

          <t>We are considering the issue of whether it would be useful to
          truncate rather than drop packets that appear to be malicious, so
          that the feedback loop is not broken but useful data can be
          removed.</t>
        </list></t>

      <t>{ToDo: Monopolies over Routes}</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Incremental_Deployment"
             title="Incremental Deployment">
      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Deployment_Features"
               title="Incremental Deployment Features">
        <t>The design of the re-ECN protocol started from the fact that the
        current ECN marking behaviour of queues was sufficient and that
        re-feedback could be introduced around these queues by changing the
        sender behaviour but not the routers. Otherwise, if we had required
        routers to be changed, the chance of encountering a path that had
        every router upgraded would be vanishly small during early deployment,
        giving no incentive to start deployment. Also, as there is no new
        forwarding behaviour, routers and hosts do not have to signal or
        negotiate anything.</t>

        <t>However, networks that choose to protect themselves using re-ECN do
        have to add new security functions at their trust boundaries with
        others. They distinguish legacy traffic by its ECN field. Traffic from
        Not-ECT transports is distinguishable by its Not-ECT marking . Traffic
        from RFC3168 compliant ECN transports is distinguished from re-ECN by
        which of ECT(0) or ECT(1) is used. We chose to use ECT(1) for re-ECN
        traffic deliberately. Existing ECN sources set ECT(0) on either 50%
        (the nonce) or 100% (the default) of packets, whereas re-ECN does not
        use ECT(0) at all. We can use this distinguishing feature of RFC3168
        compliant ECN traffic to separate it out for different treatment at
        the various border security functions: egress dropping, ingress
        policing and border policing.</t>

        <t>The general principle we adopt is that an egress dropper will not
        drop any legacy traffic, but ingress and border policers will limit
        the bulk rate of legacy traffic (Not-ECT, ECT(0) and those amrked with
        the unused codepoint as defined in <xref target="Re-TCP" />) that can
        enter each network. Then, during early re-ECN deployment, operators
        can set very permissive (or non-existent) rate-limits on legacy
        traffic, but once re-ECN implementations are generally available,
        legacy traffic can be rate-limited increasingly harshly. Ultimately,
        an operator might choose to block all legacy traffic entering its
        network, or at least only allow through a trickle.</t>

        <t>Then, as the limits are set more strictly, the more RFC3168 ECN
        sources will gain by upgrading to re-ECN. Thus, towards the end of the
        voluntary incremental deployment period, RFC3168 compliant transports
        can be given progressively stronger encouragement to upgrade.</t>
      </section>

      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Deployment_Incentives"
               title="Incremental Deployment Incentives">
        <t>It would only be worth standardising the re-ECN protocol if there
        existed a coherent story for how it might be incrementally deployed.
        In order for it to have a chance of deployment, everyone who needs to
        act must have a strong incentive to act, and the incentives must arise
        in the order that deployment would have to happen. Re-ECN works around
        unmodified ECN routers, but we can't just discuss why and how re-ECN
        deployment might build on ECN deployment, because there is precious
        little to build on in the first place. Instead, we aim to show that
        re-ECN deployment could carry ECN with it. We focus on commercial
        deployment incentives, although some of the arguments apply equally to
        academic or government sectors. <list style="hanging">
            <t hangText="ECN deployment:" />

            <t>ECN is largely implemented in commercial routers, but generally
            not as a supported feature, and it has largely not been deployed
            by commercial network operators. ECN has been implemented in most
            Unix-based operating systems for some time. Microsoft first
            implemented ECN in Windows Vista, but it is only on by default for
            the server end of a TCP connection. Unfortunately the client end
            had to be turned off by default, because a non-zero ECN field
            triggers a bug in a legacy home gateway which makes it crash. For
            detailed deployment status, see <xref target="ECN-Deploy" />. We
            believe the reason ECN deployment has not happened is twofold:
            <list style="symbols">
                <t>ECN requires changes to both routers and hosts. If someone
                wanted to sell the improvement that ECN offers, they would
                have to co-ordinate deployment of their product with others.
                An ECN server only gives any improvement on an ECN network. An
                ECN network only gives any improvement if used by ECN devices.
                Deployment that requires co-ordination adds cost and delay and
                tends to dilute any competitive advantage that might be
                gained.</t>

                <t>ECN `only' gives a performance improvement. Making a
                product a bit faster (whether the product is a device or a
                network), isn't usually a sufficient selling point to be worth
                the cost of co-ordinating across the industry to deploy it.
                Network operators tend to avoid re-configuring a working
                network unless launching a new product.</t>
              </list></t>

            <t hangText="ECN and Re-ECN for Edge-to-edge Assured QoS:" />

            <t>We believe the proposal to provide assured QoS sessions using a
            form of ECN called pre-congestion notification (PCN) <xref
            target="RFC5559" /> is most likely to break the deadlock in ECN
            deployment first. It only requires edge-to-edge deployment so it
            does not require endpoint support. It can be deployed in a single
            network, then grow incrementally to interconnected networks. And
            it provides a different `product' (internetworked assured QoS),
            rather than merely making an existing product a bit faster.</t>

            <t>Not only could this assured QoS application kick-start ECN
            deployment, it could also carry re-ECN deployment with it; because
            re-ECN can enable the assured QoS region to expand to a large
            internetwork where neighbouring networks do not trust each other.
            <xref target="Re-PCN" /> argues that re-ECN security should be
            built in to the QoS system from the start, explaining why and
            how.</t>

            <t>If ECN and re-ECN were deployed edge-to-edge for assured QoS,
            operators would gain valuable experience. They would also clear
            away many technical obstacles such as firewall configurations that
            block all but the RFC3168 settings of the ECN field and the RE
            flag.</t>

            <t hangText="ECN in Access Networks:" />

            <t>The next obstacle to ECN deployment would be extension to
            access and backhaul networks, where considerable link layer
            differences makes implementation non-trivial, particularly on
            congested wireless links. ECN and re-ECN work fine during partial
            deployment, but they will not be very useful if the most congested
            elements in networks are the last to support them. Access network
            support is one of the weakest parts of this deployment story. All
            we can hope is that, once the benefits of ECN are better
            understood by operators, they will push for the necessary link
            layer implementations as deployment proceeds.</t>

            <t hangText="Policing Unresponsive Flows:" />

            <t>Re-ECN allows a network to offer differentiated quality of
            service as explained in <xref target="retcp_E2e_QoS" />. But we do
            not believe this will motivate initial deployment of re-ECN,
            because the industry is already set on alternative ways of doing
            QoS. Despite being much more complicated and expensive, the
            alternative approaches are here and now.</t>

            <t>But re-ECN is critical to QoS deployment in another respect. It
            can be used to prevent applications from taking whatever bandwidth
            they choose without asking.</t>

            <t>Currently, applications that remain resolute in their lack of
            response to congestion are rewarded by other TCP applications. In
            other words, TCP is naively friendly, in that it reduces its rate
            in response to congestion whether it is competing with friends
            (other TCPs) or with enemies (unresponsive applications).</t>

            <t>Therefore, those network owners that want to sell QoS will be
            keen to ensure that their users can't help themselves to QoS for
            free. Given the very large revenues at stake, we believe effective
            policing of congestion response will become highly sought after by
            network owners.</t>

            <t>But this does not necessarily argue for re-ECN deployment.
            Network owners might choose to deploy bottleneck policers rather
            than re-ECN-based policing. However, under Related Work (<xref
            target="retcp_Related_Work" />) we argue that bottleneck policers
            are inherently vulnerable to circumvention.</t>

            <t>Therefore we believe there will be a strong demand from network
            owners for re-ECN deployment so they can police flows that do not
            ask to be unresponsive to congestion, in order to protect their
            revenues from flows that do ask (QoS). In particular, we suspect
            that the operators of cellular networks will want to prevent VoIP
            and video applications being used freely on their networks as a
            more open market develops in GPRS and 3G devices.</t>

            <t>Initial deployments are likely to be isolated to single
            cellular networks. Cellular operators would first place
            requirements on device manufacturers to include re-ECN in the
            standards for mobile devices. In parallel, they would put out
            tenders for ingress and egress policers. Then, after a while they
            would start to tighten rate limits on Not-ECT traffic from
            non-standard devices and they would start policing whatever
            non-accredited applications people might install on mobile devices
            with re-ECN support in the operating system. This would force even
            independent mobile device manufacturers to provide re-ECN support.
            Early standardisation across the cellular operators is likely,
            including interconnection agreements with penalties for excess
            downstream congestion.</t>

            <t>We suspect some fixed broadband networks (whether cable or DSL)
            would follow a similar path. However, we also believe that larger
            parts of the fixed Internet would not choose to police on a
            per-flow basis. Some might choose to police congestion on a
            per-user basis in order to manage heavy peer-to-peer file-sharing,
            but it seems likely that a sizeable majority would not deploy any
            form of policing.</t>

            <t>This hybrid situation begs the question, "How does re-ECN work
            for networks that choose to using policing if they connect with
            others that don't?" Traffic from non-ECN capable sources will
            arrive from other networks and cause congestion within the
            policed, ECN-capable networks. So networks that chose to police
            congestion would rate-limit Not-ECT traffic throughout their
            network, particularly at their borders. They would probably also
            set higher usage prices in their interconnection contracts for
            incoming Not-ECT and Not-RECT traffic. We assume that
            interconnection contracts between networks in the same tier will
            include congestion penalties before contracts with provider
            backbones do.</t>

            <t>A hybrid situation could remain for all time. As was explained
            in the introduction, we believe in healthy competition between
            policing and not policing, with no imperative to convert the whole
            world to the religion of policing. Networks that chose not to
            deploy egress droppers would leave themselves open to being
            congested by senders in other networks. But that would be their
            choice.</t>

            <t>The important aspect of the egress dropper though is that it
            most protects the network that deploys it. If a network does not
            deploy an egress dropper, sources sending into it from other
            networks will be able to understate the congestion they are
            causing. Whereas, if a network deploys an egress dropper, it can
            know how much congestion other networks are dumping into it, and
            apply penalties or charges accordingly. So, whether or not a
            network polices its own sources at ingress, it is in its interests
            to deploy an egress dropper.</t>

            <t hangText="Host support:" />

            <t>In the above deployment scenario, host operating system support
            for re-ECN came about through the cellular operators demanding it
            in device standards (i.e. 3GPP). Of course, increasingly, mobile
            devices are being built to support multiple wireless technologies.
            So, if re-ECN were stipulated for cellular devices, it would
            automatically appear in those devices connected to the wireless
            fringes of fixed networks if they coupled cellular with WiFi or
            Bluetooth technology, for instance. Also, once implemented in the
            operating system of one mobile device, it would tend to be found
            in other devices using the same family of operating system.</t>

            <t>Therefore, whether or not a fixed network deployed ECN, or
            deployed re-ECN policers and droppers, many of its hosts might
            well be using re-ECN over it. Indeed, they would be at an
            advantage when communicating with hosts across re-ECN policed
            networks that rate limited Not-RECT traffic.</t>

            <t hangText="Other possible scenarios:" />

            <t>The above is thankfully not the only plausible scenario we can
            think of. One of the many clubs of operators that meet regularly
            around the world might decide to act together to persuade a major
            operating system manufacturer to implement re-ECN. And they may
            agree between them on an interconnection model that includes
            congestion penalties.</t>

            <t>Re-ECN provides an interesting opportunity for device
            manufacturers as well as network operators. Policers can be
            configured loosely when first deployed. Then as re-ECN take-up
            increases, they can be tightened up, so that a network with re-ECN
            deployed can gradually squeeze down the service provided to
            RFC3168 compliant devices that have not upgraded to re-ECN. Many
            device vendors rely on replacement sales. And operating system
            companies rely heavily on new release sales. Also support services
            would like to be able to force stragglers to upgrade. So, the
            ability to throttle service to RFC3168 compliant operating systems
            is quite valuable.</t>

            <t>Also, policing unresponsive sources may not be the only or even
            the first application that drives deployment. It may be policing
            causes of heavy congestion (e.g. peer-to-peer file-sharing). Or it
            may be mitigation of denial of service. Or we may be wrong in
            thinking simpler QoS will not be the initial motivation for re-ECN
            deployment. Indeed, the combined pressure for all these may be the
            motivator, but it seems optimistic to expect such a level of
            joined-up thinking from today's communications industry. We
            believe a single application alone must be a sufficient
            motivator.</t>

            <t>In short, everyone gains from adding accountability to TCP/IP,
            except the selfish or malicious. So, deployment incentives tend to
            be strong.</t>
          </list></t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Architectural_Rationale"
             title="Architectural Rationale">
      <t>In the Internet's technical community, the danger of not responding
      to congestion is well-understood, as well as its attendant risk of
      congestion collapse <xref target="RFC3714" />. However, one side of
      the Internet's commercial community considers that the very essence of
      IP is to provide open access to the internetwork for all applications.
      They see congestion as a symptom of over-conservative investment, and
      rely on revising application designs to find novel ways to keep
      applications working despite congestion. They argue that the Internet
      was never intended to be solely for TCP-friendly applications.
      Meanwhile, another side of the Internet's commercial community believes
      that it is worthwhile providing a network for novel applications only if
      it has sufficient capacity, which can happen only if a greater share of
      application revenues can be /assured/ for the infrastructure provider.
      Otherwise the major investments required would carry too much risk and
      wouldn't happen.</t>

      <t>The lesson articulated in <xref target="Tussle" /> is that we
      shouldn't embed our view on these arguments into the Internet at design
      time. Instead we should design the Internet so that the outcome of these
      arguments can get decided at run-time. Re-ECN is designed in that
      spirit. Once the protocol is available, different network operators can
      choose how liberal they want to be in holding people accountable for the
      congestion they cause. Some might boldly invest in capacity and not
      police its use at all, hoping that novel applications will result.
      Others might use re-ECN for fine-grained flow policing, expecting to
      make money selling vertically integrated services. Yet others might sit
      somewhere half-way, perhaps doing coarse, per-user policing. All might
      change their minds later. But re-ECN always allows them to interconnect
      so that the careful ones can protect themselves from the liberal
      ones.</t>

      <t>The incentive-based approach used for re-ECN is based on Gibbens and
      Kelly's arguments <xref target="Evol_cc" /> on allowing endpoints
      the freedom to evolve new congestion control algorithms for new
      applications. They ensured responsible behaviour despite everyone's
      self-interest by applying pricing to ECN marking, and Kelly had proved
      stability and optimality in an earlier paper.</t>

      <t>Re-ECN keeps all the underlying economic incentives, but rearranges
      the feedback. The idea is to allow a network operator (if it chooses) to
      deploy engineering mechanisms like policers at the front of the network
      which can be designed to behave /as if/ they are responding to
      congestion prices. Rather than having to subject users to congestion
      pricing, networks can then use more traditional charging regimes (or
      novel ones). But the engineering can constrain the overall amount of
      congestion a user can cause. This provides a buffer against completely
      outrageous congestion control, but still makes it easy for novel
      applications to evolve if they need different congestion control to the
      norms. It also allows novel charging regimes to evolve.</t>

      <t>Despite being achieved with a relatively minor protocol change,
      re-ECN is an architectural change. Previously, Internet congestion could
      only be controlled by the data sender, because it was the only one both
      in a position to control the load and in a position to see information
      on congestion. Re-ECN levels the playing field. It recognises that the
      network also has a role to play in moderating (policing) congestion
      control. But policing is only truly effective at the first ingress into
      an internetwork, whereas path congestion was previously only visible at
      the last egress. So, re-ECN democratises congestion information. Then
      the choice over who actually controls congestion can be made at
      run-time, not design time---a bit like an aircraft with dual controls.
      And different operators can make different choices. We believe
      non-architectural approaches to this problem are unlikely to offer more
      than partial solutions (see <xref target="retcp_Related_Work" />).</t>

      <t>Importantly, re-ECN does not require assumptions about specific
      congestion responses to be embedded in any network elements, except at
      the first ingress to the internetwork if that level of control is
      desired by the ingress operator. But such tight policing will be a
      matter of agreement between the source and its access network operator.
      The ingress operator need not police congestion response at flow
      granularity; it can simply hold a source responsible for the aggregate
      congestion it causes, perhaps keeping it within a monthly congestion
      quota. Or if the ingress network trusts the source, it can do
      nothing.</t>

      <t>Therefore, the aim of the re-ECN protocol is NOT solely to police
      TCP-friendliness. Re-ECN preserves IP as a generic network layer for all
      sorts of responses to congestion, for all sorts of transports. Re-ECN
      merely ensures truthful downstream congestion information is available
      in the network layer for all sorts of accountability applications.</t>

      <t>The end to end design principle does not say that all functions
      should be moved out of the lower layers---only those functions that are
      not generic to all higher layers. Re-ECN adds a function to the network
      layer that is generic, but was omitted: accountability for causing
      congestion. Accountability is not something that an end-user can provide
      to themselves. We believe re-ECN adds no more than is sufficient to hold
      each flow accountable, even if it consists of a single datagram.</t>

      <t>"Accountability" implies being able to identify who is responsible
      for causing congestion. However, at the network layer it would NOT be
      useful to identify the cause of congestion by adding individual or
      organisational identity information, NOR by using source IP addresses.
      Rather than bringing identity information to the point of congestion, we
      bring downstream congestion information to the point where the cause can
      be most easily identified and dealt with. That is, at any trust boundary
      congestion can be associated with the physically connected upstream
      neighbour that is directly responsible for causing it (whether
      intentionally or not). A trust boundary interface is exactly the place
      to police or throttle in order to directly mitigate congestion, rather
      than having to trace the (ir)responsible party in order to shut them
      down.</t>

      <t>Some considered that ECN itself was a layering violation. The
      reasoning went that the interface to a layer should provide a service to
      the higher layer and hide how the lower layer does it. However, ECN
      reveals the state of the network layer and below to the transport layer.
      A more positive way to describe ECN is that it is like the return value
      of a function call to the network layer. It explicitly returns the
      status of the request to deliver a packet, by returning a value
      representing the current risk that a packet will not be served. Re-ECN
      has similar semantics, except the transport layer must try to guess the
      return value, then it can use the actual return value from the network
      layer to modify the next guess.</t>

      <t>The guiding principle behind all the discussion in <xref
      target="retcp_Inter-domain_Policing" /> on Policing is that any gain
      from subverting the protocol should be precisely neutralised, rather
      than punished. If a gain is punished to a greater extent than is
      sufficient to neutralise it, it will most likely open up a new
      vulnerability, where the amplifying effect of the punishment mechanism
      can be turned on others.</t>

      <t>For instance, if possible, flows should be removed as soon as they go
      negative, but we do NOT RECOMMEND any attempts to discard such flows
      further upstream while they are still positive. Such over-zealous
      push-back is unnecessary and potentially dangerous. These flows have
      paid their `fare' up to the point they go negative, so there is no harm
      in delivering them that far. If someone downstream asks for a flow to be
      dropped as near to the source as possible, because they say it is going
      to become negative later, an upstream node cannot test the truth of this
      assertion. Rather than have to authenticate such messages, re-ECN has
      been designed so that flows can be dropped solely based on locally
      measurable evidence. A message hinting that a flow should be watched
      closely to test for negativity is fine. But not a message that claims
      that a positive flow will go negative later, so it should be dropped.
      .</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Related_Work" title="Related Work">
      <t>{Due to lack of time, this section is incomplete. The reader is
      referred to the Related Work section of <xref target="Re-fb" /> for a
      brief selection of related ideas.}</t>

      <!-- {ToDo: Ancestry Under perfect competition, charges will tend to cost,
    so all prices will be equal. But under less than perfect competition,
    /value pricing/ can be applied, particularly at the retail edges of the
    network. In this sense, re-ECN can be loosely thought of as a realisation
    of MacKie-Mason & Varian's `smart pricing' idea that was intended to
    be a thought experiment rather than a practical
    proposition~\cite{MacKieVar93:Pricing_the_Internet}.} -->

      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Policing_Rate_Response_to_Congestion"
               title="Policing Rate Response to Congestion">
        <t>ATM network elements send congestion back-pressure
        messages <xref target="ITU-T.I.371" /> along each connection,
        duplicating any end to end feedback because they don't trust it. On
        the other hand, re-ECN ensures information in forwarded packets can be
        used for congestion management without requiring a connection-oriented
        architecture and re-using the overhead of fields that are already set
        aside for end to end congestion control (and routing loop detection in
        the case of re-TTL in <xref target="retcp_Re-TTL" />).</t>

        <t>We borrowed ideas from policers in the literature <xref
        target="pBox" />,<xref target="XCHOKe" />, AFD etc. for our rate
        equation policer. However, without the benefit of re-ECN they don't
        police the correct rate for the condition of their path. They detect
        unusually high /absolute/ rates, but only while the policer itself is
        congested, because they work by detecting prevalent flows in the
        discards from the local RED queue. These policers must sit at every
        potential bottleneck, whereas our policer need only be located at each
        ingress to the internetwork. As Floyd & Fall explain <xref
        target="pBox" />, the limitation of their approach is that a high
        sending rate might be perfectly legitimate, if the rest of the path is
        uncongested or the round trip time is short. Commercially available
        rate policers cap the rate of any one flow. Or they enforce monthly
        volume caps in an attempt to control high volume file-sharing. They
        limit the value a customer derives. They might also limit the
        congestion customers can cause, but only as an accidental side-effect.
        They actually punish traffic that fills troughs as much as traffic
        that causes peaks in utilisation. In practice network operators need
        to be able to allocate service by cost during congestion, and by value
        at other times.</t>
      </section>

      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Congestion_Notification_Integrity"
               title="Congestion Notification Integrity">
        <t>The choice of two ECT code-points in the ECN field <xref
        target="RFC3168" /> permitted future flexibility, optionally allowing
        the sender to encode the experimental ECN nonce <xref
        target="RFC3540" /> in the packet stream. This mechanism has since
        been included in the specifications of DCCP <xref
        target="RFC4340" />.</t>

        <t>The ECN nonce is an elegant scheme that allows the sender to detect
        if someone in the feedback loop - the receiver especially - tries to
        claim no congestion was experienced when in fact congestion led to
        packet drops or ECN marks. For each packet it sends, the sender
        chooses between the two ECT codepoints in a pseudo-random sequence.
        Then, whenever the network marks a packet with CE, if the receiver
        wants to deny congestion happened, she has to guess which ECT
        codepoint was overwritten. She has only a 50:50 chance of being
        correct each time she denies a congestion mark or a drop, which
        ultimately will give her away.</t>

        <t>The purpose of a network-layer nonce should primarily be protection
        of the network, while a transport-layer nonce would be better used to
        protect the sender from cheating receivers. Now, the assumption behind
        the ECN nonce is that a sender will want to detect whether a receiver
        is suppressing congestion feedback. This is only true if the sender's
        interests are aligned with the network's, or with the community of
        users as a whole. This may be true for certain large senders, who are
        under close scrutiny and have a reputation to maintain. But we have to
        deal with a more hostile world, where traffic may be dominated by
        peer-to-peer transfers, rather than downloads from a few popular
        sites. Often the `natural' self-interest of a sender is not aligned
        with the interests of other users. It often wishes to transfer data
        quickly to the receiver as much as the receiver wants the data
        quickly.</t>

        <t>In contrast, the re-ECN protocol enables policing of an agreed
        rate-response to congestion (e.g. TCP-friendliness) at the
        sender's interface with the internetwork. It also ensures downstream
        networks can police their upstream neighbours, to encourage them to
        police their users in turn. But most importantly, it requires the
        sender to declare path congestion to the network and it can remove
        traffic at the egress if this declaration is dishonest. So it can
        police correctly, irrespective of whether the receiver tries to
        suppress congestion feedback or whether the sender ignores genuine
        congestion feedback. Therefore the re-ECN protocol addresses a much
        wider range of cheating problems, which includes the one addressed by
        the ECN nonce.</t>

        <!-- {ToDo: Ensure we address the early ACK
    problem.} -->
      </section>

      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Identifying_Upstream_Downstream"
               title="Identifying Upstream and Downstream Congestion">
        <t>Purple <xref target="Purple" /> proposes that queues should
        use the CWR flag in the TCP header of ECN-capable flows to work out
        path congestion and therefore downstream congestion in a similar way
        to re-ECN. However, because CWR is in the transport layer, it is not
        always visible to network layer routers and policers. Purple's
        motivation was to improve AQM, not policing. But, of course, nodes
        trying to avoid a policer would not be expected to allow CWR to be
        visible.</t>

        <!-- {ToDo: Clark~\cite{Clark96:Combining_sndr+rcvr_payments}
    proposed a decrementing field representing payment as a packet traversed a
    path, with receiver-initiated messages able to meet it in the middle to
    make up any shortfall. We argue that network layer fields should represent
    verifiable properties of the path. Then operators can choose to apply
    pricing to them to determine cost or value (or choose not
    to).} -->
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Security_Considerations"
             title="Security Considerations">
      <t>{ToDo: enrich this section}{ToDo: Describe attacks by networks on
      flows (and by spoofing sources).} {ToDo: Re-ECN & DNS servers}</t>

      <t>Nearly the whole of this document concerns security.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_IANA_Considerations" title="IANA Considerations">
      <t>This memo includes no request to IANA.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Conclusions" title="Conclusions">
      <t>{ToDo:}</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Acknowledgements" title="Acknowledgements">
      <t>Sébastien Cazalet and Andrea Soppera contributed to the idea
      of re-feedback. All the following have given helpful comments: Andrea
      Soppera, David Songhurst, Peter Hovell, Louise Burness, Phil Eardley,
      Steve Rudkin, Marc Wennink, Fabrice Saffre, Cefn Hoile, Steve Wright,
      John Davey, Martin Koyabe, Carla Di Cairano-Gilfedder, Alexandru Murgu,
      Nigel Geffen, Pete Willis, John Adams (BT), Sally Floyd (ICIR), Joe
      Babiarz, Kwok Ho-Chan (Nortel), Stephen Hailes, Mark Handley (who
      developed the attack with cancelled packets), Adam Greenhalgh (who
      developed the attack on DNS) (UCL), Jon Crowcroft (Uni Cam), David
      Clark, Bill Lehr, Sharon Gillett, Steve Bauer (who complemented our own
      dummy traffic attacks with others), Liz Maida (MIT), and comments from
      participants in the CRN/CFP Broadband and DoS-resistant Internet working
      groups.A special thank you to Alessandro Salvatori for coming up with
      fiendish attacks on re-ECN.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Comments_Solicited" title="Comments Solicited">
      <t>Comments and questions are encouraged and very welcome. They can be
      addressed to the IETF Transport Area working group's mailing list
      <tsvwg@ietf.org>, and/or to the authors.</t>
    </section>
  </middle>

  <back>
    <!-- ================================================================ -->

    <references title="Normative References">
      <?rfc include="reference.RFC.2119" ?>

      <?rfc include="reference.RFC.3168" ?>
    </references>

    <references title="Informative References">
      <?rfc include="localref.Briscoe05d.Re-fb_policing" ?>

      <?rfc include="localref.Bauer06.Strat_cong_pric" ?>

      <?rfc include="localref.Clark02.Tussle.xml" ?>

      <!--      <?rfc include="reference.I-D.briscoe-re-pcn-border-cheat" ?>-->

      <reference anchor="Re-PCN">
        <front>
          <title>Emulating Border Flow Policing using Re-PCN on Bulk
          Data</title>

          <author fullname="Bob Briscoe" initials="B" surname="Briscoe">
            <organization></organization>
          </author>

          <date day="26" month="October" year="2009" />

          <abstract>
            <t>Scaling per flow admission control to the Internet is a hard
            problem. The approach of combining Diffserv and pre-congestion
            notification (PCN) provides a service slightly better than Intserv
            controlled load that scales to networks of any size without
            needing Diffserv's usual overprovisioning, but only if domains
            trust each other to comply with admission control and rate
            policing. This memo claims to solve this trust problem without
            losing scalability. It provides a sufficient emulation of per-flow
            policing at borders but with only passive bulk metering rather
            than per-flow processing. Measurements are sufficient to apply
            penalties against cheating neighbour networks.</t>
          </abstract>
        </front>

        <seriesInfo name="Internet-Draft"
                    value="draft-briscoe-re-pcn-border-cheat-03" />

        <format target="http://www.ietf.org/internet-drafts/draft-briscoe-re-pcn-border-cheat-03.txt"
                type="TXT" />
      </reference>

      <!--      <?rfc include="localref.I-D.briscoe-tsvwg-re-ecn-tcp" ?>-->

      <reference anchor="Re-TCP">
        <front>
          <title>Re-ECN: Adding Accountability for Causing Congestion to
          TCP/IP</title>

          <author fullname="Bob Briscoe" initials="B" surname="Briscoe">
            <organization></organization>
          </author>

          <author fullname="Arnaud Jacquet" initials="A" surname="Jacquet">
            <organization></organization>
          </author>

          <author fullname="T Moncaster" initials="T" surname="Moncaster">
            <organization></organization>
          </author>

          <author fullname="Alan Smith" initials="A" surname="Smith">
            <organization></organization>
          </author>

          <date day="22" month="October" year="2012" />

          <abstract>
            <t>This document introduces a new protocol for explicit congestion
            notification (ECN), termed re-ECN, which can be deployed
            incrementally around unmodified routers. The protocol works by
            arranging an extended ECN field in each packet so that, as it
            crosses any interface in an internetwork, it will carry a truthful
            prediction of congestion on the remainder of its path. The purpose
            of this document is to specify the re-ECN protocol at the IP layer
            and to give guidelines on any consequent changes required to
            transport protocols. It includes the changes required to TCP both
            as an example and as a specification. It briefly gives examples of
            mechanisms that can use the protocol to ensure data sources
            respond correctly to congestion,and these are described more fully
            in a companion document.</t>
          </abstract>
        </front>

        <seriesInfo name="Internet-Draft"
                    value="draft-briscoe-conex-re-ecn-tcp-01" />

        <format target="http://www.ietf.org/internet-drafts/draft-briscoe-conex-re-ecn-tcp-01.txt"
                type="TXT" />
      </reference>

      <reference anchor="relax-fairness">
        <front>
          <title>Problem Statement: Transport Protocols Don't Have To Do
          Fairness</title>

          <author fullname="Bob Briscoe" initials="B" surname="Briscoe">
            <organization></organization>
          </author>

          <author fullname="T Moncaster" initials="T" surname="Moncaster">
            <organization></organization>
          </author>

          <author fullname="Lou Burness" initials="L" surname="Burness">
            <organization></organization>
          </author>

          <date day="" month="July" year="2008" />
        </front>

        <seriesInfo name="Internet-Draft"
                    value="draft-briscoe-tsvwg-relax-fairness-01" />

        <format target="http://www.ietf.org/internet-drafts/draft-briscoe-tsvwg-relax-fairness-01.txt"
                type="TXT" />
      </reference>

      <?rfc include="localref.Floyd-ecn-deploy" ?>

      <?rfc include="localref.Gibbens99.Evol_cc" ?>

      <?rfc include="localref.Golden04.Smart_routing_multihome" ?>

      <?rfc include="localref.Handley04.Steps_DoS_Arch" ?>

      <?rfc include="localref.ITU-T.I.371_ATMTrafficMgmt" ?>

      <?rfc include="localref.Pletka03.Purple" ?>

      <?rfc include="localref.Floyd99.Penalty_box" ?>

      <?rfc include="localref.Chhabra02.XCHOKe" ?>

      <?rfc include="localref.Mathis97.TCP_Macroscopic" ?>

      <?rfc include="localref.Jiang02.RTT_estimation" ?>

      <?rfc include="localref.Salvatori05a.Re-fb_closed_loop_policing" ?>

      <?rfc include="localref.Savage99.Mis_rcvr" ?>

      <?rfc include="reference.RFC.2208" ?>

      <?rfc include="reference.RFC.3514" ?>

      <?rfc include="reference.RFC.3540" ?>

      <?rfc include="reference.RFC.3714" ?>

      <?rfc include="reference.RFC.4340" ?>

      <?rfc include="reference.RFC.4341" ?>

      <?rfc include="reference.RFC.4342" ?>

      <?rfc include="reference.RFC.5559" ?>
    </references>

    <!-- ================================================================ -->

    <section anchor="retcp_Alg_Sanction_Negative"
             title="Example Egress Dropper Algorithm">
      <t>{ToDo: Write up the basic algorithm with flow state, then the
      aggregated one.}</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Policer_Implementations"
             title="Policer Designs to ensure Congestion Responsiveness">
      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Per_User_Policing" title="Per-user Policing">
        <t>User policing requires a policer on the ingress interface of the
        access router associated with the user. At that point, the traffic of
        the user hasn't diverged on different routes yet; nor has it mixed
        with traffic from other sources.</t>

        <t>In order to ensure that a user doesn't generate more congestion in
        the network than her due share, a modified bulk token-bucket is
        maintained with the following parameter: <list style="symbols">
            <t>b_0 the initial token level</t>

            <t>r the filling rate</t>

            <t>b_max the bucket depth</t>
          </list></t>

        <t>The same token bucket algorithm is used as in many areas of
        networking, but how it is used is very different: <list
            style="symbols">
            <t>all traffic from a user over the lifetime of their subscription
            is policed in the same token bucket.</t>

            <t>only positive and cancelled packets (positive, cautious and
            cancelled) consume tokens</t>
          </list></t>

        <t>Such a policer will allow network operators to throttle the
        contribution of their users to network congestion. This will require
        the appropriate contractual terms to be in place between operators and
        users. For instance: a condition for a user to subscribe to a given
        network service may be that she should not cause more than a volume
        C_user of congestion over a reference period T_user, although she may
        carry forward up to N_user times her allowance at the end of each
        period. These terms directly set the parameter of the user policer:
        <list style="symbols">
            <t>b_0 = C_user</t>

            <t>r = C_user/T_user</t>

            <t>b_max = b_0 * (N_user +1)</t>
          </list></t>

        <t>Besides the congestion budget policer above, another user policer
        may be necessary to further rate-limit cautious packets, if they are
        to be marked rather than dropped (see discussion in [ref other
        document].). Rate-limiting cautious packets will prevent high bursts
        of new flow arrivals, which is a very useful feature in DoS
        prevention. A condition to subscribe to a given network service would
        have to be that a user should not generate more than C_cautious
        cautious packets, over a reference period T_cautious, with no option
        to carry forward any of the allowance at the end of each period. These
        terms directly set the parameters of the cautious packet policer:
        <list style="symbols">
            <t>b_0 = C_cautious</t>

            <t>r = C_cautious/T_cautious</t>

            <t>b_max = b_0</t>
          </list></t>

        <t>T_cautious should be a much shorter period than T_user: for
        instance T_cautious could be in the order of minutes while T_user
        could be in order of weeks.</t>
      </section>

      <!-- ________________________________________________________________ -->

      <section anchor="retcp_Per_Flow_Policing"
               title="Per-flow Rate Policing ">
        <t>Whilst we believe that simple per-user policing would be sufficient
        to ensure senders comply with congestion control, some operators may
        wish to police the rate response of each flow to congestion as well.
        Although we do not believe this will be neceesary, we include this
        section to show how one could perform per-flow policing using
        enforcement of TCP-fairness as an example. Per-flow policing aims to
        enforce congestion responsiveness on the shortest information
        timescale on a network path: packet roundtrips.</t>

        <t>This again requires that the appropriate terms be agreed between a
        network operator and its users, where a congestion responsiveness
        policy might be required for the use of a given network service
        (perhaps unless the user specifically requests otherwise).</t>

        <t>As an example, we describe below how a rate adaptation policer can
        be designed when the applicable rate adaptation policy is
        TCP-compliance. In that context, the average throughput of a flow will
        be expected to be bounded by the value of the TCP throughput during
        congestion avoidance, given in Mathis' formula <xref
        target="Mathis97"></xref> <list style="empty">
            <t>x_TCP = k * s / ( T * sqrt(m) )</t>
          </list> where: <list style="symbols">
            <t>x_TCP is the throughput of the TCP flow in packets per
            second,</t>

            <t>k is a constant upper-bounded by sqrt(3/2),</t>

            <t>s is the average packet size of the flow,</t>

            <t>T is the roundtrip time of the flow,</t>

            <t>m is the congestion level experienced by the flow.</t>
          </list></t>

        <t>We define the marking period N=1/m which represents the average
        number of packets between two positive or cancelled packets. Mathis'
        formula can be re-written as: <list style="empty">
            <t>x_TCP = k*s*sqrt(N)/T</t>
          </list></t>

        <t>We can then get the average inter-mark time in a compliant TCP
        flow, dt_TCP, by solving (x_TCP/s)*dt_TCP = N which gives <list
            style="empty">
            <t>dt_TCP = sqrt(N)*T/k</t>
          </list></t>

        <t>We rely on this equation for the design of a rate-adaptation
        policer as a variation of a token bucket. In that case a policer has
        to be set up for each policed flow. This may be triggered by cautious
        packets, with the remainder of flows being all rate limited together
        if they do not start with a cautious packet.</t>

        <t>Where maintaining per flow state is not a problem, for instance on
        some access routers, systematic per-flow policing may be considered.
        Should per-flow state be more constrained, rate adaptation policing
        could be limited to a random sample of flows exhibiting positive or
        cancelled packets.</t>

        <t>As in the case of user policing, only positive or cancelled packets
        will consume tokens, however the amount of tokens consumed will depend
        on the congestion signal.</t>

        <t>When a new rate adaptation policer is set up for flow j, the
        following state is created: <list style="symbols">
            <t>a token bucket b_j of depth b_max starting at level b_0</t>

            <t>a timestamp t_j = timenow()</t>

            <t>a counter N_j = 0</t>

            <t>a roundtrip estimate T_j</t>

            <t>a filling rate r</t>
          </list></t>

        <t>When the policing node forwards a packet of flow j with no positive
        packets: <list style="symbols">
            <t>. the counter is incremented: N_j += 1</t>
          </list></t>

        <t>When the policing node forwards a packet of flow j carrying a
        negative packet: <list style="symbols">
            <t>the counter is incremented: N_j += 1</t>

            <t>the token level is adjusted: b_j += r*(timenow()-t_j) -
            sqrt(N_j)* T_j/k</t>

            <t>the counter is reset: N_j = 0</t>

            <t>the timer is reset: t_j = timenow()</t>
          </list></t>

        <t>An implementation example will be given in a later draft that
        avoids having to extract the square root.</t>

        <t>Analysis: For a TCP flow, for r= 1 token/sec, on average, <list
            style="empty">
            <t>r*(timenow()-t_j)-sqrt(N_j)* T_j/k = dt_TCP - sqrt(N)*T/k =
            0</t>
          </list></t>

        <t>This means that the token level will fluctuate around its initial
        level. The depth b_max of the bucket sets the timescale on which the
        rate adaptation policy is performed while the filling rate r sets the
        trade-off between responsiveness and robustness: <list style="symbols">
            <t>the higher b_max, the longer it will take to catch greedy
            flows</t>

            <t>the higher r, the fewer false positives (greedy verdict on
            compliant flows) but the more false negatives (compliant verdict
            on greedy flows)</t>
          </list></t>

        <t>This rate adaptation policer requires the availability of a
        roundtrip estimate which may be obtained for instance from the
        application of re-feedback to the downstream delay <xref
        target="retcp_Re-TTL"></xref> or passive estimation <xref
        target="Jiang02"></xref>.</t>

        <t>When the bucket of a policer located at the access router (whether
        it is a per-user policer or a per-flow policer) becomes empty, the
        access router SHOULD drop at least all packets causing the token level
        to become negative. The network operator MAY take further sanctions if
        the token level of the per-flow policers associated with a user
        becomes negative.</t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Alg_Metering"
             title="Downstream Congestion Metering Algorithms">
      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Bulk_Alg_Metering"
               title="Bulk Downstream Congestion Metering Algorithm">
        <t>To meter the bulk amount of downstream congestion in traffic
        crossing an inter-domain border an algorithm is needed that
        accumulates the size of positive packets and subtracts the size of
        negative packets. We maintain two counters: <list style="empty">
            <t>V_b: accumulated congestion volume</t>

            <t>B: total data volume (in case it is needed)</t>
          </list></t>

        <t>A suitable pseudo-code algorithm for a border router is as
        follows:</t>

        <figure>
          <artwork><![CDATA[
====================================================================
V_b = 0
B   = 0
for each Re-ECN-capable packet {
    b = readLength(packet)      /* set b to packet size          */
    B += b                      /* accumulate total volume       */
    if readEECN(packet) == (positive || cautious {
        V_b += b                /* increment...                  */
    } elseif readEECN(packet) == negative {
        V_b -= b                /* ...or decrement V_b...        */
    }                           /*...depending on EECN field     */
}
====================================================================
]]></artwork>
        </figure>

        <t>At the end of an accounting period this counter V_b represents the
        congestion volume that penalties could be applied to, as described in
        <xref target="retcp_Inter-domain_Policing"></xref>.</t>

        <t>For instance, accumulated volume of congestion through a border
        interface over a month might be V_b = 5PB (petabyte = 10^15 byte).
        This might have resulted from an average downstream congestion level
        of 1% on an accumulated total data volume of B = 500PB.</t>

        <t>{ToDo: Include algorithm for precise downstream congestion.}</t>
      </section>

      <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  -->

      <section anchor="retcp_Inflation_Negative_Flows"
               title="Inflation Factor for Persistently Negative Flows">
        <t>The following process is suggested to complement the simple
        algorithm above in order to protect against the various attacks from
        persistently negative flows described in <xref
        target="retcp_Inter-domain_Policing"></xref>. As explained in that
        section, the most important and first step is to estimate the
        contribution of persistently negative flows to the bulk volume of
        downstream pre-congestion and to inflate this bulk volume as if these
        flows weren't there. The process below has been designed to give an
        unbiased estimate, but it may be possible to define other processes
        that achieve similar ends.</t>

        <t>While the above simple metering algorithm is counting the bulk of
        traffic over an accounting period, the meter should also select a
        subset of the whole flow ID space that is small enough to be able to
        realistically measure but large enough to give a realistic sample.
        Many different samples of different subsets of the ID space should be
        taken at different times during the accounting period, preferably
        covering the whole ID space. During each sample, the meter should
        count the volume of positive packets and subtract the volume of
        negative, maintaining a separate account for each flow in the sample.
        It should run a lot longer than the large majority of flows, to avoid
        a bias from missing the starts and ends of flows, which tend to be
        positive and negative respectively.</t>

        <t>Once the accounting period finishes, the meter should calculate the
        total of the accounts V_{bI} for the subset of flows I in the sample,
        and the total of the accounts V_{fI} excluding flows with a negative
        account from the subset I. Then the weighted mean of all these samples
        should be taken a_S = sum_{forall I} V_{fI} / sum_{forall I}
        V_{bI}.</t>

        <t>If V_b is the result of the bulk accounting algorithm over the
        accounting period (<xref target="retcp_Bulk_Alg_Metering"></xref>) it
        can be inflated by this factor a_S to get a good unbiased estimate of
        the volume of downstream congestion over the accounting period
        a_S.V_b, without being polluted by the effect of persistently negative
        flows.</t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="retcp_Re-TTL" title="Re-TTL">
      <t>This Appendix gives an overview of a proposal to be able to overload
      the TTL field in the IP header to monitor downstream propagation delay.
      This is included to show that it would be possible to take account of
      RTT if it was deemed desirable.</t>

      <t>Delay re-feedback can be achieved by overloading the TTL field,
      without changing IP or router TTL processing. A target value for TTL at
      the destination would need standardising, say 16. If the path hop count
      increased by more than 16 during a routing change, it would temporarily
      be mistaken for a routing loop, so this target would need to be chosen
      to exceed typical hop count increases. The TCP wire protocol and
      handlers would need modifying to feed back the destination TTL and
      initialise it. It would be necessary to standardise the unit of TTL in
      terms of real time (as was the original intent in the early days of the
      Internet).</t>

      <t>In the longer term, precision could be improved if routers
      decremented TTL to represent exact propagation delay to the next router.
      That is, for a router to decrement TTL by, say, 1.8 time units it would
      alternate the decrement of every packet between 1 & 2 at a ratio of
      1:4. Although this might sometimes require a seemingly dangerous null
      decrement, a packet in a loop would still decrement to zero after 255
      time units on average. As more routers were upgraded to this more
      accurate TTL decrement, path delay estimates would become increasingly
      accurate despite the presence of some RFC3168 compliant routers that
      continued to always decrement the TTL by 1.</t>
    </section>

    <section anchor="retcp_Nonce_Limitation"
             title="Argument for holding back the ECN nonce">
      <t>The ECN nonce is a mechanism that allows a /sending/ transport to
      detect if drop or ECN marking at a congested router has been suppressed
      by a node somewhere in the feedback loop---another router or the
      receiver.</t>

      <t>Space for the ECN nonce was set aside in <xref
      target="RFC3168"></xref> (currently proposed standard) while the full
      nonce mechanism is specified in <xref target="RFC3540"></xref>
      (currently experimental). The specifications for <xref
      target="RFC4340"></xref> (currently proposed standard) requires that
      "Each DCCP sender SHOULD set ECN Nonces on its packets...". It also
      mandates as a requirement for all CCID profiles that "Any newly defined
      acknowledgement mechanism MUST include a way to transmit ECN Nonce
      Echoes back to the sender.", therefore: <list style="symbols">
          <t>The CCID profile for TCP-like Congestion Control <xref
          target="RFC4341"></xref> (currently proposed standard) says "The
          sender will use the ECN Nonce for data packets, and the receiver
          will echo those nonces in its Ack Vectors."</t>

          <t>The CCID profile for TCP-Friendly Rate Control (TFRC) <xref
          target="RFC4342"></xref> recommends that "The sender [use] Loss
          Intervals options' ECN Nonce Echoes (and possibly any Ack Vectors'
          ECN Nonce Echoes) to probabilistically verify that the receiver is
          correctly reporting all dropped or marked packets."</t>
        </list></t>

      <t>The primary function of the ECN nonce is to protect the integrity of
      the information about congestion: ECN marks and packet drops. However,
      when the nonce is used to protect the integrity of information about
      packet drops, rather than ECN marks, a transport layer nonce will always
      be sufficient (because a drop loses the transport header as well as the
      ECN field in the network header), which would avoid using scarce IP
      header codepoint space. Similarly, a transport layer nonce would protect
      against a receiver sending early acknowledgements <xref
      target="Savage99"></xref>.</t>

      <t>If the ECN nonce reveals integrity problems with the information
      about congestion, the sending transport can use that knowledge for two
      functions: <list style="symbols">
          <t>to protect its own resources, by allocating them in proportion to
          the rates that each network path can sustain, based on congestion
          control,</t>

          <t>and to protect congested routers in the network, by slowing down
          drastically its connection to the destination with corrupt
          congestion information.</t>
        </list></t>

      <t>If the sending transport chooses to act in the interests of congested
      routers, it can reduce its rate if it detects some malicious party in
      the feedback loop may be suppressing ECN feedback. But it would only be
      useful to congested routers when /all/ senders using them are trusted to
      act in interest of the congested routers.</t>

      <t>In the end, the only essential use of a network layer nonce is when
      sending transports (e.g. large servers) want to allocate their /own/
      resources in proportion to the rates that each network path can sustain,
      based on congestion control. In that case, the nonce allows senders to
      be assured that they aren't being duped into giving more of their own
      resources to a particular flow. And if congestion suppression is
      detected, the sending transport can rate limit the offending connection
      to protect its own resources. Certainly, this is a useful function, but
      the IETF should carefully decide whether such a single, very specific
      case warrants IP header space.</t>

      <t>In contrast, Re-ECN allows all routers to fully protect themselves
      from such attacks, without having to trust anyone - senders, receivers,
      neighbouring networks. Re-ECN is therefore proposed in preference to the
      ECN nonce on the basis that it addresses the generic problem of
      accountability for congestion of a network's resources at the IP
      layer.</t>

      <t>Delaying the ECN nonce is justified because the applicability of the
      ECN nonce seems too limited for it to consume a two-bit codepoint in the
      IP header. It therefore seems prudent to give time for an alternative
      way to be found to do the one function the nonce is essential for.</t>

      <t>Moreover, while we have re-designed the Re-ECN codepoints so that
      they do not prevent the ECN nonce progressing, the same is not true the
      other way round. If the ECN nonce started to see some deployment
      (perhaps because it was blessed with proposed standard status),
      incremental deployment of Re-ECN would effectively be impossible,
      because Re-ECN marking fractions at inter-domain borders would be
      polluted by unknown levels of nonce traffic.</t>

      <t>The authors are aware that Re-ECN must prove it has the potential it
      claims if it is to displace the nonce. Therefore, every effort has been
      made to complete a comprehensive specification of Re-ECN so that its
      potential can be assessed. We therefore seek the opinion of the Internet
      community on whether the Re-ECN protocol is sufficiently useful to
      warrant standards action.</t>
    </section>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-22 17:45:41