One document matched: draft-ietf-tcpm-accecn-reqs-02.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->


<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC3168 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3168.xml">
<!ENTITY RFC3540 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3540.xml">
<!ENTITY RFC5562 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5562.xml">
<!ENTITY RFC5681 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5681.xml">
<!ENTITY RFC5690 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5690.xml">

]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-ietf-tcpm-accecn-reqs-02" ipr="trust200902">
	<!-- updates="3186" -->
 <!-- category values: std, bcp, info, exp, and historic
    ipr values: trust200902, noModificationTrust200902, noDerivativesTrust200902,
       or pre5378Trust200902
    you can add the attributes updates="NNNN" and obsoletes="NNNN" 
    they will automatically be output with "(if approved)" -->

 <!-- ***** FRONT MATTER ***** -->

 <front>
   <!-- The abbreviated title is used in the page header - it is only necessary if the 
        full title is longer than 39 characters -->

   <title abbrev="Requirements for More Accurate ECN">Problem Statement and Requirements for a More Accurate ECN Feedback</title>

   <!-- add 'role="editor"' below for the editors if appropriate -->

   <!-- Another author who claims to be an editor -->

   <author fullname="Mirja Kühlewind" initials="M." role="editor"
	   surname="Kühlewind">
    	<organization>University of Stuttgart</organization>
    	<address>
    		<postal>
    			<street>Pfaffenwaldring 47</street>
    			<code>70569</code>
    			<city>Stuttgart</city>
    			<country>Germany</country>
    		</postal>
    		<email>mirja.kuehlewind@ikr.uni-stuttgart.de</email>
    	</address>
    </author>
    
    <author fullname="Richard Scheffenegger" initials="R."
           surname="Scheffenegger">
     <organization>NetApp, Inc.</organization>
     <address>
       <postal>
         <street>Am Euro Platz 2</street>
         <code>1120</code>
         <city>Vienna</city>
         <region></region>
         <country>Austria</country>
       </postal>
       <phone>+43 1 3676811 3146</phone>
       <email>rs@netapp.com</email>
     </address>
    </author>

   <date year="2013" />


   <area>Transport</area>
   
   <workgroup>TCP Maintenance and Minor Extensions (tcpm)</workgroup>
   
   <keyword>Internet-Draft</keyword>
   <keyword>I-D</keyword>

   <abstract>
     <t>Explicit Congestion Notification (ECN) is an IP/TCP mechanism where network nodes 
     can mark IP packets instead of dropping them to indicate congestion to the end-points. 
     An ECN-capable receiver will feedback this information to the sender. ECN is specified for
     TCP in such a way that only one feedback signal can be transmitted per Round-Trip Time (RTT).
     Recently, new TCP mechanisms like ConEx or DCTCP need more accurate ECN feedback information in the case 
     where more than one marking is received in one RTT. 
     This documents specifies requirement for different ECN feedback scheme in the TCP header 
     to provide more than one feedback signal per RTT.
     </t>
    
   </abstract>
 </front>

 <middle>
   <section title="Introduction">
   <t>Explicit Congestion Notification (ECN) <xref target="RFC3168"/> is 
	   an IP/TCP mechanism where
	   network nodes can mark IP packets instead of dropping them to indicate congestion to
	   the end-points. An ECN-capable receiver will feedback this information to the sender.
	   ECN is specified for TCP in such a way that only one feedback signal can be
	   transmitted per Round-Trip Time (RTT). This is sufficient for current congestion control
	   mechanisms, as only one reduction in sending rate is performed per RTT independent of the 
	   number of ECN congestion marks.
	   But recently proposed mechanisms like Congestion Exposure (ConEx) or DCTCP 
	   <xref target="Ali10"/> need more accurate ECN feedback information 
	   in the case where more than one marking is received in one RTT to work correctly.
     </t>
     <t> The following scenarios should briefly show where the accurate feedback 
	     is needed or provides additional value:
	     <list hangIndent="8" style="hanging">
	     <t hangText="A Standard (RFC5681) TCP sender that supports ConEx:"><vspace/>
		     In this case the congestion control algorithm still ignores multiple 
		     marks per RTT, while the ConEx mechanism uses the extra information 
		     per RTT to re-echo more precise congestion information. </t>
	     <t hangText="A sender using DCTCP congestion control without ConEx:"><vspace/>
		     The congestion control algorithm uses the extra info per RTT to 
		     perform its decrease depending on the number of congestion marks.</t>
	     <t hangText="A sender using DCTCP congestion control and supports ConEx:"><vspace/>
		     Both the congestion control algorithm and ConEx use the accurate 
		     ECN feedback mechanism.</t>
	     <t hangText="A standard TCP sender (using RFC5681 congestion control algorithm) without ConEx:"><vspace/>
		     No accurate feedback is necessary here. The congestion control 
		     algorithm still react only on one signal per RTT. But it is best 
		     to have one generic feedback mechanism, whether it is used or not.</t>
	     </list>
     	</t>
     
     <t>
     This document summarizes the requirements for a new more accurate ECN feedback scheme. 
     While a new feedback scheme should still deliver identical performance as classic ECN, this document
     also clarifies what has to be taken into consideration in addition. Thus the listed requirements
     should be addressed in the specification of a more accurate ECN feedback scheme.
     Moreover, as a large set of proposals already exists, a few high level design choices are sketched
     and briefly discussed, to demonstrate some of the benefits and drawbacks of each of these potential
     schemes.
     </t>
     
     <section title="Requirements Language">
	     <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
		     "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
		     document are to be interpreted as described in <xref
									 target="RFC2119">RFC 2119</xref>.</t>
	     <t>We use the following terminology from <xref target="RFC3168"/> and <xref target="RFC3540"/>:</t>
	     <t>The ECN field in the IP header:
		     <list hangIndent="10" style="empty"><t>
				     <list hangIndent="9" style="hanging">
					     <t hangText="CE:">the Congestion Experienced codepoint,</t>
					     <t hangText="ECT(0):">the first ECN-capable Transport codepoint, and</t>
					     <t hangText="ECT(1):">the second ECN-capable Transport codepoint.</t>
				     </list>
		     </t></list>
	       The ECN flags in the TCP header:
		     <list hangIndent="10" style="empty"><t>
				     <list hangIndent="9" style="hanging">
					     <t hangText="CWR:">the Congestion Window Reduced flag,</t>
					     <t hangText="ECE:">the ECN-Echo flag, and</t>
					     <t hangText="NS:">ECN Nonce Sum.</t>
				     </list>
		     </t></list>
     	     </t>
	     <t> In this document, the ECN feedback scheme as specified 
		     in <xref target="RFC3168"/> is called the 'classic ECN' 
		     and any new proposal the 'more accurate ECN feedback' scheme. 
		     A 'congestion mark' is defined as an IP packet where the CE codepoint is set.
		     A 'congestion event' refers to one or more congestion marks belong to the same 
		     overload situation in the network (usually during one RTT). A TCP segment with the
		     acknowledgment flag set is simply called ACK.
	     </t>
     </section>
   </section>
     
     <section title="Overview ECN and ECN Nonce in IP/TCP">
       <t>ECN requires two bits in the IP header. The ECN capability of a 
         packet is indicated when either one of the two bits is set. An 
         ECN sender can set one or the other bit to indicate 
         an ECN-capable transport (ECT) which results in two signals, 
         ECT(0) and ECT(1). A network node can set both bits simultaneously 
         when it experiences congestion. When both bits are set the 
         packet is regarded as "Congestion Experienced" (CE).
	     </t>
		     
	     <t>In the TCP header the first two bits in byte 14 are defined 
       	for the use of ECN. The TCP mechanism for signaling the reception 
       	of a congestion mark uses the ECN-Echo (ECE) flag in the TCP header. 
       	To enable the TCP receiver to determine when to stop setting the 
       	ECN-Echo flag, the CWR flag is set by the sender upon reception of 
       	the feedback signal. This leads always to a full RTT of ACKs with 
       	ECE set. Thus any additional CE markings arriving within this RTT 
       	can not signaled back anymore. 
	     </t>
	     <t>
         ECN-Nonce <xref target="RFC3540"/> is an optional addition to ECN 
         that is used to protect the TCP sender against accidental or 
         malicious concealment of marked or dropped packets. This addition 
         defines the last bit of byte 13 in the TCP header as the Nonce 
         Sum (NS) bit. With ECN-Nonce a nonce sum is maintain that counts 
         the occurrence of ECT(1) packets.
	     </t>
	     <figure anchor="TCPHdr" align="center" title="The (post-ECN Nonce) definition of the TCP header flags">
<artwork align="center"><![CDATA[						 
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|               |           | N | C | E | U | A | P | R | S | F |
| Header Length | Reserved  | S | W | C | R | C | S | S | Y | I |
|               |           |   | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
]]></artwork>
  	   </figure>
   </section>
     
     <section title="Requirements">
     <t>The requirements of the accurate ECN feedback protocol, for the use of 
     	 e.g. Conex or DCTCP, are to have a fairly accurate (not necessarily 
     	 perfect), timely and protected signaling. This leads to the following 
     	 requirements, which should be discussed for any proposed more accurate 
	 ECN feedback scheme:</t>
     
     <t><list hangIndent="8" style="hanging">
     <t hangText="Resilience"><vspace/>The ECN feedback signal is carried within the 
     ACK. TCP ACKs can get lost. Moreover, delayed ACKs are mostly used 
     with TCP. That means in most cases only every second data packet triggers an ACK. 
     In a high congestion situation where most of the packets are marked with CE, an 
     accurate feedback mechanism must still be able to signal sufficient congestion 
     information. Thus the accurate ECN feedback extension has to take delayed ACK and 
     ACK loss into account.</t>
     
    <t hangText="Timeliness"><vspace/>The CE mark is induced by a network node on the 
     transmission path and echoed by the receiver in the TCP ACK. Thus when 
     this information arrives at the sender, its naturally already about one RTT old. 
     With a sufficient ACK rate a further delay of a small number of ACK can be 
     tolerated but with large delays this information will be out dated due to high 
     dynamic in the network. TCP congestion control which introduces parts of these 
     dynamics operates on a time scale of one RTT. Thus the congestion feedback 
     information should be delivered timely (within one RTT).</t>
     
     <t hangText="Integrity"><vspace/>With ECN Nonce, a misbehaving receiver or network node 
       can be detected with good probability. If the accurate ECN feedback is 
       reusing the NS bit, it is encouraged to ensure integrity at least as good as 
       ECN Nonce. If this is not possible, alternative approaches should be provided 
       how a mechanism using the accurate ECN feedback extension can re-ensure 
       integrity or give strong incentives for the receiver and network node to 
       cooperate honestly. <!--If and what kind of 
     enforcements a sender should do, when detecting wrong feedback information, is 
     out-of-scope.--></t>
     
     <t hangText="Accuracy"><vspace/><!--In TCP usually delayed ACKs are used. Thats means in 
     most cases only for every second data packets an acknowledgment is sent. Moreover, 
     an ACK can get lost.-->Classic ECN feeds back one congestion notification per RTT, as 
     this is supposed to be used for TCP congestion control which reduces the sending 
     rate at most once per RTT. The accurate ECN feedback scheme has to ensure that 
     if a congestion events occurs at least one congestion notification is echoed and 
     received per RTT as classic ECN would do. Of course, the goal of this extension is to 
     reconstruct the number of CE markings (more) accurately and in the best case even to reconstruct 
     the (exact) number of payload bytes that a CE marked packet was carrying. However, a sender should 
     not assume to get the exact number of congestion markings or marked bytes in all situations.</t>
     
     <t hangText="Complexity"><vspace/>Of course, the more accurate ECN feedback can 
       also be used, even if only one ECN feedback signal per RTT is need. The 
       implementation should be as simple as possible and only a minimum of additional 
       state information should be needed. <!--A proposal fulfilling this for a more
       accurate ECN feedback can then also be the standard ECN feedback mechanism.--></t>
     </list></t>
     </section>
     
     <section title="Design Approaches">
	     <t>All discussed approaches aim to provide accurate ECN feedback information as long as
		     no ACK loss occurs and the congestion rate is reasonable. Otherwise the proposed schemes have different resilience characteristics depending on the number of used 
		     bits for the encoding. While classic ECN provides a reliable (inaccurate) feedback of a maximum of one congestion signal per RTT, the proposed schemes do not implement any 
		     acknowledgement mechanism.
	     </t>
	<section title="Re-use of ECN/NS Header Bits">
     <t>The three ECN/NS header, ECE, CWR and NS are re-used (not only for additional 
       capability negotiation during the TCP handshake exchange but) to signal the current value of
       an CE counter at the receiver. This approach only provides a limited resilience against ACK lost depending of the number of used bits.
     </t>
     <t>There are several codings proposed so far: An one bit scheme sends one ECE 
	     for each CE received (while the CWR could be used to introduce redundant information in next ACK to increase the robustness against ACK loss). An 3 bit
	     counter scheme uses all three bits for continuously feeding the three most
	     significant bits of a CE counter back. 
	     An 3 bit codepoint scheme encodes either a CE counter or an ECT(1) counter in 8 codepoints.
     </t>
     <t>The proposed schemes provides accumulated information on ECN-CE-marking feedback, similar to the number
	     of acknowledged bytes in the TCP header. Due to the limited number of bits the 
	     ECN feedback information will wrap-around more often (than the acknowledgement). Thus with a smaller number of 
	     ACK losses it is already possible to loose feedback information. The resilience could be 
	     increased by introducing redundancy, e.g. send each counter increase twice or 
	     more times. Of course any of these additional mechanisms will increasee the complexity.
	     If the congestion rate is larger that the ACK rate (multiplied with the number of feedback
	     information that can be signaled per ACK), the congestion information cannot correctly 
	     be feed back. 
	     Thus an accurate ECN feedback mechanism needs to be able to also cover the worst case situation
	     where every packet is CE marked. This can potentially be realized by dynamically adapt the
	     ACK rate and redundancy which again increases complexity and also potentially the signaling
	     overhead. For all schemes, an integrity check is only provided if ECN Nonce can be supported.
     </t>
     </section>
     <section title="Use of Other Header Bits ">
	<t>As seen in <xref target="TCPHdr"/>, there are currently three unused flag bits
	in the TCP header. The proposed 3 bit or codepoint schemes could be extended by one or
	more bits, to add higher resilience against ACK loss. The relative gain 
	would be proportionally higher resilience against ACK loss, while the respective 
     	drawbacks would remain identical.
        </t>
	<t>Moreover, the Urgent Pointer could be used if the Urgent Flag is not set. As this is
		often the case, the resiliency could by increased without additional signaling overhead.
	</t>
     </section>
     <section title="TCP Option">
     <t>Alternatively, a new TCP option could be introduced, to help maintaining the 
       accuracy and integrity of the ECN feedback between receiver and sender. 
       Such an option could provide higher resilience and even more information. 
       E.g. ECN for RTP/UDP provides 
       explicit the number of ECT(0), ECT(1), CE, non-ECT marked and lost packets. 
       However, deploying new TCP options has its own challenges. Moreover, to actually
       achieve a high resilience, this option would need to be carried by either all or a large number
       ACKs. Thus this approach would introduce considerable signaling overhead while ECN feedback is not
       such a critical information (as in the worst case, loss will still be available to provide a strong
       congestion feedback signal).
       Anyway, such a TCP option 
       could also be used in addition to a more accurate ECN feedback scheme in the TCP header or in addition to classic ECN, only when available and needed.
     </t>
     </section>
     <!--<t>Combining the idea of <xref target="eci_mode"/> and <xref target="cp_mode"/>,
     further extending it to a one-octet option, would allow the signaling of two 
     values, each with 4 bit. The gains in worst case ACK loss, delayed ACK ratios
     and maintaining ECN Nonce would scale accordingly.
    </t>
    <t>Alternatively, if timestamp capability negotiation is supported, a few
     bits could be extracted from the timestamp value, to provide extended
     signaling. However, processing TCP options (or overloaded TCP options) is
     more complex than processing of header flags.
    </t>-->

     </section>
     
     
   

   <section title="Acknowledgements">

   </section>

   <section anchor="IANA" title="IANA Considerations">
     <t>This memo includes no request to IANA.</t>
     <!--<t> If this memo was to progress to standards track, it would update RFC3168 
     	and RFC3540, to add new combinations of flags in the TCP header for capability
      negotiation (see <xref target="TCPNeg"/>) and a change in TCP ECN semantics 
      (see <xref target="TCPSig"/>).</t>-->
   </section>

   <section anchor="Security" title="Security Considerations">
	   <t>If this scheme is used as input for congestion control, the respective algorithm
	   might not react appropriately if ECN feedback information got lost. As those schemes should 
   	still react appropriately to loss, this drawback can not lead to a congestion collapse though.</t>
	<t>Providing wrong feedback information could otherwise lead to throttling of certain connections. This problem is identical in the classic ECN feedback scheme and should be addressed by an additional integrity check like ECN Nonce.</t>
   </section>
   
 </middle>

 <!--  *****BACK MATTER ***** -->

 <back>

   <references title="Normative References">
     <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
     &RFC2119;
     &RFC3168;
     &RFC3540;
     
   </references>
   
   <references title="Informative References">
	   <?rfc include="reference.I-D.briscoe-tsvwg-re-ecn-tcp.xml"?>
	   <?rfc include="reference.I-D.kuehlewind-tcpm-accurate-ecn-option.xml"?>

     &RFC5562;
     &RFC5681;
     &RFC5690;
	   
	<reference anchor="Ali10">
		<front>
			<title>DCTCP: Efficient Packet Transport for the Commoditized Data Center</title>
			
			<author initials="M" surname="Alizadeh">
				<organization></organization></author>
			<author initials="A" surname="Greenberg">
				<organization></organization></author>
			<author initials="D" surname="Maltz">
				<organization></organization></author>
			<author initials="J" surname="Padhye">
				<organization></organization></author>
			<author initials="P" surname="Patel">
				<organization></organization></author>
			<author initials="B" surname="Prabhakar">
				<organization></organization></author>
			<author initials="S" surname="Sengupta">
				<organization></organization></author>
			<author initials="M" surname="Sridharan">
				<organization></organization></author>
			<date month="Jan" year="2010"/>
		</front>
	</reference>
		
    </references>
    
 
 </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 03:31:45