One document matched: draft-gettys-iw10-considered-harmful-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced. 
     An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2068 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2068.xml">
<!ENTITY RFC2616 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2616.xml">
<!ENTITY RFC3260 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3260.xml">
]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
     (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-gettys-iw10-considered-harmful-00" ipr="trust200902">
  <!-- category values: std, bcp, info, exp, and historic
     ipr values: full3667, noModification3667, noDerivatives3667
     you can add the attributes updates="NNNN" and obsoletes="NNNN" 
     they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
         full title is longer than 39 characters -->

    <title abbrev="IW10 Considered Harmful">IW10 Considered Harmful
    </title>

    <!-- add 'role="editor"' below for the editors if appropriate -->

    <!-- Another author who claims to be an editor -->

    <author fullname="Jim Gettys" 
            surname="Jim Gettys">
      <organization>Alcatel-Lucent Bell Labs</organization>

      <address>
        <postal>
          <street>21 Oak Knoll Road</street>
          <!-- Reorder these if your country does things differently -->
          <city>Carlisle</city>
          <region>Massachusetts</region>
          <code>01741</code>
          <country>USA</country>
        </postal>

        <phone>+1 978 254-7060</phone>
        <email>jg@freedesktop.org</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <date year="2011" />

    <!-- If the month and year are both specified and are the current ones, xml2rfc will fill 
         in the current day for you. If only the current year is specified, xml2rfc will fill 
	 in the current day and month for you. If the year is not the current one, it is 
	 necessary to specify at least a month (xml2rfc assumes day="1" if not specified for the 
	 purpose of calculating the expiry date).  With drafts it is normally sufficient to 
	 specify just the year. -->

    <!-- Meta-data Declarations -->

    <area>General</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <!-- WG name at the upperleft corner of the doc,
         IETF is fine for individual submissions.  
	 If this element is not present, the default is "Network Working Group",
         which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>bufferbloat</keyword>
    <keyword>congestion window</keyword>

    <!-- Keywords will be incorporated into HTML output
         files in a meta tag but they have no effect on text or nroff
         output. If you submit your draft to the RFC Editor, the
         keywords will be used for the search engine. -->

    <abstract>

      <t>The proposed change to the initial window to 10
	indraft-ietf-tcpm-initcwnd must be considered deeply harmful;
	not because it is the proposed change is evil taken in
	isolation, but that other changes in web browsers and
	web sites that have occurred over the last decade, it makes
	the problem of transient congestion at a user's
	broadband connection two and a half times worse.  This result has been
	hidden by the already widespread bufferbloat present in broadband connections.
	Packet loss in isolation is no longer a useful metric of a path's quality.
	The very drive to improve latency of web page rendering is already destroying
	other low latency applications, such as VOIP and gaming, and
	will prevent reliable rich web real time web applications such
	as those contemplated by the IETF rtcweb working group.  </t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">

      <t>In the second half of the 2000's, competition in web browsers
	reappeared and changed focus from strictly features, to speed (meaning
	latency, at least as seen from data-centers, which can be
	highly misleading), with the discovery (most clearly
	understood by Google) that web sites are stickier the faster
	(lower latency) they are. Perhaps Sergey Brin
	and Larry Page knew Stuart Cheshire at
	Stanford? <xref target="Cheshire"></xref>.</t>

      <t>The problem, in short, is the multiplicative effect of the following:</t>
      <t><list style="symbols">
          <t>Browsers ignoring <xref target="RFC2068">RFC 2068</xref>
      and <xref target="RFC2616">RFC 2616</xref> requirement to use no more 
	    than two simultaneous TCP connections, with current browsers often using 6 
	    or sometimes many more simultaneous TCP connections</t>
          <t>"Sharded" web sites that sometimes deliberately hide the path to servers 
	    actually located in the same data center, to encourage browsers to 
	    use even more simultaneous TCP connections</t>
	  <t>The proposed change to the TCP congestion window, to allow 
	    each fresh TCP connection to send as much as 2.5 times as 
	    much data as in the past.</t>
	  <t>Current broadband connections having a single queue available to customers, 
	    which is usually badly over-buffered, hiding packet loss</t>
	  <t>Web pages having large numbers of embedded objects in a web page. </t>
	  <t>Web servers having large memory caches and processing power when
	    generating objects on the fly that responses are
	    often/usually transmitted effectively instantaneously at line rate speed</t>
        </list>The result can easily be a horrifying large impulse of packets 
	sent effectively simultaneously to the user as a continuous packet train, landing in, 
	and clogging the one queue in their broadband connection and/or home router for extended 
	periods of time. Any chance for your VOIP call to work correctly, or to avoid being
        fragged in your game, evaporates.</t>
    </section>
    <section anchor="Discussion" title="Discussion">

      <t> The original reasons for the 2 TCP connection rule in
	<xref target="RFC2068">RFC 2068</xref>
      and <xref target="RFC2616">RFC 2616</xref> in section 8.1.4 are
      long gone.  In the 1990's era, dial-up modem banks were often
      badly underbuffered, and multiple simultaneous connections could
      easily cause excessive packet loss due to self-congestion either on
	the dialup port itself, or the dialup bank overall.</t>

	<t>
	Since the 1990's memory has become very cheap, and we now
	have the opposite problem: buffering in broad band
	equipment is much, much too large, larger than any sane
	amount, as shown by the Netalyzr <xref target="Netalyzr"></xref> and FCC
	data <xref target="Sundaresan"></xref>, a phenomena I christened
	"bufferbloat" as we had lacked a good term for this problem.</t>
	<t>
	What is more, broadband equipment usually provides only
	a single queue unmanaged, bloated queue to the subscriber; a
	large impulse of packets for a single user will block other
	packets from other applications of that user, or to other
	users who share that connection. This buffering is so large
	that slow start is badly damaged (TCP will attempt to run many
	times faster than it should until packet loss finally brings
	it back under control), and congestion avoidance is no longer
	stable, as I discovered in 2010 <xref target="Gettys"></xref>.</t>

	<t>I had expected and hoped that high performance would be achieved via
	HTTP Pipelining <xref target="HTTPPerf"></xref> and that web traffic would have
	have longer TCP sessions. HTTP pipelining is painful
	due to HTTP's lack of any multiplexing layer and the lack
	of response numbering to allow out of order responses; "poor
	man's multiplexing" is possible, but complex. The benefits of pipelining
	to the length of TCP sessions are somewhat less than one might naively
	presume, as significantly fewer packets are ultimately
	necessary. But HTTP pipelining has never seen widespread
      browser deployment (though is supported by a high fraction of
      web servers). You will seldom see packet loss by using many TCP
      connections simultaneously in today's Internet, as buffers are
      now so large they can absorb huge transients, sometimes even a megabyte
	or more.</t>

      <t>Web browsers (seeing no packet loss) changed from obeying the RFC2616
      requirement of two TCP connections to using 6 or even 15 TCP connections from a browser
      to a web server.  What is more, some web sites, called "sharded
      web sites," deliberately split themselves across multiple names
      to trick web browsers into even more profligate uses of TCP
      connections, and there is no way for a web client to easily determine
      a web site has been so engineered.</t>

      <t>A browser may then render a page with many embedded objects
	(e.g. images).  Current web browsers will therefore
	simultaneously open or reuse 6 or often many more TCP
	connections at once, and the initial congestion window of 4
	packets may be sent from the same data center simultaneously
	to a user's broadband connection. These packets currently
	enter the single queue in most broadband systems between the
	Internet and the home, and with no QOS or other fair queuing
	present, induce transient latency; your VOIP or gaming packet
	will be stuck behind this burst of web traffic until the burst
	drains. Similarly, current home routers often lack any QOS or
	sophisticated queuing to ensure fairness between different
	users.  The proposal by Chu, et. al. in to raise the initial
	congestion window from four to 10, making the blocking and
	resulting latency and jitter problem up to 2.5 times worse.</t>

      <t> Note that broadband equipment is not the only overbuffered
	equipment in most user's paths.  Home routers, 3g
	wireless and the user's operating systems are usually even
	worse than broadband equipment. In a user's home, whenever the wireless
	bandwidth happens to be below that of the broadband
	connection, the bottleneck link is in the wireless hop, and so
	the problem may occur there rather than in the broadband
	connection. It is the bottleneck transfer that matters; not
	the theoretical bandwidth of the links. 802.11g is at best
	20Mbps, but often much worse. Other bottleneck points in the
	user's paths may also be lacking AQM.</t>

      <t> I believe the performance analysis in the
	draft-ietf-tcpm-initcwnd is flawed not by being incorrect in
	what it presents, but by overlooking the latency and jitter
	inflicted on other traffic that is sharing the broadband link,
	due to the large buffering in these links and typically single queue. 
	It is the damage the IW change would make to other real time
	applications sharing that link (including rtcweb applications), or what 
	those sharing that link do to you that is the issue. </t>

<t>
Simple arithmetic to compute induced transient latency, even ignoring all overhead, 
comes up with scary results:</t>

<t>Latency table</t>
<texttable anchor="Latency_table" title="Unloaded Latency">
          <preamble></preamble>
<ttcol align="right"># conn</ttcol>
<ttcol align="right">ICW=4 (bytes)</ttcol>
<ttcol align="right">Time @1Mbps</ttcol>
<ttcol align="right">Time @50Mbps</ttcol>
<ttcol align="right">ICW=10</ttcol>
<ttcol align="right">Time @1Mbps</ttcol>
<ttcol align="right">Time @50Mbps</ttcol>
<c>2</c>
<c>12000</c>
<c>96ms</c>
<c>1.92ms</c>
<c>30000</c>
<c>240ms</c>
<c>4.8ms</c>
<c>6</c>
<c>36000</c>
<c>288ms</c>
<c>5.76ms</c>
<c>90000</c>
<c>720ms</c>
<c>14ms</c>
<c>15</c>
<c>90000</c>
<c>720ms</c>
<c>14.4ms</c>
<c>225000</c>
<c>1800ms</c>
<c>36ms</c>
<c>30</c>
<c>180000</c>
<c>1440ms</c>
<c>28.8ms</c>
<c>450000</c>
<c>3600ms</c>
<c>72ms</c>
          <postamble></postamble>
</texttable>

<t> 1 Mbps may be your fair share of a loaded 802.11
  link. 50Mbps is near the top end of today's broadband.  Available bandwidths in
  other parts of the world are often much, much lower than in parts
  of the world where broadband has deployed. </t>

  <t> Simple experiments over 50Mbps home cable service against Google images
  confirm latencies that reach or sometimes double those in the table.
  Steady-state competing TCP traffic will multiply these times correspondingly; even
  at 50Mbps, reliable, low latency VOIP can therefore be problematic.
  From this table, it becomes obvious that QOS in shared wireless
  networks has become essential, if only because of this change in web
  browser behavior. Note that the 2 connection rule still results in
  100ms latencies on 1Mbps connections, which is already very problematic for
  VOIP by induction of jitter. 
  Two TCP connections are capable of driving a megabit link at
  saturation over most paths today even from a cold start with ICW=4.
</t>

<t>In the effort to maximise speed (as seen by a data center) web
  browsers/servers have turned web traffic into an delta function
  congesting the user's queue for extended periods. Since the
  broadband edge is badly over-buffered as shown first by Netalyzr,
  packets are usually not lost, but instead, fill the one queue separating people from the
  rest of the Internet until they drain.</t>

<t> Many carrier's telephony services are not blocked by this web traffic
  since the carriers have generally provisioned voice channels
  independently of data service; but most competing services such as Vonage or Skype
  will be blocked, as they must use the single, oversized queue. While I do 
  not believe this advantage was by design, it is an
  effect of bufferbloat and current broadband supporting only a single queue, at
  most accelerating acks ahead of other bulk data packets. In the presently deployed
  broadband infrastructure, these other queues are usually unavailable for use by time
  sensitive traffic, and DiffServ <xref target="RFC3260"></xref> is not implemented 
  in broadband head end
  equipment.  Therefore time sensitive packets share the same queue of non-time sensitive
  bulk data (HTTP) traffic.</t>


    </section>
    <section anchor="solutions" title="Solutions">

      <t>If HTTP pipelining were deployed, it would result in lower actual times to most users;
      fewer bytes are needed due to sharing packets among objects and
      requests, and much lower packet overhead and lower ack traffic
      and significantly better TCP congestion behavior. While increasing the initial window
      someday may indeed make sense, it is truly a frightening to us
      to raise the ICW during this arms race given already deployed
      HTTP/1.1 implementations. <xref target="SPDY">SPDY</xref> should have similar (or better)
      results, but requires server side support that will take time to
      develop and deploy, whereas most deployed web servers have supported
      pipelining for over a decade (sometimes with bugs, which is part of
      why it is painful to deploy web client HTTP pipelining).</t>

      <t>A full discussion of solutions that would improve latency
      for general web browsing without destroying realtime
      applications is beyond the scope of this ID.  I note a few
      quickly (which are not mutually exclusive) that can and should
      be persued. They all have differing time scales and costs; all
      are desirable in my view, but the discussion would be much more
      than I can cover here.
	</t>
	<t> <list style="symbols"> 
	   <t>Deployment of HTTP/1.1 pipelining (with reduction of # of simultaneous 
	     connections back to RFC 2616 levels</t>
	   <t>Deployment of SPDY </t>
	   <t>DiffServ deployment in the broadband edge and its use by 
	     applications</t>
	   <t>DiffServ deployment in home routers (which often unbeknownst to
	      those not in the gaming industry, has already partially
	      occurred due to its inclusion in the default Linux PFIFO_FAST line
	      discipline).</t>
	   <t>Some sort of "per user" or "per machine" queuing mechanism on
	     broadband connections such that complete starvation of 
	     service for extended periods can be avoided.</t>
	  </list>
	   At a deeper and more fundamental level, 
	   individual applications (such as web browsers) may game the 
	   network with concurrent bad results to the user (or other
	   users sharing that edge connection), and with the
	   advent of Web sockets, even individual web applications may similarly
	   game the network's behavior. With the huge 
	   dynamic range of today's edge environments, we have no good way 
	   to know what a "safe" initial impulse of packets a server may send 
	   into the network in what situation. 
	   Today there is no disincentive to applications
	   abusing the network.  Congestion exposure mechanisms such
	   as Congestion Exposure <xref target="ConEx"></xref> 
	  are badly needed, and some way to enable users (and
	   their applications, on their behalf) to be aware of and react to 
	   badly behaved applications.
	 </t>
	 </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This memo includes no request to IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">

      <t>Current practice of web browsers, in concert with "sharded"
      web sites and changes to the initial congestion window, and the
      currently deployed broadband infrastructure can be considered a
      denial of (low latency) service attack on consumer's broadband
      service.</t>

    </section>
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <!-- References split into informative and normative -->

    <!-- There are 2 ways to insert reference entries from the citation libraries:
     1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
     2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
        (for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")

     Both are cited textually in the same manner: by using xref elements.
     If you use the PI option, xml2rfc will, by default, try to find included files in the same
     directory as the including file. You can also define the XML_LIBRARY environment variable
     with a value containing a set of directories to search.  These can be either in the local
     filing system or remote ones accessed by http (http://domain/dir/... ).-->

    <references title="Informative References">
      <!-- Here we use entities that we defined at the beginning. -->

      &RFC2068;

      &RFC2616;

      &RFC3260;

      <!-- A reference written by by an organization not a person. -->

      <reference anchor="Cheshire"
        target="http://rescomp.stanford.edu/~cheshire/rants/Latency.html">
        <front>
          <title>It's the Latency, Stupid</title>
          <author fullname="Stuart Cheshire" surname="Cheshire">
            <organization>Stanford University</organization>
          </author>
          <date year="1996" /> 
        </front>
      </reference>


      <reference anchor="Netalyzr"
		 target="http://www.icir.org/christian/publications/2010-imc-netalyzr.pdf">
	<front>
	  <title>Netalyzr: illuminating the edge network.</title>
	  <author fullname='Christian Kreibich' surname='Kreibich'>
	    <organization abbrev='ICSI'>
	      International Computer Science Institute
	    </organization>
	  </author>
	  <author fullname='Nicholas Weaver' surname='Weaver'>
	    <organization abbrev='ICSI'>
	      International Computer Science Institute
	    </organization>
	  </author>
	  <author fullname='Boris Nechaev' surname='Nechaev'>
	    <organization>
	     HIIT & Aalto University
	    </organization>
	  </author>
	  <author fullname='Vern Paxson' surname='Paxson'>
	    <organization>
	      ICSI & U.C. Berkeley
	    </organization>
	  </author>
	  <date month='November' year='2010'/>
	</front>
      </reference>

      <reference anchor="Chu"
	 target="http://datatracker.ietf.org/doc/draft-ietf-tcpm-initcwnd/">
	<front>
	  <title>Increasing TCP's Initial Window</title>
	  <author fullname='H.K. Jerry Chu' surname='Chu'>
	    <organization abbrev='Google'>Google, Inc.</organization>
	  </author>
	  <author fullname='Nandita Dukkipati' surname='Dukkipati'>
	    <organization abbrev='Google'>Google, Inc.</organization>
	  </author>
	  <author fullname='Yuchung Cheng' surname='Cheng'>
	    <organization abbrev='Google'>Google, Inc.</organization>
	  </author>
	  <author fullname='Matt Mathis' surname='Mathis'>
	    <organization abbrev='Google'>Google, Inc.</organization>
	  </author>
          <date year="2011" /> 
        </front>
      </reference>

      <reference anchor="Sundaresan"
	 target="http://gtnoise.net/papers/2011/sundaresan:sigcomm2011.pdf">
	<front>
	  <title>Broadband Internet Performance: A View From the Gateway, Proceedings of SIGCOMM 2011</title>
	  <author fullname='Srikanth Sundaresan' surname='Sundaresan'>
	    <organization abbrev='Georgia Tech'>Georgia Tech, Atlanta, USA</organization>
	  </author>
	  <author fullname='Walter de Donato' surname='de Donato'>
	    <organization abbrev='U. Napoli'>University of Napoli Federico II, Napoli, Italy</organization>
	  </author>
	  <author fullname='Nick Feamster' surname='Feamster'>
	    <organization abbrev='Georgia Tech'>Georgia Tech, Atlanta, USA</organization>
	  </author>
	  <author fullname='Renata Teixeira' surname='Teixeira'>
	    <organization abbrev='CNRS/UPMC Sorbonne Univ., Paris, France'></organization>
	  </author>
	  <author fullname='Sam Crawford' surname='Crawford'>
	    <organization abbrev='SamKnows'>SamKnows, London, UK</organization>
	  </author>
	  <author fullname='Antonio Pescape' surname='Pescape'>
	    <organization abbrev='U. Napoli'>University of Napoli Federico II, Napoli, Italy</organization>
	  </author>
          <date month="August" year="2011" />
        </front>
      </reference>

      <reference anchor="Gettys"
	 target="http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/">
	<front>
	  <title>Whose house is of glasse, must not throw stones at another</title>
	  <author fullname='Jim Gettys' surname='Gettys'>
	    <organization abbrev='Bell Labs'>Alcatel-Lucent Bell Labs</organization>
	  </author>
          <date month="January" year="2011" />
        </front>
      </reference>

      <reference anchor="SPDY"
	 target="http://dev.chromium.org/spdy">
	<front>
	  <title>SPDY: An experimental protocol for a faster web</title>
	  <author fullname='Mike Belshe' surname='Belshe'>
	    <organization abbrev='Google'>Google, Inc.</organization>
	  </author>
          <date year="2011" />
        </front>
      </reference>

      <reference anchor="ConEx"
	 target="http://www.bobbriscoe.net/projects/refb/">
	<front>
	  <title>congestion exposure (ConEx), re-feedback and re-ECN</title>
	  <author fullname='Bob Briscoe' surname='Briscoe'>
	    <organization abbrev='BT'>British Telecom, Inc.</organization>
	  </author>
          <date year="2005" />
        </front>
      </reference>

      <reference anchor="HTTPPerf" target="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
	<front>
	  <title>Network Performance Effects of HTTP/1.1, CSS1, and PNG</title>
	  <author fullname='Henrik Frystyk Nielsen' surname='Nielsen'>
	    <organization abbrev='W3C'>
	      World Wide Web Consortium
	    </organization>
	  </author>
	  <author fullname='Jim Gettys' surname='Gettys'>
	    <organization abbrev='W3C/Digital'>
	    World Wide Web Consortium/Digital Equipment Corporation
	    </organization>
	  </author>
	  <author fullname='Anselm Baird-Smith' surname='Baird-Smith'>
	    <organization abbrev='W3C'>
	      World Wide Web Consortium
	    </organization>
	  </author>
	  <author fullname="Eric Prud'hommeaux" surname="Prud'hommeaux">
	    <organization abbrev='W3C'>
	      World Wide Web Consortium
	    </organization>
	  </author>
	  <author fullname='Hakon Wium Lie' surname='Lie'>
	    <organization abbrev='W3C'>
	      World Wide Web Consortium
	    </organization>
	  </author>
	  <author fullname='Chris Lilley' surname='Lilley'>
	    <organization abbrev='W3C'>
	      World Wide Web Consortium
	    </organization>
	  </author>
	  <date month='June' year='1997'/>
	</front>
      </reference>

    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 03:13:29