One document matched: draft-ietf-pilc-error-04.txt
Differences from draft-ietf-pilc-error-03.txt
Internet Engineering Task Force S. Dawkins
INTERNET DRAFT G. Montenegro
M. Kojo
V. Magret
N. Vaidya
July 14, 2000
End-to-end Performance Implications of Links with Errors
draft-ietf-pilc-error-04.txt
Status of This Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Comments should be submitted to the PILC mailing list at
pilc@grc.nasa.gov.
Distribution of this memo is unlimited.
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as ``work in
progress.''
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
The rapidly-growing Internet is being accessed by an
increasingly wide range of devices over an increasingly wide
variety of links. At least some of these links do not provide
the reliability that hosts expect, and this expansion into
unreliable links causes some Internet protocols, especially TCP
[RFC793], to perform poorly.
Expires January 19, 2001 [Page 1]
INTERNET DRAFT PILC - Links with Errors July 2000
Specifically, TCP congestion control [RFC2581], while
appropriate for connections that lose traffic primarily
because of congestion and buffer exhaustion, interact badly
with connections that traverse links with high uncorrected
error rates. The result is that senders may spend an excessive
amount of time waiting on acknowledgements that aren't coming,
whether these losses are due to data losses in the forward
path or acknowledgement losses in the return path, and then,
although these losses are not due to congestion-related buffer
exhaustion, the sending TCP transmits at substantially reduced
traffic levels as it probes the network to determine "safe"
traffic levels.
This document discusses the specific TCP mechanisms that are
problematic in these environments, and discusses what can be done
to mitigate the problems without introducing intermediate devices
into the connection.
This document does not address issues with other transport
protocols, for example, UDP.
Expires January 19, 2001 [Page 2]
INTERNET DRAFT PILC - Links with Errors July 2000
Table of Contents
1.0 Introduction .................................................. 4
1.1 Relationship of this recommendation and [PILC-PEP] ......... 4
1.2 Relationship of this recommendation and [PILC-LINK] ........ 5
2.0 Errors and Interactions with TCP Mechanisms ................... 5
2.1 Slow Start and Congestion Avoidance [RFC2581] .............. 6
2.2 Fast Retransmit and Fast Recovery [RFC2581] ................ 6
2.3 Selective Acknowledgements [RFC2018, SACK-EXT] ............. 8
3.0 Summary of Recommendations .................................... 9
4.0 Topics For Further Work ....................................... 10
4.1 Achieving, and maintaining, large windows .................. 10
5.0 Acknowledgements .............................................. 11
Changes ........................................................... 11
References ........................................................ 12
Authors' addresses ................................................ 15
Appendix A: When TCP Defers Recovery to the Link Layer ............ 16
Appendix B: Detecting Transmission Errors With Explicit Notifi-
cations ........................................................... 17
Appendix C Appropriate Byte Counting [ALL99] (Experimental) ....... 18
Expires January 19, 2001 [Page 3]
INTERNET DRAFT PILC - Links with Errors July 2000
1.0 Introduction
It has been axiomatic that most losses on the Internet are due to
congestion, as routers run out of buffers and discard incoming
traffic. This observation is the basis for current TCP
congestion avoidance strategies - if losses are due to congestion,
there is no need for an explicit "congestion encountered"
notification to the sender.
Quoting Van Jacobson in 1988: "If packet loss is (almost) always
due to congestion and if a timeout is (almost) always due to a
lost packet, we have a good candidate for the `network is congested'
signal." [VJ-DCAC]
This axiom has served the Internet community well, because it
allowed the deployment of TCPs that have allowed the Internet to
accomodate explosive growth in link speeds and traffic levels.
This same explosive growth has attracted users of networking
technologies that DON'T have low uncorrected error rates -
including many satellite-connected users, and many wireless Wide
Area Network-connected users. Users connected to these networks may
not be able to transmit and receive at anything like available
bandwidth because their TCP connections are spending time in
congestion avoidance procedures, or even slow-start procedures, that
were triggered by transmission error in the absence of congestion.
This document makes recommendations about what the participants
in connections that traverse high error-rate links may wish
to consider doing to improve utilization of available bandwidth
in ways that do not threaten the stability of the Internet.
Applications use TCP in very different ways, and these have
interactions with TCP's behavior [HPF-CWV]. Nevertheless,
it is possible to make some basic assumptions about TCP
flows. Accordingly, the mechanisms discussed here are applicable
for all uses of TCP, albeit in varying degrees according to
different scenarios (as noted where appropriate).
This document does not address issues with non-TCP transport
protocols, for example, UDP.
1.1 Relationship of this recommendation and [PILC-PEP]
This document discusses end-to-end mechanisms that do not require
TCP-level awareness by intermediate nodes. This places severe
limitations on what the end nodes can know about the nature of
Expires January 19, 2001 [Page 4]
INTERNET DRAFT PILC - Links with Errors July 2000
losses that are occurring between the end nodes. Attempts to
apply heuristics to distinguish between congestion and transmission
error have not been successful [BV97, BV98, BV98a]. A companion
PILC document on Performance-Enhancing Proxies, [PILC-PEP],
relaxes this restriction; because PEPs can be placed on boundaries
where network characteristics change dramatically, PEPs have an
additional opportunity to improve performance over links with
uncorrected errors.
However, generalized use of PEPs contravenes the end-to-end
principle and is highly undesireable given their deleterious
implications with respect to the following [PILC-PEP]: fate
sharing (a PEP adds a third point of failure besides the
endpoints themselves), end-to-end reliability and diagnostics,
security (particularly, network layer security such as IPsec),
mobility (handoffs are much more complex because state must
be transferred), asymmetric routing (PEPs typically require
being on both the forward and reverse paths of a connection),
scalability (PEPs add more state to maintain), QoS transparency
and guarantees, etc
Not every type of PEP has all the drawbacks listed
above. Nevertheless, the use of PEPs may have very serious
consequences which must be weighed carefully.
1.2 Relationship of this recommendation and [PILC-LINK]
This recommendation is for use with TCP over subnetwork technologies
that have already been deployed. A companion PILC recommendation,
[PILC-LINK], is for designers of subnetworks that are intended to
carry Internet protocols, and have not been completely specified,
so that the designers have the opportunity to reduce the number of
uncorrected errors TCP will encounter.
2.0 Errors and Interactions with TCP Mechanisms
A TCP sender adapts its use of bandwidth based on feedback from
the receiver. When TCP is not able to distinguish between losses
due to congestion and losses due to uncorrected errors, it is
not able to accurately determine available bandwidth.
Some TCP mechanisms, targeting recovery from losses due to
congestion, coincidentally assist in recovery from losses due to
uncorrected errors as well.
Expires January 19, 2001 [Page 5]
INTERNET DRAFT PILC - Links with Errors July 2000
2.1 Slow Start and Congestion Avoidance [RFC2581]
Slow Start and Congestion Avoidance [RFC2581] are essential to
the Internet's stability. These mechanisms were designed to
accommodate networks that didn't provide explicit congestion
notification. Although experimental mechanisms like [RFC2481]
are moving in the direction of explicit notification, the effect
of ECN on ECN-aware TCPs is essentially the same as the effect
of implicit congestion notification through congestion-related
loss.
TCP connections experiencing high error rates interact badly
with Slow Start and with Congestion Avoidance, because high
error rates make the interpretation of losses ambiguous -
the sender cannot know intuitively whether detected losses are
due to congestion or to data corruption. TCP makes the "safe"
choice - assume that the losses are due to congestion.
- Whenever TCP's retransmission timer expires, the sender
assumes that the network is congested and invokes slow start.
- During slow start, the sender increases its window in
units of segments. This is why it is important to use an
appropriately sized MTU - and less reliable link layers
often use smaller MTUs.
Recommendation: Slow Start and Congestion Avoidance are MUSTs in
[RFC1122], itself a full Internet Standard. Recommendations in this
document will not interfere with these mechanisms.
2.2 Fast Retransmit and Fast Recovery [RFC2581]
TCPs deliver data as a reliable byte-stream to applications, so
when a segment is lost (whether due to either congestion or
transmission loss), delivery of data to the receiving application
must wait until the missing data is received. Missing segments are
detected by the receiver by segments arriving with out-of-order
sequence numbers.
TCPs SHOULD immediately send an acknowledgement when data when
is received out-of-order [RFC2581], sending the next expected
sequence number with no delay, so that the sender can retransmit
the required data and the receiver can resume delivery of data
to the receiving application. When an acknowledgement carries
the same expected sequence number as an acknowledgement that
has already been sent for the last in-order segment received,
these acknowledgements are called "duplicate ACKs".
Expires January 19, 2001 [Page 6]
INTERNET DRAFT PILC - Links with Errors July 2000
Because IP networks are allowed to reorder packets, the receiver
may send duplicate acknowledgements for segments that arrive
out of order due to routing changes, link-level retransmission,
etc. When a TCP sender receives three duplicate ACKs, fast
retransmit [RFC2581] allows it to infer that a segment was
lost. The sender retransmits what it considers to be this lost
segment without waiting for the full retransmission timeout,
thus saving time.
After a fast retransmit, a sender halves its congestion window
and invokes the fast recovery [RFC2581] algorithm, whereby
it invokes congestion avoidance, but not slow start from a
one-segment congestion window. This also saves time.
It's important to be realistic about the maximum throughput that
TCP can have over a connection that traverses a high error-rate
link. Even using Fast Retransmit/Fast Recovery, the sender will
halve the congestion window each time a window contains one or
more segments that is lost, and will re-open the window by one
additional segment for each acknowledgement that is received. If
a connection path traverses a link that loses one or more segments
during recovery, the one-half reduction takes place again, this time
on a reduced congestion window - and this downward spiral will
continue until the connection is able to recover completely without
experiencing loss.
In general, TCP can increase its congestion window beyond the
delay-bandwidth product. In links with high error rates, the
TCP window may remain rather small for long periods of time
due to any of the following reasons:
1. TCP's congestion avoidance strategy is additive-increase,
multiplicative-decrease, which means that if additional
errors are encountered before the congestion window
recovers completely from a 50-percent reduction, the
effect can be a "downward spiral" of the congestion window
due to additional 50-percent reductions. This "downward
spiral" will hold the congestion window below the capacity
of the path between the endpoints until the error rate
decreases, allowing full recovery by additive increase. Of
course, no downward spiral occurs if the error rate is
constantly high and the congestion window always remains
small.
2. If a network path with high uncorrected error rates DOES
cross a highly congested wireline Internet path,
congestion losses on the Internet have the same effect as
losses due to corruption.
Expires January 19, 2001 [Page 7]
INTERNET DRAFT PILC - Links with Errors July 2000
Not all causes of small windows are related to errors. For
example, HTTP commonly closes TCP connections to indicate
boundaries between requested resources. This means that these
applications are constantly closing "trained" TCP connections
and opening "untrained" TCP connections which will execute slow
start, beginning with one or two segments. This is true even
with HTTP1.1 in spite of its support for persistent connections.
A small window - especially a window of less than four segments -
effectively prevents the sender from taking advantage of Fast
Retransmits. Moreover, efficient recovery from multiple losses
within a single window requires adoption of new proposals
(NewReno [RFC2582]).
Recommendation: Implement Fast Retransmit and Fast Recovery at
this time. This is a widely-implemented optimization and is
currently at Proposed Standard level. [RFC2488] recommends
implementation of Fast Retransmit/Fast Recovery in satellite
environments. In cases where SACK (see next section) can not be
enabled for both sides of a connection, NewReno [RFC2582] may be
used by TCP senders to better handle partial ACKs and multiple
losses in a single window.
2.3 Selective Acknowledgements [RFC2018, SACK-EXT]
Selective Acknowledgements allow the repair of multiple segment
losses per window without requiring one (or more) round-trips
per loss.
[SACK-EXT] proposes an extension to SACK that allows receivers
to provide more information about the order of delivery of
segments, allowing "more robust operation in an environment of
reordered packets, ACK loss, packet replication, and/or early
retransmit timeouts". [SACK-EXT] has been approved for proposed
standard as a minor but useful update to Selective
Acknowledgements. Unless explicitly stated otherwise, in this
document "Selective Acknowledgements" (or "SACK") refers to the
combination of [RFC2018] and [SACK-EXT].
Selective acknowledgements are most useful in LFNs ("Long Fat
Networks"), because of the long round trip times that may be
encountered in these environments, according to Section 1.1 of
[RFC1323], and are especially useful if large windows are
required, because there is a higher probability of multiple
segment losses per window.
In low-speed, high error-rate environments (for example, the
wireless WAN environment), TCP windows are much smaller, and
Expires January 19, 2001 [Page 8]
INTERNET DRAFT PILC - Links with Errors July 2000
burst errors must be much longer in duration in order to damage
multiple segments. Accordingly, the complexity of SACK may not
be justifiable.
On the other hand, if error rates are generally low but
occasionally increase due to interference, TCP will have the
opportunity to increase its window to larger values. When
interference occurs, multiple losses within a window are likely
to occur. In this case, SACK would provide benefits in speeding
the recovery and preventing unnecessary extra reduction of
window size.
Recommendation: SACK as specified in [RFC2018] and updated by
[SACK-EXT] is a Proposed Standard. Implement SACK now for
compatibility with other TCPs.
3.0 Summary of Recommendations
Because existing TCPs have only one implicit loss feedback
mechanism, it is not possible to use this mechanism to
distinguish between congestion loss and transmission error
without additional information. Because congestion affects all
traffic on a path while transmission loss affects only the
specific traffic encountering uncorrected errors, avoiding
congestion has to take precedence over quickly repairing
transmission error. This means that the best that can be
achieved without new feedback mechanisms is minimizing the
amount of time spent unnecessarily in congestion avoidance.
Fast Retransmit/Fast Recovery allows quick repair of loss
without giving up the safety of congestion avoidance. In order
for Fast Retransmit/Fast Recovery to work, the window size must
be large enough to force the receiver to send three duplicate
acknowledgements before the retransmission timeout interval
expires, forcing full TCP slow-start.
Selective Acknowledgements (SACK) extend the benefit of Fast
Retransmit/Fast Recovery to situations where multiple segment
losses in the window need to be repaired more quickly than can
be accomplished by executing Fast Retransmit for each segment
loss, only to discover the next segment loss.
These mechanisms cover both wireless and wireline environments.
This general applicability attracts more attention and analysis
from the research community.
All of these mechanisms continue to work in the presence
of IPsec.
Expires January 19, 2001 [Page 9]
INTERNET DRAFT PILC - Links with Errors July 2000
4.0 Topics For Further Work
Delayed Duplicate Acknowledgements is an attractive scheme,
especially when link layers use fixed retransmission timer
mechanisms that may still be trying to recover when TCP-level
retransmission timeouts occur, adding additional traffic to the
network. This proposal is worthy of additional study, but is not
recommended at this time, because we don't know how to calculate
appropriate amounts of delay for an arbitrary network topology.
It is not possible to use explicit congestion notification
as a surrogate for explicit transmission error notification
(no matter how much we wish it was!). Some mechanism to
provide explicit notification of transmission error would
be very helpful. This might be more easily provided in a
PEP environment, especially when the PEP is the "first hop"
in a connection path, because current checksum mechanisms
do not distinguish between transmission error to a payload
and transmission error to the header - and, if the header is
damaged it's problematic to send explicit transmission error
notification to the right endpoints.
Losses that take place on the ACK stream, especially while a TCP
is learning network characteristics, can make the data stream
quite bursty (resulting in losses on the data stream, as well).
"Appropriate Byte Counting" [ALL99] has been proposed as a
way of limiting this burstiness.
4.1 Achieving, and maintaining, large windows
The recommendations described in this document will aid TCPs in
injecting packets into ERRORed connections as fast as possible,
but no faster. The goal here is to optimize the use of available
bandwidth in a way that does not destabilize the Internet.
In addition to these TCP-level recommendations, there is still
additional work to do at the application level, especially with
the dominant application protocol on the World Wide Web, HTTP.
HTTP/1.0 (and its predecessor, HTTP/0.9) used TCP connection
closing to signal a receiver that all of a requested resource
had been transmitted. Because WWW objects tend to be small
in size [MOGUL], TCPs carrying HTTP/1.0 traffic experience
difficulty in "training" on available bandwidth (a substantial
portion of the transfer had already happened, by the time the
TCPs got out of slow start).
Several HTTP modifications have been introduced to improve this
Expires January 19, 2001 [Page 10]
INTERNET DRAFT PILC - Links with Errors July 2000
interaction with TCP ("persistent connections" in HTTP/1.0,
with improvements in HTTP/1.1 [RFC2616]). For a variety of
reasons, many HTTP interactions are still HTTP/1.0-style -
relatively short-lived.
Proposals which reuse TCP congestion information across
connections, like TCP Control Block Interdependence [RFC2140],
or the more recent Congestion Manager [BS99] proposal, will have
the effect of making multiple parallel connections impact the
network as if they were a single connection, "trained" after
a single startup transient. These proposals are critical to
the long-term stability of the Internet, because today's users
always have the choice of clicking on the "reload" button in
their browsers and cutting off TCP's exponential backoff -
replacing connections which are building knowledge of the
available bandwidth with connections with no knowledge at all.
5.0 Acknowledgements
This recommendation has grown out of RFC 2757, "TCP Over Long
Thin Networks", which was in turn based on work done in the IETF
TCPSAT working group. The authors are indebted to the active
members of the PILC working group. In particular, Mark Allman
gave us copious and insightful feedback. Also, Jamshid Mahdavi
provided text replacements.
Changes
Changes between versions 03 and 04:
Other editorial changes and corrections.
Changes between versions 02 and 03:
Restructure document into discussion of standard mechanisms, work
remaining to be done, and appendices on experimental mechanisms.
Change "Explicit Corruption Notification" to "Explicit Transmission
Error Notification", in order to avoid confusion with "Explicit
Congestion Notification".
Other editorial changes and corrections.
Changes between versions 03 and 04:
Incorporated lots of comments from mark allman to numerous to list
Expires January 19, 2001 [Page 11]
INTERNET DRAFT PILC - Links with Errors July 2000
here.
Also incorporated some changes suggested by Jamshid Mahdavi.
SACK-EXT is now approved for proposed. Reflected this change in
status in the text by treating SACK-EXT in the same way as
SACK.
Changed section name from Delayed Duplicate Acknowledgements to
"When TCP Defers Recovery to the Link Layer" and mentioned
Reinger Ludwig's Eifel algorithm.
Added reference to link-outage in the appendix.
References
[ALL99] Mark Allman, "TCP Byte Counting Refinements," ACM
Computer Communication Review, Volume 29, Number 3, July 1999.
http://www.acm.org/sigcomm/ccr/archive/1999/jul99/ccr-9907-allman.pdf
[BBKVP96] Bakshi, B., P., Krishna, N., Vaidya, N., Pradhan, D.K.,
"Improving Performance of TCP over Wireless Networks," Technical
Report 96-014, Texas A&M University, 1996.
[BPSK96] Balakrishnan, H., Padmanabhan, V., Seshan, S., Katz, R.,
"A Comparison of Mechanisms for Improving TCP Performance over
Wireless Links," in ACM SIGCOMM, Stanford, California, August
1996.
[BS99] Hari Balakrishnan, Srinivasan Seshan, "The Congestion
Manager", July, 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/draft-ietf-ecm-cm-00.txt
[BV97] Biaz, S., Vaidya, N., "Using End-to-end Statistics to
Distinguish Congestion and Corruption Lossses: A Negative Result,"
Texas A&M University, Technical Report 97-009, August 18, 1997.
[BV98] Biaz, S., Vaidya, N., "Sender-Based heuristics for
Distinguishing Congestion Losses from Wireless Transmission
Losses," Texas A&M University, Technical Report 98-013, June
1998.
[BV98a] Biaz, S., Vaidya, N., "Discriminating Congestion Losses
from Wireless Losses using Inter-Arrival Times at the Receiver,"
Texas A&M University, Technical Report 98-014, June 1998.
[HPF-CWV] Handley, M., Padhye, J., Floyd, S., "TCP Congestion
Expires January 19, 2001 [Page 12]
INTERNET DRAFT PILC - Links with Errors July 2000
Window Validation," March 2000. Approved for informational rfc,
available at
http://search.ietf.org/internet-drafts/draft-handley-tcp-cwv-02.txt.
[LINK-OUTAGE] G. Montenegro, "Link Outage ICMP Notification,"
July 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/
draft-montenegro-pilc-link-outage-00.txt
[LK00] Reiner Ludwig and Randy Katz, "The Eifel Algorithm:
Making TCP Robust Against Spurious Retransmissions, " ACM
Computer Communication Review, Volume 30, number 1, January
2000. Available at
http://www.acm.org/sigcomm/ccr/archive/2000/jan00/
ccr-200001-ludwig.pdf
[MD95] Gabriel Montenegro and Steve Drach, "System Isolation and
Network Fast-Fail Capability in Solaris," Second USENIX
Symposium on Mobile and Location-Independent, April 1995.
http://www.usenix.org/publications/library/proceedings/mob95/montenegro.html
[MV97] Mehta, M., Vaidya, N., "Delayed Duplicate-Acknowledgements:
A Proposal to Improve Performance of TCP on Wireless Links," Texas
A&M University, December 24, 1997.
Available at http://www.cs.tamu.edu/faculty/vaidya/mobile.html
[PILC-LINK] Phil Karn, Aaron Falk, Joe Touch, Marie-Jose
Montpetit, Jamshid Mahdavi, Gabriel Montenegro, Dan Grossman,
Gorry Fairhurst, "Advice for Internet Subnetwork Designers",
July 2000. Work in progress, available at http://
www.ietf.org/internet-drafts/draft-ietf-pilc-link-design-03.txt
[PILC-PEP] J. Border, M. Kojo, Jim Griner, G. Montenegro,
"Performance Implications of Link-Layer Characteristics: Performance
Enhancing Proxies", July 2000. Work in progress, available
at http://www.ietf.org/internet-drafts/draft-ietf-pilc-pep-03.txt
[PILC-SLOW] S. Dawkins, G. Montenegro, M. Kojo, V. Magret,
"Performance Implications of Link-Layer Characteristics: Slow
Links", July 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/draft-ietf-pilc-slow-04.txt
[P-HTTP] "The Case for Persistent-Connection HTTP", Jeffrey
C. Mogul, Research Report 95/4, May 1995, available as
http://www.research.digital.com/wrl/techreports/abstracts/95.4.html
[RFC793] Jon Postel, "Transmission Control Protocol", September 1981.
RFC 793.
Expires January 19, 2001 [Page 13]
INTERNET DRAFT PILC - Links with Errors July 2000
[RFC1122] Braden, R., Requirements for Internet Hosts --
Communication Layers, October 1989. RFC 1122.
[RFC1323] Van Jacobson, Robert Braden, and David Borman. TCP
Extensions for High Performance, May 1992. RFC 1323.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and Romanow, A.,
"TCP Selective Acknowledgment Options," October, 1996.
[RFC2140] J. Touch, "TCP Control Block Interdependence", RFC 2140,
April 1997.
[RFC2309] Braden, B. Clark, D., Crowcroft, J., Davie, B., Deering,
S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge,
C., Peterson, L., Ramakrishnan, K.K., Shenker, S., Wroclawski, J.,
Zhang, L., "Recommendations on Queue Management and Congestion
Avoidance in the Internet," RFC 2309, April 1998.
[RFC2481] Ramakrishnan, K.K., Floyd, S., "A Proposal to add Explicit
Congestion Notification (ECN) to IP", RFC 2481, January 1999.
[RFC2488] Mark Allman, Dan Glover, Luis Sanchez. "Enhancing TCP
Over Satellite Channels using Standard Mechanisms," RFC 2488
(BCP 28), January 1999.
[RFC2581] M. Allman, V. Paxson, W. Stevens, "TCP Congestion
Control," April 1999. RFC 2581.
[RFC2582] Floyd, S., Henderson, T., "The NewReno Modification to
TCP's Fast Recovery Algorithm," April 1999. RFC 2582.
[RFC2616] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, Masinter,
P. Leach, T. Berners-Lee. "Hypertext Transfer Protocol -- HTTP/1.1",
RFC 2616, June 1999. (Draft Standard)
[SACK-EXT] Sally Floyd, Jamshid Mahdavi, Matt Mathis, Matthew
Podolsky, Allyn Romanow, "An Extension to the Selective
Acknowledgement (SACK) Option for TCP", August 1999. Approved
for proposed standard, available at
http://www.ietf.org/internet-drafts/draft-floyd-sack-00.txt
[SF98] Nihal K. G. Samaraweera and Godred Fairhurst, "Reinforcement
of TCP error Recovery for Wireless Communication", Computer
Communication Review, volume 28, number 2, April 1998. Available at
http://www.acm.org/sigcomm/ccr/archive/1998/apr98/
ccr-9804-samaraweera.pdf
[VJ-DCAC] Van Jacobson, "Dynamic Congestion Avoidance / Control"
Expires January 19, 2001 [Page 14]
INTERNET DRAFT PILC - Links with Errors July 2000
e-mail dated Feberuary 11, 1988, available from
http://www.kohala.com/~rstevens/vanj.88feb11.txt
[VMPM99] N. H. Vaidya, M. Mehta, C. Perkins, G. Montenegro,
"Delayed Duplicate Acknowledgements: A TCP-Unaware Approach to
Improve Performance of TCP over Wireless," Technical Report
99-003, Computer Science Dept., Texas A&M University, February
1999.
Authors' addresses
Questions about this document may be directed to:
Spencer Dawkins
Fujitsu Network Communications
2801 Telecom Parkway
Richardson, Texas 75082
Voice: +1-972-479-3782
E-Mail: spencer.dawkins@fnc.fujitsu.com
Gabriel E. Montenegro
Sun Labs Networking and Security Group
Sun Microsystems, Inc.
901 San Antonio Road
Mailstop UMPK 15-214
Mountain View, California 94303
Voice: +1-650-786-6288
Fax: +1-650-786-6445
E-Mail: gab@sun.com
Markku Kojo
University of Helsinki/Department of Computer Science
P.O. Box 26 (Teollisuuskatu 23)
FIN-00014 HELSINKI
Finland
Voice: +358-9-7084-4179
Fax: +358-9-7084-4441
E-Mail: kojo@cs.helsinki.fi
Expires January 19, 2001 [Page 15]
INTERNET DRAFT PILC - Links with Errors July 2000
Vincent Magret
Corporate Research Center
Alcatel Network Systems, Inc
1201 Campbell
Mail stop 446-310
Richardson Texas 75081 USA
M/S 446-310
Voice: +1-972-996-2625
Fax: +1-972-996-5902
E-mail: vincent.magret@aud.alcatel.com
Nitin Vaidya
Dept. of Computer Science
Texas A&M University
College Station, TX 77843-3112
Voice: +1 409-845-0512
Fax: +1 409-847-8578
Email: vaidya@cs.tamu.edu
Appendix A: When TCP Defers Recovery to the Link Layer
When link layers try aggressively to correct a high underlying
error rate, it is imperative to prevent interaction between
link-layer retransmission and TCP retransmission as these layers
duplicate each other's efforts. It may be preferable to allow a
local mechanism to resolve a local problem, instead of invoking
TCP's end-to-end mechanism and incurring the associated costs,
both in terms of wasted bandwidth and in terms of its effect on
TCP's window behavior. In such an environment it may make sense
to delay TCP's efforts so as to give the link-layer a chance to
recover. With this in mind, the Delayed Dupacks [MV97, VMPM99]
scheme selectively delays duplicate acknowledgements at the
receiver.
At this time, it is not well understood how long the receiver
should delay the duplicate acknowledgments. In particular, the
impact of medium access control (MAC) protocol on the
choice of delay parameter needs to be studied. The MAC
protocol may affect the ability to choose the appropriate
delay (either statically or dynamically). In general,
significant variabilities in link-level retransmission times
can have an adverse impact on the performance of the Delayed
Dupacks scheme.
Expires January 19, 2001 [Page 16]
INTERNET DRAFT PILC - Links with Errors July 2000
Delayed dupacks makes very little assumptions about the TCP
implementations. If, however, one assumes that the
implementations support TCP timestamps, then other schemes are
possible. For example, the Eifel algorithm [LK00] uses
timestamps to eliminate the deleterious effect of a spurious
timeout on TCP performance. Additionally, Eifel uses two of the
currently unused bits in the TCP header and makes certain
assumptions about the link-layer being able to treat flows that
benefit from reliable service differently from real-time flows.
Recommendation: Delaying duplicate acknowledgements and the
Eifel Algorithm are not standards-track mechanisms. They may be
useful in specific network topologies, but a general
recommendation requires further research and experience.
Appendix B: Detecting Transmission Errors With Explicit Notifications
As noted above, today's TCPs assume that any loss is due
to congestion, and encounter difficulty in distinguishing
between congestion loss and corruption loss because this
"implicit notification" mechanism can't carry both meanings
at once. [SF98] reports simulation results showing that
performance improvements are possible when TCP can correctly
distingush between losses due to congestion and losses due to
corruption.
With explicit notification from the network it is possible to
determine when a loss is due to corruption. Several proposals
along these lines include:
- Explicit Loss Notification (ELN) [BPSK96]
- Explicit Bad State Notification (EBSN) [BBKVP96]
- Explicit Loss Notification to the Receiver (ELNR), and
Explicit Delayed Dupack Activation Notification (EDDAN)
[MV97]
- Space Communication Protocol Specification - Transport
Protocol (SCPS-TP), which uses explicit "negative
acknowledgements" to notify the sender that a damaged
packet has been received.
Similarly to notifying about corruptions affecting specific
packets, it is useful to inform of sustained interruptions in
link connectivity. These conditions can be reported with an ICMP
Host Unreachable message [LINK-OUTAGE]. IP is required to pass
Expires January 19, 2001 [Page 17]
INTERNET DRAFT PILC - Links with Errors July 2000
any such messages up to transport layers like UDP and TCP, and
these, in turn, to applications above them [RFC1122]. What is
not clearly defined is what code within an ICMP Host Unreachable
message should be used to notify of an error condition. For
conditions of network outage, a currently unused 'host isolated'
(code 8) was introduced for routers (actually, IMP's) to inform
hosts of an outage. Additionally, [MD95] argues for the
application of 'host isolated' for notifications emanating from
a host's lower layers.
In summary, these notifications to upper layers can originate
either from within a host itself or in another host altogether.
ICMP includes the necessary information to determine the sender
of the notification, as well as a part of the datagram which
encountered the error
These proposals offer promise, but none have been proposed as
standards-track mechanisms for adoption in IETF.
Recommendation: Researchers should continue to investigate true
corruption-notification mechanisms, especially mechanisms like
ELNR and EDDAN [MV97], in which the only systems that need to be
modified are the base station and the mobile device. We also note
that the requirement that the base station be able to examine TCP
headers at link speeds raises performance issues with respect to
IPsec-encrypted packets.
Appendix C Appropriate Byte Counting [ALL99] (Experimental)
Researchers have pointed out an interaction between delayed
acknowledgements and TCP acknowledgement-based self-clocking, and
various proposals have been made to improve bandwidth utilization
during slow start. One proposal, called "Appropriate Byte Counting",
increases cwnd based on the number of bytes acknowledged, instead of
the number of ACKs received. This proposal is a refinement of earlier
proposals, limits the increase in cwnd so that cwnd does not "spike"
in the presence of "stretch ACKs", which cover more than two segments
(whether this is intentional behavior by the receiver or the result
of lost ACKs), and limits cwnd growth based on byte counting to the
initial slow-start exchange.
This proposal is still at the experimental stage, but implementors
may wish to follow this work, because the effect is that cwnd is
opening more aggressively when ACKs are lost during the initial
slow-start exchange, but this aggressiveness does not act to the
detriment of other flows.
Expires January 19, 2001 [Page 18]
| PAFTECH AB 2003-2026 | 2026-04-22 03:58:29 |