One document matched: draft-ietf-pilc-error-05.txt
Differences from draft-ietf-pilc-error-04.txt
Internet Engineering Task Force S. Dawkins
INTERNET DRAFT G. Montenegro
M. Kojo
V. Magret
N. Vaidya
September 22, 2000
End-to-end Performance Implications of Links with Errors
draft-ietf-pilc-error-05.txt
Status of This Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Comments should be submitted to the PILC mailing list at
pilc@grc.nasa.gov.
Distribution of this memo is unlimited.
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as ``work in
progress.''
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
The rapidly-growing Internet is being accessed by an
increasingly wide range of devices over an increasingly wide
variety of links. At least some of these links do not provide
the reliability that hosts expect, and this expansion into
unreliable links causes some Internet protocols, especially TCP
[RFC793], to perform poorly.
Expires March 22, 2001 [Page 1]
INTERNET DRAFT PILC - Links with Errors September 2000
Specifically, TCP congestion control [RFC2581], while
appropriate for connections that lose traffic primarily
because of congestion and buffer exhaustion, interact badly
with connections that traverse links with high uncorrected
error rates. The result is that senders may spend an excessive
amount of time waiting on acknowledgements that aren't coming,
whether these losses are due to data losses in the forward
path or acknowledgement losses in the return path, and then,
although these losses are not due to congestion-related buffer
exhaustion, the sending TCP transmits at substantially reduced
traffic levels as it probes the network to determine "safe"
traffic levels.
This document discusses the specific TCP mechanisms that are
problematic in these environments, and discusses what can be done
to mitigate the problems without introducing intermediate devices
into the connection.
This document does not address issues with other transport
protocols, for example, UDP.
Expires March 22, 2001 [Page 2]
INTERNET DRAFT PILC - Links with Errors September 2000
Table of Contents
1.0 Introduction .................................................. 4
1.1 Relationship of this recommendation and [PILC-PEP] ......... 4
1.2 Relationship of this recommendation and [PILC-LINK] ........ 5
1.3 Should you be reading this recommendation? ................ 5
2.0 Errors and Interactions with TCP Mechanisms ................... 6
2.1 Slow Start and Congestion Avoidance [RFC2581] .............. 6
2.2 Fast Retransmit and Fast Recovery [RFC2581] ................ 7
2.3 Selective Acknowledgements [RFC2018, SACK-EXT] ............. 9
3.0 Summary of Recommendations .................................... 10
4.0 Topics For Further Work ....................................... 10
4.1 Achieving, and maintaining, large windows .................. 11
5.0 Acknowledgements .............................................. 12
Changes ........................................................... 12
References ........................................................ 13
Authors' addresses ................................................ 16
Appendix A: When TCP Defers Recovery to the Link Layer ............ 18
Appendix B: Detecting Transmission Errors With Explicit Notifi-
cations ........................................................... 18
Appendix C Appropriate Byte Counting [ALL99] (Experimental) ....... 20
Expires March 22, 2001 [Page 3]
INTERNET DRAFT PILC - Links with Errors September 2000
1.0 Introduction
It has been axiomatic that most losses on the Internet are due to
congestion, as routers run out of buffers and discard incoming
traffic. This observation is the basis for current TCP
congestion avoidance strategies - if losses are due to congestion,
there is no need for an explicit "congestion encountered"
notification to the sender.
Quoting Van Jacobson in 1988: "If packet loss is (almost) always
due to congestion and if a timeout is (almost) always due to a
lost packet, we have a good candidate for the `network is congested'
signal." [VJ-DCAC]
This axiom has served the Internet community well, because it
allowed the deployment of TCPs that have allowed the Internet to
accomodate explosive growth in link speeds and traffic levels.
This same explosive growth has attracted users of networking
technologies that DON'T have low uncorrected error rates -
including many satellite-connected users, and many wireless Wide
Area Network-connected users. Users connected to these networks may
not be able to transmit and receive at anything like available
bandwidth because their TCP connections are spending time in
congestion avoidance procedures, or even slow-start procedures, that
were triggered by transmission error in the absence of congestion.
This document makes recommendations about what the participants
in connections that traverse high error-rate links may wish
to consider doing to improve utilization of available bandwidth
in ways that do not threaten the stability of the Internet.
Applications use TCP in very different ways, and these have
interactions with TCP's behavior [HPF-CWV]. Nevertheless,
it is possible to make some basic assumptions about TCP
flows. Accordingly, the mechanisms discussed here are applicable
for all uses of TCP, albeit in varying degrees according to
different scenarios (as noted where appropriate).
This document does not address issues with non-TCP transport
protocols, for example, UDP.
1.1 Relationship of this recommendation and [PILC-PEP]
This document discusses end-to-end mechanisms that do not require
TCP-level awareness by intermediate nodes. This places severe
limitations on what the end nodes can know about the nature of
Expires March 22, 2001 [Page 4]
INTERNET DRAFT PILC - Links with Errors September 2000
losses that are occurring between the end nodes. Attempts to
apply heuristics to distinguish between congestion and transmission
error have not been successful [BV97, BV98, BV98a]. A companion
PILC document on Performance-Enhancing Proxies, [PILC-PEP],
relaxes this restriction; because PEPs can be placed on boundaries
where network characteristics change dramatically, PEPs have an
additional opportunity to improve performance over links with
uncorrected errors.
However, generalized use of PEPs contravenes the end-to-end
principle and is highly undesireable given their deleterious
implications with respect to the following [PILC-PEP]: fate
sharing (a PEP adds a third point of failure besides the
endpoints themselves), end-to-end reliability and diagnostics,
security (particularly, network layer security such as IPsec),
mobility (handoffs are much more complex because state must
be transferred), asymmetric routing (PEPs typically require
being on both the forward and reverse paths of a connection),
scalability (PEPs add more state to maintain), QoS transparency
and guarantees, etc
Not every type of PEP has all the drawbacks listed
above. Nevertheless, the use of PEPs may have very serious
consequences which must be weighed carefully.
1.2 Relationship of this recommendation and [PILC-LINK]
This recommendation is for use with TCP over subnetwork technologies
that have already been deployed. A companion PILC recommendation,
[PILC-LINK], is for designers of subnetworks that are intended to
carry Internet protocols, and have not been completely specified,
so that the designers have the opportunity to reduce the number of
uncorrected errors TCP will encounter.
1.3 Should you be reading this recommendation?
All known subnetwork technologies provide an "imperfect"
subnetwork service - the bit error rate is non-zero. But there's
no obvious way for end stations to tell the difference between
losses due to congestion and losses due to transmission errors.
It may be obvious if a directly-attached subnetwork reports
transmission errors. But both hosts won't be directly attached
to the same subnetwork in all but the most trivial networks, so
even if one host receives specific error reports, the other host
probably won't.
Expires March 22, 2001 [Page 5]
INTERNET DRAFT PILC - Links with Errors September 2000
Another way of deciding if a subnetwork should be considered to
have a "high error rate" is by appealing to mathematics.
A formula giving an upper bound on the performance of any
additive- increase, multiplicative-decrease algorithm likely to
be implemented in TCP in the future was derived in [MSMO97]:
MSS 1
BW = 0.93 --- -------
RTT sqrt(p)
where
MSS is the segment size being used by the connection
RTT is the end-to-end round trip time of the TCP connection
p is the packet loss rate for the path
(i.e. .01 if there is 1% packet loss)
If one plugs in an observed packet loss rate and then does the
math and sees predicted bandwidth utilization that is greater
than the link speed, the connection won't benefit from
recommendations in ERROR, because the level of packet losses
being encountered won't affect the ability of TCP to utilize the
link. If, however, the predicted bandwidth is less than the link
speed, packet losses are affecting the ability of TCP to utilize
the link, and if further investigation reveals a subnetwork with
significant transmission error rates, the recommendations in
ERROR will improve the ability of TCP to utilize the link.
2.0 Errors and Interactions with TCP Mechanisms
A TCP sender adapts its use of bandwidth based on feedback from
the receiver. When TCP is not able to distinguish between losses
due to congestion and losses due to uncorrected errors, it is
not able to accurately determine available bandwidth.
Some TCP mechanisms, targeting recovery from losses due to
congestion, coincidentally assist in recovery from losses due to
uncorrected errors as well.
2.1 Slow Start and Congestion Avoidance [RFC2581]
Slow Start and Congestion Avoidance [RFC2581] are essential to
the Internet's stability. These mechanisms were designed to
accommodate networks that didn't provide explicit congestion
notification. Although experimental mechanisms like [RFC2481]
are moving in the direction of explicit notification, the effect
of ECN on ECN-aware TCPs is essentially the same as the effect
Expires March 22, 2001 [Page 6]
INTERNET DRAFT PILC - Links with Errors September 2000
of implicit congestion notification through congestion-related
loss.
TCP connections experiencing high error rates interact badly
with Slow Start and with Congestion Avoidance, because high
error rates make the interpretation of losses ambiguous -
the sender cannot know intuitively whether detected losses are
due to congestion or to data corruption. TCP makes the "safe"
choice - assume that the losses are due to congestion.
- Whenever TCP's retransmission timer expires, the sender
assumes that the network is congested and invokes slow start.
- Less-reliable link layers often use small link MTUs. This slows
the rate of increase in the sender's window size during slow
start, because the sender's window is increased in units of
segments. Small link MTUs alone don't improve things unless
Path MTU discovery is also used to prevent fragmentation.
Path MTU discovery allows the most rapid opening
of the sender's window size during slow start, but a number of
round trips may still be required to open the window completely.
Recommendation: Slow Start and Congestion Avoidance are MUSTs in
[RFC1122], itself a full Internet Standard. Recommendations in this
document will not interfere with these mechanisms.
2.2 Fast Retransmit and Fast Recovery [RFC2581]
TCPs deliver data as a reliable byte-stream to applications, so
when a segment is lost (whether due to either congestion or
transmission loss), delivery of data to the receiving application
must wait until the missing data is received. Missing segments are
detected by the receiver by segments arriving with out-of-order
sequence numbers.
TCPs SHOULD immediately send an acknowledgement when data is
received out-of-order [RFC2581], sending the next expected
sequence number with no delay, so that the sender can retransmit
the required data and the receiver can resume delivery of data
to the receiving application. When an acknowledgement carries
the same expected sequence number as an acknowledgement that
has already been sent for the last in-order segment received,
these acknowledgements are called "duplicate ACKs".
Because IP networks are allowed to reorder packets, the receiver
may send duplicate acknowledgements for segments that arrive
out of order due to routing changes, link-level retransmission,
etc. When a TCP sender receives three duplicate ACKs, fast
Expires March 22, 2001 [Page 7]
INTERNET DRAFT PILC - Links with Errors September 2000
retransmit [RFC2581] allows it to infer that a segment was
lost. The sender retransmits what it considers to be this lost
segment without waiting for the full retransmission timeout,
thus saving time.
After a fast retransmit, a sender halves its congestion window
and invokes the fast recovery [RFC2581] algorithm, whereby
it invokes congestion avoidance, but not slow start from a
one-segment congestion window. This also saves time.
It's important to be realistic about the maximum throughput that
TCP can have over a connection that traverses a high error-rate
link. Even using Fast Retransmit/Fast Recovery, the sender will
halve the congestion window each time a window contains one or
more segments that is lost, and will re-open the window by one
additional segment for each acknowledgement that is received. If
a connection path traverses a link that loses one or more segments
during recovery, the one-half reduction takes place again, this time
on a reduced congestion window - and this downward spiral will
continue until the connection is able to recover completely without
experiencing loss.
In general, TCP can increase its congestion window beyond the
delay-bandwidth product. In links with high error rates, the
TCP window may remain rather small for long periods of time
due to any of the following reasons:
1. TCP's congestion avoidance strategy is additive-increase,
multiplicative-decrease, which means that if additional
errors are encountered before the congestion window
recovers completely from a 50-percent reduction, the
effect can be a "downward spiral" of the congestion window
due to additional 50-percent reductions. This "downward
spiral" will hold the congestion window below the capacity
of the path between the endpoints until the error rate
decreases, allowing full recovery by additive increase. Of
course, no downward spiral occurs if the error rate is
constantly high and the congestion window always remains
small.
2. If a network path with high uncorrected error rates DOES
cross a highly congested wireline Internet path,
congestion losses on the Internet have the same effect as
losses due to corruption.
Not all causes of small windows are related to errors. For
example, HTTP/1.0 commonly closes TCP connections to indicate
boundaries between requested resources. This means that these
Expires March 22, 2001 [Page 8]
INTERNET DRAFT PILC - Links with Errors September 2000
applications are constantly closing "trained" TCP connections
and opening "untrained" TCP connections which will execute slow
start, beginning with one or two segments. This can happen even
with HTTP/1.1, if webmasters configure their HTTP/1.1 servers to
close connections instead of waiting to see if the connection will
be useful again.
A small window - especially a window of less than four segments -
effectively prevents the sender from taking advantage of Fast
Retransmits. Moreover, efficient recovery from multiple losses
within a single window requires adoption of new proposals
(NewReno [RFC2582]).
Recommendation: Implement Fast Retransmit and Fast Recovery at
this time. This is a widely-implemented optimization and is
currently at Proposed Standard level. [RFC2488] recommends
implementation of Fast Retransmit/Fast Recovery in satellite
environments. In cases where SACK (see next section) can not be
enabled for both sides of a connection, NewReno [RFC2582] may be
used by TCP senders to better handle partial ACKs and multiple
losses in a single window.
2.3 Selective Acknowledgements [RFC2018, SACK-EXT]
Selective Acknowledgements allow the repair of multiple segment
losses per window without requiring one (or more) round-trips
per loss.
[SACK-EXT] proposes an extension to SACK that allows receivers
to provide more information about the order of delivery of
segments, allowing "more robust operation in an environment of
reordered packets, ACK loss, packet replication, and/or early
retransmit timeouts". [SACK-EXT] has been approved for proposed
standard as a minor but useful update to Selective
Acknowledgements. Unless explicitly stated otherwise, in this
document "Selective Acknowledgements" (or "SACK") refers to the
combination of [RFC2018] and [SACK-EXT].
Selective acknowledgements are most useful in LFNs ("Long Fat
Networks"), because of the long round trip times that may be
encountered in these environments, according to Section 1.1 of
[RFC1323], and are especially useful if large windows are
required, because there is a higher probability of multiple
segment losses per window.
On the other hand, if error rates are generally low but
occasionally increase due to interference, TCP will have the
opportunity to increase its window to larger values. When
Expires March 22, 2001 [Page 9]
INTERNET DRAFT PILC - Links with Errors September 2000
interference occurs, multiple losses within a window are likely
to occur. In this case, SACK would provide benefits in speeding
the recovery and preventing unnecessary extra reduction of
window size.
Recommendation: SACK as specified in [RFC2018] and updated by
[SACK-EXT] is a Proposed Standard. Implement SACK now for
compatibility with other TCPs.
3.0 Summary of Recommendations
Because existing TCPs have only one implicit loss feedback
mechanism, it is not possible to use this mechanism to
distinguish between congestion loss and transmission error
without additional information. Because congestion affects all
traffic on a path while transmission loss affects only the
specific traffic encountering uncorrected errors, avoiding
congestion has to take precedence over quickly repairing
transmission error. This means that the best that can be
achieved without new feedback mechanisms is minimizing the
amount of time spent unnecessarily in congestion avoidance.
Fast Retransmit/Fast Recovery allows quick repair of loss
without giving up the safety of congestion avoidance. In order
for Fast Retransmit/Fast Recovery to work, the window size must
be large enough to force the receiver to send three duplicate
acknowledgements before the retransmission timeout interval
expires, forcing full TCP slow-start.
Selective Acknowledgements (SACK) extend the benefit of Fast
Retransmit/Fast Recovery to situations where multiple segment
losses in the window need to be repaired more quickly than can
be accomplished by executing Fast Retransmit for each segment
loss, only to discover the next segment loss.
These mechanisms cover both wireless and wireline environments.
This general applicability attracts more attention and analysis
from the research community.
All of these mechanisms continue to work in the presence
of IPsec.
4.0 Topics For Further Work
Delayed Duplicate Acknowledgements is an attractive scheme,
especially when link layers use fixed retransmission timer
mechanisms that may still be trying to recover when TCP-level
retransmission timeouts occur, adding additional traffic to the
Expires March 22, 2001 [Page 10]
INTERNET DRAFT PILC - Links with Errors September 2000
network. This proposal is worthy of additional study, but is not
recommended at this time, because we don't know how to calculate
appropriate amounts of delay for an arbitrary network topology.
It is not possible to use explicit congestion notification
as a surrogate for explicit transmission error notification
(no matter how much we wish it was!). Some mechanism to
provide explicit notification of transmission error would
be very helpful. This might be more easily provided in a
PEP environment, especially when the PEP is the "first hop"
in a connection path, because current checksum mechanisms
do not distinguish between transmission error to a payload
and transmission error to the header - and, if the header is
damaged it's problematic to send explicit transmission error
notification to the right endpoints.
Losses that take place on the ACK stream, especially while a TCP
is learning network characteristics, can make the data stream
quite bursty (resulting in losses on the data stream, as well).
Several ways of limiting this burstiness have been proposed,
including "Appropriate Byte Counting" (ABC) [ALL99], TCP transmit
pacing at the sender, and ACK rate control within the network.
ABC can lead to behavior that is less bursty than standard TCP,
because the congestion window is opened by the number of bytes that
have been successfully transfered to the receiver, giving more
appropriate behavior for application protocols that initiate
connections with relatively short packets. For SMTP, for instance,
the client might send a short HELO packet, a short MAIL packet, one
or more short RCPT packets, and a short DATA packet - followed by
the entire mail body sent as maximum-length packets. ABC would not
use ACKs for each of these short packets to increase the congestion
window allowing additional full-length packets.
4.1 Achieving, and maintaining, large windows
The recommendations described in this document will aid TCPs in
injecting packets into ERRORed connections as fast as possible
without destabilizing the Internet, and so optimizing the use of
available bandwidth.
In addition to these TCP-level recommendations, there is still
additional work to do at the application level, especially with
the dominant application protocol on the World Wide Web, HTTP.
HTTP/1.0 (and its predecessor, HTTP/0.9) used TCP connection
closing to signal a receiver that all of a requested resource
had been transmitted. Because WWW objects tend to be small
Expires March 22, 2001 [Page 11]
INTERNET DRAFT PILC - Links with Errors September 2000
in size [MOGUL], TCPs carrying HTTP/1.0 traffic experience
difficulty in "training" on available bandwidth (a substantial
portion of the transfer had already happened, by the time the
TCPs got out of slow start).
Several HTTP modifications have been introduced to improve this
interaction with TCP ("persistent connections" in HTTP/1.0,
with improvements in HTTP/1.1 [RFC2616]). For a variety of
reasons, many HTTP interactions are still HTTP/1.0-style -
relatively short-lived.
Proposals which reuse TCP congestion information across
connections, like TCP Control Block Interdependence [RFC2140],
or the more recent Congestion Manager [BS99] proposal, will have
the effect of making multiple parallel connections impact the
network as if they were a single connection, "trained" after
a single startup transient. These proposals are critical to
the long-term stability of the Internet, because today's users
always have the choice of clicking on the "reload" button in
their browsers and cutting off TCP's exponential backoff -
replacing connections which are building knowledge of the
available bandwidth with connections with no knowledge at all.
5.0 Acknowledgements
This recommendation has grown out of RFC 2757, "TCP Over Long
Thin Networks", which was in turn based on work done in the IETF
TCPSAT working group. The authors are indebted to the active
members of the PILC working group. In particular, Mark Allman
gave us copious and insightful feedback. Also, Jamshid Mahdavi
provided text replacements.
Changes
Changes between versions 03 and 04:
Other editorial changes and corrections.
Changes between versions 02 and 03:
Restructure document into discussion of standard mechanisms, work
remaining to be done, and appendices on experimental mechanisms.
Change "Explicit Corruption Notification" to "Explicit Transmission
Error Notification", in order to avoid confusion with "Explicit
Congestion Notification".
Expires March 22, 2001 [Page 12]
INTERNET DRAFT PILC - Links with Errors September 2000
Other editorial changes and corrections.
Changes between versions 03 and 04:
Incorporated lots of comments from mark allman to numerous to list
here.
Also incorporated some changes suggested by Jamshid Mahdavi.
SACK-EXT is now approved for proposed. Reflected this change in
status in the text by treating SACK-EXT in the same way as
SACK.
Changed section name from Delayed Duplicate Acknowledgements to
"When TCP Defers Recovery to the Link Layer" and mentioned
Reiner Ludwig's Eifel algorithm.
Added reference to link-outage in the appendix.
Changes between versions 04 and 05:
Added section 1.3.
References
[ALL99] Mark Allman, "TCP Byte Counting Refinements," ACM
Computer Communication Review, Volume 29, Number 3, July 1999.
http://www.acm.org/sigcomm/ccr/archive/1999/jul99/ccr-9907-allman.pdf
[BBKVP96] Bakshi, B., P., Krishna, N., Vaidya, N., Pradhan, D.K.,
"Improving Performance of TCP over Wireless Networks," Technical
Report 96-014, Texas A&M University, 1996.
[BPSK96] Balakrishnan, H., Padmanabhan, V., Seshan, S., Katz, R.,
"A Comparison of Mechanisms for Improving TCP Performance over
Wireless Links," in ACM SIGCOMM, Stanford, California, August
1996.
[BS99] Hari Balakrishnan, Srinivasan Seshan, "The Congestion
Manager", July, 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/draft-ietf-ecm-cm-00.txt
[BV97] Biaz, S., Vaidya, N., "Using End-to-end Statistics to
Distinguish Congestion and Corruption Lossses: A Negative Result,"
Texas A&M University, Technical Report 97-009, August 18, 1997.
[BV98] Biaz, S., Vaidya, N., "Sender-Based heuristics for
Expires March 22, 2001 [Page 13]
INTERNET DRAFT PILC - Links with Errors September 2000
Distinguishing Congestion Losses from Wireless Transmission
Losses," Texas A&M University, Technical Report 98-013, June
1998.
[BV98a] Biaz, S., Vaidya, N., "Discriminating Congestion Losses
from Wireless Losses using Inter-Arrival Times at the Receiver,"
Texas A&M University, Technical Report 98-014, June 1998.
[HPF-CWV] Handley, M., Padhye, J., Floyd, S., "TCP Congestion
Window Validation," March 2000. Approved for informational rfc,
available at
http://search.ietf.org/internet-drafts/draft-handley-tcp-cwv-02.txt.
[LINK-OUTAGE] G. Montenegro, "Link Outage ICMP Notification,"
July 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/
draft-montenegro-pilc-link-outage-00.txt
[LK00] Reiner Ludwig and Randy Katz, "The Eifel Algorithm:
Making TCP Robust Against Spurious Retransmissions, " ACM
Computer Communication Review, Volume 30, number 1, January
2000. Available at
http://www.acm.org/sigcomm/ccr/archive/2000/jan00/
ccr-200001-ludwig.pdf
[MD95] Gabriel Montenegro and Steve Drach, "System Isolation and
Network Fast-Fail Capability in Solaris," Second USENIX
Symposium on Mobile and Location-Independent, April 1995.
http://www.usenix.org/publications/library/proceedings/mob95/
montenegro.html
[MSMO97] M. Mathis, J. Semke, J. Mahdavi, T. Ott, "The Macroscopic
Behavior of the TCP Congestion Avoidance Algorithm", Computer
Communication Review, volume 27, number 3, July 1997. Available at
http://www.acm.org/sigcomm/ccr/archive/1997/jul97/
ccr-9707-mathis.html
[MV97] Mehta, M., Vaidya, N., "Delayed Duplicate-Acknowledgements:
A Proposal to Improve Performance of TCP on Wireless Links," Texas
A&M University, December 24, 1997.
Available at http://www.cs.tamu.edu/faculty/vaidya/mobile.html
[PILC-LINK] Phil Karn, Aaron Falk, Joe Touch, Marie-Jose
Montpetit, Jamshid Mahdavi, Gabriel Montenegro, Dan Grossman,
Gorry Fairhurst, "Advice for Internet Subnetwork Designers",
July 2000. Work in progress, available at http://
www.ietf.org/internet-drafts/draft-ietf-pilc-link-design-03.txt
Expires March 22, 2001 [Page 14]
INTERNET DRAFT PILC - Links with Errors September 2000
[PILC-PEP] J. Border, M. Kojo, Jim Griner, G. Montenegro,
"Performance Implications of Link-Layer Characteristics: Performance
Enhancing Proxies", July 2000. Work in progress, available
at http://www.ietf.org/internet-drafts/draft-ietf-pilc-pep-03.txt
[PILC-SLOW] S. Dawkins, G. Montenegro, M. Kojo, V. Magret,
"Performance Implications of Link-Layer Characteristics: Slow
Links", July 2000. Work in progress, available at
http://www.ietf.org/internet-drafts/draft-ietf-pilc-slow-04.txt
[P-HTTP] "The Case for Persistent-Connection HTTP", Jeffrey
C. Mogul, Research Report 95/4, May 1995, available as
http://www.research.digital.com/wrl/techreports/abstracts/95.4.html
[RFC793] Jon Postel, "Transmission Control Protocol", September 1981.
RFC 793.
[RFC1122] Braden, R., "Requirements for Internet Hosts --
Communication Layers", October 1989. RFC 1122.
[RFC1323] Van Jacobson, Robert Braden, and David Borman. "TCP
Extensions for High Performance", May 1992. RFC 1323.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and Romanow, A.,
"TCP Selective Acknowledgment Options," October, 1996.
[RFC2140] J. Touch, "TCP Control Block Interdependence", RFC 2140,
April 1997.
[RFC2309] Braden, B. Clark, D., Crowcroft, J., Davie, B., Deering,
S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge,
C., Peterson, L., Ramakrishnan, K.K., Shenker, S., Wroclawski, J.,
Zhang, L., "Recommendations on Queue Management and Congestion
Avoidance in the Internet," RFC 2309, April 1998.
[RFC2481] Ramakrishnan, K.K., Floyd, S., "A Proposal to add Explicit
Congestion Notification (ECN) to IP", RFC 2481, January 1999.
[RFC2488] Mark Allman, Dan Glover, Luis Sanchez. "Enhancing TCP
Over Satellite Channels using Standard Mechanisms," RFC 2488
(BCP 28), January 1999.
[RFC2581] M. Allman, V. Paxson, W. Stevens, "TCP Congestion
Control," April 1999. RFC 2581.
[RFC2582] Floyd, S., Henderson, T., "The NewReno Modification to
TCP's Fast Recovery Algorithm," April 1999. RFC 2582.
Expires March 22, 2001 [Page 15]
INTERNET DRAFT PILC - Links with Errors September 2000
[RFC2616] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, Masinter,
P. Leach, T. Berners-Lee. "Hypertext Transfer Protocol -- HTTP/1.1",
RFC 2616, June 1999. (Draft Standard)
[SACK-EXT] Sally Floyd, Jamshid Mahdavi, Matt Mathis, Matthew
Podolsky, Allyn Romanow, "An Extension to the Selective
Acknowledgement (SACK) Option for TCP", August 1999. Approved
for proposed standard, available at
http://www.ietf.org/internet-drafts/draft-floyd-sack-00.txt
[SF98] Nihal K. G. Samaraweera and Godred Fairhurst, "Reinforcement
of TCP error Recovery for Wireless Communication", Computer
Communication Review, volume 28, number 2, April 1998. Available at
http://www.acm.org/sigcomm/ccr/archive/1998/apr98/
ccr-9804-samaraweera.pdf
[VJ-DCAC] Van Jacobson, "Dynamic Congestion Avoidance / Control"
e-mail dated Feberuary 11, 1988, available from
http://www.kohala.com/~rstevens/vanj.88feb11.txt
[VMPM99] N. H. Vaidya, M. Mehta, C. Perkins, G. Montenegro,
"Delayed Duplicate Acknowledgements: A TCP-Unaware Approach to
Improve Performance of TCP over Wireless," Technical Report
99-003, Computer Science Dept., Texas A&M University, February
1999.
Authors' addresses
Questions about this document may be directed to:
Spencer Dawkins
Fujitsu Network Communications
2801 Telecom Parkway
Richardson, Texas 75082
Voice: +1-972-479-3782
E-Mail: spencer.dawkins@fnc.fujitsu.com
Expires March 22, 2001 [Page 16]
INTERNET DRAFT PILC - Links with Errors September 2000
Gabriel E. Montenegro
Sun Labs Networking and Security Group
Sun Microsystems, Inc.
901 San Antonio Road
Mailstop UMPK 15-214
Mountain View, California 94303
Voice: +1-650-786-6288
Fax: +1-650-786-6445
E-Mail: gab@sun.com
Markku Kojo
University of Helsinki/Department of Computer Science
P.O. Box 26 (Teollisuuskatu 23)
FIN-00014 HELSINKI
Finland
Voice: +358-9-7084-4179
Fax: +358-9-7084-4441
E-Mail: kojo@cs.helsinki.fi
Vincent Magret
Corporate Research Center
Alcatel Network Systems, Inc
1201 Campbell
Mail stop 446-310
Richardson Texas 75081 USA
M/S 446-310
Voice: +1-972-996-2625
Fax: +1-972-996-5902
E-mail: vincent.magret@aud.alcatel.com
Nitin Vaidya
Dept. of Computer Science
Texas A&M University
College Station, TX 77843-3112
Voice: +1 409-845-0512
Fax: +1 409-847-8578
Email: vaidya@cs.tamu.edu
Expires March 22, 2001 [Page 17]
INTERNET DRAFT PILC - Links with Errors September 2000
Appendix A: When TCP Defers Recovery to the Link Layer
When link layers try aggressively to correct a high underlying
error rate, it is imperative to prevent interaction between
link-layer retransmission and TCP retransmission as these layers
duplicate each other's efforts. It may be preferable to allow a
local mechanism to resolve a local problem, instead of invoking
TCP's end-to-end mechanism and incurring the associated costs,
both in terms of wasted bandwidth and in terms of its effect on
TCP's window behavior. In such an environment it may make sense
to delay TCP's efforts so as to give the link-layer a chance to
recover. With this in mind, the Delayed Dupacks [MV97, VMPM99]
scheme selectively delays duplicate acknowledgements at the
receiver.
At this time, it is not well understood how long the receiver
should delay the duplicate acknowledgments. In particular, the
impact of medium access control (MAC) protocol on the
choice of delay parameter needs to be studied. The MAC
protocol may affect the ability to choose the appropriate
delay (either statically or dynamically). In general,
significant variabilities in link-level retransmission times
can have an adverse impact on the performance of the Delayed
Dupacks scheme.
Delayed dupacks makes very little assumptions about the TCP
implementations. If, however, one assumes that the
implementations support TCP timestamps, then other schemes are
possible. For example, the Eifel algorithm [LK00] uses timestamps
(alternatively two of the currently four unused bits in the TCP
header) to make TCP more robust in the face of spurious timeouts
and packet re-orderings.
Recommendation: Delaying duplicate acknowledgements and the
Eifel Algorithm are not standards-track mechanisms. They may be
useful in specific network topologies, but a general
recommendation requires further research and experience.
Appendix B: Detecting Transmission Errors With Explicit Notifications
As noted above, today's TCPs assume that any loss is due
to congestion, and encounter difficulty in distinguishing
between congestion loss and corruption loss because this
"implicit notification" mechanism can't carry both meanings
at once. [SF98] reports simulation results showing that
performance improvements are possible when TCP can correctly
distingush between losses due to congestion and losses due to
Expires March 22, 2001 [Page 18]
INTERNET DRAFT PILC - Links with Errors September 2000
corruption.
With explicit notification from the network it is possible to
determine when a loss is due to corruption. Several proposals
along these lines include:
- Explicit Loss Notification (ELN) [BPSK96]
- Explicit Bad State Notification (EBSN) [BBKVP96]
- Explicit Loss Notification to the Receiver (ELNR), and
Explicit Delayed Dupack Activation Notification (EDDAN)
[MV97]
- Space Communication Protocol Specification - Transport
Protocol (SCPS-TP), which uses explicit "negative
acknowledgements" to notify the sender that a damaged
packet has been received.
Similarly to notifying about corruptions affecting specific
packets, it is useful to inform of sustained interruptions in
link connectivity. These conditions can be reported with an ICMP
Host Unreachable message [LINK-OUTAGE]. IP is required to pass
any such messages up to transport layers like UDP and TCP, and
these, in turn, to applications above them [RFC1122]. What is
not clearly defined is what code within an ICMP Host Unreachable
message should be used to notify of an error condition. For
conditions of network outage, a currently unused 'host isolated'
(code 8) was introduced for routers (actually, IMP's) to inform
hosts of an outage. Additionally, [MD95] argues for the
application of 'host isolated' for notifications emanating from
a host's lower layers.
In summary, these notifications to upper layers can originate
either from within a host itself or in another host altogether.
ICMP includes the necessary information to determine the sender
of the notification, as well as a part of the datagram which
encountered the error
These proposals offer promise, but none have been proposed as
standards-track mechanisms for adoption in IETF.
Recommendation: Researchers should continue to investigate true
corruption-notification mechanisms, especially mechanisms like
ELNR and EDDAN [MV97], in which the only systems that need to be
modified are the base station and the mobile device. We also note
that the requirement that the base station be able to examine TCP
headers at link speeds raises performance issues with respect to
Expires March 22, 2001 [Page 19]
INTERNET DRAFT PILC - Links with Errors September 2000
IPsec-encrypted packets.
Appendix C Appropriate Byte Counting [ALL99] (Experimental)
Researchers have pointed out an interaction between delayed
acknowledgements and TCP acknowledgement-based self-clocking, and
various proposals have been made to improve bandwidth utilization
during slow start. One proposal, called "Appropriate Byte Counting",
increases cwnd based on the number of bytes acknowledged, instead of
the number of ACKs received. This proposal is a refinement of earlier
proposals, limits the increase in cwnd so that cwnd does not "spike"
in the presence of "stretch ACKs", which cover more than two segments
(whether this is intentional behavior by the receiver or the result
of lost ACKs), and limits cwnd growth based on byte counting to the
initial slow-start exchange.
This proposal is still at the experimental stage, but implementors
may wish to follow this work, because the effect is that the
congestion window is opening more aggressively when ACKs are lost
during the initial slow-start exchange, but this aggressiveness
does not act to the detriment of other flows.
Expires March 22, 2001 [Page 20]
| PAFTECH AB 2003-2026 | 2026-04-22 03:59:57 |