One document matched: draft-zimmermann-tcp-lcd-02.txt
Differences from draft-zimmermann-tcp-lcd-01.txt
Internet Engineering Task Force A. Zimmermann
Internet-Draft A. Hannemann
Intended status: Experimental RWTH Aachen University
Expires: February 27, 2010 August 26, 2009
Make TCP more Robust to Long Connectivity Disruptions
draft-zimmermann-tcp-lcd-02
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on February 27, 2010.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Abstract
Disruptions in end-to-end path connectivity which last longer than
one retransmission timeout cause suboptimal TCP performance. The
reason for the performance degradation is that TCP interprets segment
Zimmermann & Hannemann Expires February 27, 2010 [Page 1]
Internet-Draft Make TCP more Robust to LCDs August 2009
loss induced by connectivity disruptions as a sign of congestion,
resulting in repeated backoffs of the retransmission timer. This
leads in turn to a deferred detection of the re-establishment of the
connection since TCP waits until the next retransmission timeout
occurs before attempting the retransmission.
This document describes how standard ICMP messages can be exploited
to disambiguate true congestion loss from non-congestion loss caused
by long connectivity disruptions. Moreover, a revert strategy of the
retransmission timer is specified that enables a more prompt
detection of whether the connectivity to a previously disconnected
peer node has been restored or not. The specified algorithm is a TCP
sender-only modification that effectively improves TCP performance in
presence of connectivity disruptions.
Table of Contents
1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Connectivity Disruption Indication . . . . . . . . . . . . . . 5
4. Connectivity Disruption Reaction . . . . . . . . . . . . . . . 6
4.1. Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2. The Algorithm . . . . . . . . . . . . . . . . . . . . . . 7
4.3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . 9
4.4. Protecting Against Misbehaving Routers (the Safe
Variant) . . . . . . . . . . . . . . . . . . . . . . . . . 11
5. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 11
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
7. Security Considerations . . . . . . . . . . . . . . . . . . . 13
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
9.1. Normative References . . . . . . . . . . . . . . . . . . . 13
9.2. Informative References . . . . . . . . . . . . . . . . . . 14
Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . . 16
Appendix B. Changes from previous versions of the draft . . . . . 16
B.1. Changes from draft-zimmermann-tcp-lcd-01 . . . . . . . . . 16
B.2. Changes from draft-zimmermann-tcp-lcd-00 . . . . . . . . . 16
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
Zimmermann & Hannemann Expires February 27, 2010 [Page 2]
Internet-Draft Make TCP more Robust to LCDs August 2009
1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
As defined in [RFC0793], the term "acceptable acknowledgment (ACK)"
refers to a TCP segment that acknowledges previously unacknowledged
data. The Transmission Control Protocol (TCP) sender state variable
"SND.UNA" and the current segment variable "SEG.SEQ" are used as
defined in [RFC0793]. SND.UNA holds the segment sequence number of
earliest segment that has not been acknowledged by the TCP receiver
(the oldest outstanding segment). SEG.SEQ is the segment sequence
number of a given segment.
We use both the term "retransmission timer" and the term
"retransmission timeout (RTO)" as defined in [RFC2988].
2. Introduction
Connectivity disruptions can occur in many different situations. The
frequency of the connectivity disruptions depends thereby on the
property of the end-to-end path between the communicating hosts.
While connectivity disruptions can occur in traditional wired
networks too, e.g., simply due to an unplugged network cable, the
likelihood of occurrence is significantly higher in wireless (multi-
hop) networks. Especially, end-host mobility, network topology
changes and wireless interferences are crucial factors. In the case
of the Transmission Control Protocol (TCP) [RFC0793], the performance
of the connection can exhibit a significant reduction compared to a
permanently connected path [SESB05]. This is because TCP, which was
originally designed to operate in fixed and wired networks, generally
assumes that the end-to-end path connectivity is relatively stable
over the connection's lifetime.
According to Schuetz et. al. [I-D.schuetz-tcpm-tcp-rlci]
connectivity disruptions can be classified into two groups: "short"
and "long" connectivity disruptions. A connectivity disruption is
short if connectivity returns before the retransmission timer fires
for the first time. In this case, TCP recovers lost data segments
through Fast Retransmit and lost acknowledgments (ACK) through
successfully delivered later ACKs. Connectivity disruptions are
declared as "long" for a given TCP connection, if the retransmission
timer fires at least once before connectivity returns. Whether or
not path characteristics like the round trip time (RTT) or the
available bandwidth have changed when the connectivity returns after
a disruption is another important aspect for TCP's retransmission
Zimmermann & Hannemann Expires February 27, 2010 [Page 3]
Internet-Draft Make TCP more Robust to LCDs August 2009
scheme [I-D.schuetz-tcpm-tcp-rlci].
This document will focus on TCP's behavior in face of long
connectivity disruptions in the time "before" connectivity is
restored. In particular this memo does not describe any additional
modification to detect if the path characteristics remain unchanged
in order to improve TCP's behavior "after" connectivity is restored.
Therefore, TCP's congestion control mechanisms
[I-D.ietf-tcpm-rfc2581bis] will be unchanged.
When a long connectivity disruption occurs on a TCP connection, the
TCP sender stops receiving acknowledgments. After the retransmission
timer expires, the TCP sender enters the timeout-based loss recovery
and declares the oldest outstanding segment (SND.UNA) as lost. Since
TCP tightly couples reliability and congestion control, the
retransmission of SND.UNA is triggered together with the reduction of
sending rate, which is based on the assumption that loss is
indication of congestion [I-D.ietf-tcpm-rfc2581bis]. As long as the
connectivity disruption persists, TCP will repeat the procedure until
the oldest outstanding segment is successfully acknowledged, or the
connection times out. TCP implementations that follow the
recommended retransmission timeout (RTO) management of RFC 2988
[RFC2988] double the RTO after each retransmission attempt. However,
the RTO growth may be bounded by an upper limit, the maximum RTO,
which is at least 60s, but may be longer: Linux for example uses
120s. If the connectivity is restored between two retransmission
attempts, TCP still has to wait until the retransmission timer
expires before resuming transmission, since it simply does not have
any means to know when the connectivity is re-established.
Therefore, depending on when connectivity becomes available again,
this can waste up to maximum RTO of possible transmission time.
This retransmission behavior is not efficient, especially in
scenarios or networks like wireless (multi-hop) networks where
connectivity disruptions are frequent. In the ideal case, TCP would
attempt a retransmission as soon as connectivity to its peer is re-
established. This document describes how the standard Internet
Control Message Protocol (ICMP) can be exploited to identify non-
congestion loss caused by connectivity disruptions. An revert
strategy of the retransmission timer is specified that enables, due
to higher-frequency retransmissions, a prompt detection of whether
connectivity to a previously disconnected peer node has been
restored. The specified scheme is a TCP sender-only modification,
i.e., neither intermediate routers nor the TCP receiver have to be
modified. Furthermore, in the case the network allows, i.e., no
congestion is present, the proposed algorithm approaches the ideal
behavior.
Zimmermann & Hannemann Expires February 27, 2010 [Page 4]
Internet-Draft Make TCP more Robust to LCDs August 2009
3. Connectivity Disruption Indication
As long as the queue of an intermediate router experiencing a link
outage is deep enough, i.e., it can buffer all incoming packets, a
connectivity disruption will only cause variation in delay which is
handled well by contemporary TCP implementations with the help of
Eifel [RFC3522] or forward RTO (F-RTO) [I-D.ietf-tcpm-rfc4138bis].
However, if the link outage lasts too long, the router experiencing
the link outage is forced to drop packets and finally to discard the
according route. Means to detect such link outages comprise reacting
on failed address resolution protocol (ARP) [RFC0826] queries,
unsuccessful link sensing, and the like. However, this is solely in
the responsibility of the respective router.
Note: The focus of this memo is on introducing a method how ICMP
messages may be exploited to improve TCP's performance; how
different physical and link layer mechanisms underneath the
network layer may trigger ICMP destination unreachable messages
are out of scope of this memo.
The removal of the route usually goes along with a notification to
the corresponding TCP sender about the dropped packets via ICMP
destination unreachable messages of code 0 (net unreachable) or code
1 (host unreachable) [RFC1812]. Therefore, since ICMP destination
unreachable messages of these codes provide evidence that packets
were dropped due to a link outage, they can be used by a TCP as an
indication for a connectivity disruption.
Note that there are also other ICMP destination unreachable messages
with different codes. Some of them are candidates for connectivity
disruption indications too, but need further investigation. For
example ICMP destination unreachable messages with code 5 (source
route failed), code 11 (net unreachable for TOS), or code 12 (host
unreachable for TOS) [RFC1812]. On the other side codes that flag
hard errors are of no use for the proposed scheme, since TCP should
abort the connection when those are received [RFC1122]. In the
following, the term "ICMP unreachable message" is used as synonym for
ICMP destination unreachable messages of code 0 or code 1.
The accurate interpretation of ICMP unreachable messages as an
connectivity disruption indication is complicated by the following
two peculiarities of ICMP messages. Firstly, they do not necessarily
operate on the same timescale as the packets, i.e., in the given case
TCP segments, which elicited them. When a router drops a packet due
to a missing route it will not necessarily send an ICMP unreachable
message immediately, but rather queues it for later delivery.
Secondly, ICMP messages are subject to rate limiting, e.g., when a
router drops a whole window of data due to a link outage, it will
Zimmermann & Hannemann Expires February 27, 2010 [Page 5]
Internet-Draft Make TCP more Robust to LCDs August 2009
hardly send as many ICMP unreachable messages as it dropped TCP
segments. Depending on the load of the router it may even send no
ICMP unreachable messages at all. Both peculiarities originate from
[RFC1812].
Fortunately, according to [RFC0792] ICMP unreachable messages are
obliged to contain in their body the Internet Protocol (IP) header
[RFC0791] of the datagram eliciting the ICMP unreachable messages
plus the first 64 bits of the payload of that datagram. Hence, in
case of TCP both port numbers and the sequence number are included.
This allows the originating TCP to identify the connection which an
ICMP unreachable message is reporting an error about. Moreover, it
allows the originating TCP to identify which segment of the
respective connection triggered the ICMP unreachable message,
provided that there are not several segments in flight with the same
sequence number. This may very well be the case when TCP is
recovering lost segments (see Section 4.3).
A connectivity disruption indication in form of an ICMP unreachable
message associated with a presumably lost TCP segment provides strong
evidence that the segment was not dropped due to congestion but
instead was successful delivered to the temporary end-point of the
employed path, i.e., the reporting router. It therefore did not
witness any congestion at least on that very part of the path which
was traveled by both, the TCP segment eliciting the ICMP unreachable
message as well as the ICMP unreachable message itself.
4. Connectivity Disruption Reaction
In Section 4.1 the basic idea of the algorithm is given. The
complete algorithm is specified in Section 4.2. In Section 4.3 the
algorithm is discussed in detail.
4.1. Basic Idea
The goal of the algorithm is the prompt detection when the
connectivity to a previously disconnected peer node has been restored
after a long connectivity disruption while retaining appropriate
behavior in case of congestion. The proposed algorithm exploits
standard ICMP unreachable messages to increase the TCP's
retransmission frequency during timeout-based loss recovery by
undoing one retransmission timer backoff whenever an ICMP unreachable
message reports on a presumably lost retransmission.
This approach has the advantage of appropriately reducing the probing
rate in case of congestion. If either the (re-)transmission itself,
or the corresponding ICMP message is dropped the conventional backoff
Zimmermann & Hannemann Expires February 27, 2010 [Page 6]
Internet-Draft Make TCP more Robust to LCDs August 2009
is performed and not undone, effectively halving the probing rate.
4.2. The Algorithm
A TCP sender using RFC 2988 [RFC2988] to compute TCP's retransmission
timer MAY employ the following scheme to avoid over-conservative
backoffs of the retransmission timer in case of long connectivity
disruptions. If a TCP sender does implement the scheme, the
following steps MUST be taken, but only upon initiation of a timeout-
based loss recovery, i.e., upon the first timeout of the oldest
outstanding segment (SND.UNA). The algorithm MUST NOT be re-
initiated after a timeout-based loss recovery has already been
started but not completed. In particular, it must not be re-
initiated upon subsequent timeouts for the same segment.
A TCP sender that does not employ RFC 2988 [RFC2988] to compute TCP's
retransmission timer SHOULD NOT use the scheme. We envision that the
scheme could be easily adapted to other algorithms than RFC 2988.
However, we leave this as future work.
The scheme specified in this document uses the "Backoff_cnt"
variable, whose initial value is zero. The variable is used to count
the number of performed retransmission timer backoffs during one
timeout-based loss recovery. Moreover, the "RTO_base" variable is
used to recover the previous RTO in case the retransmission timer
backoff was unnecessary. The variable is initialized with the RTO
upon initiation of timeout-based loss recovery.
(1) Before the variable RTO gets updated when timeout-based loss
recovery is initiated, set the variable "Backoff_cnt" and the
variable "RTO_base" as follows:
Backoff_cnt := 0;
RTO_base := RTO.
Proceed to step (R).
(R) This is a placeholder for the behavior that a standard TCP must
execute at this point in case the retransmission timer is
expired. In particular if RFC 2988 [RFC2988] is used, steps
(5.4) - (5.6) of that algorithm go here. Proceed to step (2).
(2) If the retransmission timer was backed off in the previous step
(R), then increment the variable "Backoff_cnt" by one to account
for the new backoff
Backoff_cnt := Backoff_cnt + 1.
Zimmermann & Hannemann Expires February 27, 2010 [Page 7]
Internet-Draft Make TCP more Robust to LCDs August 2009
(3) Wait either
for the expiration of the retransmission timer. When the
retransmission timer expires, proceed to step (R);
or for the arrival of an acceptable ACK. When an acceptable
ACK arrives, proceed to step (A);
or for the arrival of an ICMP unreachable message. When the
ICMP unreachable message ICMP_DU arrives, proceed to step
(4).
(4) If "Backoff_cnt > 0", i.e., an undoing of the last
retransmission timer backoff is allowed, then
proceed to step (5);
else
proceed to step (3).
(5) Extract the TCP segment header included in the ICMP destination
unreachable message ICMP_DU
SEG := Extract(ICMP_DU).
(6) If "SEG.SEQ == SND.UNA", i.e., the ICMP unreachable ICMP_DU
message reports on the oldest outstanding segment, then undo the
last retransmission timer backoff
Backoff_cnt := Backoff_cnt - 1;
RTO := RTO_base * 2^(Backoff_cnt).
(7) If the retransmission timer expires due to the undoing in the
previous step (6), then
proceed to step (R);
else
proceed to step (3).
(A) This is a placeholder for the standard TCP behavior that must be
executed at this point in the case an acceptable ACK has
arrived. No further processing.
When a TCP in steady-state detects a segment loss using the
retransmission timer it enters the timeout-based loss recovery and
Zimmermann & Hannemann Expires February 27, 2010 [Page 8]
Internet-Draft Make TCP more Robust to LCDs August 2009
initiates the algorithm (step 1). It adjusts the slow start
threshold (ssthresh), sets the congestion window (CWND) to one
segment, back offs the retransmission timer and retransmits the first
unacknowledged segment (step R) [I-D.ietf-tcpm-rfc2581bis] [RFC2988].
In case the retransmission timer expires again (step 3a) a TCP will
repeat the retransmission of the first unacknowledged segment and
back off the retransmission timer once more (step R). If a maximum
value is placed on the RTO (rule 2.5 in [RFC2988]) and that maximum
value is already reached the TCP will not backoff the retransmission
timer in this step and thus "Backoff_cnt" MUST NOT be incremented.
However, the "last step" to reach this maximum RTO is still
considered as a backoff in the scope of this algorithm and
"Backoff_cnt" MUST be incremented, even if the RTO is not strictly
doubled.
If the first received packet after the retransmission(s) is an
acceptable ACK (step 3b), a TCP will proceed as normal, i.e., slow
start the connection and terminate the algorithm (step A). Later
ICMP unreachable messages from the just terminated timeout-based loss
recovery are of no use and therefore ignored since the ACK clock is
already restarting due to the successful retransmission.
On the other side if the first received packet after the
retransmission(s) is an ICMP unreachable message (step 3c), a TCP
SHOULD if allowed (step 4) undo one backoff for each ICMP unreachable
message reporting an error on a retransmission. To decide if an ICMP
unreachable message reports on a retransmission, the sequence number
therein is exploited (step 5, step 6). The undo is done by re-
calculating the RTO with the previously reduced "Backoff_cnt". This
calculation explicitly matches the exponential backoff specified in
[RFC2988] (rule 5.5).
Upon receipt of an ICMP unreachable message which legitimately undoes
one backoff there is the possibility that this new started
retransmission timer has expired already (step 7). Then, a TCP
SHOULD retransmit immediately, i.e., an ICMP message clocked
retransmission. In case the new started retransmission timer has not
expired yet, TCP MUST wait accordingly.
4.3. Discussion
It is important to note that the proposed algorithm only reacts to
connectivity disruption indications in form of ICMP destination
unreachable messages during the phase of RTO induced loss recovery.
That is, TCP's behavior is not altered when no ICMP unreachable
messages are received, or the retransmission timer of the TCP sender
did not yet expire since the last successfully received ACK. Thereby
Zimmermann & Hannemann Expires February 27, 2010 [Page 9]
Internet-Draft Make TCP more Robust to LCDs August 2009
the algorithm is by definition only triggered in the case of long
connectivity disruptions.
Only such ICMP unreachable messages which are reporting on the
sequence number of the retransmission (SND.UNA) are evaluated by the
proposed algorithm. All other ICMP unreachable messages are ignored.
If an ICMP unreachable message arrives for a retransmission it
provides evidence that neither the retransmission nor the
corresponding ICMP unreachable message itself did experience any
congestion. In other words, it has been proved that the
retransmission was not lost due to congestion, but due to a
connectivity disruption instead.
One could argue, that if an ICMP unreachable message arrives for an
RTO induced retransmission, the RTO should be reset, and the next
retransmission sent out immediately similar to what is done when an
ACK arrives after an RTO induced recovery phase. This would allow
for a much higher probing frequency based on the round trip time of
the router where the connectivity is disrupted. However, we consider
our proposed scheme a good trade off between conservative behavior
and a fast detection of connectivity re-establishment.
Of course there is an ambiguity on which (re-)transmission an ICMP
unreachable message reports. However, for our purposes it is not
considered to be problem, because the assumption that such an ICMP
message provides evidence that one link loss was wrongly considered
as a congestion loss, still holds. There is also the option to make
use of the timestamps option to obtain a more strict mapping between
segments and ICMP messages (see Section 4.3).
Besides the ambiguity if the first unacknowledged sequence number
refers to the original transmission or to any of the retransmissions,
there is another source of ambiguity about the sequence numbers
contained in the ICMP unreachable messages. For high bandwidth paths
like modern gigabit links the sequence space may wrap rather quickly,
thereby allowing the possibility that a late ICMP unreachable message
reporting on an old error may coincidentally fit as input in the
scheme explained above. As a result, the scheme would wrongly undo
one backoff. Chances for this to happen are minuscule, since a
particular ICMP message would need to contain the exact sequence
number of SND.UNA, while at the same TCP is coincidentally in
timeout-based loss recovery. Moreover, as the scheme is tailored
most conservatively no threat to the network from this issues may
arise.
Finally, the scheme explicitly does not call for a differentiation of
ICMP unreachable messages originating from different routers, as the
evidence of no congestion still holds even if the reporting router
Zimmermann & Hannemann Expires February 27, 2010 [Page 10]
Internet-Draft Make TCP more Robust to LCDs August 2009
changed.
Another exploitation of ICMP unreachable messages in the context of
TCP congestion control might seem appropriate in case the ICMP
unreachable message is received while TCP is in steady-state and the
message refers to a segment from within the current window of data.
As the RTT up to the router which generates the ICMP unreachable
message is likely to be substantially shorter than the overall RTT to
the destination, the ICMP unreachable message may very well reach the
originating TCP while it is transmitting the current window of data.
In case the remaining window is large, it might seem appropriate to
refrain from transmitting the remaining window as there is timely
evidence that it will only trigger further ICMP unreachable messages
at the very router. Although this might seem appropriate from a
wastage perspective, it may be counterproductive from a security
perspective since ICMP message are easy to spoof, thereby allowing an
easy attack to the TCP by simply forging such ICMP messages.
An additional consideration is the following: in the presence of
multi-path routing even the receipt of a legitimate ICMP unreachable
message cannot be exploited accurately because there is the option
that only one of the multiple paths to the destination is suffering
from a connectivity disruption which causes ICMP unreachable messages
to be sent. Then however, there is the possibility that the path
along which the connectivity disruption occurred contributed
considerably to the overall bandwidth, such that a congestion
response is very well reasonable. However, this is not necessarily
the case. Therefore, a TCP has no means except for its inherent
congestion control to decide on this matter. All in all, it seems
that for a connection in steady-state, i.e., not in RTO induced
recovery, reacting on ICMP unreachable messages in regard to
congestion control is not appropriate. For the case of RTO-based
retransmissions, however, there is a reasonable congestion response,
which is skipping further backoffs of the retransmission timer
because there is no congestion indication - as described above.
4.4. Protecting Against Misbehaving Routers (the Safe Variant)
Given that the TCP Timestamps option [I-D.ietf-tcpm-1323bis] is
enabled for a connection, a TCP sender MAY use the following
algorithm to protect against misbehaving routers.
5. Related Work
In literature there are several methods that address TCP's problems
in the presence of connectivity disruptions. Some of them try to
improve TCP's performance by modifying lower layers. For example
Zimmermann & Hannemann Expires February 27, 2010 [Page 11]
Internet-Draft Make TCP more Robust to LCDs August 2009
[SM03] introduces a "smart link layer" that buffers one segment for
each ongoing connection and replaying these segments on connectivity
re-establishment. This approach has a serious drawback: previously
stateless intermediate routers have to be modified in order to
inspect TCP headers, to track the end-to-end connection and to
provide additional buffer space. These lead all in all to an
additional need of memory and processing power.
On the other hand stateless link layer schemes, like proposed in
[RFC3819], which unconditionally buffer some small number of packets
may have another problem: if a packet is buffered longer than the
maximum segment lifetime (MSL) of 2 min [RFC0793], i.e., the
disconnection lasts longer than MSL, TCP's assumption that such
segments will never be received will no longer be true, violating
TCP's semantics [I-D.eggert-tcpm-tcp-retransmit-now].
Other approaches like TCP-F [CRVP01] or the Explicit Link Failure
Notification (ELFN) [HV02] inform the TCP sender about a disrupted
path by special messages generated from intermediate routers. In
case of a link failure they stop sending segments and freeze TCP's
retransmission timers. TCP-F stays in this state and remains silent
until either a "route establishment notification" is received or an
internal timer expires. In contrast, ELFN periodically probes the
network to detect connectivity re-establishment. Both proposals rely
on changes to intermediate routers, whereas the scheme proposed in
this document is a sender-only modification. Moreover, ELFN also
does not consider congestion and may impose serious additional load
on the network, depending on the probe interval.
The authors of ATCP [LS01] propose enhancements to identify different
types of packet loss by introducing a layer between TCP and IP. They
utilize ICMP destination unreachable messages to set TCP's receiver
advertised window to zero and thus forcing the TCP sender to perform
zero window probing with a exponential backoff. ICMP destination
unreachable messages, which arrive during this probing period, are
ignored. This approach is nearly orthogonal to this document, which
exploits ICMP messages to undo a retransmission timer backoff when
TCP is already probing. In principle both mechanisms could be
combined, however, due to security considerations it does not seem
appropriate to adopt ATCP's reaction as discussed in Section 4.3.
Schuetz et al. describe in [I-D.schuetz-tcpm-tcp-rlci] a set of TCP
extensions that improve TCP's behavior when transmitting over paths
whose characteristics can change on short time-scales. Their
proposed extensions modify the local behavior of TCP and introduce a
new TCP option to signal locally received connectivity-change
indications (CCIs) to remote peers. Upon reception of a CCI, they
re-probe the path characteristics either by performing a speculative
Zimmermann & Hannemann Expires February 27, 2010 [Page 12]
Internet-Draft Make TCP more Robust to LCDs August 2009
retransmission or by sending a single segment of new data, depending
on whether the connection is currently stalled in exponential backoff
or transmitting in steady-state, respectively. The authors focus on
specifying TCP response mechanisms, nevertheless underlying layers
would have to be modified to explicitly send CCIs to make these
immediate responses possible.
6. IANA Considerations
This memo includes no request to IANA.
7. Security Considerations
The proposed algorithm is considered to be secure. For example an
attacker cannot make a TCP modified with proposed scheme flood the
network just by sending forged ICMP unreachable messages to attempt
to maliciously shorten the retransmission timer. An attacker would
need to guess the correct sequence number of the current
retransmission, which seems very unlikely. Even in case of an
omniscient attacker, the impact on network load would be low, since
the retransmission frequency is limited by the RTO which was computed
before TCP has entered the timeout-based loss recovery. (The highest
probing frequency is expected to be even lower than once per minimum
RTO, that is 1s as specified by [RFC2988].)
8. Acknowledgments
We would like to thank Timothy Shepard and Joe Touch for feedback on
earlier versions of this draft. We also thank Michael Faber, Daniel
Schaffrath, and Damian Lukowski for implementing and testing the
algorithm in Linux. Special thanks go to Ilpo Jarvinen, who gave
valuable feedback regarding the Linux implementation.
This document was written with the xml2rfc tool described in
[RFC2629].
9. References
9.1. Normative References
[I-D.ietf-tcpm-1323bis]
Borman, D., Braden, R., and V. Jacobson, "TCP Extensions
for High Performance", draft-ietf-tcpm-1323bis-01 (work in
progress), March 2009.
Zimmermann & Hannemann Expires February 27, 2010 [Page 13]
Internet-Draft Make TCP more Robust to LCDs August 2009
[I-D.ietf-tcpm-rfc2581bis]
Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", draft-ietf-tcpm-rfc2581bis-07 (work in
progress), July 2009.
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, September 1981.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, September 1981.
[RFC1812] Baker, F., "Requirements for IP Version 4 Routers",
RFC 1812, June 1995.
[RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission
Timer", RFC 2988, November 2000.
[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control
Message Protocol (ICMPv6) for the Internet Protocol
Version 6 (IPv6) Specification", RFC 4443, March 2006.
9.2. Informative References
[CRVP01] Chandran, K., Raghunathan, S., Venkatesan, S., and R.
Prakash, "A feedback-based scheme for improving TCP
performance in ad hoc wireless networks", IEEE Personal
Communications vol. 8, no. 1, pp. 34-39, February 2001.
[HV02] Holland, G. and N. Vaidya, "Analysis of TCP performance
over mobile ad hoc networks", Wireless Networks vol. 8,
no. 2-3, pp. 275-288, March 2002.
[I-D.eggert-tcpm-tcp-retransmit-now]
Eggert, L., "TCP Extensions for Immediate
Retransmissions", draft-eggert-tcpm-tcp-retransmit-now-02
(work in progress), June 2005.
[I-D.ietf-tcpm-rfc4138bis]
Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata,
"Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP",
draft-ietf-tcpm-rfc4138bis-04 (work in progress),
October 2008.
[I-D.schuetz-tcpm-tcp-rlci]
Schuetz, S., Koutsianas, N., Eggert, L., Eddy, W., Swami,
Y., and K. Le, "TCP Response to Lower-Layer Connectivity-
Change Indications", draft-schuetz-tcpm-tcp-rlci-03 (work
Zimmermann & Hannemann Expires February 27, 2010 [Page 14]
Internet-Draft Make TCP more Robust to LCDs August 2009
in progress), February 2008.
[LS01] Liu, J. and S. Singh, "ATCP: TCP for mobile ad hoc
networks", IEEE Journal on Selected Areas in
Communications vol. 19, no. 7, pp. 1300-1315, 2001 July.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
September 1981.
[RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or
converting network protocol addresses to 48.bit Ethernet
address for transmission on Ethernet hardware", STD 37,
RFC 826, November 1982.
[RFC1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, October 1989.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
June 1999.
[RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
for TCP", RFC 3522, April 2003.
[RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
Wood, "Advice for Internet Subnetwork Designers", BCP 89,
RFC 3819, July 2004.
[RFC4884] Bonica, R., Gan, D., Tappan, D., and C. Pignataro,
"Extended ICMP to Support Multi-Part Messages", RFC 4884,
April 2007.
[SESB05] Schuetz, S., Eggert, L., Schmid, S., and M. Brunner,
"Protocol enhancements for intermittently connected
hosts", SIGCOMM Computer Communication Review vol. 35, no.
3, pp. 5-18, December 2005.
[SM03] Scott, J. and G. Mapp, "Link layer-based TCP optimisation
for disconnecting networks", SIGCOMM Computer
Communication Review vol. 33, no. 5, pp. 31-42,
October 2003.
Zimmermann & Hannemann Expires February 27, 2010 [Page 15]
Internet-Draft Make TCP more Robust to LCDs August 2009
Appendix A. TODO list
o Extend the Security Sections 4.4 and 7.
o Extend discussion in Section 4.3
* ICMPv6. See [RFC4443] and [RFC4884].
* Explicit Congestion Notification (ECN).
* More about congestion in general.
o Mention the possible side-effect on TCP implementations that
measure the thresholds R1 and R2 (Section 4.2.3.5 of [RFC1122]) as
a count of retransmissions instead of time units.
o Discuss the influence of packet duplication on the algorithm
(Thanks to Ilpo).
Appendix B. Changes from previous versions of the draft
B.1. Changes from draft-zimmermann-tcp-lcd-01
o The algorithm in Section 4.2 was slightly changed. Instead of
reverting the RTO by halving it, it is recalculated with help of
the "Backoff_cnt" variable. This fixes an issue that occurred
when the retransmission timer was backed off but bounded by a
maximum value. The algorithm in the previous version of the
draft, would have "reverted" to half of that maximum value,
instead of using the value, before the RTO was doubled (and then
bounded).
o Miscellaneous editorial changes.
o Extended the TODO list (Appendix A).
B.2. Changes from draft-zimmermann-tcp-lcd-00
o Miscellaneous editorial changes in Section 1, 2 and 3.
o The document was restructured in Section 1, 2 and 3 for easier
reading. The motivation for the algorithm is changed according
TCP's problem to disambiguate congestion from non-congestion loss.
o Added Section 4.1.
Zimmermann & Hannemann Expires February 27, 2010 [Page 16]
Internet-Draft Make TCP more Robust to LCDs August 2009
o The algorithm in Section 4.2 was restructured and simplified:
* The special case of the first received ICMP destination
unreachable message after an RTO was removed.
* The "Backoff_cnt" variable was introduced so it is no longer
possible to perform more reverts than backoffs.
o The discussion in Section 4.3 was improved and expanded according
to the algorithm changes.
o Added Section 4.4.
Authors' Addresses
Alexander Zimmermann
RWTH Aachen University
Ahornstrasse 55
Aachen, 52074
Germany
Phone: +49 241 80 21422
Email: zimmermann@cs.rwth-aachen.de
Arnd Hannemann
RWTH Aachen University
Ahornstrasse 55
Aachen, 52074
Germany
Phone: +49 241 80 21423
Email: hannemann@nets.rwth-aachen.de
Zimmermann & Hannemann Expires February 27, 2010 [Page 17]
| PAFTECH AB 2003-2026 | 2026-04-21 21:19:46 |