One document matched: draft-ietf-ledbat-congestion-10.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc5681 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5681.xml'>
<!ENTITY rfc6298 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6298.xml'>
<!ENTITY rfc4821 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4821.xml'>
<!ENTITY rfc3168 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3168.xml'>
<!ENTITY rfc5905 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5905.xml'>
<!ENTITY rfc6297 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6297.xml'>
]>
<rfc category="exp" ipr="trust200902"
docName="draft-ietf-ledbat-congestion-10.txt">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes" ?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<front>
<title abbrev="LEDBAT">Low Extra Delay Background Transport (LEDBAT)</title>
<author initials='S' surname="Shalunov" fullname='Stanislav Shalunov'>
<organization>BitTorrent Inc</organization>
<address>
<postal>
<street>612 Howard St, Suite 400</street>
<city>San Francisco</city> <region>CA</region> <code>94105</code>
<country>USA</country>
</postal>
<email>shalunov@shlang.com</email>
<uri>http://shlang.com</uri>
</address>
</author>
<author initials='G' surname="Hazel" fullname='Greg Hazel'>
<organization>BitTorrent Inc</organization>
<address>
<postal>
<street>612 Howard St, Suite 400</street>
<city>San Francisco</city> <region>CA</region> <code>94105</code>
<country>USA</country>
</postal>
<email>greg@bittorrent.com</email>
</address>
</author>
<author initials='J' surname="Iyengar" fullname='Janardhan Iyengar'>
<organization>Franklin and Marshall College</organization>
<address>
<postal>
<street>415 Harrisburg Ave.</street>
<city>Lancaster</city> <region>PA</region> <code>17603</code>
<country>USA</country>
</postal>
<email>jiyengar@fandm.edu</email>
</address>
</author>
<author initials='M' surname="Kühlewind" fullname='Mirja Kühlewind'>
<organization>University of Stuttgart</organization>
<address>
<postal>
<street></street>
<code></code><city>Stuttgart</city>
<country>DE</country>
</postal>
<email>mirja.kuehlewind@ikr.uni-stuttgart.de</email>
</address>
</author>
<date/>
<area>Transport</area>
<workgroup>LEDBAT WG</workgroup>
<abstract>
<t>LEDBAT is an experimental delay-based congestion control algorithm
that seeks to utilize the available bandwidth on an end-to-end path
while limiting the consequent increase in queueing delay on that path.
LEDBAT uses changes in one-way delay measurements
to limit congestion that the flow itself induces in the network.
LEDBAT is designed for use by background bulk-transfer applications
to be no more aggressive than standard TCP congestion control (as specified in RFC5681)
and to yield in the presence of competing flows,
thus limiting interference with the network performance of competing flows.</t>
</abstract>
</front>
<middle>
<section title="Requirements notation">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
and "OPTIONAL" in this document are to be interpreted as
described in <xref target="RFC2119"/>.</t>
</section>
<section title="Introduction">
<t>
TCP congestion control <xref target="RFC5681"/>
seeks to share bandwidth at a bottleneck link equitably
among flows competing at the bottleneck,
and it is
the predominant congestion control mechanism used on the Internet.
However, not all applications seek an equitable share
of network throughput.
"Background" applications,
such as software updates or file-sharing applications,
seek to operate
without interfering with the performance of
more interactive and delay- and/or bandwidth-sensitive "foreground"
applications. Standard TCP congestion control,
as specified in <xref target="RFC5681"/>,
may be too aggressive for use with such background applications.
</t>
<t> LEDBAT is an experimental delay-based congestion control mechanism
that reacts early to congestion in the network,
thus enabling "background" applications to use the network
while avoiding interference with the network performance of competing flows.
A LEDBAT sender
uses one-way delay measurements to estimate the amount of queueing on the data path,
controls the LEDBAT flow's congestion window based on this estimate, and
minimizes interference with competing flows
by adding low extra queueing delay on the end-to-end path.
</t>
<t>
Delay-based congestion control protocols,
such as TCP-Vegas <xref target="Bra94"/><xref target="Low02"/>,
are generally designed to achieve more, not less throughput than
standard TCP,
and often outperform TCP under particular network settings.
For further discussion on Lower-than-Best-Effort (BE) transport
protocols see <xref target="RFC6297"/>.
In contrast, LEDBAT is designed to be no more aggressive than TCP
<xref target="RFC5681"/>;
LEDBAT is a "scavenger" congestion control mechanism
that seeks to utilize all available bandwidth
and yields quickly when competing with standard TCP at a bottleneck link.
</t>
<t>
In the rest of this document, we refer to <xref target="RFC5681"/>-specified congestion control as "standard TCP".
</t>
<!--
Other than TCP Vegas, the LEDBAT appraoch is based on changes in
the one-way delay (OWD) instead of Round-Trip Time (RTT). This avoids that
additional delays on the backchannel influence the sent-out decision.
Additionally, LEDBAT maintains an low extra delay to operate more stable.
Moreover, LEDBAT is designed to be not more aggressive than standard
TCP as LEDBAT will additionally react to loss as a congestion signal.
</t>
<t>
TCP Vegas [Bra94] is one of the first congestion control mechanisms known to have a
smaller sending rate than standard TCP when both protocols share a bottleneck
[Kur00] - yet it was designed to achieve more, not less throughput than
standard TCP. Indeed, when it is the only protocol on the bottleneck, the
throughput of TCP Vegas is greater than the throughput of standard TCP.
</t>
The predominant congestion control mechanism used on the Internet,
TCP congestion control [XXXRFC2581],
requires data loss to detect congestion.
A TCP sender increases its congestion window [XXXRFC2581]
until a loss occurs,
which,
in the absence of Active Queue Management (AQM),
occurs only
when the queue at the
bottleneck link
on the end-to-end path overflows.
Even with AQM,
The queueing delay at the bottleneck link
increases significantly before
TCP responds to congestion at the tight link.
This increased delay can be significant;
default parameters on customer-side ADSL modems [XXXcite]
can result in seconds of queueing delay on the ADSL uplink alone.
While these large queueing delays have no known
benefit, they have substantial drawbacks for interactive
applications - "lag" increases for interactive games,
and voice/video communication suffers from the consequently high roundtrip times.
-->
<!--
It has been deployed by BitTorrent
in the wild first with the BitTorrent DNA client (a P2P-based CDN) and now with the uTorrent
client. This mechanism not only allows to keep delay across a bottleneck low, but also
yields quickly in the presence of competing traffic with loss-based congestion
control.</t>
<t>Beyond its utility for P2P, LEDBAT enables other advanced networking applications to
better get out of the way of interactive apps.</t>
<t>In addition to direct and immediate benefits for P2P and other application that can
benefit from scavenger service, LEDBAT could point the way for a possible future evolution
of the Internet where loss is not part of the designed behavior and delay is
minimized.</t>
-->
<section title="Design Goals">
<t>LEDBAT congestion control seeks to achieve the following goals:
<list style="numbers">
<t> to utilize end-to-end available bandwidth, and to maintain
low queueing delay when no other traffic is present, </t>
<t> to add limited queuing delay to that induced by concurrent flows, and</t>
<t> to yield quickly to standard TCP flows
that share the same bottleneck link.</t>
</list>
</t>
</section>
<section title="Applicability">
<t> LEDBAT is a "scavenger" congestion control mechanism that is
motivated primarily by for background bulk-transfer applications,
such as
large file transfers (as with file-sharing applications)
and
software updates.
It can be used with any application
that seeks to minimize its impact on the network
and on other interactive delay- and/or bandwidth-sensitive network applications.
LEDBAT is expected to work well when the sender and/or receiver
is connected via a residential access network.
</t>
<t> LEDBAT can be used as part of a
transport protocol or as part of an application,
as long as the data transmission
mechanisms are capable of carrying timestamps and
acknowledging data frequently.
LEDBAT can be used
with TCP, SCTP, and DCCP,
with appropriate extensions where necessary,
and with proprietary application protocols,
such as those built on top of UDP for P2P applications.
</t>
<t>
When used with an ECN-capable framing protocol,
LEDBAT should react to an ECN mark as it would to a loss,
as specified in <xref target="RFC3168"/>.</t>
<t> LEDBAT is designed to reduce build-up
of a standing queue by long-lived LEDBAT flows
at a link with a tail-drop FIFO queue,
so as to avoid persistently delaying other flows
sharing the queue.
If Active Queue Management (AQM) is configured to
drop or ECN-mark packets before the
LEDBAT flow starts reacting to persistent queue build-up,
LEDBAT reverts to standard TCP behavior rather than
yielding to other TCP flows.
However, such an AQM is still desirable since it keeps
queuing delay low, achieving an outcome that is in line with LEDBAT's goals.
Additionally,
a LEDBAT transport that supports ECN enjoys
the advantages that an ECN-capable TCP enjoys over
an ECN-agnostic TCP; avoiding losses and possible retransmissions.
Weighted Fair Queuing (WFQ), as employed by some home gateways,
seeks to isolate and protect delay-sensitive flows from delays due to
standing queues built up by concurrent long-lived flows.
Consequently, while it prevents LEDBAT from yielding to other TCP flows,
it again achieves an outcome
that is in line with LEDBAT's goals <xref target="Sch10"/>.
</t>
</section>
</section>
<section title="LEDBAT Congestion Control">
<section title="Overview">
<t> A standard TCP sender increases its congestion window
until a loss occurs <xref target="RFC5681"/> or
an ECN mark is received <xref target="RFC3168"/>,
which, in the absence of any AQM and
link errors in the network, occurs only when the queue at the bottleneck link
on the end-to-end path overflows.
Since packet loss or marking at the bottleneck link is expected to
be preceded by an increase in the queueing delay at the bottleneck link,
LEDBAT congestion control uses this increase in queueing delay as an early
signal of congestion,
enabling it to respond to congestion earlier than standard TCP,
and enabling it to yield bandwidth to a competing TCP flow.
</t>
<t> LEDBAT employs one-way delay measurements
to estimate queueing delay.
When the estimated queueing delay
is less than a pre-determined target,
LEDBAT infers that the network is not yet congested,
and increases its sending rate to utilize any spare capacity in the network.
When the estimated queueing delay
becomes greater than the pre-determined target,
LEDBAT decreases its sending rate
as a response to potential congestion in the network.
</t>
</section>
<section title="Preliminaries">
<t>
<!-- (JRI: We don't need to assume fixed-size segments since bytes_acked takes care of it.)
For the purposes of explaining LEDBAT,
we assume a transport sender that uses fixed-size
segments and a receiver that acknowledges each segment separately.
It is straightforward to apply the mechanisms described here
with variable-sized segments
and with delayed acknowledgments. -->
A LEDBAT sender uses a congestion window (cwnd)
to gate the amount of data that the sender can send into the network in one roundtrip time (RTT).
A sender MAY maintain its cwnd in bytes or in packets;
this document uses cwnd in bytes.
LEDBAT requires that each data segment carries a "timestamp" from the sender,
based on which the receiver computes the one-way delay from the sender
and sends this computed value back to the sender.</t>
<t> In addition to the LEDBAT mechanism described below,
we note that a slow start mechanism can be used as specified in <xref target="RFC5681"/>.
Since slow start leads to faster increase in the window than
that specified in LEDBAT,
conservative congestion control implementations employing LEDBAT
may skip slow start altogether
and start with an initial window of INIT_CWND * MSS.
(INIT_CWND is described later in
<xref target='params'/>.)
</t>
<t> The term "MSS", or the sender's Maximum Segment Size,
used in this document
refers to the size of the
largest segment that the sender can transmit.
The value of MSS can be based on the path MTU discovery <xref target="RFC4821"/> algorithm
and/or on other factors.
</t>
</section>
<section title="Receiver-Side Operation">
<t> A LEDBAT receiver calculates the one-way delay from the sender to the receiver based on
its own system time and timestamps in the received data packets. The receiver then
feeds the computed one-way delay back to the
sender in the next acknowledgement.
A LEDBAT receiver operates as follows:
<figure><artwork><![CDATA[
on data_packet:
remote_timestamp = data_packet.timestamp
acknowledgement.delay = local_timestamp() - remote_timestamp
# fill in other fields of acknowledgement
acknowlegement.send()
]]></artwork></figure></t>
<t> A receiver may choose to delay sending an ACK and may combine
acknowledgements for more than one data packet into a single ACK packet,
as is with delayed ACKs in standard TCP, for example.
In such cases,
the receiver
MAY bundle all the delay samples into one ACK packet
and MUST transmit the samples in the order generated.
When multiple delay samples are bundled within a single ACK,
the sender applies these bundled delay samples at once during its cwnd adjustment
(discussed in the next section).
Since the sender's adjustment
may be sensitive to the order in which
the delay samples are applied,
the computed delay samples
should be available to the sender
in the order they were generated at the receiver.
</t>
</section>
<section title="Sender-Side Operation">
<section title="An Overview">
<t>As a first approximation, a LEDBAT sender operates as shown below;
the complete algorithm is specified later in <xref target="full-algo"/>.
TARGET is the maximum queueing delay that LEDBAT itself may introduce in the network,
and GAIN determines the rate at which the cwnd responds to changes in queueing delay;
both constants are specified later.
off_target is a normalized value representing the difference between
the measured current queueing delay and the pre-determined TARGET delay.
off_target can be positive or negative,
consequently, cwnd increases or decreases in proportion to off_target.
</t>
<figure><artwork><![CDATA[
on initialization:
base_delay = +INFINITY
on acknowledgement:
current_delay = acknowledgement.delay
base_delay = min(base_delay, current_delay)
queuing_delay = current_delay - base_delay
off_target = (TARGET - queuing_delay) / TARGET
cwnd += GAIN * off_target * bytes_newly_acked * MSS / cwnd
]]></artwork></figure>
<t>The simplified mechanism above ignores multiple delay samples in an acknowledgment,
noise filtering, base delay expiration, and sender idle times,
which we now take into account in our complete sender algorithm below.
</t>
</section>
<section anchor="full-algo" title="The Complete Sender Algorithm">
<t>
update_current_delay() maintains a list of one-way delay measurements,
of which a filtered value is used as an estimate of the current end-to-end delay.
update_base_delay() maintains a list of one-way delay minima over a number of one-minute intervals,
to measure and to track changes in the base delay of the end-to-end path.
Both of these lists are maintained per LEDBAT flow.
<!--
Note that while this document uses the minimum to filter any noise in the one-way delay,
a different and more sophisticated filter MAY be used.
-->
</t>
<t> We note this algorithm assumes that
slight random fluctuations exist in inter-packet arrival times at the bottleneck queue,
to allow a LEDBAT sender to correctly measure the base delay.
See section <xref target="fairness"/> for a more complete discussion.
</t>
<t>
The full sender-side algorithm is given below:</t>
<figure><artwork><![CDATA[
on initialization:
# cwnd is the amount of data that is allowed to be
# outstanding in an RTT and is defined in bytes.
# CTO is the Congestion Timeout value.
create current_delays list with CURRENT_FILTER elements
create base_delays list with BASE_HISTORY number of elements
initialize elements in base_delays to +INFINITY
initialize elements in current_delays according to FILTER()
last_rollover = -INFINITY # More than a minute in the past
flightsize = 0
cwnd = INIT_CWND * MSS
CTO = 1 second
]]></artwork></figure>
<figure><artwork><![CDATA[
on acknowledgment:
# flightsize is the amount of data outstanding before this ACK
# was received and is updated later;
# bytes_newly_acked is the number of bytes that this ACK
# newly acknowledges, and it MAY be set to MSS.
for each delay sample in the acknowledgment:
delay = acknowledgement.delay
update_base_delay(delay)
update_current_delay(delay)
queuing_delay = FILTER(current_delays) - MIN(base_delays)
off_target = (TARGET - queuing_delay) / TARGET
cwnd += GAIN * off_target * bytes_newly_acked * MSS / cwnd
max_allowed_cwnd = flightsize + ALLOWED_INCREASE * MSS
cwnd = min(cwnd, max_allowed_cwnd)
cwnd = max(cwnd, MIN_CWND * MSS)
flightsize = flightsize - bytes_newly_acked
update_CTO()
]]></artwork></figure>
<figure><artwork><![CDATA[
on data loss:
# at most once per RTT
cwnd = min (cwnd, max (cwnd/2, MIN_CWND * MSS))
if data lost is not to be retransmitted:
flightsize = flightsize - bytes_not_to_be_retransmitted
if no ACKs are received within a CTO:
# extreme congestion, or significant RTT change.
# set cwnd to 1MSS and backoff the congestion timer.
cwnd = 1 * MSS
CTO = 2 * CTO
]]></artwork></figure>
<figure><artwork><![CDATA[
update_CTO()
# implements an RTT estimation mechanism using data
# transmission times and ACK reception times,
# which is used to implement a congestion timeout (CTO).
# If implementing LEDBAT in TCP, sender SHOULD use
# mechanisms described in RFC 6298 [RFC6298],
# and the CTO would be the same as the RTO.
]]></artwork></figure>
<figure><artwork><![CDATA[
update_current_delay(delay)
# Maintain a list of CURRENT_FILTER last delays observed.
delete first item in current_delays list
append delay to current_delays list
]]></artwork></figure>
<figure><artwork><![CDATA[
update_base_delay(delay)
# Maintain BASE_HISTORY delay-minima.
# Each minimum is measured over a period of a minute.
# 'now' is the current system time
if round_to_minute(now) != round_to_minute(last_rollover)
last_rollover = now
delete first item in base_delays list
append delay to base_delays list
else
base_delays.tail = MIN(base_delays.tail, delay)
]]></artwork></figure>
<t> The LEDBAT sender seeks to extract the actual delay estimate
from the current_delay samples by implementing FILTER() to eliminate any outliers.
Different types of filters MAY be used for FILTER() ---
a NULL filter, that does not filter at all, is a reasonable candidate as well,
since LEDBAT's use of a linear controller for cwnd increase and decrease
may allow it to recover quickly from errors induced by bad samples.
Another example of a filter is the
Exponentially-Weighted Moving Average (EWMA) function,
with weights that enable agile tracking of changing network delay.
A simple MIN filter applied over a small window (much smaller than BASE_HISTORY)
may also provide robustness to large delay peaks,
as may occur with delayed ACKs in TCP.
Care should be taken that the filter used, while providing robustness to noise,
remains sensitive to persistent congestion signals.
</t>
<t> We note that
when multiple delay samples are bundled within a single ACK,
the sender's resulting cwnd may be slightly different
than when the samples are sent individually in separate ACKs.
The cwnd is adjusted based on the total number of bytes ACKed
and the final filtered value of queueing_delay,
irrespective of the number of delay samples in an ACK.
</t>
<t>To implement an approximate
minimum over the past few minutes,
a LEDBAT sender stores BASE_HISTORY separate minima---one each
for the last BASE_HISTORY-1 minutes,
and one for the running current minute.
At the end of the current minute, the window moves---the
earliest minimum is dropped and the latest minimum is added.
If the connection is idle for a given minute,
no data is available for the one-way delay and,
therefore,
a value of +INFINITY has to be stored in the list.
If the connection has been idle for BASE_HISTORY minutes,
all minima in the list are thus set to +INFINITY and measurement begins
anew.
LEDBAT thus requires that during idle periods,
an implementation must maintain the base delay list.</t>
<t>
LEDBAT restricts cwnd growth after a period of inactivity.
When the sender is application-limited,
the sender's cwnd is clamped down
using max_allowed_cwnd
to a little more than flightsize.
To be TCP-friendly,
LEDBAT halves its cwnd on data loss.
</t>
<t> LEDBAT uses a congestion timeout (CTO) to avoid transmitting
data during periods of heavy congestion, and to avoid congestion collapse.
A CTO is used to detect heavy congestion indicated by loss of all outstanding data or acknowledgments,
resulting in reduction of the cwnd to 1 MSS
and an exponential backoff of the CTO interval.
This backoff of the CTO value avoids sending more data into an overloaded queue,
and also allows the sender to cope with sudden changes in the RTT of the path.
The function of a CTO is similar to that of an retransmission timeout (RTO) in TCP <xref target='RFC6298'/>,
but
since LEDBAT separates reliability from congestion control,
a retransmission need not be triggered by a CTO.
LEDBAT, however does not preclude a CTO from triggering retransmissions,
as could be the case if LEDBAT congestion control were to be used with TCP framing and reliability.
</t>
<t> The CTO is a gating mechanism that ensures
exponential backoff of sending rate under heavy congestion,
and it may
be implemented with or without a timer.
An implementation choosing to avoid timers may consider
using a "next-time-to-send" variable,
set based on the CTO,
to control the earliest time a sender may transmit
without receiving any ACKs.
A maximum value MAY be placed on the CTO, and if placed, it MUST be at least 60 seconds.
</t>
</section>
</section>
<section anchor="params" title="Parameter Values">
<t>TARGET MUST be 100 milliseconds or less,
and this choice of value is explained further in <xref target="target"/>.
Note that using the same TARGET
value across LEDBAT flows enables equitable sharing
of the bottleneck bandwidth.
A flow with a higher TARGET value than other competing LEDBAT flows
may get a larger share of the bottleneck bandwidth.
It is possible to consider
the use of different TARGET values
for implementing a relative priority between
two competing LEDBAT flows
by setting a higher TARGET value for the higher-priority flow.
</t>
<t>
ALLOWED_INCREASE SHOULD be 1, and it MUST be greater than 0.
An ALLOWED_INCREASE of 0 results in no cwnd growth at all,
and an ALLOWED_INCREASE of 1 allows and limits the cwnd increase
based on flightsize in the previous RTT.
An ALLOWED_INCREASE greater than 1 MAY be used when
interactions between LEDBAT and the framing protocol
provide a clear reason for doing so.
<!--
ALLOWED_INCREASE allows for congestion window growth where the
flightsize consistently remains lower than the congestion window.
For instance,
standard TCP
for a sender to artificially inflate the congestion window
during loss recovery
partially since duplicate acks do not convey ,
-->
</t>
<t> GAIN MUST be set to 1 or less.
A GAIN of 1 limits the
maximum cwnd ramp-up to the same rate as
TCP Reno in Congestion Avoidance.
While this document specifies the use
of the same GAIN for both cwnd increase (when off_target is greater than zero) and
decrease (when off_target is less than zero),
implementations MAY use a higher GAIN for cwnd decrease than for the increase;
our justification follows.
When a competing non-LEDBAT flow increases its
sending rate,
the LEDBAT sender may only measure a small amount of additional delay
and decrease the sending rate slowly.
To ensure no
impact on a competing non-LEDBAT flow,
the LEDBAT flow should decrease its sending rate
at least as quickly as the competing flow increases its sending rate.
A higher decrease-GAIN MAY be used to allow the LEDBAT flow
to decrease its sending rate faster than
the competing flow's increase rate.
</t>
<t>
The size of the base_delays list, BASE_HISTORY, SHOULD be 10.
If the actual base delay decreases,
due to a route change for instance,
a LEDBAT sender adapts immediately,
irrespective of the value of BASE_HISTORY.
If the actual base delay increases however,
a LEDBAT sender will take BASE_HISTORY
minutes to adapt and may wrongly infer a little more extra delay than intended (TARGET) in the meanwhile.
A value for BASE_HISTORY is thus a tradeoff:
a higher value may yield a more accurate measurement when the base delay is unchanging,
and a lower value results in a quicker response to actual increase in base delay.
</t>
<t>
A LEDBAT sender uses the current_delays list to
maintain only delay measurements made within a RTT amount of time in the past,
seeking to eliminate noise spikes
in its measurement of the current one-way delay through the network.
The size of this list, CURRENT_FILTER, may be variable,
and depends on the FILTER() function as well as the number of successful measurements made within
a RTT amount of time in the past.
The sender should seek to gather enough delay samples in each RTT
so as to have statistical confidence in the measurements.
While the number of delay samples required for such confidence will vary
depending on network conditions,
we recommend that the sender SHOULD use at least 4 samples in each RTT,
unless the number of samples is lower due to a small congestion window.
Thus, subject to congestion window constraints,
the number of delay samples in each RTT SHOULD be at least 4.
The value of CURRENT_FILTER will depend on the filter being employed,
but CURRENT_FILTER MUST be limited such that samples in the list are
not older than an RTT in the past.
</t>
<t>
INIT_CWND and
MIN_CWND SHOULD both be 2.
An INIT_CWND of 2 should
help seed FILTER() at the sender
when there are no samples at the beginning of a flow,
and
a MIN_CWND of 2
allows FILTER() to use more than a single instantaneous delay estimate
while not being too aggressive.
Slight deviations
may be warranted, for example,
when these values of INIT_CWND and MIN_CWND interact poorly with the framing protocol.
However,
INIT_CWND and MIN_CWND
MUST be no larger than
the corresponding values specified for TCP <xref target='RFC5681'/>.
</t>
<!--MIN_CWND SHOULD be 2, and it MUST be at least 1.
INIT_CWND SHOULD be 2, and it MUST be at least 1.
The choice of MIN_CWND and INIT_CWND
are strongly connected to the framing protocol,
and the acknowledgment mechanisms used;
a larger MIN_CWND and/or INIT_CWND MAY be used
if the framing protocol allows it.
For instance, TCP senders may use
a larger INIT_CWND as specified in <xref target='RFC3390'/>. -->
</section>
</section>
<section title="Understanding LEDBAT Mechanisms">
<t>This section describes the
delay estimation and window management mechanisms
used in LEDBAT.
</t>
<section title="Delay Estimation">
<t> LEDBAT estimates congestion in the direction of the data flow,
and to avoid measuring additional delay from e.g. queue build-up
on the reverse path (or ACK path) or reordering,
LEDBAT uses one-way delay estimates.
LEDBAT assumes measurements are done with data packets,
thus avoiding the need for separate measurement packets
and avoiding the pitfall of
measurement packets being treated
differently from the data packets in the network.</t>
<t> End-to-end delay
can be decomposed into transmission (or serialization) delay,
propagation (or speed-of-light) delay,
queueing delay,
and processing delay.
On any given path,
barring some noise,
all delay components except for queueing delay are constant.
To observe an increase in the queueing delay in the network,
a LEDBAT sender separates the queueing delay component
from the rest of the end-to-end delay,
as described below.
</t>
<section title="Estimating Base Delay">
<t>
Since queuing delay is always additive to the end-to-end delay,
LEDBAT estimates the
sum of the constant delay components,
which we call "base delay",
to be the minimum delay observed on the end-to-end path.
<!--Using the minimum observed delay
also allows LEDBAT to eliminate noise in the delay estimation,
such as due to spikes in processing delay at a node on the path.--></t>
<t> To respond to true changes in the base delay,
as can be caused by a route change,
LEDBAT uses only recent measurements in estimating
the base delay. The duration of the observation window itself
is a tradeoff between
robustness of measurement and responsiveness to change---a
larger observation window increases the chances that
the true base delay will be detected (as long as the
true base delay is unchanged),
whereas a smaller observation window results
in faster response to true changes in the base delay.</t>
</section>
<section title="Estimating Queueing Delay">
<t>Assuming that the base delay is constant
(in the absence of any route changes),
the queueing delay is represented by the variable component
of the measured end-to-end delay.
LEDBAT measures queueing delay as simply the
difference between an end-to-end delay measurement
and the current estimate of base delay.
The queueing delay should be filtered (depending on the usage
scenario) to eliminate noise in the delay estimation,
such as due to spikes in processing delay at a node on the
path.
</t>
</section>
</section>
<section title="Managing the Congestion Window">
<t>LEDBAT uses a simple linear controller to determine the sending rate
as a function of the delay estimate, where the
response of the sender is proportional to
the difference between the
current queueing delay estimate and the target.
</t>
<section title="Window Increase: Probing For More Bandwidth">
<t> When the queuing delay is smaller than a delay target value,
as specified by the TARGET parameter in this document,
a LEDBAT sender will increase its congestion window
proportionally to the relative difference
between the current queueing delay
and the delay target.
As the current queuing delay gets closer to TARGET,
LEDBAT's window growth gets slower.
To compete fairly with concurrent TCP flows,
we set the highest rate of LEDBAT's window growth
(when the current queueing delay estimate is zero)
to be the same as TCP's (increase of one packet per RTT).
In other words,
a LEDBAT flow thus
never ramps up faster than a competing TCP flow over the same path.
The TARGET value specifies the maximum extra queuing delay that LEDBAT
will induce. If the current queuing delay equals the TARGET value,
LEDBAT tries to maintain this extra delay.
This is done by a very slow increase rate (1 packet all couple of RTTs) in this state.
</t>
</section>
<section title="Window Decrease: Responding To Congestion">
<t> When a sender's queueing delay estimate is higher than the target,
the LEDBAT flow's rate should be reduced.
LEDBAT's linear controller allows the sender to decrease
the window proportional to the difference between the target
and the current queueing delay.
<!--only slightly
while probing and leads to a quite stable state
with high link utilization. -->
<!--In limited experiments with Bittorrent nodes,
this controller seems to work well.-->
</t>
<t> Unlike TCP-like loss-based
congestion control, LEDBAT
seeks to
avoid losses and so
a LEDBAT sender is not expected to normally
rely on losses to determine the sending rate.
However, when data loss does occur,
LEDBAT must respond as standard
TCP does;
even if the queueing delay estimates indicate otherwise,
a loss is assumed to be a strong indication of congestion.
Thus,
to deal with severe congestion when
packets are dropped in the network,
and to provide a fallback against
incorrect queuing delay estimates,
a LEDBAT sender halves its congestion window
when a loss event is detected.
As with TCP New-Reno,
LEDBAT reduces its cwnd by half at
most once per RTT.
</t>
</section>
</section>
<section anchor="target" title="Choosing The Queuing Delay Target">
<t>
The International Telecommunication Union's (ITU's)
Recommendation G.114 defines a one-way
delay of 150 ms to be acceptable for most user voice
applications <xref target="g114"/>.
Thus the delay induced by LEDBAT must be
well below 150 ms to limit its impact on concurrent
delay-sensitive traffic sharing the same bottleneck queue.
A target that is too low, on the other hand,
increases the sensitivity of the sender's algorithm to
noise in the one-way delays and in the delay measurement process,
and may lead to reduced throughput for the LEDBAT flow
and to under-utilization of the bottleneck link.
</t>
<t> Our recommendation of 100 ms or less as the target
is a tradeoff between these considerations.
Anecdotal evidence indicates that this value works well---LEDBAT
has been implemented and successfully
deployed with a target value of 100 ms
in two Bittorrent implementations: as the
exclusive congestion control mechanism in
BitTorrent Delivery Network Accelerator (DNA),
and as an experimental mechanism in uTorrent <xref target="uTorrent"/>.
</t>
</section>
</section>
<section title="Discussion">
<section title="Framing and ACK Frequency Considerations">
<t>While the actual framing and wire format of the protocols
using LEDBAT are outside the scope of this document,
we briefly consider the
data framing and ACK frequency needs
of LEDBAT mechanisms.</t>
<t> To compute the data path's one-way delay,
our discussion of LEDBAT assumes a framing that
allows the sender
to timestamp packets and
for the receiver to convey the measured one-way delay
back to the sender in ACK packets.
LEDBAT does not require this particular method,
but it does require unambiguous delay estimates using data and ACK packets.
</t>
<t>A LEDBAT receiver may send an ACK as frequently as
one for every data packet received
or less frequently;
LEDBAT does require that the receiver MUST transmit
at least one ACK in every RTT.
</t>
</section>
<section title="Competing With TCP Flows" anchor="competing_flows">
<t> LEDBAT is designed to
respond to congestion indications earlier
than loss-based standard TCP <xref target="RFC5681"/>.
A LEDBAT flow gets more aggressive as the
queueing delay estimate gets lower;
since the queueing delay estimate is non-negative,
LEDBAT is most aggressive when the queueing delay
estimate is zero.
In this case,
LEDBAT ramps up its congestion window at the same rate as standard TCP <xref target="RFC5681"/>.
LEDBAT may reduce its rate earlier than standard TCP
and always halves its congestion window on loss.
Thus,
in the worst case, where the delay estimates are completely and consistently off,
a LEDBAT flow falls back to standard TCP behavior,
and is no more aggressive than standard TCP <xref target="RFC5681"/>.</t>
<!-- If LEDBAT can ramp down faster than the
loss-based connection ramps up, LEDBAT will
yield. LEDBAT ramps down when queuing delay estimate
exceeds the target: the more the excess, the faster the
ramp-down. When the loss-based connection is standard
TCP, LEDBAT will yield at precisely the same rate as TCP
is ramping up when the queuing delay is double the
target.
-->
</section>
<section title="Competing With Non-TCP Flows" anchor="competing_others">
<t> While LEDBAT yields to all high-load flows, both TCP and non-TCP,
LEDBAT may not yield to low-load and latency-sensitive traffic
that do not induce a measurable delay at the bottleneck queue,
such as VoIP traffic.
While such flows will experience
additional delay due to any concurrent LEDBAT flows,
the TARGET delay sets a limit to the total amount of additional delay
that all the concurrent LEDBAT flows will jointly induce.
If the TARGET delay is higher than what the bottleneck queue can sustain,
the LEDBAT flows should experience loss and will fall back to standard
loss-based TCP behavior.
Thus, in the worst case, LEDBAT will add no more latency than standard TCP
when competing with non-TCP flows.
In the common case however, we expect LEDBAT flows to add TARGET amount of delay,
which ought to be within the delay tolerance for most latency-sensitive applications,
including VoIP applications.
<!--
LEDBAT might not yield to non TCP flows with a very low sending
rate as those flows do not induce a measurable delay.
For this case LEDBAT is designed to not induce more than TARGET delay.
While loss-based TCP (without AQM) will always fill up the queue and
thus induce a maximum delay based on the queue size, LEDBAT is bounded
to a maximum delay of TARGET. If the queue size corresponds to a smaller
delay than TARGET, LEDBAT will fall back to a standard loss-based TCP
congestion control behavior. Thus the maximum delay will be the same than
it would be when using standard TCP.
-->
</t>
<!--
<t>
LEDBAT flows sustain a standing queue of TARGET.
When the maximum queue size at the bottleneck is not much larger
than TARGET,
it is possible for the average delay at the bottleneck
to be higher with LEDBAT than with TCP flows,
especially when the amount of statistical multiplexing at the bottleneck is low.
Let us consider the case when the queue size is selected such that
the maximum delay that could be induce when the queue is full, is
just above TARGET. In this case LEDBAT maintains a standing queue of
TARGET (if no TCP or more aggressive cross traffic is present) while
standard TCP will empty the queue partly form time to time (whenever
there is a window halving after a loss). This can lead to the
situation where the average delay with LEDBAT is slightly higher than
with standard TCP. But the maximum delay, that is most important for
interactive applications like VoIP, will still be smaller with
LEDBAT.
</t> -->
</section>
<section anchor="fairness" title="Fairness Among LEDBAT Flows">
<t>The primary design goals of LEDBAT are focussed on the aggregate
behavior of LEDBAT flows when they compete with standard
TCP. Since LEDBAT is designed for
background traffic, we consider link utilization to be
more important than fairness amongst LEDBAT flows.
Nevertheless, we now consider fairness issues that might arise
amongst competing LEDBAT flows.</t>
<t>LEDBAT as described so far lacks a mechanism
specifically designed to equalize utilization
amongst LEDBAT flows. Anecdotally observed behavior of existing
implementations indicates that a rough equalization
does occur since in most enviroments some amount
of randomness in the inter-packet transmission times exist, as explained further below.</t>
<t>Delay-based congestion control systems suffer from the
possibility of late-comers incorrectly
measuring and using a higher base-delay than an active flow that started earlier.
Consider that a bottleneck is saturated
by a single LEDBAT flow,
and the flow therefore maintains the bottleneck queue at TARGET delay.
When a new LEDBAT flow arrives at the bottleneck,
it might incorrectly include the steady queueing delay
in its measurement of the base delay on the path.
The new flow has an inflated estimate of the base delay,
and may now effectively build on top of the existing,
already maximal,
queueing delay.
As the late-comer flow builds up,
the old flow sees the true queueing delay and backs off,
while the late-comer keeps building up,
using up the entire link's capacity,
and effectively shutting the old flow out.
This advantage is called the "late-comer's advantage".
</t>
<t> In the worst case, if the first flow yields at the same rate as the new flow
increases its sending rate,
the new flow will see constant end-to-end delay, which it assumes is the base delay,
until the first flow backs off completely.
As a result,
by the time the second flow stops increasing its cwnd,
it would have added twice the target queueing delay to the network.
</t>
<t> This advantage can be reduced
if the first flow yields and empties the bottleneck queue
faster than the incoming flow increases its occupancy in the queue.
In such a case, the late-comer might measure correctly a delay that is closer to the base delay.
While such a reduction might be achieved through a multiplicative decrease
of the congestion window,
this may cause strong fluctuations in flow throughput during the flow's steady state.
Thus we do not recommend a multiplicative decrease scheme.
</t>
<t>We note that in certain use-case scenarios,
it is possible for a later LEDBAT flow to gain an
unfair advantage over an existing one <xref target="Car10"/>.
In practice, this concern ought to be
alleviated by the burstiness of network traffic: all
that’s needed to measure the base delay is one small gap
in transmission schedules between the LEDBAT
flows. These gaps can occur for a number of reasons such
as latency introduced due to application sending patterns,
OS scheduling at the
sender, processing delay at the sender or any network
node, and link contention. When such a gap occurs in the
first sender's transmission while the late-comer is
starting, base delay is immediately correctly measured.
With a small number of LEDBAT flows,
system noise may sufficiently regulate the late-comer's
advantage.
</t>
</section>
</section>
<section title="Open Areas for Experimentation">
<t>
We now outline some areas that need experimentation in the Internet
and under different network scenarios. These experiments should help
the community understand LEDBAT's dynamics and should help
towards further standardization of LEDBAT and LEDBAT-related documents.
</t>
<section title="Network Effects and Monitoring">
<t> Further study is required to fully understand the behavior and
convergence properties of LEDBAT in networks with non-tail-drop,
non-FIFO queues, in networks with frequent route changes, and in networks
with network-level load balancing. These studies should have two broad goals:
(i) to understand the effects of different network mechanisms on LEDBAT,
and (ii) to understand the impact of LEDBAT on the network.</t>
<t> Network mechanisms and dynamics can influence LEDBAT flows in unintended ways.
For instance, frequent route changes that result in increasing base delays
may, in the worst case, throttle a LEDBAT flow's throughput significantly.
The influence of different network traffic management
mechanisms on LEDBAT throughput should be studied.
</t>
<t> An increasing number of LEDBAT flows in the network
will likely result in operator-visible network effects as well
and should thus be studied.
For instance,
as long as the bottleneck queue in a network
is larger than TARGET (in terms of delay),
we expect that both the average queueing delay and loss rate
in the network should reduce as LEDBAT traffic increasingly dominates
the traffic mix in the network.
Note that for bottleneck queues that are smaller than TARGET,
LEDBAT will appear to behave very similar to standard TCP
and it's flow-level behavior may not be distinguishable from that of standard TCP.
</t>
<t> We note that a network operator may be able to verify the operation of a LEDBAT
flow by monitoring per-flow behavior and queues in the network---when the queueing delay
at a bottleneck queue is above TARGET as specified in this document, LEDBAT flows
should be expected to back off and reduce their sending rate.</t>
</section>
<section title="Parameter Values">
<t>
The throughput and response of LEDBAT to the proposed parameter values of
TARGET, decrease-GAIN, BASE_HISTORY, INIT_CWND, and MIN_CWND
should be evaluated with different types of competing traffic
in different network settings,
including with different AQM schemes at the bottleneck queue.
TARGET controls LEDBAT's added latency, while
decrease-GAIN controls LEDBAT's response to competing traffic.
Since LEDBAT is intended to be minimally intrusive to competing traffic,
the impact of TARGET and decrease-GAIN
on delay-sensitive traffic should be studied.
TARGET also impacts the growth rate of the congestion window
when off_target is smaller than 1.
This impact of TARGET on the rate of cwnd growth should be studied.
The amount of history maintained by the base delay estimator,
BASE_HISTORY,
influences the responsiveness of LEDBAT to changing network conditions.
LEDBAT's responsiveness and throughput should be evaluated in the wide area
and under conditions where abrupt changes in base delay might occur,
such as with route changes and with cellular handovers.
The impact and efficacy of these parameters
should be carefully studied with tests over the Internet.
</t>
</section>
<section title="Filters">
<t>
LEDBAT's effectiveness depends on a sender's ability to
accurately estimate end-to-end queueing delay from delay samples.
Consequently, the filtering algorithm used for this estimation,
FILTER(),
is an important candidate for experiments.
This document suggests the use of NULL, EWMA and MIN filters
for estimating the current delay;
the efficacy of these and other possible filters for this estimate
should be investigated.
FILTER() may also impact cwnd dynamics when delay samples are bundled in ACKs,
since cwnd adaption is done once per ACK irrespective of the number of delay samples in the ACK.
This impact should be studied when the different filters are considered.
</t>
</section>
<section title="Framing">
<t>
This document defines only a congestion control algorithm
and assumes that framing mechanisms for exchanging delay information
exist within the protocol in which LEDBAT is being implemented.
If implemented in a new protocol,
both the sender and receiver may be LEDBAT-aware,
but if implemented in an existing protocol which is capable of
providing one-way delay information,
LEDBAT may be implemented as a sender-side-only modification.
In either case, the parent protocol may interact with LEDBAT's algorithms;
for instance, the rate of ACK feedback to the data sender may be dictated
by other protocol parameters, but will interact with the LEDBAT flow's dynamics.
Careful experimentation is necessary to understand
and integrate LEDBAT into both new and existing protocols.
<!--LEDBAT can be implemented in the
application layer (using a transport without congestion control
like UDP) as BitTorrent is doing. In this case changes on the
sender and the receiver code are implied. A document which is
defining LEDBAT together with the framing protocol should aim
for Proposed Standard then.
Otherwise LEDBAT could also be used within the transport layer.
If feedback is already available that provides information on
the one-way delay variations, LEDBAT could be implemented
sender-side only.-->
</t>
</section>
</section>
<section title="IANA Considerations">
<t>There are no IANA considerations for this document.</t>
</section>
<section title="Security Considerations">
<t> LEDBAT's aggressiveness is contingent on the delay estimates and on the TARGET delay value.
If these parameter values at the sender are compromised such that
delay estimates are artificially set to zero
and the TARGET delay value is set to +INFINITY,
the LEDBAT algorithm deteriorates to TCP-like behavior.
Thus, while LEDBAT is sensitive to these parameters,
the algorithm is fundamentally limited in the worst case
to be as aggressive as standard TCP.
</t>
<t>An man-in-the-middle may be able to change queueing delay on a network path,
and/or modify the timestamps transmitted by a LEDBAT sender
and/or modify the delays reported by a LEDBAT receiver,
thus causing a LEDBAT flow to back off even when there's no congestion.
A protocol using LEDBAT ought to minimize the risk of such man-in-the-middle attacks
by at least authenticating the timestamp field in the data packets
and the delay field in the ACK packets.
</t>
<t>
LEDBAT is not known to introduce any new concerns with privacy, integrity,
or other security issues for flows that use it.
LEDBAT is compatible with use of IPsec and TLS/DTLS.
</t>
</section>
<section title="Acknowledgements">
<t> We thank folks in the LEDBAT working group for their comments and feedback.
Special thanks to Murari Sridharan and Rolf Winter for their patient and untiring shepherding.
</t>
</section>
</middle>
<back>
<references title='Normative References'>
&rfc2119;&rfc6298;&rfc4821;&rfc5681;&rfc3168;
</references>
<references title='Informative References'>
&rfc5905;&rfc6297;
<reference anchor="uTorrent">
<front>
<title>uTorrent Transport Protocol library</title>
<author initials="G" surname="Hazel"><organization></organization></author>
<date year="http://github.com/bittorrent/libutp, July 2012" />
</front>
</reference>
<reference anchor="g114">
<front>
<title>SERIES G: TRANSMISSION SYSTEMS AND MEDIA,
DIGITAL SYSTEMS AND NETWORKS;
International telephone connections and circuits - General;
Recommendations on the transmission quality for an
entire international telephone connection;
One-way transmission time</title>
<author></author>
<date year="ITU-T Recommendation G.114, 05/2003" />
</front>
</reference>
<reference anchor="Bra94">
<front>
<title>TCP Vegas: New techniques for congestion detection and avoidance</title>
<author initials="L" surname="Brakmo"><organization></organization></author>
<author initials="S" surname="O'Malley"><organization></organization></author>
<author initials="L" surname="Peterson"><organization></organization></author>
<date year="Proceedings of SIGCOMM '94, pages 24-35, August 1994" />
</front>
</reference>
<reference anchor="Low02">
<front>
<title>Understanding TCP Vegas: A Duality Model</title>
<author initials="S" surname="Low"><organization></organization></author>
<author initials="L" surname="Peterson"><organization></organization></author>
<author initials="L" surname="Wang"><organization></organization></author>
<date year="JACM 49 (2), March 2002" />
</front>
</reference>
<reference anchor="Car10">
<front>
<title>Rethinking Low Extra Delay Background Transport Protocols</title>
<author initials="G" surname="Carofiglio"><organization></organization></author>
<author initials="L" surname="Muscariello"><organization></organization></author>
<author initials="D" surname="Rossi"><organization></organization></author>
<author initials="C" surname="Testa"><organization></organization></author>
<author initials="S" surname="Valenti"><organization></organization></author>
<date year="arXiv:1010.5623v1, September 2010" />
</front>
</reference>
<reference anchor="Sch10">
<front>
<title>Out of my Way -- Evaluating Low Extra Delay Background Transport in an ADSL Access Network</title>
<author initials="J" surname="Schneider"><organization></organization></author>
<author initials="J" surname="Wagner"><organization></organization></author>
<author initials="R" surname="Winter"><organization></organization></author>
<author initials="H" surname="Kolbe"><organization></organization></author>
<date year="Proceedings of 22nd International Teletraffic Congress (ITC22), September 2010" />
</front>
</reference>
<!--
<reference anchor="Kur00">
<front>
<title>Fairness Comparisons Between TCP Reno and TCP Vegas for
Future�Deployment of TCP Vegas</title>
<author initials="K" surname="Kurata"><organization></organization></author>
<author initials="G" surname="Hasegawa"><organization></organization></author>
<author initials="M" surname="Murata"><organization></organization></author>
<date year="Proceedings of INET 2000,�July 2000" />
</front>
</reference>
-->
</references>
<section anchor="app-additional" title="Measurement Errors">
<t> LEDBAT measures and uses one-way delays, and we now consider
measurement errors in timestamp generation and use.
In this Section, we use the same locally-linear clock model and
the same terminology as Network Time Protocol (NTP) <xref target="RFC5905"/>.
<!-- This model is valid for any differentiable clock. -->
In particular,
NTP uses
the terms "offset" to refer to the difference between measured time and true time,
and "skew" to refer to difference of clock rate from the true rate.
A clock thus has two time measurement errors:
a fixed offset from the true time,
and a skew.
We now consider these errors in the context of LEDBAT.</t>
<section title="Clock Offset">
<t> The offset of the clocks, both the sender's and the receiver's,
shows up as a fixed error in
LEDBAT's one-way delay measurement.
The offset in the measured one-way delay
is simply the difference in offsets
between the receiver's and the sender's clocks.
LEDBAT however
does not use this estimate directly,
but uses the
difference between the measured one-way delay
and a measured base delay.
Since the
offset error (difference of clock offsets) is the same for the
measured one-way delay and the base delay,
the offsets cancel each other out in the queuing delay estimate,
which LEDBAT uses for its window computations.
Clock offset error thus has no impact on LEDBAT.</t>
</section>
<section title="Clock Skew">
<t> Clock skew generally shows up as a
linearly changing error in a time estimate.
Similar to the offset,
the skew of LEDBAT's one-way delay estimate
is thus the difference between the two clocks' skews.
Unlike the offset however, skew does not cancel out when
the queuing delay estimate is computed,
since it causes the two clocks' offsets to change over time.
</t>
<t>
While the offset could be large, with some clocks off by minutes or even
hours or more,
skew is typically small.
Typical skews of untrained clocks
seem to be around 100-200 PPM <xref target="RFC5905"/>,
where a skew of 100 PPM translates to
an error accumulation of 6 milliseconds per minute.
This accumulation is limited in LEDBAT,
since any error accumulation is limited to the amount of history maintained
by the base delay estimator, as dictated by the BASE_HISTORY parameter.
The effects of clock skew error on LEDBAT
should generally be insignificant
unless the skew is unusually high,
or unless extreme values have been chosen for TARGET (extremely low)
and BASE_HISTORY (extremely large).
<!--so that the clock skew error in the measured one-way delay
within base_delay amount of time is larger than the TARGET.-->
Nevertheless, we now consider the possible impact of skew on LEDBAT behavior.
</t>
<t>Clock skew can manifest in two ways:
the sender's clock can be faster than the receiver's clock,
or the receiver's clock can be faster than the sender's clock.
In the first case,
the measured one-way delay will decrease as the sender's clock drifts forward.
While this drift can lead to an artificially low estimate of the queueing delay,
the drift should also lead to a lower base delay measurement,
which consequently absorbs the erroneous reduction in the one-way delay estimates.
</t>
<!-- If a competing flow introduces additional queueing delay
LEDBAT will anyway get out of the way quickly and an overestimated
one-way delay will just speed-up the back-off.</t> -->
<t>In the second case,
the one-way delay estimate will artifically increase with time.
This increase can reduce a LEDBAT flow's throughput
unnecessarily. In this case, a skew correction mechanism can
be beneficial.</t>
<t> We now discuss an example clock skew correction mechanism.
In this example, the receiver sends back raw (sending
and receiving) timestamps. Using this information,
the sender can estimate one-way delays in both
directions,
and the sender can also compute and maintain an estimate of the base delay as
would be observed by the receiver.
If the sender
detects the receiver reducing its estimate of the base delay,
it may infer that this reduction is due to
clock drift. The sender then compensates by increasing its base
delay estimate by the same amount. To apply this mechanism,
timestamps need to be transmitted in both directions.</t>
<t> We now outline a few other ideas that can be used for skew correction.
<list style="symbols">
<t>Skew correction with faster virtual clock:<vspace blankLines="0" />
Since having a faster clock on the sender will result in continuous updates of
the base delay, a faster virtual clock can be used for sender timestamping. This
virtual clock can be computed from the default machine clock through
a linear transformation. For instance, with a 500 PPM speed-up the sender's clock
is very likely to be faster than a receiver's clock. Consequently, LEDBAT will benefit
from the implicit correction when updating the base delay.<vspace blankLines="1" />
</t>
<t>Skew correction with estimating drift:<vspace blankLines="0" />
A LEDBAT sender maintains a history of base delay minima.
This history can provide a base to compute the clock skew difference between the
two hosts. The slope of a linear function fitted to the set of minima base delays
gives an estimate of the clock skew. This estimation can be used to correct
the clocks. If the other endpoint is doing the same, the clock should be
corrected by half of the estimated skew amount.<vspace blankLines="1" />
</t>
<t>Byzantine skew correction:<vspace blankLines="0" />
When it is known that each host maintains long-lived connections to a
number of different other hosts, a byzantine scheme can be used to
estimate the skew with respect to the true time. Namely, a host calculates
the skew difference for each of the peer hosts as described with the
previous approach, then take the median of the skew differences.
While this scheme is not universally applicable, it combines well
with other schemes, since it is essentially a clock training
mechanism. The scheme also corrects fast, since state is
preserved between connections.
</t>
</list></t>
</section>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-23 10:57:51 |