One document matched: draft-chroboczek-babel-doesnt-care-00.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" []>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes"?>
<?rfc tocdepth="2"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="no" ?>
<rfc category="info" docName="draft-chroboczek-babel-doesnt-care-00"
ipr="trust200902">
<front>
<title abbrev="Babel doesn't care">Babel doesn't care</title>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek">
<organization>PPS, University of Paris-Diderot</organization>
<address>
<postal>
<street>Case 7014</street>
<city>75205 Paris Cedex 13</city>
<region></region>
<code></code>
<country>France</country>
</postal>
<email>jch@pps.univ-paris-diderot.fr</email>
</address>
</author>
<date day="4" month="April" year="2015"/>
<abstract><t>Where we claim that, unlike typical IGPs, the Babel routing
protocol just doesn't care.</t></abstract>
</front>
<middle>
<section title="Introduction: why Babel doesn't care">
<t>The core of the Babel routing protocol <xref target="RFC6126"/>
consists of a distance vector protocol based on a distributed variant of
the Bellman-Ford algorithm combined with a loop avoidance and a blackhole
elimination mechanism. As long as these mechanisms are left intact, Babel
will push packets in roughly the right direction following a loop-free
path.</t>
<t>Babel doesn't care what happens in the network, in the metric, or in
the route selection policy, or whether routing information is propagated
in a timely manner through the network. In particular, Babel has been
shown to work on unstable networks, on wireless networks, on mesh
networks, on hybrid networks (networks composed of an unstable meshy bit
combined with a stable wired infrastructure), on overlay networks, and has
been extended with source-specific routing. Some of these extensions
involved fundamental changes to the structure of metrics or even the
forwarding semantics, but, to a great extent, Babel just didn't care.</t>
<t>This is in contrast with familiar link-state IGPs, such as OSPF and
IS-IS. These link-state protocols deeply care that all nodes in a given
area have their link state databases synchronised in a timely manner, lest
routing loops and other pathologies occur. OSPF and IS-IS therefore
require careful (see?) implementation of timely and reliable flooding, do
not support dynamically computed metrics without careful attention to
stability issues, do not allow any form of filtering or non-default route
selection except at area boundaries, and require a distributive
metric.</t>
<section title="Consequences and applicability of the protocol">
<t>Babel doesn't care where the routing information comes from —
whatever information it is given, it maintains a set of loop-free paths.
As we show in <xref target="examples"/>, this has allowed Babel to either
work out of the box or be easily adapted to a wide range of situations
that are difficult or outright impossible to solve with the familiar
link-state routing protocols.</t>
<t>This enormous flexibility of the protocol makes Babel particularly
adapted to tricky networks where the features of familiar routing
protocols (shortest-path next-hop routing over transitive links with
statically defined metrics) are not good enough. This can include mesh
networks, overlay networks, radio networks, etc. Since all of these kinds
of networks are handled by a single protocol, Babel is particularly
applicable to hybrid networks, whose different parts need different
algorithms, for example networks composed of a meshy part and a stable,
wired part.</t>
<t>Babel is not necessarily the best choice in large, stable, carefully
administered networks, where a more rigid protocol such as OSPF or IS-IS
will have lower overhead and be easier to debug.</t>
</section>
</section>
<section title="Structure of the Babel routing protocol" anchor="structure">
<t>The Babel routing protocol <xref target="RFC6126"/> is a loop-avoiding
distance vector protocol. The core Babel protocol is reasonably simple
(it is described in 30 pages in RFC 6126 and has been reimplemented
from scratch in slightly over 20 hours by Markus Stenberg), and consists
of the following components:
<list style="symbols">
<t>a distance vector routing protocol that is metric-agnostic and
supports flexible routing policies;</t>
<t>explicit mechanisms (Hello/IHU) for link sensing and bidirectional
reachability detection;</t>
<t>a loop-avoidance mechanism that makes strong guarantees even in
transient state;</t>
<t>a provably complete blackhole avoidance mechanism.</t>
</list></t>
<t>In the rest of this section, we describe these mechanisms in more
detail.</t>
<section title="Distance-vector protocol">
<t>At the core of Babel lies a distance-vector routing protocol, based on
a distributed variant of the Bellman-Ford algorithm, similiar to the core
of BGP or EIGRP.</t>
<t>Distance-vector IGPs have become unfashionable sometime in the 1990s.
However, distance vector has a number of very pleasant properties: it is
simple to implement, reasonably robust in the presence of implementation
mistakes, and allows the use of redundant routing tables (routing tables
with multiple routes to a single destination), which makes reconvergence
after a link failure almost instantaneous (no packet exchanges) when the
alternate routes are already available.</t>
<t>Another important property is that, unlike link-state, distance vector
builds its routes consistently with the direction of forwarding in
next-hop routing. This makes it robust in the presence of routing
inconsistencies, a property on which Babel relies when it delays sending
updates or uses a non-distributive metric (see <xref
target="non-distributive"/> below).</t>
</section>
<section title="Link sensing">
<t>Like most modern routing protocols, Babel has a simple and lightweight
mechanism for detecting neighbours and eliminating assymetric neighbours.
Every node sends periodic Hello TLVs; somewhat less often, every node
sends IHU ("I Heard You") TLVs that contain the list of all the nodes from
which it has heard a Hello.</t>
<t>Hello TLVs carry a sequence number (not to be confused with the
sequence number used for loop avoidance). Additionally, both Hello and
IHU TLVs carry an explicit interval — an upper bound on the time
after which a new Hello or IHU can be expected. This allows nodes with
different parameters to reliably establish neighbour relationships, since
the hold time for a neighbour can be determined dynamically from the
explicit interval. Additionally, the Hello seqno allows a node to vary
its Hello to send unscheduled Hellos without confusing its neigbours, for
example after detecting a mobility event.</t>
<t>Both Hello and IHU TLVs are extensible. For example, the delay-based
routing extension <xref target="BABEL-RTT"/> uses Hellos and IHUs to
carry timestamps and timestamp echoes respectively.</t>
</section>
<section title="Metric agnosticity">
<t>Distance vector's robustness and the extensibility of Babel's packet
format conspire to make Babel metric-agnostic: a metric in Babel doesn't
need to be an integer, it can be any sort of algebra that is equipped with
a strictly-monotonic ordering (Section 3.5.2 of <xref target="RFC6126"/>).
In particular, a Babel metric need not be distributive — a property
that is used by Babel's radio interference extension <xref
target="BABEL-Z"/>. More about this in <xref target="non-distributive"/>
below.</t>
<t>What is more, the loop-avoidance features of Babel ensure that a packet
is pushed in roughly the right direction, following a loop-free path, even
when the metric is extremely unstable. This property is what makes
delay-based routing work reliably even in the presence of congestion.</t>
</section>
<section title="Flexible route selection policies">
<t>Distance vector is flexible, in the sense that routes can be dropped
("filtered") out at any point in the network, and that a node can that is
unwilling to forward packets may advertise an artificially inflated metric
("prepending"). This is very different from link state, where filtering
can only be done at area boundaries. The reference implementation of
Babel includes a powerful language for filtering and prepending, and this
feature has proven to be extremely useful to work around flaws in the data
plane (e.g. for discarding NATed or firewalled routes).</t>
<t>What is more, route selection need not choose the route with smallest
metric -- any route with finite metric will do, as long as the choice is
consistent with the data plane. This property is used by Babel to improve
stability by applying hysteresis during route selection.</t>
</section>
<section title="Loop avoidance">
<t>Like EIGRP and BGP, Babel is a loop-avoiding protocol: only those
routes are accepted that can be proven to be loop-free. A route that has
this property is said to be "feasible".</t>
<t>Babel's feasibility condition is based on the one used in EIGRP <xref
target="DUAL"/>, but is structured similarly to DSDV <xref
target="DSDV"/>. Thus, Babel avoids EIGRP's active phase (not needed with
DSDV-style feasibility) while also avoiding DSDV's issues with retractions
(the even-odd mechanism of DSDV is not necessary with EIGRP-style
feasibility) and minor metric fluctuations (EIGRP-style feasibility can
cushion up to one hop of fluctuation).</t>
<t>Babel's loop-freedom properties are well understood (Section 1.1 of
<xref target="RFC6126"/>) and have been proven (the proof has not been
published yet).</t>
</section>
<section title="Blackhole elimination">
<t>It is well known that DSDV-style loop avoidance, as used in Babel, has
an issue: it may sometimes cause spurious blackholes. This situation is
known as "starvation" in Babel (Section 2.5 of <xref target="RFC6126"/>).</t>
<t>DSDV attempted to work around the issue by updating a route's seqno
periodically, which made DSDV unsuitable for high-mobility scenarios:
after moving a DSDV laptop to a new place, the user would need to wait for
up to 30 seconds for a new seqno to be generated and propagate across the
network. What is more, the need to propagate new seqnos in a timely
manner put an upper bound on DSDV's update interval.</t>
<t>Babel uses a different mechanism for clearing spurious blackholes,
which is based on the observation that it should be the starving node that
triggers a seqno update. When a node detects that it is suffering from
starvation (all the routes to a given destination are unfeasible), it
sends an end-to-end "seqno request" that requests that the source increase
its seqno number. Since seqno requests follow routes that have not been
proven to be loop-free, they are equipped with a hop count; combined with
duplicate request detection, this mechanism is complete and reasonably
efficient.</t>
</section>
</section>
<section title="Examples of how Babel doesn't care" anchor="examples">
<t>In the rest of this document, we describe a number of situations in
which Babel just doesn't care.</t>
<section title="Lossy networks">
<t>Babel was designed to support lossy networks well, and one of the
metrics suggested in Appendix A of <xref target="RFC6126"/> depends on
packet loss. This metric tends to fluctuate in normal usage, but two
features of Babel conspire to make it work well: Babel's feasibility
function is able to keep a route feasible when the metric doesn't
fluctuate by more than the cost of one hop, and Babel's loop avoidance
ensures that packets are pushed in the right direction even when the
metric fluctuates.</t>
<t>The metric may fluctuate — but Babel doesn't care.</t>
</section>
<section title="Non-transitive (mesh) networks">
<t>A non-transitive (or mesh) network is one in which there is no
well-defined notion of link — where the neighbour relationship
between interfaces is not transitive. Routing protocols such as OSPF and
IS-IS require a transitive network. Babel just doesn't care.</t>
<t>(In fact, it does care a little — the split horizon optimisation
can be enabled on a transitive, lossless link.)</t>
</section>
<section title="Non-default route selection policies">
<t>Babel doesn't require that a router always pick the route with smallest
metric. The route selection procedure uses this property to prefer stable
routes (routes that have been around for a while) to potentially unstable
ones (routes that have popped up recently), and could potentially take
into account information that has been obtained from other sources than
the protocol (say, battery charge information).</t>
<t>Choosing a path other than the shortest one would break a link-state
protocol. Babel doesn't care.</t>
</section>
<section title="Non-distributive metrics" anchor="non-distributive">
<t>In most routing protocols, metrics consist of some set of bounded
integers equipped with the usual addition. But there is no requirement
that the metric have such a simple structure. Babel will work with any
metric that is strictly monotonic on the left:</t>
<t>m < n.m</t>
<t>Link-state protocols make one further requirement on the metric, called
left distributivity or isotonicity, which is not required by distance
vector protocols:</t>
<t>m <= m' ---> n.m <= n.m'</t>
<t>To see how a non-distributive metric can arise, consider Babel's radio
interference extension <xref target="BABEL-Z"/>, which aims to minimise
the number of times a route crosses a given frequency. Suppose that A is
a single radio router, while routers B and C have two radios; suppose
further that the dashed frequency has less packet loss than the dotted
frequency:</t>
<figure><artwork><![CDATA[
q
p -----
A ----- B ..... C
q'
]]></artwork></figure>
<t>Then the preferred route from A to C is p.q', while the preferred route
from B to C is q. The metric is non-distributive:</t>
<t>q <= q' but p.q > p.q'</t>
<t>(A note for BGP specialists: this is analogous to what happens in BGP
when q is a customer route, and is therefore preferred by B, while q' is
a shorter peer route, and would therefore be preferable for A. Many
familiar BGP policies can be modelled by non-distributive metrics.)</t>
<t>Using a non-distributive metric is impossible in link-state protocols,
where it may lead to persistent routing loops. Babel doesn't care, and
converges to a Nash equilibrium. (This assertion is currently unproven,
but similar properties are known to hold for Bellman-Ford.)</t>
</section>
<section title="Delay-based routing">
<t>Overlay networks — networks built over tunnels or VPNs —
present a complex network topology as a single hop. Efficient routing in
such networks requires distinguishing between local and remote hops, which
is usually done by a human administrator manually tweaking the metric.</t>
<t>The delay-based extension to Babel <xref target="BABEL-RTT"/> <xref
target="DELAY-BASED"/> uses RTT measurements to automatically prefer local
routes. Such an approach leads to a negative feedback loop between the
control plane and the data plane, which naturally leads to routing
oscillations (a route is preferred when it has low RTT, which causes
traffic to flow through it, which in turn causes its RTT to increase,
which causes a different route to be preferred). In a link-state
protocol, such oscillations would lead to repeated refloodings and SPF
recomputation. Babel, on the other hand, doesn't care: even in the
presence of oscillations, it is pushing packets in the right direction
following loop-free paths.</t>
<t>(Oscillations cause packet reordering, which is something the transport
layer cares about, so while Babel itself might not care, we have equipped
our implementation with a number of mechanisms to limit the amount of
oscillation.)</t>
</section>
<section title="Source-specific routing">
<t>On advice from the Homement working group, we have extended Babel with
support for source-specific routing <xref target="BABEL-SS"/> <xref
target="SS-ROUTING"/>. This is a sizeable extension: it changes the
behaviour of the data plane (packets are forwarded according to both
source and destination), it changes the notion of most-specific route, and
it requires no less than three new TLVs to be carried by the protocol.</t>
<t>In this particular case, Babel does care, to a certain extent. The new
TLVs needed to be very carefully encoded so that even in the presence of
non-source-specific routers, the data plane remains consistent with the
control plane, and we needed to make some efforts to ensure that the
blackhole elimination mechanism wouldn't be broken by source-specific
routing. From a different point of view, however, Babel still doesn't
care: the loop avoidance mechanism applied with no changes to the new kind
of routes.</t>
</section>
<section title="Interoperability">
<t>All of the extensions to Babel described above interoperate, to the
extent possible. For example, a delay-based router will reliably exchange
routes with a non-delay-based one, but will not perform any RTT
estimation. Similarly, a source-specific router will reliably exchange
non-specific routes with a non-source-specific router.</t>
<t>While achieving this requires careful encoding of the extension data so
that just the right bits are discarded by routers that don't understand
the extension (Section 4 of <xref target="BABEL-EXT"/>) and checking that
the forwarding behaviour remains consistent with the routing data, Babel
itself doesn't care — it just computes paths and applies the loop
avoidance mechanism, whatever the source of the routing information.</t>
</section>
</section>
<section title="Conclusion">
<t>We have solved within the Babel routing protocol a number of
interesting routing problems that are difficult or outright impossible to
solve in other routing protocols, and we have done that without breaking
interoperability between the various variants of the protocol. We claim
that this makes Babel particularly suitable for the kind of messy networks
where such problems arise, and in particular to hybrid networks, where
different parts of the network require different mechanisms to work well
but don't necessarily wish to run multiple routing protocols.</t>
<t>But that's us. Babel doesn't care.</t>
</section>
</middle>
<back>
<references>
<reference anchor="RFC6126"><front>
<title>The Babel Routing Protocol</title>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date month="February" year="2011"/>
</front>
<seriesInfo name="RFC" value="6126"/>
</reference>
<reference anchor="BABEL-EXT"><front>
<title>Extension Mechanism for the Babel Routing Protocol</title>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date month="March" year="2015"/>
</front>
<seriesInfo name="Internet-Draft" value="draft-chroboczek-babel-extension-mechanism-04"/>
</reference>
<reference anchor="BABEL-SS"><front>
<title>Source-Specific Routing in Babel</title>
<author fullname="Matthieu Boutier" initials="M." surname="Boutier"/>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date year="2014" month="November"/>
</front>
<seriesInfo name="Internet-Draft" value="draft-boutier-babel-source-specific-00"/>
</reference>
<reference anchor="SS-ROUTING" target="http://arxiv.org/abs/1403.0445">
<front>
<title>Source-sensitive routing</title>
<author initials="M." surname="Boutier" fullname="Matthieu Boutier"/>
<author initials="J." surname="Chroboczek" fullname="Juliusz Chroboczek"/>
<date year="2015" month="March"/>
</front>
<annotation>To appear in Proc. IFIP Networking 2015.</annotation>
</reference>
<reference anchor="BABEL-Z"><front>
<title abbrev="Babel Diversity Routing">Diversity Routing for the Babel
Routing Protocol</title>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date year="2014" month="July"/>
</front>
<seriesInfo name="Internet-Draft" value="draft-chroboczek-babel-diversity-routing-00"/>
</reference>
<reference anchor="BABEL-RTT"><front>
<title>Delay-based Metric Extension for the Babel Routing Protocol</title>
<author fullname="Baptiste Jonglez" initials="B." surname="Jonglez"/>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date year="2014" month="July"/>
</front>
<seriesInfo name="Internet Draft" value="draft-jonglez-babel-rtt-extension-00"/>
</reference>
<reference anchor="DELAY-BASED" target="http://arxiv.org/abs/1403.3488">
<front>
<title>A delay-based routing metric</title>
<author fullname="Baptiste Jonglez" initials="B." surname="Jonglez"/>
<author fullname="Matthieu Boutier" initials="M." surname="Boutier"/>
<author fullname="Juliusz Chroboczek" initials="J." surname="Chroboczek"/>
<date year="2015" month="March"/>
</front>
<annotation>Submitted for publication.</annotation>
</reference>
<reference anchor="DSDV"><front>
<title>Highly Dynamic Destination-Sequenced Distance-Vector Routing
(DSDV) for Mobile Computers</title>
<author fullname="Charles Perkins" initials="C." surname="Perkins"/>
<author fullname="Pravin Bhagwat" initials="P." surname="Bhagwat"/>
<date year="1994"/>
</front>
<seriesInfo name="ACM SIGCOMM'94 Conference on Communications
Architectures, Protocols and Applications" value="234-244"/>
</reference>
<reference anchor="DUAL"><front>
<title>Loop-Free Routing Using Diffusing Computations</title>
<author fullname="J. J. Garcia Luna Aceves"
initials="J. J." surname="Garcia Luna Aceves"/>
<date month="February" year="1993"/>
</front>
<seriesInfo name="IEEE/ACM Transactions on Networking" value="1:1"/>
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 07:12:28 |