One document matched: draft-ymbk-idr-rs-bfd-01.xml
<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc comments="yes"?>
<?rfc compact="yes"?>
<?rfc inline="yes"?>
<?rfc sortrefs="yes"?>
<?rfc subcompact="yes"?>
<?rfc symrefs="yes"?>
<?rfc toc="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc tocompact="yes"?>
<rfc category="std" docName="draft-ymbk-idr-rs-bfd-01" ipr="noDerivativesTrust200902">
<front>
<title>Making Route Servers Aware of Data Link Failures at IXPs</title>
<author fullname="Randy Bush" initials="R." surname="Bush">
<organization>Internet Initiative Japan</organization>
<address>
<postal>
<street>5147 Crystal Springs</street>
<city>Bainbridge Island</city>
<region>Washington</region>
<code>98110</code>
<country>US</country>
</postal>
<email>randy@psg.com</email>
</address>
</author>
<author fullname="Jeffrey Haas" initials="J." surname="Haas">
<organization>Juniper Networks, Inc.</organization>
<address>
<postal>
<street>1194 N. Mathilda Ave.</street>
<city>Sunnyvale</city>
<region>CA</region>
<code>94089</code>
<country>US</country>
</postal>
<email>jhaas@juniper.net</email>
</address>
</author>
<author fullname="John G. Scudder" initials="J." surname="Scudder">
<organization>Juniper Networks, Inc.</organization>
<address>
<postal>
<street>1194 N. Mathilda Ave.</street>
<city>Sunnyvale</city>
<region>CA</region>
<code>94089</code>
<country>US</country>
</postal>
<email>jgs@juniper.net</email>
</address>
</author>
<author fullname="Arnold Nipper" initials="A." surname="Nipper">
<organization>DE-CIX Management GmbH</organization>
<address>
<postal>
<street>Lichtstrasse 43i</street>
<city>Cologne</city>
<code>50825</code>
<country>Germany</country>
</postal>
<email>arnold.nipper@de-cix.net</email>
</address>
</author>
<author fullname="Thomas King" initials="T." surname="King" role="editor">
<organization>DE-CIX Management GmbH</organization>
<address>
<postal>
<street>Lichtstrasse 43i</street>
<city>Cologne</city>
<code>50825</code>
<country>Germany</country>
</postal>
<email>thomas.king@de-cix.net</email>
</address>
</author>
<date month="March" year="2015" />
<abstract>
<t>
When route servers are used, the data plane is not congruent with
the control plane. Therefore, the peers on the Internet exchange can
lose
data connectivity without the control plane being aware of it,
and
packets are dropped on the floor. This document proposes the use
of
BFD between the two peering routers to detect a data plane
failure, and then
uses BGP next hop cost to signal the state of the
data link to the route server(s).
</t>
</abstract>
<note title="Requirements Language">
<t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
are to
be interpreted as described in
<xref target="RFC2119" />
only when they appear in all upper
case. They may also appear in
lower or mixed case as English
words, without normative meaning.
</t>
</note>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t>
In configurations (typically Internet exchanges) where EBGP routing
information is exchanged between client routers through the agency of a route
server <xref target="I-D.ietf-idr-ix-bgp-route-server" />, but traffic is exchanged directly, operational
issues can arise when partial data plane connectivity exists among the route
server client routers.
This is because, as the data plane is not congruent with the control
plane, the client routers on the Internet exchange can lose data
connectivity without the
control plane - the route server - being aware of it, and
packets are dropped on the floor.
</t>
<t>
To remedy this, two basic problems need to be solved:
</t>
<t>
1. Client routers must have a means of verifying connectivity
amongst themselves, and
</t>
<t>
2. Client routers must have a means of communicating the knowledge so
gained back to the route server.
</t>
<t>
The first can be solved by application of Bidirectional Forwarding Detection
<xref target="RFC5880" />. The second can be solved
by use of BGP NH-SAFI <xref target="I-D.ietf-idr-bgp-nh-cost" />. There is a subsidiary problem that
must also be solved. Since one of the key value propositions offered by
a route server is that client routers need not be configured to peer with each
other:
</t>
<t>
3. Client routers must have a means (other than configuration)
to know of one another's existence.
</t>
<t>
This can also be solved by an application of BGP NH-SAFI.
</t>
<t>
Throughout this document, we generally assume that the route server
being discussed is able to represent different RIBs towards
different clients, as discussed in
<xref target="I-D.ietf-idr-ix-bgp-route-server">section 2.3.2.1.</xref>.
These procedures (other than the use of BFD to track next hop
reachability) have limited value if this is not the case.
</t>
</section>
<section anchor="operation" title="Operation">
<t>
Below, we detail procedures where a route server tells its client routers
about other client routers (by sending it their next hops using
NH-SAFI),
the client router verifies connectivity to those other client routers
(using BFD)
and communicates its findings back to the route server (again using
NH-SAFI). The route server uses the received NH-SAFI routes as input
to the route selection process it performs on behalf of the client.
</t>
<section anchor="discovery"
title="Mutual Discovery of Route Server Client Routers">
<t>
Strictly speaking, what is needed is not for a route server client router
to know of other (control-plane) client routers, but rather to know
(so
that it can validate) all the next hops the route server might
choose to send the client router, i.e. to know of potential forwarding
plane relationships.
</t>
<t>
In effect, this requirement amounts to knowing the BGP next hops the
route server is aware of in its Adj-RIBs-In. Fortunately, <xref target="I-D.ietf-idr-bgp-nh-cost" />
defines a construct that contains exactly this data, the
"Next-Hop Information Base", or NHIB, as well as procedures for a
BGP speaker to communicate its NHIB to its peer. Thus, the problem
can be solved by the route server advertising its NHIB to its
client router, following those procedures.
</t>
<t>
We observe that (as per NH-SAFI) the cost advertised in the
route server's Adj-NHIB-Out need not reflect a "real" IGP cost, the
only requirement being that the advertised costs are commensurate. A
route server MAY choose to advertise any fixed cost other than
all-ones (which is a reserved value in NH-SAFI). This specification
does not suggest semantics be imputed to the NH-SAFI advertised by
the route server and received by the client, other than "this next
hop is present in the control plane, you might like to track it". The
route server is not allowed to advertise a next hop as NH_UNREACHABLE.
</t>
<t>
A route server client SHOULD use BFD (or other means beyond the
scope of this document) to track forwarding plane connectivity <xref target="RFC5880"/>
to each next hop depicted in the received NH-SAFI.
</t>
<t>
<!-- Comment John:
XXX want to say something about tracking connectivity to plain old
next hops too? Just in case the RS gets it wrong?
-->
</t>
</section>
<section anchor="tracking" title="Tracking Connectivity">
<t>
For each next hop in the Adj-NHIB-In received from the route server,
the client router SHOULD use some means to confirm that data plane
connectivity does exist to that next hop.
</t>
<t>
For each next hop in the
Adj-NHIB-In received from the route server, the
client router SHOULD
setup a BFD session to it if one is not already
available and track the
reachability of this next hop.
</t>
<t>
For each next hop being tracked, a corresponding NH-SAFI route
should be placed in the client router's own Adj-NHIB-Out to be
advertised to the route server. Any next hop for which connectivity
has failed should have its cost advertised as NH_UNREACHABLE. (This
may also be done as a result of policy even if connectivity
exists.)
Any other next hop should have some feasible cost advertised. The
values advertised may be all equal, or may be set according to
policy or other implementation-specific means.
</t>
<t>
If the test of connectivity between one client router and another client router
has failed the client router that detected this failure should perform
connectivity test for a configurable amount of time (preferable 24 hours) on
a regular basis (e.g. every 5 minutes). If during this time no connectivity can
be restored no more testing is performed and this client router is advertised
as NH_UNREACHABLE until manually changed or the client router is rebooted.
</t>
<!--To b replaced by section best path determination. <t>
A client router tracking next hop reachability should also use that
determination as input to its own bestpath determination, as per
<xref target="RFC4271">section 9.1</xref>.
</t>-->
</section>
</section>
<section anchor="advertising" title="Advertising Client Router Connectivity to the Route Server">
<t>
As discussed above, a client router will advertise its Adj-NHIB-Out
to the route server. The route server should use this information as
input to its own decision process when computing the Adj-RIB-Out for
this peer.
This peer-dependent Adj-RIB-Out is then advertised to this peer. In
particular, the route server MUST
exclude any routes whose next hops the client has declared to be
NH_UNREACHABLE. The route server MAY also consider the advertised
cost to be the "IGP cost"
<xref target="RFC4271">section 9.1</xref>
when doing this
computation.
</t>
</section>
<section anchor="routeselectionprocess" title="Utilizing Next Hop Unreachablility Information at Client Routers">
<t>
A client router detecting an unreachable next hop signals this information
to the route server as described above.
Also, it treats the routes as unresolvable as per
<xref target="RFC4271">section 9.1.2.1</xref> and proceeds with route selection as normal.
</t>
<t>
Changes in nexthop reachability via these mechanisms should receive some amount of consideration
toward avoiding unnecessary route flapping. Similar mechanisms exist in IGP implementations
and should be applied to this scenario.
</t>
</section>
<section anchor="recommendations" title="Recommendations for Using BFD">
<t>
The RECOMMENDED way a client router can confirm the data plane
connectivity to its next hops is available, is the use of BFD in
asynchronous mode. Echo mode MAY be used if both client routers running
a BFD session support this. The use of authentication in BFD is
OPTIONAL as there is a certain level of trust between the
operators of the client routers at a particular IXP. If trust cannot
be assumed, it is recommended to use pair-wise keys (how this can be
achieved is outside the scope of this document).
The ttl/hop limit values as described in
<xref target="RFC5881">section 5</xref>
MUST be obeyed in order to secure BFD sessions from packets
coming from outside the IXP.
</t>
<t>
There is interdependence between the functionality described in
this document and BFD from an administrative point of view.
To streamline behaviour of different implementations the following
is RECOMMENDED:
<list style='symbols'>
<t>If BFD is administratively shut down by the administrator of
a client router then the functionality described in this document
MUST also be administratively shut down.</t>
<t>If the administrator enables the functionality described in
this document on a client router then BFD MUST be automatically
enabled.</t>
</list>
</t>
<t>
The following values of the BFD configuration of client routers
(see <xref target="RFC5880">section 6.8.1</xref>) are RECOMMENDED in
order to allow a fast detection of lost data plane connectivity:
<list style='symbols'>
<t>DesiredMinTxInterval: 1,000,000 (microseconds)</t>
<t>RequiredMinRxInterval: 1,000,000 (microseconds)</t>
<t>DetectMult: 3</t>
</list>
</t>
<t>
The configuration values above are a trade-off between fast detection of data plane connectivity and
the load client routers must handle keeping up the BFD communication. Selecting smaller
DesiredMinTxInterval and RequiredMinRxInterval values generates lots of BFD packets,
especially at larger IXPs with many hundreds of client routers.
</t>
<t>
The configuration values above are selected in order to handle brief interrupts on the data plane.
Otherwise, if a BFD session detects a brief data plane interrupt to a particular client router, it
will cause to signal the route server that is should remove routes from this client router and
tell it shortly afterwards to add the routes again. This is disruptive and computational expensive
on the route server.
</t>
<t>
The configuration values above are also partially impacted by BGP advertisement time in reaction
to events from BFD. If the configuration values are selected so that BFD detects data plane interrupts
a lot faster than the BGP advertisement time, a data plane connectivity flapping could be detected by BFD but
the route server is not informed about them because BGP is not able to transport this information fast enough.
</t>
<t>
As discussed, finding good configuration values is hard so a client router administrator MAY
select better suited values depending on the special needs of the particular deployment.
</t>
<!-- bFD flapping handling meachanism -->
<!-- Comment Jeff: Whether we need to specify that the remote nexthop reachability mechanism is damped in the spec is an open question. -->
</section>
<section anchor="bootstrapping" title="Bootstrapping">
<t>
If the route server starts it does not know anything about connectivity states
between client routers. So, the route server assumes optimistically that all
client routers are able to reach each other unless told otherwise.
</t>
</section>
<section anchor="other" title="Other Considerations">
<t>
For purposes of routing stability, implementations may wish to apply
hysteresis ("holddown") to next hops that have transitioned from
reachable to unreachable and back.
</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include="reference.RFC.2439"?>
<?rfc include="reference.RFC.4271"?>
<?rfc include="reference.RFC.5880"?>
<?rfc include="reference.RFC.5881"?>
<?rfc include="reference.I-D.ietf-idr-ix-bgp-route-server"?>
<?rfc include="reference.I-D.ietf-idr-bgp-nh-cost"?>
</references>
<!--- <references title="Informative References"> <?rfc include="reference.I-D.ietf-sidr-bgpsec-protocol"?>
</references> -->
</back>
</rfc>| PAFTECH AB 2003-2026 | 2026-04-24 05:42:32 |