One document matched: draft-bgp-path-marking-00.xml
<?xml version="1.0"?>
<?rfc comments="yes"?>
<?rfc compact="yes"?>
<?rfc inline="yes"?>
<?rfc sortrefs="yes"?>
<?rfc subcompact="no"?>
<?rfc symrefs="yes"?>
<?rfc toc="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc tocompact="yes"?>
<rfc category="std" docName="draft-bgp-path-marking-00" ipr="pre5378Trust200902">
<front>
<title abbrev="Path-Marking">
BGP Path Marking
</title>
<author initials="C.C." surname="Camilo Cardona" fullname="Camilo Cardona">
<organization>IMDEA Networks</organization>
<address>
<postal>
<street>Avenida del Mar Mediterraneo</street>
<city>Leganes</city> <code>28919</code>
<country>Spain</country>
</postal>
<!--><uri>http://inl.info.ucl.ac.be/pfr</uri><-->
<email>juancamilo.cardona@imdea.org</email>
</address>
</author>
<author initials="P.F." surname="Pierre Francois" fullname="Pierre Francois">
<organization>IMDEA Networks</organization>
<address>
<postal>
<street>Avenida del Mar Mediterraneo</street>
<city>Leganes</city> <code>28919</code>
<country>Spain</country>
</postal>
<!--><uri>http://inl.info.ucl.ac.be/pfr</uri><-->
<email>pierre.francois@imdea.org</email>
</address>
</author>
<author initials="S.R." surname="Ray" fullname="Saikat Ray">
<organization>Cisco Systems</organization>
<address>
<postal>
<street>170 W. Tasman Drive</street>
<city>San Jose, CA</city> <code>95134</code>
<country>USA</country>
</postal>
<!--><uri>http://inl.info.ucl.ac.be/pfr</uri><-->
<email>sairay@cisco.com</email>
</address>
</author>
<author initials="K.P." surname="Patel" fullname="Keyur Patel">
<organization>Cisco Systems</organization>
<address>
<postal>
<street>170 W. Tasman Drive</street>
<city>San Jose, CA</city> <code>95134</code>
<country>USA</country>
</postal>
<!--><uri>http://inl.info.ucl.ac.be/pfr</uri><-->
<email>keyupate@cisco.com</email>
</address>
</author>
<author initials="P.L." surname="Paolo Lucente" fullname="Paolo Lucente">
<organization>Cisco Systems</organization>
<address>
<postal>
<street>170 W. Tasman Drive</street>
<city>San Jose, CA</city> <code>95134</code>
<country>USA</country>
</postal>
<!--><uri>http://inl.info.ucl.ac.be/pfr</uri><-->
<email>plucente@cisco.com</email>
</address>
</author>
<author fullname="Pradosh Mohapatra" initials="P.M." surname="Mohapatra">
<organization>Cumulus Networks</organization>
<address>
<postal>
<street>140 C. Whisman Rd.</street>
<city>Mountain View</city>
<region>CA</region>
<code>94041</code>
<country>USA</country>
</postal>
<email>pmohapat@cumulusnetworks.com</email>
</address>
</author>
<date month="July" year="2013" />
<area>General</area>
<keyword>I-D</keyword>
<keyword>Internet-Draft</keyword>
<abstract>
<t>
The potential advertisement of non-best paths by a BGP speaker
supporting the add-path or the best-external extensions makes
it difficult for other BGP speakers to identify the paths that
have been selected as best by those who advertise them. This
information is required for proper operation of some
applications. Towards that end, this document proposes marking
the paths using extended communities that encode the path
type.
</t>
</abstract>
</front>
<middle>
<section title="Introduction" anchor="sec:intro">
<t>
When there are multiple paths for a given address prefix, BGP
chooses one of the paths as the "best-path" according to the
best-path selection rules prescribed in <xref
target="RFC4271"></xref> and installs the best-path in its
forwarding table. Classically, each BGP speaker advertises
only the best-path to its peers. So when a BGP speaker
receives a path from one of its peers, it is assured that the
path is used by the peer for forwarding and all other peers
have received the same path from this peer. This leads to
consistent routing in a BGP network.
</t>
<t>
The classical advertisement rule of sending only the best-path
does not convey the full routing state of an address prefix
present on a BGP speaker to its peers.
<list style="symbols">
<t>
In order to improve link bandwidth utilization, most BGP
implementations choose additional paths, that satisfy
certain conditions, as "multi-path", and install them
in the forwarding table. Incoming packets for that
destination are load-balanced across the best-path and
the multi-path(s). I.e., there may be paths installed in
the forwarding table that are not advertised to the
peers.
</t>
<t>
When an Autonomous System (AS) deploys a route-reflector
(<xref target="RFC4456"></xref>) instead of using full
IBGP mesh, the BGP speakers receive only the route
reflector's best-path and therefore lose information
about the best-paths of other IBGP peers.
</t>
<t>
If an IBGP path is chosen as the best-path by a
non-route-reflector BGP speaker, then the best-path is
not sent to its IBGP peers. Thus the IBGP peers learn
nothing from this BGP speaker even though it might have
other EBGP paths for that address prefix.
</t>
<t>
Even when a BGP speaker selects an EBGP path as the
best-path and advertises it to its peers, it may have
additional EBGP paths for the address prefix. Should
those paths be advertised a priori, they could be used
by the peers in the event of loss of reachability of the
best-path resulting in faster convergence.
</t>
</list>
</t>
<t>
There are extensions to the classical BGP advertisement rule
to provide additional information about the routing state of
an address prefix. A BGP speaker supporting the <xref
target="I-D.ietf-idr-best-external">best-external</xref>
extension sends its best external path to its IBGP peers when
the best-path is an IBGP path. A BGP speaker supporting the
<xref target="I-D.ietf-idr-add-paths">add-path</xref>
extension advertises multiple paths for a given address
prefix.
</t>
<t>
With best-external or add-path extensions in use, when a BGP
speaker receives a path from a peer, that path may not be the
best-path, or it may not be installed in the peer's forwarding
table. In some scenarios, knowledge of the path type - i.e.,
whether the path is the best-path, or whether the path is
installed in the forwarding table - is essential.
</t>
<t>
For instance, in a typical dual-homed VPN in primary-backup
configuration, the backup path is created by advertising the
best-external path from the backup PE with worse
LOCAL_PREF. However, when the customer adds a site in another
AS, the LOCAL_PREF information does not reach that site. As a
result, data traffic coming from that site may incorrectly be
forwarded over the backup link instead of the primary link.
</t>
<t>
Similarly when an add-path enabled peer receives multiple
paths from a peer, it does not know which one among those
paths is the best-path and which ones are installed in the
forwarding table. An exogenous monitoring system, e.g., would
require that information to properly tweak the policies on the
router to effect desired forwarding optimization.
</t>
<t>
This draft proposes marking the advertised paths by an
extended community, called Path Type community, that
encodes the path type. The path type provides the necessary
information to the BGP speakers about how the path is used by
the sender when add-path or best-external extensions are in
use.
</t>
</section>
<section title ="The BGP Path Type Community" anchor="sec:path-type">
<t>
The BGP Path Type Community is an IPv4 Address Extended Community (<xref
target="RFC4360"></xref>) defined as follows:
</t>
<t>
<list style="hanging" hangIndent="4">
<t hangText="Type Field:">
<vspace />
<vspace />
The value of the high-order octet of the extended Type
Field is 0x01, which indicates that it is transitive. The
value of low-order octet of the extended type field for
this community is TBD.
</t>
<t hangText="Value Field:">
<vspace />
<vspace />
The Value field contains two sub-fields, described below:
</t>
</list>
</t>
<figure align="center">
<artwork>
<![CDATA[
+---------------------+
| Router-ID (4 octet) |
+---------------------+
| Path type (2 octet) |
+---------------------+
]]>
</artwork>
</figure>
<t>
The Router-ID field contains the BGP identifier of the BGP
speaker that adds the Path Type community to a path.
</t>
<t>
The Path type field contains a bitfield where each bit encodes
a specific role of the path. Multiple bits may be set when a
path is used in multiple roles.
</t>
<texttable anchor="path-type-val" title="Path Type Values">
<ttcol align='center'>Value</ttcol>
<ttcol align='left'>Path type</ttcol>
<c>0x0000</c>
<c>Unknown</c>
<c>0x0001</c>
<c>Best-path</c>
<c>0x0002</c>
<c>Best-external path</c>
<c>0x0004</c>
<c>Multi-path</c>
<c>0x0008</c>
<c>Backup path</c>
<c>0x0010</c>
<c>Uninstalled path</c>
<c>0x0020</c>
<c>Unreachable path</c>
</texttable>
<t>
The best-path is defined in <xref target="RFC4271"></xref> and
the best-external path is defined in <xref
target="I-D.ietf-idr-best-external"></xref>.
</t>
<t>
A multi-path is not the best-path but installed in the
forwarding table and used for forwarding packets. We use the
convention that the best-path is not considered a multi-path.
</t>
<t>
A backup path is installed in the forwarding table, but it is not
used for forwarding until all multipath(s) and the best-path
become unreachable. Backup paths are used for fast convergence
in the event of failures.
</t>
<t>
All other reachable paths are marked as 'Uninstalled'.
</t>
<t>
Lastly, all paths that are considered unreachable are marked
as 'Unreachable'. Unreachable paths may be sent only in
special cases (such as to a monitoring application).
</t>
</section>
<section title = "Rules" anchor = "sec:rules">
<t>
<list style="symbols">
<t>
A BGP speaker MAY add the Path Type community to an
originated path.
</t>
<t>
When a BGP speaker receives a path from a peer and
propagates it without changing the NEXT_HOP to self:
<list style="symbols">
<t>
If the path contained a Path Type community, it MUST
be retained in the propagated path.
</t>
<t>
If the path did not contain a Path Type community, the
speaker MAY add a Path Type community with 'Unknown'
value.
</t>
</list>
</t>
<t>
When a path received from a peer is propagated after
changing the NEXT_HOP to self:
<list style="symbols">
<t>
If the path did not contain a Path Type community, the
Path Type community indicating the path role MAY be
added.
</t>
<t>
If the path contained a Path Type community:
<list style="symbols">
<t>
If data traffic entering the router for the given
destination may be forwarded over other paths
(e.g., for doing load balancing), then the
existing Path Type community MUST be removed. The
BGP speaker MAY add its own Path Type community.
</t>
<t>
If data traffic entering the router for the given
destination is forwarded only along the given
path, then the existing Path Type community MAY be
retained.
</t>
</list>
</t>
</list>
</t>
</list>
</t>
<t>
In all cases, when a BGP speaker adds its own Path Type
community, it sets its own router-id in the community. Note
that BGP router-id need not be unique across ASes.
</t>
<t>
The above rule-set prevents a route reflector from modifying
the Path Type community set by its client (unless the route
reflector is changing the NEXT_HOP to self).
</t>
<t>
When a peer is capable of sending only one path for a given
address prefix and it sends the path without any Path Type
community, the path MAY be considered as the best-path of the
peer. In all other cases, a path without any Path Type
community SHOULD be considered to have an 'Unknown' Path type.
</t>
<t>
A local policy might modify the above rules. For instance, if
a monitoring application peers with a BGP speaker with
add-path capability for the sole purpose of learning its paths
and their types, then the speaker may always add its own Path
Type community when it advertises the paths to that peer even
if it does not change the NEXT_HOP to self. Such overriding
policies should be used with caution if the advertised paths
may impact forwarding decisions in the network.
</t>
</section>
<section title="Operational Considerations">
<t>
If a speaker receives a path with a Path Type community with
an invalid combination of bits (e.g., both 'Multi-path' and
'Backup' bits are set), the path MUST NOT be considered
invalid. Such error cases SHOULD be logged through other
means.
</t>
<t>
An implementation SHOULD provide a configurable option for the
user to indicate whether a path should be readvertised when
its type is changed. If the user does not configure the
option, the BGP speaker MUST NOT readvertise a path just to
update its Path Type community (e.g., when a path type changes
from 'Multi-path' to 'Uninstalled' due to a change in IGP
metric).
</t>
<t>
An implementation SHOULD provide a configurable option for
removing Path Type communities from paths that are advertised
to untrusted peers.
</t>
<t>
An implementation SHOULD mark all paths for a given address
prefix consistently. If one of the paths is marked, then all
other paths SHOULD be marked.
</t>
<t>
An implementation MAY modify its best-path selection algorithm
to take path type information into account. For instance,
paths with type 'Best-path' MAY be preferred over paths of
other types. Similarly, paths of type 'Best-external' MAY be
considered ineligible for being a multipath.
</t>
</section>
<section title ="Applications" anchor="sec:applications">
<t>
In this section, we illustrate some applications that benefit
from the Path Type community proposed in this draft.
</t>
<section title ="Avoiding suboptimal routing in Inter-AS VPN"
anchor="sec:depref">
<figure anchor="fig:best-ext" title="Inter-AS VPN scenario" align="center">
<artwork>
<![CDATA[
(RD1)A/B +---+ +---+
LP=200 |RR1| |RR2|
+---+ ,,-+---+-.. _.-+---+-._
,|PE1|' `. / \ (RD3)A/B
+---+,' +---+ +---+ +---+ +---+ -> PE1 (LP=100)
A/B |CE1|. | AS1 |AR1|---|AR2| AS2 |PE3| -> PE2 (LP=100)
+---+ \ +---+ +---+ +---+ +---+
>|PE2|._ _,' `. ,'
+---+ `-....,-' `--...--'
(RD2)A/B
LP=150
]]>
</artwork>
</figure>
<t>
<xref target="fig:best-ext"/> depicts an L3VPN network that
spans two ASes: AS1 and AS2. The ASes may be connected using
either Option-B or Option-C techniques <xref target=
"RFC4364"></xref>. A customer site with equipment CE1 is
dual-homed in AS1, connected to PE1 and PE2. For prefix
A/B, the customer prefers to use the link between CE1 and
PE1. This routing preference is expressed by setting the
LOCAL_PREF of the prefix advertised by PE1 to a higher value
than that of the prefix advertised by PE2. This causes PE2
to use PE1's route as the best-path and its own EBGP path
becomes the best-external path. PE2 is configured to
advertise its best-external path. Therefore, both PEs
continue to advertise their own EBGP path. The provider uses
unique route-distinguishers for its VPNs. So PE1 and PE2
advertises different VPN prefixes: (RD1)A/B and
(RD2)A/B. Both these prefixes are advertised to PE3 in AS2.
PE3 imports both paths to its own VPN with
route-distinguisher RD3.
</t>
<t>
<list style="hanging" hangIndent="6">
<t hangText="Existing behavior:">
<vspace blankLines="1"/>
Since LOCAL_PREF is not sent across AS boundary, both
paths on PE3 have the default LOCAL_PREF of 100. As a
result the best-path selection on PE3 may boil down to
tie breaking steps and the path towards PE2, which is
the best-external path, may be chosen. Alternately, the
path from PE2 may be chosen as the multipath and may be
used for load-balancing. Therefore, some or all data
traffic entering PE3 would reach CE1 via PE2, which is
not what the customer desired.
</t>
<t hangText="Behavior with Path Type Community:">
<vspace blankLines="1"/>
When PE2 advertises its path, it adds the best-external
Path Type community. This community is preserved across
AS boundary. If option C is used, then RR1 or RR2 does
not change the NEXT_HOP and hence the community is
preserved according to the rule-set (<xref
target="sec:rules"/>). If option B is used, then the
community reaches AR1 since RR1 does not change the
NEXT_HOP. At AR1, (RD2)A/B has only one path and
forwarding traffic entering AR1 from AR2 for this
destination (determined by the outer label) would use
this path. Therefore, AR1 retains the Path Type
community set by PE2. The same applies to AR2. So at
PE3, the path to PE2 has the best-external Path Type
community and therefore PE3 can choose to not use this
path for forwarding.
</t>
</list>
</t>
<t>
If the best-path algorithm takes the Path Type community
values into account, it eliminates the need for setting
LOCAL_PREF to deprefer the bext-external path even within a
single AS. This simplifies the network design and management.
</t>
<t>
Instead of using Path Type communities, it is possible to
use policies on the border routers (AR1 and AR2 for option
B, or RR1 and RR2 for option C) to recreate the LOCAL_PREF
in AS2 (e.g., by matching on the RD and the
prefix). However, the recreated LOCAL_PREF may interfere
with the local policies set in AS2 (e.g., if there are other
paths in AS2 for A/B that the customer wants to use as
secondary paths). In addition, such policies are error-prone
and complex to manage, especially when the customer is
allowed to change the primary/backup relationships between
PE1 and PE2 on its own. The standardized mechanism of Path
Type community is free from such drawbacks.
</t>
</section>
<section title ="Monitoring applications" anchor="sec:monitoring">
<t>
A modern Service Provider (SP) network may contain thousands
of BGP routers. For planning, proper engineering and
operation of a backbone, it is a good practice to
continuously monitor the routers' states and perhaps keep a
history. Many Network Management Systems (NMS) establish
IBGP sessions with BGP speakers to collect the paths the
speaker has. When the speaker supports add-path (or
best-external), the NMS receives non-best-paths. There are
also monitoring protocols such as <xref
target="I-D.ietf-grow-bmp">BMP</xref> that similarly
receives all paths from a speaker.
</t>
<t>
When an NMS receives multiple paths for a destination, it is
important for its operation to know which path is the
best-path, which paths are installed in forwarding table,
which path is used as a backup, etc. The NMS system may run
the best-path algorithm on those paths on its own. However,
its information, especially on IGP metric, local policies,
etc., may be incomplete and hence its own calculations may
not match that of the router's. It is also noted that even
if the NMS system collected additional information to run
the best-path algorithm from the point-of-view of the router,
it would have to do so for every router in the network,
which would impose a very high computational burden on the
NMS.
</t>
<t>
When Path Type community is in use, the router provides the
required information directly, thus avoiding computational
load on the NMS as well as potential discrepancies between
the point-of-view of the router and that of the NMS.
</t>
</section>
<section title ="SDN applications" anchor="sec:sdn">
<t>
Similar to the monitoring applications, a "Software Defined
Networking" application monitors the routing state and based
on it, may change the policies on the router, or inject
additional paths, to influence the forwarding. When a BGP
speaker supports Path Type communities and add-path, an SDN
application can simply peer with the router to receive its
routing state in real-time even if the router does not
provide vendor-specific APIs for doing the same.
</t>
</section>
<section title ="Selective Best-path" anchor="sec:selectivebestpath">
<t>
When the classical BGP advertisement rule is followed, all
paths a BGP speaker considers for best-path are already
installed in the forwarding table of the peer. However, when
add-path, or best-external extensions are used, that no longer
holds. If the BGP speakers support the Path Type communities,
then the classical behavior can be reinstated by considering
only those paths in the best-path algorithm that are marked as
best-path or multi-path. Detailed discussions on the rules and
benefits of such an approach are outside the scope of this
draft.
</t>
</section>
</section>
<section title = "IANA Considerations">
<t>
<xref target="sec:path-type"></xref> defines an IPv4 Address
specific transitive extended community called the Path Type
extended community. IANA is requested to assign a sub-type
value for the Path Type extended community. The last 2 bytes
of the value field of the Path Type extended community
contains a bitfield that encodes the type of the advertised
path. IANA is expected to maintain a registry for these
bits. <xref target="sec:path-type"></xref> defines 6 of those
bits. The rest of the bits are to be assigned by IANA using
the "IETF Consensus" policy defined in <xref
target="RFC2434"></xref>.
</t>
</section>
<section title = "Security Considerations">
<t>
This document introduces no new security concerns to BGP or
other specifications referenced in this document.
</t>
</section>
<section title ="Contributors">
<t>
Adam Simpson
<vspace />
Alcatel-Lucent<vspace />
600 March Road<vspace />
Ottawa, Ontario K2K 2E6<vspace />
Canada<vspace />
Email: adam.simpson@alcatel-lucent.com<vspace />
</t>
<t>
Roberto Fragassi<vspace />
Alcatel-Lucent<vspace />
600 Mountain Avenue<vspace />
Murray Hill, New Jersey<vspace />
USA<vspace />
Email: roberto.fragassi@alcatel-lucent.com</t>
</section>
<section title ="Acknowledgments">
<t>
We would like to thank Bruno Decraene for his feedback on this work.
</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include="reference.RFC.4364"?>
<?rfc include="reference.RFC.2434"?>
</references>
<references title="Informative References">
<?rfc include="reference.RFC.4360"?>
<?rfc include="reference.RFC.4271"?>
<?rfc include="reference.RFC.4456"?>
<?rfc include="reference.I-D.ietf-idr-add-paths.xml"?>
<?rfc include="reference.I-D.ietf-idr-best-external.xml"?>
<?rfc include="reference.I-D.ietf-grow-bmp.xml"?>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:38:12 |