One document matched: draft-ietf-mpls-entropy-label-06.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfcfoo.dtd">
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std"
ipr="trust200902"
updates="3031, 3107, 3209, 5036"
docName='draft-ietf-mpls-entropy-label-06'>
<front>
<title abbrev="MPLS Entropy Labels">
The Use of Entropy Labels in MPLS Forwarding
</title>
<author fullname="Kireeti Kompella" initials="K." surname="Kompella">
<organization>Juniper Networks</organization>
<address>
<postal>
<street>1194 N. Mathilda Ave.</street>
<city>Sunnyvale</city>
<region>CA</region>
<code>94089</code>
<country>US</country>
</postal>
<email>kireeti.kompella@gmail.com</email>
</address>
</author>
<author fullname="John Drake" initials="J." surname="Drake">
<organization>Juniper Networks</organization>
<address>
<postal>
<street>1194 N. Mathilda Ave.</street>
<city>Sunnyvale</city>
<region>CA</region>
<code>94089</code>
<country>US</country>
</postal>
<email>jdrake@juniper.net</email>
</address>
</author>
<author fullname="Shane Amante" initials="S." surname="Amante">
<organization>Level 3 Communications, LLC</organization>
<address>
<postal>
<street>1025 Eldorado Blvd</street>
<city>Broomfield</city>
<region>CO</region>
<code>80021</code>
<country>US</country>
</postal>
<email>shane@level3.net</email>
</address>
</author>
<author fullname="Wim Henderickx" initials="W." surname="Henderickx">
<organization>Alcatel-Lucent</organization>
<address>
<postal>
<street>Copernicuslaan 50</street>
<city>2018 Antwerp</city>
<country>Belgium</country>
</postal>
<email>wim.henderickx@alcatel-lucent.com</email>
</address>
</author>
<author fullname="Lucy Yong" initials="L." surname="Yong">
<organization>Huawei USA</organization>
<address>
<postal>
<street>5340 Legacy Dr.</street>
<city>Plano</city>
<region>TX</region>
<code>75024</code>
<country>US</country>
</postal>
<email>lucy.yong@huawei.com</email>
</address>
</author>
<date year="2012"/>
<area>Routing</area>
<keyword>Internet-Draft</keyword>
<keyword>entropy hash ecmp load balancing</keyword>
<abstract>
<t>
Load balancing is a powerful tool for engineering traffic
across a network. This memo suggests ways of improving load
balancing across MPLS networks using the concept of "entropy
labels". It defines the concept, describes why entropy labels
are useful, enumerates properties of entropy labels that allow
maximal benefit, and shows how they can be signaled and used
for various applications. This document updates RFCs 3031,
3107, 3209 and 5036.
</t>
</abstract>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t>
Load balancing, or multi-pathing, is an attempt to balance traffic
across a network by allowing the traffic to use multiple
paths. Load balancing has several benefits: it eases capacity
planning; it can help absorb traffic surges by spreading them
across multiple paths; it allows better resilience by offering
alternate paths in the event of a link or node failure.
</t>
<t>
As providers scale their networks, they use several techniques to
achieve greater bandwidth between nodes. Two widely used
techniques are: Link Aggregation Group (LAG) and Equal-Cost
Multi-Path (ECMP). LAG is used to bond together several physical
circuits between two adjacent nodes so they appear to higher-layer
protocols as a single, higher bandwidth 'virtual' pipe. ECMP is
used between two nodes separated by one or more hops, to allow
load balancing over several shortest paths in the network. This
is typically obtained by arranging IGP metrics such that there are
several equal cost paths between source-destination pairs. Both
of these techniques may, and often do, co-exist in various parts
of a given provider's network, depending on various choices made
by the provider.
</t>
<t>
A very important requirement when load balancing is that packets
belonging to a given 'flow' must be mapped to the same path, i.e.,
the same exact sequence of links across the network. This is to
avoid jitter, latency and re-ordering issues for the flow. What
constitutes a flow varies considerably. A common example of a
flow is a TCP session. Other examples are an L2TP session
corresponding to a given broadband user, or traffic within an ATM
virtual circuit.
</t>
<t>
To meet this requirement, a node uses certain fields, termed
'keys', within a packet's header as input to a load balancing
function (typically a hash function) that selects the path for all
packets in a given flow. The keys chosen for the load balancing
function depend on the packet type; a typical set (for IP packets)
is the IP source and destination addresses, the protocol type, and
(for TCP and UDP traffic) the source and destination port numbers.
An overly conservative choice of fields may lead to many flows
mapping to the same hash value (and consequently poorer load
balancing); an overly aggressive choice may map a flow to multiple
values, potentially violating the above requirement.
</t>
<t>
For MPLS networks, most of the same principles (and benefits)
apply. However, finding useful keys in a packet for the purpose
of load balancing can be more of a challenge. In many cases, MPLS
encapsulation may require fairly deep inspection of packets to
find these keys at transit Label Switching Routers (LSRs).
</t>
<t>
One way to eliminate the need for this deep inspection is to have
the ingress LSR of an MPLS Label Switched Path extract the
appropriate keys from a given packet, input them to its load
balancing function, and place the result in an additional label,
termed the 'entropy label', as part of the MPLS label stack it
pushes onto that packet.
</t>
<t>
The packet's MPLS entire label stack can then be used by transit
LSRs to perform load balancing, as the entropy label introduces
the right level of "entropy" into the label stack.
</t>
<t>
There are five key reasons why this is beneficial:
<list style="numbers">
<t>
at the ingress LSR, MPLS encapsulation hasn't yet occurred, so
deep inspection is not necessary;
</t>
<t>
the ingress LSR has more context and information about
incoming packets than transit LSRs;
</t>
<t>
ingress LSRs usually operate at lower bandwidths than transit
LSRs, allowing them to do more work per packet;
</t>
<t>
transit LSRs do not need to perform deep packet inspection and
can load balance effectively using only a packet's MPLS label
stack; and
</t>
<t>
transit LSRs, not having the full context that an ingress LSR
does, have the hard choice between potentially misinterpreting
fields in a packet as valid keys for load balancing (causing
packet ordering problems) or adopting a conservative approach
(giving rise to sub-optimal load balancing). Entropy labels
relieves them of making this choice.
</t>
</list>
</t>
<t>
This memo describes why entropy labels are needed and defines the
properties of entropy labels; in particular how they are generated
and received, and the expected behavior of transit LSRs. Finally,
it describes in general how signaling works and what needs to be
signaled, as well as specifics for the signaling of entropy labels
for LDP (<xref target="RFC5036"/>), BGP (<xref
target="RFC3107"/>), and RSVP-TE (<xref target="RFC3209"/>).
</t>
<section anchor="conv" title="Conventions used">
<t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in <xref target="RFC2119"/>.
</t>
<t>
The following acronyms are used:
<list>
<t>BoS: Bottom of Stack</t>
<t>CE: Customer Edge device</t>
<t>ECMP: Equal Cost Multi-Path</t>
<t>EL: Entropy Label</t>
<t>ELC: Entropy Label Capability</t>
<t>ELI: Entropy Label Indicator</t>
<t>FEC: Forwarding Equivalence Class</t>
<t>LAG: Link Aggregation Group</t>
<t>LER: Label Edge Router</t>
<t>LSP: Label Switched Path</t>
<t>LSR: Label Switching Router</t>
<t>PE: Provider Edge Router</t>
<t>PW: Pseudowire</t>
<t>PHP: Penultimate Hop Popping</t>
<t>TC: Traffic Class</t>
<t>TTL: Time-to-Live</t>
<t>UHP: Ultimate Hop Popping</t>
<t>VPLS: Virtual Private LAN (Local Area Network) Service</t>
<t>VPN: Virtual Private Network</t>
</list>
</t>
<t>
The term ingress (or egress) LSR is used interchangeably with
ingress (or egress) LER. The term application throughout the
text refers to an MPLS application (such as a VPN or VPLS).
</t>
<t>
A label stack (say of three labels) is denoted by <L1, L2,
L3>, where L1 is the "outermost" label and L3 the innermost
(closest to the payload). Packet flows are depicted left to
right, and signaling is shown right to left (unless otherwise
indicated).
</t>
<t>
The term 'label' is used both for the entire 32-bit label stack
entry and the 20-bit label field within a label stack entry. It
should be clear from the context which is meant.
</t>
</section>
<section title="Motivation">
<t>
MPLS is a very successful generic forwarding substrate that
transports several dozen types of protocols, most notably: IP,
PWs, VPLS and IP VPNs. Within each type of protocol, there
typically exist several variants, each with a different set of
load balancing keys, e.g., for IP: IPv4, IPv6, IPv6 in IPv4,
etc.; for PWs: Ethernet, ATM, Frame-Relay, etc. There are also
several different types of Ethernet over PW encapsulation, ATM
over PW encapsulation, etc. as well. Finally, given the
popularity of MPLS, it is likely that it will continue to be
extended to transport new protocols.
</t>
<t>
Currently, each transit LSR along the path of a given LSP has to
try to infer the underlying protocol within an MPLS packet in
order to extract appropriate keys for load balancing.
Unfortunately, if the transit LSR is unable to infer the MPLS
packet's protocol (as is often the case), it will typically use
the topmost (or all) MPLS labels in the label stack as keys for
the load balancing function. The result may be an extremely
inequitable distribution of traffic across equal-cost paths
exiting that LSR. This is because MPLS labels are generally
fairly coarse-grained forwarding labels that typically describe
a next-hop, or provide some of demultiplexing and/or forwarding
function, and do not describe the packet's underlying protocol.
</t>
<t>
On the other hand, an ingress LSR (e.g., a PE router) has
detailed knowledge of an packet's contents, typically through a
priori configuration of the encapsulation(s) that are expected
at a given PE-CE interface, (e.g., IPv4, IPv6, VPLS, etc.).
They also have more flexible forwarding hardware. PE routers
need this information and these capabilities to:
<list>
<t>
a) apply the required services for the CE;
</t>
<t>
b) discern the packet's CoS forwarding treatment;
</t>
<t>
c) apply filters to forward or block traffic to/from the CE;
</t>
<t>
d) to forward routing/control traffic to an onboard
management processor; and,
</t>
<t>
e) load-balance the traffic on its uplinks to transit LSRs
(e.g., P routers).
</t>
</list>
By knowing the expected encapsulation types, an ingress LSR
router can apply a more specific set of payload parsing routines
to extract the keys appropriate for a given protocol. This
allows for significantly improved accuracy in determining the
appropriate load balancing behavior for each protocol.
</t>
<t>
If the ingress LSR were to capture the flow information so
gathered in a convenient form for downstream transit LSRs,
transit LSRs could remain completely oblivious to the contents
of each MPLS packet, and use only the captured flow information
to perform load balancing. In particular, there will be no
reason to duplicate an ingress LSR's complex packet/payload
parsing functionality in a transit LSR. This will result in
less complex transit LSRs, enabling them to more easily scale to
higher forwarding rates, larger port density, lower power
consumption, etc. The idea in this memo is to capture this flow
information as a label, the so-called entropy label.
</t>
<t>
Ingress LSRs can also adapt more readily to new protocols and
extract the appropriate keys to use for load balancing packets
of those protocols. This means that deploying new protocols or
services in edge devices requires fewer concomitant changes in
the core, resulting in higher edge service velocity and at the
same time more stable core networks.
</t>
</section>
</section>
<section title="Approaches">
<t>
There are two main approaches to encoding load balancing
information in the label stack. The first allocates multiple
labels for a particular Forwarding Equivalence Class (FEC). These
labels are equivalent in terms of forwarding semantics, but having
multiple labels allows flexibility in assigning labels to flows
belonging to the same FEC. This approach has the advantage that
the label stack has the same depth whether or not one uses
label-based load balancing; and so, consequently, there is no
change to forwarding operations on transit and egress LSRs.
However, it has a major drawback in that there is a significant
increase in both signaling and forwarding state.
</t>
<t>
The other approach encodes the load balancing information as an
additional label in the label stack, thus increasing the depth of
the label stack by one. With this approach, there is minimal
change to signaling state for a FEC; also, there is no change in
forwarding operations in transit LSRs, and no increase of
forwarding state in any LSR. The only purpose of the additional
label is to increase the entropy in the label stack, so this is
called an "entropy label". This memo focuses solely on this
approach.
</t>
<t>
This latter approach uses upstream generated entropy labels, which
may conflict with downstream allocated application labels. There
are a few approaches to deal with this: 1) allocate a pair of
labels for each FEC, one that must have an entropy label below it,
and one that must not; 2) use a label (the "Entropy Label
Indicator") to indicate that the next label is an entropy label;
and 3) allow entropy labels only where there is no possible
confusion. The first doubles control and data plane state in the
network; the last is too restrictive. The approach taken here is
the second. In making both the above choices, the trade-off is to
increase label stack depth rather than control and data plane
state in the network.
</t>
<t>
Finally, one may choose to associate ELs with MPLS tunnels (LSPs),
or with MPLS applications (e.g., VPNs). (What this entails is
described in later sections.) We take the former approach, for
the following reasons:
<list style='numbers'>
<t>
There are a small number of tunneling protocols for MPLS, but
a large and growing number of applications. Defining ELs on a
tunnel basis means simpler standards, lower development,
interoperability and testing efforts.
</t>
<t>
As a consequence, there will be much less churn in the network
as new applications (services) are defined and deployed.
</t>
<t>
Processing application labels in the data plane is more
complex than processing tunnel labels. Thus, it is preferable
to burden the latter rather than the former with EL
processing.
</t>
<t>
Associating ELs with tunnels makes it simpler to deal with
hierarchy, be it LDP-over-RSVP-TE or Carrier's Carrier VPNs.
Each layer in the hierarchy can choose independently whether
or not they want ELs.
</t>
</list>
The cost of this approach is that ELIs will be mandatory; again,
the trade-off is the size of the label stack. To summarize, the
net increase in the label stack to use entropy labels is two: one
reserved label for the ELI, and the entropy label itself.
</t>
</section>
<section title="Entropy Labels and Their Structure" anchor='el-struct'>
<t>
An entropy label (as used here) is a label:
<list style="numbers">
<t>that is not used for forwarding;</t>
<t>that is not signaled; and</t>
<t>
whose only purpose in the label stack is to provide 'entropy'
to improve load balancing.
</t>
</list>
</t>
<t>
Entropy labels are generated by an ingress LSR, based entirely on
load balancing information. However, they MUST NOT have values in
the reserved label space (0-15) [IANA MPLS Label Values].
</t>
<t>
Since entropy labels are generated by an ingress LSR, an egress
LSR MUST be able to distinguish unambiguously between entropy
labels and application labels. To accomplish this, it is REQUIRED
that the label immediately preceding an entropy label (EL) in the
MPLS label stack be an 'entropy label indicator' (ELI), where
preceding means closer to the top of the label stack (farther from
bottom of stack indication). The ELI is a reserved label with
value (TBD by IANA). How to set values of the TTL, TC and 'Bottom
of Stack' (BoS) fields (<xref target='RFC3032'/>) for the ELI and
for ELs is discussed in <xref target='ingress-lsr'/>.
</t>
<t>
Entropy labels are useful for pseudowires (<xref
target="RFC4447"/>). <xref target="RFC6391"/> explains how
entropy labels can be used for RFC 4447-style pseudowires, and
thus is complementary to this memo, which focuses on how entropy
labels can be used for tunnels, and thus for all other MPLS
applications.
</t>
</section>
<section title="Data Plane Processing of Entropy Labels">
<section anchor="egress-lsr" title="Egress LSR">
<t>
Suppose egress LSR Y is capable of processing entropy labels for
a tunnel. Y indicates this to all ingresses via signaling (see
<xref target='sig'/>). Y MUST be prepared to deal both with
packets with an imposed EL and those without; the ELI will
distinguish these cases. If a particular ingress chooses not to
impose an EL, Y's processing of the received label stack (which
might be empty) is as if Y chose not to accept ELs.
</t>
<t>
If an ingress X chooses to impose an EL, then Y will receive a
tunnel termination packet with label stack <TL, ELI, EL>
<remaining packet header>. Y recognizes TL as the label
it distributed to its upstreams for the tunnel, and pops it.
(Note that TL may be the implicit null label, in which case it
doesn't appear in the label stack.) Y then recognizes the ELI
and pops two labels: the ELI and the EL. Y then processes the
remaining packet header as normal; this may require further
processing of tunnel termination, perhaps with further ELI+EL
pairs. When processing the final tunnel termination, Y MAY
enqueue the packet based on that tunnel TL's or ELI's TC value,
and MAY use the tunnel TL's or ELI's TTL to compute the TTL of
the remaining packet header. The EL's TTL MUST be ignored.
</t>
<t>
If any ELI processed by Y has BoS bit set, Y MUST discard the
packet, and MAY log an error. The EL's BoS bit will indicate
whether or not there are more labels in the stack.
</t>
</section>
<section anchor="ingress-lsr" title="Ingress LSR">
<t>
If an egress LSR Y indicates via signaling that it can process
ELs on a particular tunnel, an ingress LSR X can choose whether
or not to insert ELs for packets going into that tunnel. Y MUST
handle both cases.
</t>
<t>
The steps that X performs to insert ELs are as follows:
<list style='numbers'>
<t>
On an incoming packet, identify the application to which the
packet belongs; based on this, pick appropriate fields as
input to the load balancing function; apply the load
balancing function to these input fields, and let LB be the
output.
</t>
<t>
Determine the application label AL (if any). Push
<AL> onto the packet.
</t>
<t>
Based on the application, the load balancing output LB and
other factors, determine the egress LSR Y, the tunnel to Y,
the specific interface to the next hop, and thus the tunnel
label TL. Use LB to generate the entropy label EL.
</t>
<t>
If, for the chosen tunnel, Y has not indicated that it can
process ELs, push <TL> onto the packet. If Y has
indicated that it can process ELs for the tunnel, push
<TL, ELI, EL> onto the packet. X SHOULD put the same
TTL and TC fields for the ELI as it does for TL. X MAY
choose different values for the TTL and TC fields if it is
known that the ELI will not be exposed as the top label at
any point along the LSP (as may happen in cases where PHP is
used and the ELI and EL are not stripped at the penultimate
hop (see <xref target='php-lsr'/>). The BoS bit for the ELI
MUST be zero. The TTL for the EL MUST be zero to ensure
that it is not used inadvertently for forwarding. The TC
for the EL may be any value. The BoS bit for the EL depends
on whether or not there are more labels in the label stack.
</t>
<t>
X then determines whether further tunnel hierarchy is
needed; if so, X goes back to step 3, possibly with a new
egress Y for the new tunnel. Otherwise, X is done, and
sends out the packet.
</t>
</list>
</t>
<t>
Notes:
<list style='letters'>
<t>
X computes load balancing information and generates the EL
based on the incoming application packet, even though the
signaling of EL capability is associated with tunnels.
</t>
<t>
X MAY insert several entropy labels in the stack (each, of
course, preceded by an ELI), potentially one for each
hierarchical tunnel, provided that the egress for that
tunnel has indicated that it can process ELs for that
tunnel.
</t>
<t>
X MUST NOT include an entropy label for a given tunnel
unless the egress LSR Y has indicated that it can process
entropy labels for that tunnel.
</t>
<t>
The signaling and use of entropy labels in one direction
(signaling from Y to X, and data path from X to Y) is
completely independent of the signaling and use of entropy
labels in the reverse direction (signaling from X to Y, and
data path from Y to X).
</t>
</list>
</t>
</section>
<section anchor="transit-lsr" title="Transit LSR">
<t>
Transit LSRs MAY operate with no change in forwarding behavior.
The following are suggestions for optimizations that improve
load balancing, reduce the amount of packet data processed,
and/or enhance backward compatibility.
</t>
<t>
If a transit LSR recognizes the ELI, it MAY choose to load
balance solely on the following label (the EL); otherwise, it
SHOULD use as much of the whole label stack as feasible as keys
for the load balancing function. In any case, reserved labels
MUST NOT be used as keys for the load balancing function.
</t>
<t>
Some transit LSRs look beyond the label stack for better load
balancing information. This is a simple, backward compatible
approach in networks where some ingress LSRs impose ELs and
others don't. However, this is of limited incremental value if
an EL is indeed present, and requires more packet processing
from the LSR. A transit LSR MAY choose to parse the label stack
for the presence of the ELI, and look beyond the label stack
only if it does not find it, thus retaining the old behavior
when needed, yet avoiding unnecessary work if not needed.
</t>
<t>
As stated in <xref target='egress-lsr'/> and <xref
target='sig'/>, an egress LSR that signals both ELC and implicit
null MUST pop the ELI and the next label if it encounters a
packet with the ELI as the topmost label. Any other LSR
(including PHP LSRs) MUST drop such packets, as per section 3.18
of <xref target='RFC3031'/>.
</t>
</section>
<section anchor='php-lsr' title='Penultimate Hop LSR'>
<t>
No change is needed at penultimate hop LSRs. However, a PHP LSR
that recognizes the ELI MAY choose to pop the ELI and following
label (which should be an entropy label) in addition to popping
the tunnel label, provided that doing so doesn't diminish its
ability to load balance on the next hop.
</t>
</section>
</section>
<section anchor="sig" title="Signaling for Entropy Labels">
<t>
An egress LSR Y can signal to ingress LSR(s) its ability to
process entropy labels (henceforth called "Entropy Label
Capability" or ELC) on a given tunnel. In particular, even if Y
signals an implicit null label, indicating that PHP is to be
performed, Y MUST be prepared to pop the ELI and EL.
</t>
<t>
Note that Entropy Label Capability may be asymmetric: if LSRs X
and Y are at opposite ends of a tunnel, X may be able to process
entropy labels, whereas Y may not. The signaling extensions below
allow for this asymmetry.
</t>
<t>
For an illustration of signaling and forwarding with entropy
labels, see <xref target='sig-forw'/>.
</t>
<section anchor="ldp" title="LDP Signaling">
<t>
A new LDP TLV (<xref target="RFC5036"/>) is defined to signal an
egress's ability to process entropy labels. This is called the
ELC TLV, and may appear as an Optional Parameter of the Label
Mapping Message TLV.
</t>
<t>
The presence of the ELC TLV in a Label Mapping Message indicates
to ingress LSRs that the egress LSR can process entropy labels
for the associated LDP tunnel. The ELC TLV has Type (TBD by
IANA) and Length 0.
</t>
<t>
<figure anchor="el_sub_tlv" title="Entropy Label Capability TLV">
<preamble>
The structure of the ELC TLV is shown below.
</preamble>
<artwork>
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|U|F| Type (TBD) | Length (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>
</t>
<t>
where:
<list style="empty">
<t>
U: Unknown bit. This bit MUST be set to 1. If the ELC TLV
is not understood by the receiver, then it MUST be ignored.
</t>
<t>
F: Forward bit. This bit MUST be set be set to 1. Since
the ELC TLV is going to be propagated hop-by-hop, it should
be forwarded even by nodes that may not understand it.
</t>
<t>
Type: Type field. To be assigned by IANA.
</t>
<t>
Length: Length field. This field specifies the total length
in octets of the ELC TLV, and is currently defined to be 0.
</t>
</list>
</t>
<section title='Processing the ELC TLV'>
<t>
An LSR that receives a Label Mapping with the ELC TLV but does
not understand it MUST propagate it intact to its neighbors
and MUST NOT send a notification to the sender (following the
meaning of the U- and F-bits).
</t>
<t>
An LSR X may receive multiple Label Mappings for a given FEC F
from its neighbors. In its turn, X may advertise a Label
Mapping for F to its neighbors. If X understands the ELC TLV,
and if any of the advertisements it received for FEC F does
not include the ELC TLV, X MUST NOT include the ELC TLV in its
own advertisements of F. If all the advertised Mappings for F
include the ELC TLV, then X MUST advertise its Mapping for F
with the ELC TLV. If any of X's neighbors resends its
Mapping, sends a new Mapping or Withdraws a previously
advertised Mapping for F, X MUST re-evaluate the status of ELC
for FEC F, and, if there is a change, X MUST re-advertise its
Mapping for F with the updated status of ELC.
</t>
</section>
</section>
<section anchor="bgp" title="BGP Signaling">
<t>
When BGP <xref target="RFC4271"/> is used for distributing
Network Layer Reachability Information (NLRI) as described in,
for example, <xref target="RFC3107"/>, the BGP UPDATE message
may include the ELC attribute as part of the Path Attributes.
This is an optional, transitive BGP attribute of type (to be
assigned by IANA). The inclusion of this attribute with an NLRI
indicates that the advertising BGP router can process entropy
labels as an egress LSR for all routes in that NLRI.
</t>
<t>
A BGP speaker S that originates an UPDATE should include the
ELC attribute only if both of the following are true:
<list style='format A%d:'>
<t>
S sets the BGP NEXT_HOP attribute to itself; AND
</t>
<t>
S can process entropy labels.
</t>
</list>
</t>
<t>
Suppose a BGP speaker T receives an UPDATE U with the ELC
attribute. T has two choices. T can simply re-advertise U
with the ELC attribute if either of the following is true:
<list style='format B%d:'>
<t>
T does not change the NEXT_HOP attribute; OR
</t>
<t>
T simply swaps labels without popping the entire label
stack and processing the payload below.
</t>
</list>
An example of the use of B1 is Route Reflectors.
</t>
<t>
However, if T changes the NEXT_HOP attribute for U and in the
data plane pops the entire label stack to process the payload, T
MAY include an ELC attribute for UPDATE U' if both of the
following are true:
<list style='format C%d:'>
<t>
T sets the NEXT_HOP attribute of U' to itself; AND
</t>
<t>
T can process entropy labels.
</t>
</list>
Otherwise, T MUST remove the ELC attribute.
</t>
</section>
<section title="RSVP-TE Signaling" anchor='rsvp-te'>
<t>
Entropy Label support is signaled in RSVP-TE <xref
target="RFC3209"/> using the Entropy Label Capability (ELC) flag
in the Attribute Flags TLV of the LSP_ATTRIBUTES object <xref
target="RFC5420"/>. The presence of the ELC flag in a Path
message indicates that the ingress can process entropy labels in
the upstream direction; this only makes sense for a
bidirectional LSP and MUST be ignored otherwise. The presence
of the ELC flag in a Resv message indicates that the egress can
process entropy labels in the downstream direction.
</t>
<t>
The bit number for the ELC flag is to be assigned by IANA.
</t>
</section>
<section title="Multicast LSPs and Entropy Labels"
anchor='mlsp'>
<t>
Multicast LSPs <xref target="RFC4875"/>, <xref
target='RFC6388'/> typically do not use ECMP for load balancing,
as the combination of replication and multipathing can lead to
duplicate traffic delivery. However, these LSPs can traverse
bundled links <xref target="RFC4201"/> and LAGs. In both these
cases, load balancing is useful, and hence entropy labels can be
of value for multicast LSPs.
</t>
<t>
The methodology defined for entropy labels here will be used for
multicast LSPs; however, the details of signaling and processing
ELs for multicast LSPs will be specified in a companion
document.
</t>
</section>
</section>
<section title="Operations, Administration, and Maintenance (OAM) and Entropy Labels">
<t>
Generally OAM comprises a set of functions operating in the data
plane to allow a network operator to monitor its network
infrastructure and to implement mechanisms in order to enhance the
general behavior and the level of performance of its network,
e.g., the efficient and automatic detection, localization,
diagnosis and handling of defects.
</t>
<t>
Currently defined OAM mechanisms for MPLS include LSP
Ping/Traceroute <xref target="RFC4379"/> and Bidirectional Failure
Detection (BFD) for MPLS <xref target="RFC5884"/>. The latter
provides connectivity verification between the endpoints of an
LSP, and recommends establishing a separate BFD session for every
path between the endpoints.
</t>
<t>
The LSP traceroute procedures of <xref target="RFC4379"/> allow an
ingress LSR to obtain label ranges that can be used to send
packets on every path to the egress LSR. It works by having
ingress LSR sequentially ask the transit LSRs along a particular
path to a given egress LSR to return a label range such that the
inclusion of a label in that range in a packet will cause the
replying transit LSR to send that packet out the egress interface
for that path. The ingress provides the label range returned by
transit LSR N to transit LSR N + 1, which returns a label range
which is less than or equal in span to the range provided to it.
This process iterates until the penultimate transit LSR replies to
the ingress LSR with a label range that is acceptable to it and to
all LSRs along path preceding it for forwarding a packet along the
path.
</t>
<t>
However, the LSP traceroute procedures do not specify where in the
label stack the value from the label range is to be placed,
whether deep packet inspection is allowed and if so, which keys
and key values are to be used.
</t>
<t>
This memo updates LSP traceroute by specifying that the value from
the label range is to be placed in the entropy label. Deep packet
inspection is thus not necessary, although an LSR may use it,
provided it do so consistently, i.e., if the label range to go to
a given downstream LSR is computed with deep packet inspection,
then the data path should use the same approach and the same keys.
</t>
<t>
In order to have a BFD session on a given path, a value from the
label range for that path should be used as the EL value for BFD
packets sent on that path.
</t>
</section>
<section title="MPLS-TP and Entropy Labels">
<t>
Since MPLS-TP does not use ECMP, entropy labels are not applicable to
an MPLS-TP deployment.
</t>
</section>
<section title="Entropy Labels in Various Scenarios" anchor='sig-forw'>
<t>
This section describes the use of entropy labels in various
scenarios.
</t>
<t>
In the figures below, the following conventions used to depict
processing between X and Y. Note that control plane signaling
goes right to left, whereas data plane processing goes left to
right.
<figure>
<artwork>
Protocols
Y: <--- [L, E] Y signals L to X
X ------------- Y
LS: <L, ELI, EL> Label stack
X: +<L, ELI, EL> X pushes <L, ELI, EL>
Y: -<L, ELI, EL> Y pops <L, ELI, EL>
</artwork>
</figure>
This means that Y signals to X label L for an LDP tunnel. E can
be one of:
<list>
<t>
0: meaning egress is NOT entropy label capable, or
</t>
<t>
1: meaning egress is entropy label capable.
</t>
</list>
The line with LS: shows the label stack on the wire. Below that
is the operation that each LSR does in the data plane, where +
means push the following label stack, - means pop the following
label stack, L~L' means swap L with L', and * means that the
operation is not depicted.
</t>
<section anchor="tunnels" title="LDP Tunnel">
<t>
The following illustrates several simple intra-AS LDP tunnels.
The first diagram shows ultimate hop popping (UHP) with ingress
inserting an EL, the second UHP with no ELs, the third PHP with
ELs, and finally, PHP with no ELs, but also with an application
label AL (which could, for example, be a VPN label).
</t>
<t>
Note that, in all the cases below, the MPLS application does not
matter; it may be that X pushes some more labels (perhaps for a
VPN or VPLS) below the ones shown, and Y pops them.
<figure title='LDP with UHP; ingress inserts ELs'>
<artwork>
A: <--- [TL4, 1]
B: <-- [TL3, 1]
...
W: <-- [TL1, 1]
Y: <-- [TL0, 1]
X --------------- A --------- B ... W ---------- Y
LS: <TL4, ELI, EL> <TL3,ELI,EL> <TL0,ELI,EL>
X: +<TL4, ELI, EL>
A: TL4~TL3
B: TL3~TL2
...
W: TL1~TL0
Y: -<TL0, ELI, EL>
</artwork>
</figure>
<figure title='LDP with UHP; ingress does not insert ELs'>
<artwork>
A: <--- [TL4, 1]
B: <-- [TL3, 1]
...
W: <-- [TL1, 1]
Y: <-- [TL0, 1]
X --------------- A --------- B ... W ---------- Y
LS: <TL4> <TL3> <TL0>
X: +<TL4>
A: TL4~TL3
B: TL3~TL2
...
W: TL1~TL0
Y: -<TL0>
</artwork>
</figure>
<figure title='LDP with PHP; ingress inserts ELs'>
<artwork>
A: <--- [TL4, 1]
B: <-- [TL3, 1]
...
W: <-- [TL1, 1]
Y: <-- [3, 1]
X --------------- A --------- B ... W ---------- Y
X: +<TL4, ELI, EL>
A: TL4~TL3
B: TL3~TL2
...
W: -TL1
Y: -<ELI, EL>
</artwork>
</figure>
<figure title='LDP with PHP + VPN; ingress does not insert ELs'>
<artwork>
A: <--- [TL4, 1]
B: <-- [TL3, 1]
...
W: <-- [TL1, 1]
Y: <-- [3, 1]
VPN: <------------------------------------------ [AL]
X --------------- A --------- B ... W ---------- Y
LS: <TL4, AL> <TL3, AL> <AL>
X: +<TL4, AL>
A: TL4~TL3
B: TL3~TL2
...
W: -TL1
Y: -<AL>
</artwork>
</figure>
<figure title='LDP with PHP + VPN; ingress inserts ELs'>
<artwork>
A: <--- [TL4, 1]
B: <-- [TL3, 1]
...
W: <-- [TL1, 1]
Y: <-- [3, 1]
VPN: <--------------------------------------------- [AL]
X --------------- A ------------ B ... W ---------- Y
LS: <TL4,ELI,EL,AL> <TL3,ELI,EL,AL> <ELI,EL,AL>
X: +<TL4,ELI,EL,AL>
A: TL4~TL3
B: TL3~TL2
...
W: -TL1
Y: -<ELI,EL,AL>
</artwork>
</figure>
</t>
</section>
<section anchor="ldp-rsvp" title="LDP Over RSVP-TE">
<t>
The following illustrates "LDP over RSVP-TE" tunnels. X and Y are
the ingress and egress (respectively) of the LDP tunnel; A and W
are the ingress and egress of the RSVP-TE tunnel. It is assumed
that both the LDP and RSVP-TE tunnels have PHP.
<figure title="LDP over RSVP-TE Tunnels" anchor='ldp-rvsp-fig'>
<artwork>
LDP with ELs, RSVP-TE without ELs
LDP: <--- [L4, 1] <------- [L3, 1] <--- [3, 1]
RSVP-TE: <-- [Rn, 0]
<-- [3, 0]
X --------------- A --------- B ... W ---------- Y
LS: <L4, ELI, EL> <Rn,L3,ELI,EL> ... <ELI, EL>
DP: +<L4, ELI, EL> L4~<Rn, L3> * -L1 -<ELI, EL>
</artwork>
</figure>
</t>
</section>
<section title='MPLS Applications'>
<t>
An ingress LSR X must keep state per unicast tunnel as to
whether the egress for that tunnel can process entropy labels.
X does not have to keep state per application running over that
tunnel. However, an ingress PE can choose on a per-application
basis whether or not to insert ELs. For example, X may have an
application for which it does not wish to use ECMP (e.g.,
circuit emulation), or for which it does not know which keys to
use for load balancing (e.g., Appletalk over a pseudowire). In
either of those cases, X may choose not to insert entropy
labels, but may choose to insert entropy labels for an IP VPN
over the same tunnel.
</t>
</section>
</section>
<section anchor="sec-con" title="Security Considerations">
<t>
This document describes advertisement of the capability to support
receipt of entropy labels which an ingress LSR may insert in MPLS
packets in order to allow transit LSRs to attain better load
balancing across LAG and/or ECMP paths in the network.
</t>
<t>
This document does not introduce new security vulnerabilities to
LDP, BGP or RSVP-TE. Please refer to the Security Considerations
section of these protocols (<xref target="RFC5036"/>, <xref
target='RFC4271'/> and <xref target='RFC3209'/>) for security
mechanisms applicable to each.
</t>
<t>
Given that there is no end-user control over the values used for
entropy labels, there is little risk of Entropy Label forgery
which could cause uneven load-balancing in the network.
</t>
<t>
If Entropy Label Capability is not signaled from an egress PE to
an ingress PE, due to, for example, malicious configuration
activity on the egress PE, then the PE will fall back to not using
entropy labels for load-balancing traffic over LAG or ECMP paths
which is in general no worse than the behavior observed in current
production networks. That said, it is recommended that operators
monitor changes to PE configurations and, more importantly, the
fairness of load distribution over LAG or ECMP paths. If the
fairness of load distribution over a set of paths changes that
could indicate a misconfiguration, bug or other non-optimal
behavior on their PEs and they should take corrective action.
</t>
</section>
<section anchor="iana-con" title="IANA Considerations">
<section title="Reserved Label for ELI">
<t>
IANA is requested to allocate a reserved label for the Entropy
Label Indicator (ELI) from the "Multiprotocol Label Switching
Architecture (MPLS) Label Values" Registry.
</t>
</section>
<section title="LDP Entropy Label Capability TLV">
<t>
IANA is requested to allocate the next available value from the
IETF Consensus range (0x0001-0x07FF) in the LDP TLV Type Name
Space Registry as the "Entropy Label Capability TLV".
</t>
</section>
<section title="BGP Entropy Label Capability Attribute">
<t>
IANA is requested to allocate the next available Path Attribute
Type Code from the "BGP Path Attributes" registry as the "BGP
Entropy Label Capability Attribute".
</t>
</section>
<section title="RSVP-TE Entropy Label Capability flag">
<t>
IANA is requested to allocate a new bit from the "Attribute
Flags" sub-registry of the "RSVP TE Parameters" registry.
</t>
<t>
<figure>
<artwork>
Bit | Name | Attribute | Attribute | RRO
No | | Flags Path | Flags Resv |
----+--------------------------+------------+------------+-----
TBD Entropy Label Capability Yes Yes No
</artwork>
</figure>
</t>
</section>
</section>
<section title="Acknowledgments">
<t>
We wish to thank Ulrich Drafz for his contributions, as well as
the entire 'hash label' team for their valuable comments and
discussion.
</t>
<t>
Sincere thanks to Nischal Sheth for his many suggestions and
comments, and his careful reading of the document, especially with
regard to data plane processing of entropy labels.
</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include='reference.RFC.2119'?>
<?rfc include='reference.RFC.3031'?>
<?rfc include='reference.RFC.3032'?>
<?rfc include='reference.RFC.3107'?>
<?rfc include='reference.RFC.3209'?>
<?rfc include='reference.RFC.5036'?>
<?rfc include='reference.RFC.5420'?>
</references>
<references title="Informative References">
<?rfc include='reference.RFC.4201'?>
<?rfc include='reference.RFC.4271'?>
<?rfc include='reference.RFC.4379'?>
<?rfc include='reference.RFC.4447'?>
<?rfc include='reference.RFC.4875'?>
<?rfc include='reference.RFC.5884'?>
<?rfc include='reference.RFC.6388'?>
<?rfc include='reference.RFC.6391'?>
</references>
<section title="Applicability of LDP Entropy Label Capability TLV">
<t>
In the case of unlabeled IPv4 (Internet) traffic, the Best
Current Practice is for an egress LSR to propagate eBGP learned
routes within a SP's Autonomous System after resetting the BGP
next-hop attribute to one of its Loopback IP addresses. That
Loopback IP address is injected into the Service Provider's IGP
and, concurrently, a label assigned to it via LDP. Thus, when
an ingress LSR is performing a forwarding lookup for a BGP
destination it recursively resolves the associated next-hop to a
Loopback IP address and associated LDP label of the egress LSR.
</t>
<t>
Thus, in the context of unlabeled IPv4 traffic, the LDP Entropy
Label Capability TLV will typically be applied only to the FEC
for the Loopback IP address of the egress LSR and the egress LSR
need not announce an entropy label capability for the eBGP
learned route.
</t>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-21 22:14:41 |