One document matched: draft-ietf-pim-ecmp-00.xml
<?xml version="1.0" encoding="utf-8"?>
<!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com)
by Daniel M Kohn (private) -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd"[
<!ENTITY rfc2119 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc4601 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4601.xml'>
<!ENTITY rfc3973 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3973.xml'>
<!ENTITY rfc5015 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5015.xml'>
]>
<rfc category="exp" ipr="trust200902" docName="draft-ietf-pim-ecmp-00.txt" obsoletes="" updates="" submissionType="IETF" xml:lang="en">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="no"?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<front>
<title abbrev="PIMv2 ECMP Assert">Protocol Independent Multicast
ECMP Assert</title>
<author surname="Yiqun Cai" fullname="Yiqun Cai">
<organization>Cisco Systems, Inc.</organization>
<address>
<postal>
<street>Tasman
Drive</street>
<city>San Jose</city>
<region>CA</region>
<code>95134</code>
<country>USA</country>
</postal>
<email>ycai@cisco.com</email>
</address>
</author>
<author surname="Liming Wei" fullname="Liming Wei">
<organization>Cisco Systems, Inc.</organization>
<address>
<postal>
<street>Tasman Drive</street>
<city>San Jose</city>
<region>CA</region>
<code>95134</code>
<country>USA</country>
</postal>
<email>lwei@cisco.com</email>
</address>
</author>
<author surname="Heidi Ou" fullname="Heidi Ou">
<organization>Cisco Systems, Inc.</organization>
<address>
<postal>
<street>Tasman Drive</street>
<city>San Jose</city>
<region>CA</region>
<code>95134</code>
<country>USA</country>
</postal>
<email>hou@cisco.com</email>
</address>
</author>
<author surname="Vishal Arya" fullname="Vishal Arya">
<organization>DIRECTV Inc.</organization>
<address>
<postal>
<street>2230 E Imperial Hwy</street>
<city>El Segundo</city>
<region>CA</region>
<code>90245</code>
<country>USA</country>
</postal>
<email>varya@directv.com</email>
</address>
</author>
<author surname="Sunil Jethwani" fullname="Sunil Jethwani">
<organization>DIRECTV Inc.</organization>
<address>
<postal>
<street>2230 E Imperial Hwy</street>
<city>El Segundo</city>
<region>CA</region>
<code>90245</code>
<country>USA</country>
</postal>
<email>sjethwani@directv.com</email>
</address>
</author>
<date />
<abstract>
<t>
A PIM router uses RPF procedure to select an upstream interface
and router to build forwarding state. When there are equal
cost multiple paths (ECMP), existing implementations
often use hash algorithms to select a path. Such algorithms
do not allow the spread of traffic among the ECMPs according
to administrative metrics. This usually leads to
inefficient or ineffective use of network resources.
This document introduces the ECMP Assert, a mechanism to
improve the RPF procedure over ECMPs. It allows ECMP path
selection to be based on administratively selected metrics,
such as data transmission delays, path preferences and routing
metrics.
</t>
</abstract>
</front>
<middle>
<section title="Requirements Notation" toc="default">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
and
"OPTIONAL" in this document are to be interpreted as
described
in <xref target="RFC2119" pageno="false" format="default" />.</t>
</section>
<section title="Introduction" toc="default">
<t>A PIM [RFC4601] router uses RPF procedure to select an upstream
interface and a PIM neighbor on that interface to build forwarding
state. When there are equal cost multiple paths (ECMP) upstream,
existing implementations often use hash algorithms to select a
path. Such algorithms do not allow the spread of traffic among the
ECMP according to administrative metrics. This usually leads to
inefficient or ineffective use of network resources. This document
introduces the ECMP Assert, a mechanism to improve the RPF
procedure over ECMP. It allows ECMP path selection to be based on
administratively selected metrics, such as data transmission delays,
path preferences and routing metrics, or a combination of metrics.</t>
<t>ECMPs are frequently used in networks to provide
redundancy and to increase available bandwidth. A PIM router
selects a path in the ECMP based on its own implementation
specific choice. The selection is a local decision. One
way is to choose the PIM neighbor with the highest IP
address, another is to pick the PIM neighbor with the
best hash value over the destination and source addresses. </t>
<t>While implementations supporting ECMP have been deployed widely,
the existing RPF selection methods have weaknesses. The lack of
administratively effective ways to allocate traffic over alternative
paths is a major issue. For example, there is no straightforward way
to tell two downstream routers to select either the same or
different RPF neighbor routers for the same traffic flows.</t>
<t>With the ECMP Assert mechanism introduced here, the upstream
routers use a new PIM ECMP Assert message to instruct the
downstream routers on how to tie-break among the upstream
neighbors. The PIM ECMP Assert message conveys the tie-break
information based on metrics selected administratively.</t>
<section title="Overview" toc="default">
<t>The existing PIM Assert mechanism allows the upstream router
to detect the existence of multiple forwarders for the same
multicast flow onto the same downstream interface. The upstream
router sends a PIM Assert message containing a routing metric
for the downstream routers to use for tie-breaking among the
multiple upstream forwarders on the same RPF interface.</t>
<t>With ECMP interfaces between the downstream and upstream routers,
the PIM ECMP Assert mechanism works in a similar way, but extends
the ability to resolve the selection of forwarders among different
interfaces in the ECMP.</t>
<t>When a PIM router downstream of the ECMP interfaces creates a new
(*,G) or (S,G) entry, it will populate the RPF interface and RPF
neighbor information according to the rules specified by [RFC4601].
This router will send its initial joins to that RPF neighbor.</t>
<t>When the RPF neighbor router receives the join message and finds that
the receiving interface is one of the ECMP interfaces, it will check
if the same flow is already being forwarded out of another ECMP
interface. If so, this RPF neighbor router will send a PIM ECMP
Assert message onto the interface the join was received on. The PIM
ECMP Assert message contains the address of the desired RPF neighbor, an
interface ID <xref target="INTID" pageno="false" format="default" />, along with other parameters used as
tie breakers. In essence, a PIM ECMP Assert message is sent by an upstream
router to notify downstream routers to redirect PIM Joins to the new RPF neighbor
via a different interface. When the downstream routers receive this message,
they should trigger PIM Joins toward the new RPF neighbor specified in the
packet.</t>
<t>This new message is named PIM ECMP Assert for the following reasons,</t>
<t>
<list style="numbers">
<t>It is sent by an upstream router; </t>
<t>It is used to influence the
RPF selection by downstream routers; And </t>
<t>A tie breaker metric is used. </t>
</list>
</t>
<t>
This new message functions in similar ways to the existing PIM Assert
message, with the exception that the existing Assert
message is used to select an upstream router within the same multi-access
network (such as a LAN) while the new message is used to select both
a network and an upstream router.
</t>
<t>
One advantage of this design is that the control messages
are only sent when there is need to "re-balance" the traffic.
This reduces the amount of control traffic.
</t>
</section>
<section title="Applicability" toc="default">
<t>The use of ECMP Assert applies to shared trees or
source trees built with procedures described in
<xref target="RFC4601" pageno="false" format="default" />.
The use of ECMP Assert in "Protocol Independent
Multicast - Dense Mode" <xref target="RFC3973" pageno="false" format="default" /> or
in "Bidirectional Protocol
Independent Multicast" <xref target="RFC5015" pageno="false" format="default" /> is not
considered.</t>
<t> The enhancement described in this document can be applicable to a number
of scenarios. For example, it allows a network operator to
use ECMP paths and have the ability to perform load splitting
based on bandwidth. To do this, the downstream
routers perform RPF selection with bandwidth instead of IP
addresses as a tie breaker. The ECMP Assert mechanism assures
that all downstream routers select the desired network link
and upstream router whenever possible.
Another example is for a network operator to impose a
transmission delay limit on certain links. The ECMP Assert mechanism
provides a mean for an upstream router to instruct a downstream
router to choose a different RPF path. </t>
<t>This specification does not dictate the scope of applications
of this mechanism.</t>
</section>
</section>
<section title="Protocol Specification" toc="default">
<section title="ECMP Bundle" toc="default">
<t> An ECMP bundle is a set of PIM enabled interfaces
on a router, where
all interfaces belonging to the same bundle share the same routing
metric. The ECMP paths reside between the upstream and downstream
routers over the ECMP bundle.</t>
<t> There can be one or more ECMP bundles on any router,
while one individual interface can only belong to a single bundle.</t>
<t>ECMP bundles are created on a router via configuration.</t>
</section>
<section title="Sending ECMP Assert" toc="default">
<t>ECMP Asserts are sent by an upstream router in a rate limited
fashion, under the following conditions,</t>
<t>
<list style="symbols">
<t>It detects a PIM Join on a non-desired
outgoing interface; or</t>
<t>It detects multicast traffic on a non-desired
outgoing interface.</t>
</list>
</t>
<t>In both cases, an ECMP Assert is sent to the
non-desired interface. An outgoing interface is
considered "non-desired" when, </t>
<t>
<list style="symbols">
<t>The upstream router is already forwarding
the same flow out of another interface belonging
to the same ECMP bundle; </t>
<t>The upstream router is not forwarding the flow
yet out any interfaces of the ECMP bundle,
but there is another interface with more desired attributes.</t>
</list>
</t>
<t>An upstream router may choose not to send
ECMP Asserts if it becomes aware that some of the
downstream routers do not support the new message,
or unreachable via some links in ECMP bundle. </t>
</section>
<section title="Receiving ECMP Assert" toc="default">
<t>
When a downstream router receives an ECMP Assert, and detects the
desired RPF path from its upstream router's point of view is
different from its current one, it should choose to prune from the
current path and join to the new path. The exact order of such actions
is implementation specific.</t>
<t>If a downstream router receives multiple ECMP Asserts sent by
different upstream routers, it SHOULD use the Preference, Metric, or
other fields as specified below, as the tie breakers to choose the
most preferred RPF interface and neighbor.</t>
<t>If an upstream router receives an ECMP Assert from another upstream
router, it SHOULD NOT change its forwarding behavior even if the
ECMP Assert makes it a less preferred RPF neighbor on the receiving
interface.</t>
</section>
<section title="Transient State" toc="default">
<t>
During a transient network outage with a single link cut in an
ECMP bundle, a downstream router may lose connection to its RPF neighbor
and the normal ECMP Assert operation may be interrupted temporarily. In
such an event, the following actions are recommended.</t>
<t>The down stream router may re-select a new RPF neighbor. Among all ECMP
upstream routers, the one on the same LAN as the previous RPF neighbor
is preferred. </t>
<t>If there is no upstream router reachable on the same LAN, the down stream
router will select a RPF neighbor on a different LAN.
Among all ECMP upstream routers, the one served as RPF neighbor before the
link failure is preferred. Such a router can be identified by the Router ID
which is part of the Interface ID in the PIM ECMP Assert Hello option.</t>
<t>
During normal ECMP Assert operations, when PIM Joins for the same (*,G) or
(S,G) are received on a different LAN, an upstream router will send ECMP
Assert to prune the non-preferred LAN. Such ECMP Asserts during partial
network outage can be supressed if the upstream router decides that the
non-preferred PIM Join is from a router that is not reachable via the
preferred LAN. This check can be performed by retrieving the downstream's
Router ID, using the source address in the PIM join, and searching
neighbors on the preferred LAN for one with the same router ID.
</t>
</section>
<section title="Interoperability" toc="default">
<t>If a PIM router supports this draft, it MUST send the new Hello
option ECMP-Assert-Supported TLV in its PIM Hello messages. A PIM
router sends ECMP Asserts on an interface only when it detects that
all neighbors have sent this Hello option. If a PIM
router detects that any of its neighbor does not support this
Hello option, it MUST not send ECMP Asserts, however, it SHOULD still
process any ECMP Asserts received.</t>
</section>
<section title="Packet Format" toc="default">
<section title="PIM ECMP Assert Hello Option" toc="default">
<figure anchor="ECMP Assert Hello" title="ECMP Assert Hello Option" suppress-title="false" align="left" alt="" width="" height="">
<preamble></preamble>
<artwork xml:space="preserve" name="" type="" align="left" alt="" width="" height=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = TBD | Length = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
<postamble />
</figure>
<t>
<list style="hanging">
<t hangText="Type: ">
TBD. </t>
<t hangText="Length: "> 0 </t>
</list>
</t>
</section>
<section title="PIM ECMP Assert Format" toc="default">
<figure anchor="ECMP Assert" title="ECMP Assert Message Format" suppress-title="false" align="left" alt="" width="" height="">
<preamble></preamble>
<artwork xml:space="preserve" name="" type="" align="left" alt="" width="" height=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Address (Encoded-Group format) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address (Encoded-Unicast format) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Neighbor Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+- ............ Interface ID ........... -+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Preference | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-- ... Metric ... -+-+-+-+-+-+-+-+-+
| |
+- .. Metric .. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+
]]></artwork>
<postamble />
</figure>
<t>
<list style="hanging">
<t hangText="Type: "> TBD </t>
<t hangText="Neighbor Address (32/128 bits): ">
Address of desired upstream neighbor where the
downstream receiver should redirect PIM Joins to.
This address MUST be associated with an interface
in the same ECMP bundle as the ECMP Assert message's
outgoing interface. If the "Interface ID" field (see
below) is ignored, this "Neighbor Address" field
uniquely identifies a LAN and an upstream router to
which a downstream router should redirect its Join
messages to, and an ECMP Assert message MUST be
discarded if the "Neighbor Address" field in the
message does not match cached neighbor address. </t>
<t hangText="Interface ID (64 bits): ">
This field is used in IPv4 when one or more RPF
neighbors in the ECMP bundle are unnumbered, or
in IPv6 where link local addresses are in use.
For other IPv4 usage, this field is zero'ed when
sent, and ignored when received. If the "Router ID"
part of the "Interface ID" is zero, the field must be
ignored. See <xref target="INTID" pageno="false" format="default" /> for details of
its assignment and usage in PIM Hellos. If the
"Interface ID" is not ignored, the receiving router
of this message MUST use the "Interface ID", instead
of "Neighbor Address", to identify the new RPF
neighbor, and an ECMP Assert message MUST be
discarded if the "Interface ID" field in the message
does not match cached interface ID. </t>
<t hangText="Preference (8 bits): ">
The first tie breaker when ECMP
Asserts from multiple upstream routers are compared
against each other. Numerically smaller value is
preferred. A reserved (15) value is used to indicate
the metric value following the "Preference" field is a
timestamp, taken at the moment the sending router
started to forward out of this interface.</t>
<t hangText="Metric (64 bits): ">
The second tie breaker if the the
"Preference" values are the same. Numerically smaller
metric is preferred. This "Metric" can contain path
parameters defined by users. When both "Preference"
and "Metric" values are the same, "Neighbor Address" or
"Interface ID" field is used as the third
tie-breaker, depends on which field is used to identify
the RPF neighbor, and the bigger value wins. </t>
</list>
</t>
</section>
</section>
</section>
<section title="IANA Considerations" toc="default">
<t>A new PIM Type is required to be assigned to the ECMP Assert
messages. According to <xref target="PIMREG" pageno="false" format="default" />, this document
recommends 11 (0xB) as the new "PIM ECMP Assert Type".</t>
</section>
<section title="Security Considerations" toc="default">
<t>Security of the ECMP Assert is only guaranteed by the
security of the PIM packet, so the security considerations
for PIM Assert packets as described in [RFC4601] apply here.
Spoofed ECMP Assert packets may cause the downstream routers
to send PIM Joins to an undesired upstream router, and trigger
more ECMP Assert messages.
</t>
</section>
<section title="Acknowledgement" toc="default">
<t>The authors would like to thank Apoorva Karan for helping with the
original idea, Eric Rosen, Isidor Kouvelas, Toerless Eckert and
Stig Venaas for their review comments. </t>
</section>
</middle>
<back>
<references title="Normative Reference">
<reference anchor="RFC2119">
<front>
<title abbrev="RFC Key Words">Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials="S." surname="Bradner" fullname="Scott Bradner">
<organization>Harvard University</organization>
<address>
<postal>
<street>1350 Mass. Ave.</street>
<street>Cambridge</street>
<street>MA 02138</street>
</postal>
<phone>- +1 617 495 3864</phone>
<email>sob@harvard.edu</email>
</address>
</author>
<date year="1997" month="March" />
<area>General</area>
<keyword>keyword</keyword>
<abstract>
<t>
In many standards track documents several words are used to signify
the requirements in the specification. These words are often
capitalized. This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
<list><t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
</t></list></t>
<t>
Note that the force of these words is modified by the requirement
level of the document in which they are used.
</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14" />
<seriesInfo name="RFC" value="2119" />
<format type="TXT" octets="4723" target="http://www.rfc-editor.org/rfc/rfc2119.txt" />
<format type="HTML" octets="17491" target="http://xml.resource.org/public/rfc/html/rfc2119.html" />
<format type="XML" octets="5777" target="http://xml.resource.org/public/rfc/xml/rfc2119.xml" />
</reference>
<reference anchor="RFC4601">
<front>
<title>Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)</title>
<author initials="B." surname="Fenner" fullname="B. Fenner">
<organization />
</author>
<author initials="M." surname="Handley" fullname="M. Handley">
<organization />
</author>
<author initials="H." surname="Holbrook" fullname="H. Holbrook">
<organization />
</author>
<author initials="I." surname="Kouvelas" fullname="I. Kouvelas">
<organization />
</author>
<date year="2006" month="August" />
<abstract>
<t>This document specifies Protocol Independent Multicast - Sparse Mode (PIM-SM). PIM-SM is a multicast routing protocol that can use the underlying unicast routing information base or a separate multicast-capable routing information base. It builds unidirectional shared trees rooted at a Rendezvous Point (RP) per group, and optionally creates shortest-path trees per source.</t><t> This document obsoletes RFC 2362, an Experimental version of PIM-SM. [STANDARDS-TRACK]</t>
</abstract>
</front>
<seriesInfo name="RFC" value="4601" />
<format type="TXT" octets="340632" target="http://www.rfc-editor.org/rfc/rfc4601.txt" />
<format type="PDF" octets="304538" target="http://www.rfc-editor.org/rfc/rfc4601.pdf" />
</reference>
</references>
<references title="Informative References">
<reference anchor="RFC3973">
<front>
<title>Protocol Independent Multicast - Dense Mode (PIM-DM): Protocol Specification (Revised)</title>
<author initials="A." surname="Adams" fullname="A. Adams">
<organization />
</author>
<author initials="J." surname="Nicholas" fullname="J. Nicholas">
<organization />
</author>
<author initials="W." surname="Siadak" fullname="W. Siadak">
<organization />
</author>
<date year="2005" month="January" />
<abstract>
<t>This document specifies Protocol Independent Multicast - Dense Mode (PIM-DM). PIM-DM is a multicast routing protocol that uses the underlying unicast routing information base to flood multicast datagrams to all multicast routers. Prune messages are used to prevent future messages from propagating to routers without group membership information. This memo defines an Experimental Protocol for the Internet community.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="3973" />
<format type="TXT" octets="136708" target="http://www.rfc-editor.org/rfc/rfc3973.txt" />
</reference>
<reference anchor="RFC5015">
<front>
<title>Bidirectional Protocol Independent Multicast (BIDIR-PIM)</title>
<author initials="M." surname="Handley" fullname="M. Handley">
<organization />
</author>
<author initials="I." surname="Kouvelas" fullname="I. Kouvelas">
<organization />
</author>
<author initials="T." surname="Speakman" fullname="T. Speakman">
<organization />
</author>
<author initials="L." surname="Vicisano" fullname="L. Vicisano">
<organization />
</author>
<date year="2007" month="October" />
<abstract>
<t>This document discusses Bidirectional PIM (BIDIR-PIM), a variant of PIM Sparse-Mode that builds bidirectional shared trees connecting multicast sources and receivers. Bidirectional trees are built using a fail-safe Designated Forwarder (DF) election mechanism operating on each link of a multicast topology. With the assistance of the DF, multicast data is natively forwarded from sources to the Rendezvous-Point (RP) and hence along the shared tree to receivers without requiring source-specific state. The DF election takes place at RP discovery time and provides the route to the RP, thus eliminating the requirement for data-driven protocol events. [STANDARDS-TRACK]</t>
</abstract>
</front>
<seriesInfo name="RFC" value="5015" />
<format type="TXT" octets="96431" target="http://www.rfc-editor.org/rfc/rfc5015.txt" />
</reference>
<reference anchor="INTID">
<front>
<title>An Interface ID Hello Option for PIM</title>
<author initials="S." surname="Gulrajani"></author>
<author initials="S." surname="Venaas"></author>
</front>
<seriesInfo name="Internet-Draft" value="draft-gulrajani-pim-hello-intid-01.txt" />
</reference>
<reference anchor="PIMREG">
<front>
<title>A Registry for PIM Message Types</title>
<author initials="S." surname="Venaas"></author>
</front>
<seriesInfo name="Internet-Draft" value="draft-ietf-pim-registry-04.txt" />
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-22 05:21:09 |