One document matched: draft-raggarwa-l2vpn-vpls-mcast-00.txt
Network Working Group Rahul Aggarwal
Internet Draft Juniper Networks
Expiration Date: August 2005 Yuji Kamite
NTT Communications
Luyuan Fang
AT&T
Multicast in VPLS
draft-raggarwa-l2vpn-vpls-mcast-00.txt
Status of this Memo
By submitting this Internet-Draft, we certify that any applicable
patent or IPR claims of which we are aware have been disclosed, and
any of which we become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 1]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
Abstract
This document describes a solution for overcoming the limitations of
existing VPLS multicast solutions. It describes procedures for VPLS
multicast that utilize multicast trees in the sevice provider (SP)
network. One such multicast tree can be shared between multiple VPLS
instances. Procedures for propagating multicast control information,
learned from local VPLS sites, to remote VPLS sites, are described.
These procedures do not require IGMP-PIM snooping on the SP backbone
links.
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [KEYWORDS].
1. Contributors
Rahul Aggarwal
Yakov Rekhter
Juniper Networks
Yuji Kamite
NTT Communications
Luyuan Fang
AT&T
Chaitanya Kodeboniya
Juniper Networks
2. Terminology
This document uses terminology described in [VPLS-BGP] and [VPLS-
LDP].
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 2]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
3. Introduction
[VPLS-BGP] and [VPLS-LDP] describe a solution for VPLS multicast that
relies on ingress replication. This solution has certain limitations
for certain VPLS multicast traffic profiles. This document describes
procedures for overcoming the limitations of existing VPLS multicast
solutions.
It describes procedures for VPLS multicast that utilize multicast
trees in the sevice provider (SP) network. One such multicast tree
can be shared between multiple VPLS instances. Procedures for
propagating multicast control information, learned from local VPLS
sites, to remote VPLS sites are described. These procedures do not
require IGMP-PIM snooping on the SP backbone links.
4. Existing Limitation of VPLS Multicast
VPLS multicast solutions described in [VPLS-BGP] and [VPLS-LDP] rely
on ingress replication. Thus the ingress PE replicates the multicast
packet for each egress PE and sends it to the egress PE using a
unicast tunnel.
This is a reasonable model when the bandwidth of the multicast
traffic is low or/and the number of replications performed on an
average on each outgoing interface for a particular customer VPLS
multicast packet is small. If this is not the case it is desirable to
utilize multicast trees in the SP core to transmit VPLS multicast
packets. Note that unicast packets that are flooded to each of the
egress PEs, before the ingress PE performs learning for those unicast
packets, will still use ingress replication.
By appropriate IGMP or PIM snooping it is possible for the ingress PE
to send the packet only to the egress PEs that have the receivers for
that traffic, rather than to all the PEs in the VPLS instance. While
PIM/IGMP snooping allows to avoid the situation where an IP multicast
packet is sent to PEs with no receivers, there is a cost for this
optimization. Namely, a PE has to maintain (S,G) state for all the
(S,G) of all the VPLSs present on the PE. And not only this, but
also PIM snooping has to be done not only on the CE-PE interfaces,
but on Pseudo-Wire (PW) interfaces as well, which in turn introduces
a non-negligeable overhead on the PE. It is desirable to reduce this
overhead when IGMP/PIM snooping is used.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 3]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
5. Overview
This document describes procedures for using multicast trees in the
SP network to transport VPLS multicast data packets. RSVP-TE P2MP
LSPs described in [RSVP-P2MP] are an example of such multicast trees.
The use of multicast trees in the SP network can be beneficial when
the bandwidth of the multicast traffic is high or when it is
desirable to optimize the number of copies of a multicast packet
transmitted by the ingress. This comes at a cost of state in the SP
core to build multicast trees and overhead to maintain this state.
This document places no restrictions on the protocols used to build
SP multicast trees.
Multicast trees used for VPLS can be of two types:
1. Default Trees. A single multicast distribution tree in the SP
backbone is used to carry all the multicast traffic from a specified
set of one or more VPLSs. These multicast distribution trees can be
set up to carry the traffic of a single VPLS, or to carry the traffic
of multiple VPLSs. The ability to carry the traffic of more than one
VPLS on the same tree is termed 'Aggregation'. The tree will include
every PE that is a member of any of the VPLSs that are using the
tree. This enables the SP to place a bound on the amount of multicast
routing state which the P routers must have. This implies that a PE
may receive multicast traffic for a multicast stream even if it
doesn't have any receivers on the path of that stream.
2. Data Trees. A Data Tree is used by a PE to send multicast
traffic for one or more multicast streams, that belong to the same or
different VPLSs, to a subset of the PEs that belong to those VPLSs.
Each of the PEs in the subset are on the path to a receiver of one or
more multicast streams that are mapped onto the tree. The ability to
use the same tree for multicast streams that belong to different
VPLSs is termed 'Aggregation'. The reason for having Data Trees is to
provide a PE to have the ability to create separate SP multicast
trees for high bandwidth multicast groups. This allows traffic for
these multicast groups to reach only those PE routers that have
receivers in these groups. This avoids flooding other PE routers in
the VPLS.
A SP can use both Default Trees and Data Trees or either of them for
a given VPLS on a PE, based on local configuration. Default Trees can
be used for both IP and non-IP data multicast traffic, while Data
Trees can be used only for IP multicast data traffic.
In order to establish Default and Data multicast trees the root of
the tree must be able to discover the VPLS membership of all the PEs
and/or the multicast groups that each PE has receivers in. This
document describes procedures for doing this. For discovering the
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 4]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
multicast group membership this document describes procedures that do
not rely on IGMP-PIM snooping in the SP backbone. These procedures
can also be used with ingress replication to send traffic for a
multicast stream to only those PEs that are on the path to receivers
for that strea. Aggregation also requires a mechanism for the
egresses of the tree to demultiplex the multicast traffic received
over the tree. This document describes how upstream label allocation
by the root of the tree can be used to perform this demultiplexing.
This document also describes procedures based on BGP that are used by
the root of an Aggregate Tree to advertise the Default or Data tree
binding and the demultiplexing information to the leaves of the tree
This document uses the prefix 'C' to refer to the customer control or
data packets and 'P' to refer to the provider control or data
packets.
6. VPLS Multicast / Broadcast / Unknown Unicast Data Packet Treatment
If the destination MAC address of a VPLS packet received by a PE from
a VPLS site is a multicast adddress, a multicast tree SHOULD be used
to transport the packet, if possible. Such a tree can be a Default
Tree for the VPLS. It can also be a Data Tree if the VPLS multicast
packet is an IP packet.
If the destination MAC address of a VPLS packet is a broadcast
address, it is flooded. If Default tree is already established, PE
floods over it. If Default Tree cannot be used for some reason, PE
MUST flood over multiple unicast PWs, based on [VPLS-BGP] [VPLS-LDP].
If the destination MAC address of the packet has not been learned,
the flooding of the packet also occurs. Unlike broadcast case, it
should be noted that when a PE learns the MAC it might immediately
switch to transport over one particular PW. This implies that
flooding unknown unicast traffic over Default Tree might lead to
packet reordering. Therefore, unknown unicast SHOULD be flooded over
multiple unicast PWs based on [VPLS-BGP] [VPLS-LDP], not over
multicast trees.
P-multicast trees are intended to be used only for VPLS C-multicast
data packets, not for control packets being used by a customer's
layer-2 and layer-3 control protocols. For instance, Bridge Protocol
Data Units (BPDUs) use an IEEE assigned all bridges multicast MAC
address, and OSPF uses OSPF routers multicast MAC address. P-
multicast trees SHOULD not be used for transporting these control
packets.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 5]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
7. Propagating Multicast Control Information
PEs participating in VPLS need to learn the <C-S, C-G> information
for two reasons:
1. With ingress replication, this allows a PE to send the IP
multicast packet for a <C-S, C-G> only to other PEs in the VPLS
instance, that have receivers interested in that particular <C-S, C-
G>. This eliminates flooding.
2. It allows the construction of Aggregate Data Trees.
There are two components for a PE to learn the <C-S, C-G> information
in a VPLS:
1. Learning the <C-S, C-G> information from the locally homed
VSIs.
2. Learning the <C-S, C-G> information from the remote VSIs.
7.1. IGMP/PIM Snooping
In order to learn the <C-S, C-G> information from the locally homed
VSIs a PE needs to implement IGMP/PIM snooping. This is because there
is no PIM adjacency between the locally homed CEs and the PE.
IGMP/PIM snooping has to be used to build the database of C-Joins
that are being sent by the customer for a particular VSI. This also
requires a PE to create a IGMP/PIM instance per VSI for which
IGMP/PIM snooping is used. This instance is analogous to the
multicast VRF PIM instance that is created for MVPNs.
It is conceivable that IGMP/PIM snooping can be used to learn <C-S,
C-G> information from remote VSIs by snooping VPLS traffic received
over the SP backbone. However IGMP/PIM snooping is computationally
expensive. Furthermore the periodic nature of PIM Join/Prune
messages implies that snooping PIM messages places even a greater
processing burden on a PE. Hence to learn <C-S, C-G> information
from remote VSIs, this document proposes the use of a reliable
protocol machinery to transport <C-S, C-G> information over the SP
infrastructure. This is described in the next section.
7.2. C-Multicast Control Information Propagation in the SP
A C-Join/Prune message for <C-S, C-G> coming from a customer, that
are snooped by a PE have to be propagated to the remote PE that can
reach C-S. One way to do this is to forward the C-Join/Prune as a
multicast data packet and let the egress PEs perform IGMP/PIM
snooping over the pseudo-wire. However PIM is a soft state protocol
and periodically re-transmits C-Join/Prune messages. This places a
big burden on a PE while snooping PIM messages. It is not possible to
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 6]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
eliminate this overhead for snooping messages received over the
customer facing interfaces. However it is possible to alleviate this
overhead over SP facing interfaces. This is done by converting
snooped PIM C-Join/Prune messages to reliable protocol messages over
the SP network.
Each PE maintains the database of IGMP/PIM <C-S, C-G> entries that
are snooped and that are learnt from remote PEs for each VSI.
Unlike MVPNs there is an additional challenge while propagating
snooped PIM C-Join/Prune messages over the SP network for VPLS. If
the ingress PE wishes to propagate the C-Join/Prune only to the
upstream PE which has reachability to C-S, this upstream PE is not
known. This is because the local PE doesn't have a route to reach C-
S. This is unlike MVPNs where the route to reach C-S is known from
the unicast VPN routing table. This implies that the C-Join/Prune
message has to be sent to all the PEs in the VPLS. This document
proposes two possible solutions for achieving this and one of these
will be eventually picked after discussion in the WG.
1. Using PIM
This is similar to the propagation of PIM C-Join/Prune messages for
MVPN that has been described earlier in the document. The PIM
Neighbor discovery and maintenance is based on the VPLS membership
information learnt as part of VPLS auto-discovery. VPLS auto-
discovery allows a particular PE to learn which of the other PEs
belong to a particular VPLS instance. Each of these PEs can be
treated as a neighbor for PIM procedures while sending PIM C-
Join/Prune messages to other PEs. The neighbor is considered up as
long as the VPLS auto-discovery mechanism does not withdraw the
neighbor membership in the VPLS instance.
The C-Join/Prune messages is sent to all the PEs in the VPLS using
unicast PIM messages. The use of unicast PIM implies that there is no
Join suppression PIM refresh reduction mechanisms, that are currently
being worked upon in the PIM WG, MUST be used. To send the C-
Join/Prune message to a particular remote PE, the message is
encapsulated in the PW used to reach the PE, for the VPLS that the C-
Join/Prune message belongs to.
2. Using BGP
The use of PIM for propagation VPLS C-Join/Prune information may have
scalability limitations. This is because even after building PIM
refresh reduction mechanisms PIM will not have optimized transport
when there is one sender and multiple receivers. BGP provides such
transport as it has route-reflector machinery. One option to
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 7]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
propagate the C-Join/Prune information is to use BGP. This is done by
using the BGP mechanisms described in section 13.
8. Multicast Tree Leaf Discovery
8.1. Default Tree Leaf Discovery
VPLS auto-discovery as described in [VPLS-BGP, BGP-AUTO] or another
VPLS auto-discovery mechanism enables a PE to learn the VPLS
membership of other PEs. This is used by the root of the Tree to
learn the egresses of the tree.
8.2. Data Tree Leaf Discovery
This is done using the C-Multicast control information propagation
described in the previous section.
9. Demultiplexing Multicast Tree Traffic
Demultiplexing received VPLS traffic requires the receiving PE to
determine the VPLS instance the packet belongs to. The egress PE can
then perform a VPLS lookup to further forward the packet.
9.1. One Multicast Tree - One VPLS Mapping
When a multicast tree is mapped to only one VPLS, determining the
tree on which the packet is received is sufficient to determine the
VPLS instance on which the packet is received. The tree is determined
based on the tree encapsulation. If MPLS encapsulation is used, eg:
RSVP-TE P2MP LSPs, the outer MPLS label is used to determine the
tree. Penultimate-hop-popping must be disabled on the RSVP-TE P2MP
LSP.
9.1.1. One Multicast Tree - Many VPLS Mapping
As traffic belonging to multiple VPLSs can be carried over the same
tree, there is a need to identify the VPLS the packet belongs to.
This is done by using an inner label that corresponds to the VPLS for
which the packet is intended. The ingress PE uses this label as the
inner label while encapsulating a customer multicast data packet.
Each of the egress PEs must be able to associate this inner label
with the same VPLS and use it to demultimplex the traffic received
over the Aggregate Default Tree or the Aggregate Data Tree. If
downstream label assignment were used this would require all the
egress PEs in the VPLS to agree on a common label for the VPLS.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 8]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
We propose a solution that uses upstream label assignment by the
ingress PE. Hence the inner label is allocated by the ingress PE.
Each egress PE has a separate label space for every Aggregate Tree
for which the egress PE is a leaf node. The egress PEs create a
forwarding entry for the inner VPN label, allocated by the ingress
PE, in this label space. Hence when the egress PE receives a packet
over an Aggregate Tree, the Tree identifier specifies the label
space to perform the inner label lookup. An implementation
may create a logical interface corresponding to an Aggregate Tree.
In that case the label space to lookup the inner label is an
interface based label space where the interface corresponds to the
tree.
When PIM based IP/GRE trees are used the root PE source address and
the tree P-group address identifies the tree interface. The label
space corresponding to the tree interface is the label space to
perform the inner label lookup in. A lookup in this label space
identifies the VPLS in which the customer multicast lookup needs to
be done.
If the tree uses MPLS encapsulation the outer MPLS label and the
incoming interface provides the label space of the label beneath it.
This assumes that penultimate-hop-popping is disabled. An example of
this is RSVP-TE P2MP LSPs. The outer label and incoming interface
effectively identifies the Tree interface.
The ingress PE informs the egress PEs about the inner label as part
of the tree binding procedures described in section 11.
10. Establishing Multicast Trees
This document does not place any restrictions on the multicast
technology used to setup P-multicast trees. However specific
procedures are specified only for RSVP-TE P2MP LSPs, PIM-SM and PIM-
SSM based trees.
A P-multicast tree can be either a source tree or a shared tree. A
source tree is used to carry traffic only for the VPLSs that exist
locally on the root of the tree i.e. for which the root has local
CEs. A shared tree on the other hand can be used to carry traffic
belonging to VPLSs that exist on other PEs as well. For example a RP
based PIM-SM Aggregate tree would be a shared tree.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 9]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
10.1. RSVP-TE P2MP LSPs
This section describes procedures that are specific to the usage of
RSVP-TE P2MP LSPs for instantiating a tree. The RSVP-TE P2MP LSP can
be either a source tree or a shared tree. Procedures in [RSVP-TE-
P2MP] are used to signal the LSP. The LSP is signaled after the root
of the LSP discovers the leaves. The egress PEs are discovered using
the procedures described in section 8. Aggregation as described in
this document is supported.
10.1.1. P2MP TE LSP - VPLS Mapping
P2MP TE LSP to VPLS mapping can be learned at the egress PEs using
BGP based advertisements of the P2MP TE LSP - VPLS mapping. They
require that the root of the tree include the P2MP TE LSP identifier
as the tunnel identifier in the BGP advertisements. This identifier
contains the following information elements:
- The type of the tunnel is set to RSVP-TE P2MP LSP
- RSVP-TE P2MP LSP's SESSION Object
- RSVP-TE P2MP LSP's SENDER_TEMPLATE Object
10.1.2. Demultiplexing C-Multicast Data Packets
Demultiplexing the C-multicast data packets at the egress PE require
that the PE be able to determine the P2MP TE LSP that the packets are
received on. The egress PE needs to determine the P2MP LSP to
determine the VPLS that the packet belongs to, as described in
section 9. To achieve this the LSP must be signaled with penultimate-
hop-popping (PHP) off. This is because the egress PE needs to rely on
the MPLS label, that it advertises to its upstream neighbor, to
determine the P2MP LSP that a C-multicast data packet is received on.
Signaling the P2MP TE LSP with PHP off requires an extension to RSVP-
TE which will be described in a future version of this document.
10.2. Receiver Initiated MPLS Trees
Receiver initiated MPLS trees can also be used. Details of the usage
of these trees will be specified in a later revision.
10.3. PIM Based Trees
When PIM is used to setup multicast trees in the SP core, an
Aggregate Default Tree is termed as the "Aggregate MDT" and an
Aggregate Data Tree is termed as an "Aggregate Data MDT". The
Aggregate MDT may be a shared tree, rooted at the RP, or a shortest
path tree. Aggregate Data MDT is rooted at the PE that is connected
to the multicast traffic source. The root of the Aggregate MDT or the
Aggregate Data MDT has to advertise the P-Group address chosen by it
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 10]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
for the MDT to the PEs that are leaves of the MDT. These other PEs
can then Join this MDT. The announcement of this address is done as
part of the tree binding procedures described in section 11.
10.4. Encapsulation of the Aggregate Default Tree and Aggregate Data
Tree
An Aggregate Default Tree or an Aggregate Data Tree may use an IP/GRE
encapsulation or a MPLS encapsulation. The protocol type in the
IP/GRE header in the former case and the protocol type in the data
link header in the latter case needs further explanation. This will
be done later.
11. Tree to VPLS / C-Multicast Stream Binding Distribution
Once a PE sets up an Aggregate Default Tree or an Aggregate Data Tree
it needs to announce the customer multicast groups being mapped to
this tree to other PEs in the network. This procedure is referred to
as Default Tree or Data Tree binding distribution and is performed
using BGP. For an Default Tree this discovery implies announcing the
mapping of all VPLSs mapped to the Default Tree. The inner label
allocated by the ingress PE for each VPLS is included. The Default
Tree Identifier is also included. For an Data Tree this discovery
implies announcing all the specific <C-Source, C-Group> entries
mapped to this tree along with the Data Tree Identifier. The inner
label allocated for each <C-Source, C-Group> is included. The Data
Tree Identifier is also included.
The egress PE creates a logical interface corresponding to the
Default Tree or the Data Tree identifier. An Default Tree by
definition maps to all the <C-Source, C-Group> entries belonging to
all the VPLSs associated with the Default Tree. An Data Tree maps to
the specific <C-Source, C-Group> associated with it.
When PIM is used to setup SP multicast trees, the egress PE also
Joins the P-Group Address corresponding to the MDT or the Data MDT.
This results in setup of the PIM SP tree.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 11]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
12. Switching to Aggregate Data Trees
Data Trees provide a PE the ability to create separate SP multicast
trees for certain <C-S, C-G> entires. The source PE that originates
the Data Tree and the egress PEs have to switch to using the Data
Tree for the <C-S, C-G> entries that are mapped to it.
Once a source PE decides to setup an Data Tree, it announces the
mapping of the <C-S, C-G> entries that are mapped on the tree to the
other PEs using BGP. Depending on the SP multicast technology used,
this announcement may be done before or after setting up the Data
Tree. After the egress PEs receive the announcement they setup their
forwarding path to receive traffic on the Data Tree if they have one
or more receivers interested in the <C-S, C-G> entries mapped to the
tree. This implies setting up the demultiplexing forwarding entries
based on the inner label as described earlier. The egress PEs may
perform this switch to the Data Tree once the advertisement from the
ingress PE is received or wait for a preconfigured timer to do so.
A source PE may use one of two approaches to decide when to start
transmitting data on the Data tree. In the first approach once the
source PE sets up the Data Tree, it starts sending multicast packets
for <C-S, C-G> entries mapped to the tree on both that tree as well
as on the Default Tree. After some preconfigured timer the PE stops
sending multicast packets for <C-S, C-G> entries mapped on the Data
Tree on the default tree. In the second approach a certain pre-
configured delay after advertising the <C-S, C-G> entries mapped to
an Data Tree, the source PE begins to send traffic on the Data Tree.
At this point it stops to send traffic for the <C-S, C-G> entries,
that are mapped on the Data Tree, on the Default Tree. This traffic
is instead transmitted on the Data Tree.
13. BGP Advertisements
The procedures required in this document use BGP for Tree - VPLS
binding advertisements, Tree - Multicast stream binding
advertisement, and for C-Multicast control propagation. This section
first describes the information that needs to be propagated in BGP
for achieving the functional requirements. It then describes a
suggested encoding.
13.1. Information Elements
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 12]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
13.1.1. Default Tree - VPLS Binding Advertisement
The root of an Aggregate Default Tree maps one or more VPLS instances
to the Default Tree. It announces this mapping in BGP. Along with
the VPLS instances that are mapped to the Default Tree. The Default
Tree identifier is also advertised in BGP.
The following information is required in BGP to advertise the VPLS
instance that is mapped to the Default Tree:
1. The address of the router that is the root of the Default Tree.
2. The inner label allocated by the Default Tree root for the VPLS
instance. The usage of this label is described in section 9.
When a PE distributes this information via BGP, it must include the
following:
1. An identifier of the Default Tree.
2. Route Target Extended Communities attribute. This RT must be an
"Import RT" of each VSI in the VPLS. The BGP distribution procedures
used by [VPLS-BGP] or [BGP-AUTO] will then ensure that the advertised
information gets associated with the right VSIs.
13.1.2. Data Tree - C-Multicast Stream Binding Advertisement
The root of an Aggregate Data Tree maps one or more <C-Source, C-
Group> entries to the tree. These entries are advertised in BGP along
with the the Data Tree identifier to which these entries are mapped.
The following information is required in BGP to advertise the <C-
Source, C-Group> entries that are mapped to the Data Tree:
1. The RD configured for the VPLS instance. This is required to
uniquely identify the <C-Source, C-Group> as the addresses could
overlap between different VPLS instances.
2. The inner label allocated by the Data Tree root for the <C-
Source, C-Group>. The usage of this label is described in section 9.
3. The C-Source address. This address can be a prefix in order to
allow a range of C-Source addresses to be mapped to the Data Tree.
4. The C-Group address. This address can be a range in order to
allow a range of C-Group addresses to be mapped to the Data Tree.
When a PE distributes this information via BGP, it must include the
following:
1. An identifier of the Data Tree.
2. Route Target Extended Communities attribute. This is used as
described in section 13.1.1.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 13]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
13.1.3. Using BGP for Propagating VPLS C-Joins/Prunes
Section 7.2 describes PIM and BGP as possible options for propagating
VPLS C-Join/Prune information. This section describes the information
elements needed if BGP were to be used to propagate the VPLS C-
Join/Prune information in the SP network.
The following information is required to be advertised by BGP for a
VPLS <C-Source, C-Group> for VPLS C-Join propagation and withdrawn by
BGP for VPLS C-Prune propagation.
1. The RD configured for the VPLS instance. This is required to
uniquely identify the <C-Source, C-Group> as the addresses could
overlap between different VPLS instances.
2. The C-Source address. This can be a prefix.
3. The C-Group address. This can be a prefix.
When a PE distributes this information via BGP, it must include the
Route Target Extended Communities attribute. This is used as
described in section 13.1.1.
13.1.4. Default Tree/Data Tree Identifier
Default Tree and Data Tree advertisements carry the Tree identifier.
The following information elements are needed in this identifier.
1. Whether this is a shared Default Tree or not.
2. The type of the tree. For example the tree may use PIM-SM or
PIM-SSM.
3. The identifier of the tree. For trees setup using PIM the
identifier is a (S, G) value.
13.2. Suggested Encoding
This section describes a suggested BGP encoding for carrying the
information elements described above. This encoding needs further
discussion.
A new Subsequence-Address Family (SAFI) called the VPLS MCAST SAFI is
proposed. Following is the format of the NLRI associated with this
SAFI:
+---------------------------------+
| Length (2 octets) |
+---------------------------------+
| MPLS Labels (variable) |
|---------------------------------+
| RD (8 octets) |
+---------------------------------+
| Multicast Source Length |
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 14]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
+---------------------------------+
|Multicast Source (Variable) |
+---------------------------------+
|Multicast Group (Variable) |
+---------------------------------+
For Default Tree discovery the information elements for the VPLS
instances that are mapped to the Default Tree are encoded as a NLRI.
The RD is set to the configured RD for the VPLS. The Multicast Group
is set to 0. The source address is set to the PE's P-address. This
advertisement also carries a new attribute to identify the Default
Tree.
The BGP next-hop address in the NEXT_HOP attribute or the
MP_REACH_ATTRIBUTE is set to the PE's P-address. This P-address is
the address of the root of the tree.
For Data Tree discovery, the information elements for the <C-S, C-G>
entries that are mapped to the tree are encoded in a NLRI and are set
using the information elements described in section 13.1.2. The
address of the Data Tree root router is carried in the BGP next-hop
address of the MP_REACH_ATTRIBUTE.
For VPLS C-Join/Prune propagation the information elements are
encoded in a NLRI. The address of the router originating the C-
Join/Prunes is carried in the BGP next-hop address of the
MP_REACH_ATTRIBUTE.
A new optional transitive attribute called the
Multicast_Tree_Attribute is defined to signal the Default Tree or the
Data Tree. Following is the format of this attribute:
+---------------------------------+
|S| Reserved | Tree Type |
+---------------------------------+
| Tree Identifier |
| . |
| . |
+---------------------------------+
The S bit is set if the tree is a shared Default Tree. Tree type
identifies the SP multicast technology used to establish the tree.
This determines the semantics of the tree identifier. Currently three
Tree Types are defined:
1. PIM-SSM Tree
2. PIM-SM Tree
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 15]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
3. RSVP-TE P2MP LSP
When the type is set to PIM-SM MDT or PIM-SSM, the tree identifier
contains a PIM <P-Source, P-Multicast Group> address. When the type
is set to RSVP-TE P2MP LSP, the tree identifier contains a RSVP-TE
<Sesssion, Sender Template> tuple.
Hence MP_REACH identifies the set of VPLS customer's multicast trees,
the Multicast_Tree_Attribute identifies a particular SP tree (aka
Default Tree or Data Tree), and the advertisement of both in a single
BGP Update creates a binding/mapping between the SP tree (the Default
Tree or Data Tree) and the set of VPLS customer's trees.
14. Aggregation Methodology
In general the herustics used to decide which VPLS instances or <C-S,
C-G> entries to aggregate is implementation dependent. It is also
conceivable that offline tools can be used for this purpose. This
section discusses some tradeoffs with respect to aggregation.
The "congruency" of aggregation is defined by the amount of overlap
in the leaves of the client trees that are aggregated on a SP tree.
For Aggregate Default Trees the congruency depends on the overlap in
the membership of the VPLSs that are aggregated on the Aggregate
Default Tree. If there is complete overlap aggregation is perfectly
congruent. As the overlap between the VPLSs that are aggregated
reduces, the congruency reduces.
If aggregation is done such that it is not perfectly congruent a PE
may receive traffic for VPLSs to which it doesn't belong. As the
amount of multicast traffic in these unwanted VPLSs increases
aggregation becomes less optimal with respect to delivered traffic.
Hence there is a tradeoff between reducing state and delivering
unwanted traffic.
An implementation should provide knobs to control the congruency of
aggregation. This will allow a SP to deploy aggregation depending on
the VPLS membership and traffic profiles in its network. If
different PEs or RPs are setting up Aggregate Default Trees this will
also allow a SP to engineer the maximum amount of unwanted VPLSs that
a particular PE may receive traffic for.
The state/bandwidth optimality trade-off can be further improved by
having a versatile many-to-many association between client trees and
provider trees. Thus a VPLS can be mapped to multiple Aggregate
Trees. The mechanisms for achieving this are for further study. Also
it may be possible to use both ingress replication and an Aggregate
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 16]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
Tree for a particular VPLS. Mechanisms for achieving this are also
for further study.
15. Data Forwarding
15.1. MPLS Tree Encapsulation
The following diagram shows the progression of the VPLS IP multicast
packet as it enters and leaves the SP network when MPLS trees are
being used for multiple VPLS instances. RSVP-TE P2MP LSPs are
examples of such trees.
Packets received Packets in transit Packets forwarded
at ingress PE in the service by egress PEs
provider network
+---------------+
|MPLS Tree Label|
+---------------+
| VPN Label |
++=============++ ++=============++ ++=============++
|| C-IP Header || || C-IP Header || || C-IP Header ||
++=============++ >>>>> ++=============++ >>>>> ++=============++
|| C-Payload || || C-Payload || || C-Payload ||
++=============++ ++=============++ ++=============++
The receiver PE does a lookup on the outer MPLS tree label and
determines the MPLS forwarding table in which to lookup the inner
MPLS label. This table is specific to the tree label space. The inner
label is unique within the context of the root of the tree (as it is
assigned by the root of the tree, without any coordination with any
other nodes). Thus it is not unique across multiple roots. So, to
unambiguously identify a particular VPLS one has to know the label,
and the context within which that label is unique. The context is
provided by the outer MPLS label.
The outer MPLS label is stripped. The lookup of the resulting MPLS
label determines the VSI in which the receiver PE needs to do the C-
multicast data packet lookup. It then strips the inner MPLS label and
sends the packet to the VSI for multicast data forwarding.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 17]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
15.2. IP Tree Encapsulation
The following diagram shows the progression of the packet as it
enters and leaves the SP network when the Aggregate MDT or Aggregate
Data MDTs are being used for multiple VPLS instances. MPLS-in-GRE
[MPLS-IP] encapsulation is used to encapsulate the customer multicast
packets.
Packets received Packets in transit Packets forwarded
at ingress PE in the service by egress PEs
provider network
+---------------+
| P-IP Header |
+---------------+
| GRE |
+---------------+
| VPN Label |
++=============++ ++=============++ ++=============++
|| C-IP Header || || C-IP Header || || C-IP Header ||
++=============++ >>>>> ++=============++ >>>>> ++=============++
|| C-Payload || || C-Payload || || C-Payload ||
++=============++ ++=============++ ++=============++
The P-IP header contains the Aggregate MDT (or Aggregate Data MDT) P-
group address as the destination address and the root PE address as
the source address. The receiver PE does a lookup on the P-IP header
and determines the MPLS forwarding table in which to lookup the inner
MPLS label. This table is specific to the Aggregate MDT (or Aggregate
Data MDT) label space. The inner label is unique within the context
of the root of the MDT (as it is assigned by the root of the MDT,
without any coordination with any other nodes). Thus it is not unique
across multiple roots. So, to unambiguously identify a particular
VPLS one has to know the label, and the context within which that
label is unique. The context is provided by the P-IP header.
The P-IP header and the GRE header is stripped. The lookup of the
resulting MPLS label determines the VSI in which the receiver PE
needs to do the C-multicast data packet lookup. It then strips the
inner MPLS label and sends the packet to the VSI for multicast data
forwarding.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 18]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
16. Security Considerations
Security considerations discussed in [VPLS-BGP] and [VPLS-LDP] apply
to this document.
17. Acknowledgments
Many thanks to Thomas Morin for his support of this work.
18. Normative References
[RFC2119] "Key words for use in RFCs to Indicate Requirement
Levels.", Bradner, March 1997
[RFC3107] Y. Rekhter, E. Rosen, "Carrying Label Information in
BGP-4", RFC3107.
[VPLS-BGP] K. Kompella, Y. Rekther, "Virtual Private LAN Service",
draft-ietf-l2vpn-vpls-bgp-02.txt
[VPLS-LDP] M. Lasserre, V. Kompella, "Virtual Private LAN Services
over MPLS", draft-ietf-l2vpn-vpls-ldp-03.txt
[MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP
or Generic Routing Encapsulation (GRE)", draft-ietf-mpls-in-ip-or-
gre-08.txt
[BGP-AUTO] H. Ould-Brahim et al., "Using BGP as an Auto-Discovery
Mechanism for Layer-3 and Layer-2 VPNs", draft-ietf-l3vpn-bgpvpn-
auto-04.txt
[RSVP-P2MP] R. Aggarwal et. al, "Extensions to RSVP-TE for Point to
Multipoint TE LSPs", draft-ietf-mpls-rsvp-te-p2mp-01.txt
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 19]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
19. Informative References
20. Author Information
20.1. Editor Information
Rahul Aggarwal
Juniper Networks
1194 North Mathilda Ave.
Sunnyvale, CA 94089
Email: rahul@juniper.net
20.2. Contributor Information
Yakov Rekhter
Juniper Networks
1194 North Mathilda Ave.
Sunnyvale, CA 94089
Email: yakov@juniper.net
Yuji Kamite
NTT Communications Corporation
Tokyo Opera City Tower
3-20-2 Nishi Shinjuku, Shinjuku-ku,
Tokyo 163-1421,
Japan
Email: y.kamite@ntt.com
Luyuan Fang
AT&T
200 Laurel Avenue, Room C2-3B35
Middletown, NJ 07748
Phone: 732-420-1921
Email: luyuanfang@att.com
Chaitanya Kodeboniya
Juniper Networks
1194 North Mathilda Ave.
Sunnyvale, CA 94089
Email: ck@juniper.net
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 20]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
21. Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
22. Full Copyright Statement
Copyright (C) The Internet Society (2004). This document is subject
to the rights, licenses and restrictions contained in BCP 78 and
except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUNG BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 21]
Internet Draft draft-raggarwa-l2vpn-vpls-mcast-00.txt February 2005
23. Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
draft-raggarwa-l2vpn-vpls-mcast-00.txt [Page 22]
| PAFTECH AB 2003-2026 | 2026-04-22 09:43:41 |