One document matched: draft-farinacci-msdp-00.txt
Network Working Group Dino Farinacci
INTERNET DRAFT Yakov Rekhter
cisco Systems
Peter Lothberg
Sprint
Hank Kilmer
Digex
Jeremy Hall
UUnet
June 25, 1998
Multicast Source Discovery Protocol (MSDP)
<draft-farinacci-msdp-00.txt>
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Abstract
This proposal describes a mechanism to connect multiple PIM-SM
domains together. Each PIM-SM domain uses it's own independent RP(s)
and do not have to depend on RPs in other domains.
This proposal is being submitted as a method for the initial phase of
Inter-Domain Multicast deployment in the Internet and may be upward
compatible with the IDMR protocols being proposed for subsequent
phases.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 1]
RFC DRAFT June 1998
1.0 Introduction
This proposal describes a mechanism to connect multiple PIM-SM
domains together. Each PIM-SM domain uses it's own independent RP(s)
and do not have to depend on RPs in other domains.
Some advantages of this proposal:
o PIM-SM domains can rely on their own RPs only.
o Domains with only receivers get data without globally advertising
group membership.
o Global source state is not required.
2.0 Overview
An RP in a PIM-SM domain will have a MSDP peering relationship with
an RP in another domain. The peering relationship will be made up of
a TCP connection in which only control information is primarily
exchanged. Each domain will have a connection to this virtual
topology.
The purpose of this topology is to have domains discover multicast
sources from other domains. If the multicast sources are of interest
to a domain which has receivers, the normal source-tree building
mechanism in PIM-SM will be used to deliver multicast data over an
inter-domain distribution tree.
We envision this virtual topology will essentially be congruent to
the existing BGP topology used in the unicast-based Internet today.
That is the TCP connections between RPs can be realized by the
underlying BGP routing system.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 2]
RFC DRAFT June 1998
3.0 Procedure
A source in a PIM-SM domain originates traffic to a multicast group.
The PIM DR which is directly connected to the source sends the data
encapsulated in a PIM Register message to the RP in the domain.
The RP will construct a "Source-Active" (SA) message and send it to
it's MSDP peers. The SA message contains the following fields:
o Source address of the data source.
o Group address the data source sends to.
o IP address of the RP.
Each MSDP peer receives and forwards the message away from the RP
address in a "peer-RPF flooding" fashion. The notion of peer-RPF
flooding is with respect to forwarding SA messages. The BGP routing
table is examined to determine which peer is the next hop towards the
originating RP of the SA message. Such a peer is called an "RPF
peer".
If the MSDP peer receives the SA from a non-RPF peer towards the
originating RP, it will drop the message. Otherwise, it forwards the
message to all it's MSDP peers.
The flooding can be further constrained to children of the peer by
interrogating BGP reachability information. That is, if a peer
advertises a route (back to you) and you are the next to last AS in
the AS-path, the peer is using you as the next-hop. In this case, you
*should* forward an SA message (which was originated from the RP
address covered by that route) to the peer. This is known in other
circles as Split-Horizon with Poison Reverse.
When each MSDP peer (which are also RPs for their own domain) receive
an SA message, they determine if they have any group members
interested in the group the SA message describes. If the (*,G) entry
exists with an non-empty outgoing interface list, the domain is
interested in the group, and the RP triggers an (S,G) join towards
the data source. This sets up a branch of the source-tree to this
domain. Subsequent data packets arrive at the RP which are forwarded
down the shared-tree inside the domain. If leaf routers choose to
join the source-tree they have the option to do so according to
existing PIM-SM conventions.
This procedure has been affectionately named flood-and-join because
if any RP is not interested in the group, they can ignore the SA
message. Otherwise, they join a distribution tree.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 3]
RFC DRAFT June 1998
4.0 Controlling State
RPs which receive SA messages are not required to keep MSDP (S,G)
state. However, if they do, newly formed MSDP peers can get MSDP
(S,G) state sooner and therefore reduce join latency for new joiners.
RPs which originate SA messages do it periodically as long as there
is data being sent by the source. RPs will not send more than 1 SA
message for a given (S,G) within a 1 minute interval. Originating
periodic SA messages are important so new receivers who join after a
source has been active can get data quickly via the receiver's own RP
when it is not caching SA state.
Intermediate RPs do not send periodic SA messages on behalf of
sources in other domains. They only do for their own sources.
As the number of (source,group) pairs increases in the Internet, an
RP may want to filter what sources it describes in SA messages. Also,
filtering may be used as a matter of policy which at the same time
can reduce state. Only the RP colocated in the same domain as the
source can restrict SA messages. Other RPs should not filter or the
flood-and-join model becomes broken.
If an MSDP peer decides to cache SA state, it may accept SA-Requests
from other MSDP peers. When a MSDP peer receives an SA-Request for a
group range, it will respond to the peer with a set of SA entries, in
a SA-Response message, for all active sources sending to the group
range requested in the SA-Request message. The peer that sends the
request will not flood the responding SA-Response message to other
peers.
5.0 SA Encapsulated Data Packets
For bursty sources, the SA message may contain multicast data from
the source. Interested RPs can decapsulate the SA message and forward
the original data packet down the shared-tree inside of a domain. We
recommend this not be the default setting.
6.0 Auto-configuration versus Manual-configuration of MSDP Peers
MSDP peers can be configured manually or can be learned
automatically. The two automatic mechanisms can be achieved by:
o PIM Query/Hello messages
o BGP capability parameter negotiation
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 4]
RFC DRAFT June 1998
In either case, each side of the peering relationship will indicate
it's desire to participate in the MSDP protocol. If so, the TCP peer
relationship is set up.
7.0 Other Scenarios
MSDP is not limited to deployment across different routing domains.
It can be used within a routing domain when it is desired to deploy
multiple RPs for different group ranges. As long as all RPs have a
interconnected MSDP topology, each can learn about active sources as
well as RPs in other domains.
MSDP can be used in domains that operate a dense-mode multicast
routing protocol. However, in some cases SA messages with
encapsulated source data may be required.
8.0 Packet Formats
MSDP messages will be encapsulated in a TCP connection using well-
known port 639. The one side of the MSDP peering relationship will
listen on the well-known port and the other side will do an active
connect on the well-known port. The side with the higher IP address
will do the listen. This connection establishment algorithm avoids
call collision. Therefore, there is no need for a call collision
procedure.
MSDP messages will be encoded in TLV format:
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Value .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type (8 bits)
Describes the format of the Value field.
Length (16 bits)
Length of Type, Length, and Value fields in octets. Minimum length
required is 3 octets.
Value (variable length)
Format is based on the Type value. See below. The length of the
value field is Length field minus 3.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 5]
RFC DRAFT June 1998
Documented Types:
IPv4 Source-Active TLV
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 1 | x + y | Entry Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved | Gprefix Len | Sprefix Len | \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \
| Group Address Prefix | ) z
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ /
| Source Address Prefix | /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type
IPv4 Source-Active TLV is type 1.
Length x
Is the length of the control information in the message. x is 8
octets (for the first two 32-bit quantities) plus 12 times Entry
Count octets.
Length y
If 0, then there is no data encapsulated. Otherwise an IPv4 packet
follows and y is the length of the total length field of the IPv4
header encapsulated. If there are multiple SA TLVs in a message,
and data is also included, y must be 0 in all SA TLVs except the
last one. And the last SA TLV must reflect the source and destination
addresses in the IP header of the encapsulated data.
Entry Count
Is the count of z entries (note above) which follow the RP address
field. This is so multiple (S,G)s from the same domain can be
encoded efficiently for the same RP address.
RP Address
The address of the RP in the domain the source has become active in.
Gprefix Len and Sprefix Len
The route prefix length associated with the group address prefix
and source address prefix, respectively.
Group Address Prefix
The group address the active source has sent data to.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 6]
RFC DRAFT June 1998
Source Address Prefix
The route prefix associated with the active source.
Multiple SA TLVs can appear in the same message and can be batched for
efficiency at the expense of data latency. This would typically occur
on intermediate forwarding of SA messages.
IPv4 Source-Active Request TLV
Used to request SA-state from a caching MSDP peer. If an RP in a domain
receives a PIM Join message for a group, creates (*,G) state and wants to
know all active sources for group G, and it has been configured to peer
with an SA-state caching peer, it may send an SA-Request message
for the group.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 2 | 8 | Gprefix Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Address Prefix |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type
IPv4 Source-Active Request TLV is type 2.
Gprefix Len
The route prefix length associated with the group address prefix.
Group Address Prefix
The group address prefix the MSDP peer is requesting.
IPv4 Source-Active Response TLV
Sent in response to a Source-Active Request message. The Source-Active
Response message has the same format as a Source-Active message but
does not allow encapsulation of multicast data.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 3 | x | .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type
IPv4 Source-Active Response TLV is type 3.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 7]
RFC DRAFT June 1998
Length x
Is the length of the control information in the message. x is 8
octets (for the first two 32-bit quantities) plus 12 times Entry
Count octets.
9.0 Acknowledgements
The authors would like to thank David Meyer, John Meylor, Liming Wei,
Manoj Leelanivas, Mark Turner, and John Zwiebel for their design
feedback and comments.
10.0 Author's Address:
Dino Farinacci
Cisco Systems, Inc.
170 Tasman Drive
San Jose, CA, 95134
Email: dino@cisco.com
Yakov Rehkter
Cisco Systems, Inc.
170 Tasman Drive
San Jose, CA, 95134
Email: yakov@cisco.com
Peter Lothberg
Sprint
VARESA0104
12502 Sunrise Valley Drive
Reston VA, 20196
Email: roll@sprint.net
Hank Kilmer
Digex Inc.
One DIGEX Plaza
Beltsville, Maryland 20705
Email: hank@rem.com
Jeremy Hall
UUnet Technologies
3060 Williams Drive
Fairfax, VA 22031
Email: jhall@uu.net
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 8]
RFC DRAFT June 1998
11.0 References
[1] Estrin D., Farinacci, D., Helmy, A., Thaler, D., Deering, S.,
Handley M., Jacobson, V., Liu C., Sharma, P., Wei, L., "Protocol
Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification",
draft-ietf-idmr-pim-sm-specv2-00.txt, September 9, 1997.
[2] Thaler, D., Estrin, D., Meyer, D., "Border Gateway Multicast Protocol
(BGMP): Protocol Specification", draft-ietf-idmr-gum-01.txt, October 30,
1997.
[3] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771,
March 1995.
[4] Bates, T., Chandra, R., Katz, D., and Y. Rekhter., "Multiprotocol
Extensions for BGP-4", RFC 2283, February 1998.
[5] Deering, S., "Multicast Routing in a Datagram Internetwork", PhD thesis,
Electric Engineering Dept., Stanford University, December 1991.
[6] Pusateri, T., "Distance Vector Multicast Routing Protocol",
draft-ietf-idmr-dvmrp-v3-05.txt, October 1997.
Farinacci, Rekhter, Lothberg, Kilmer, Hall [Page 9]
| PAFTECH AB 2003-2026 | 2026-04-21 07:28:59 |