One document matched: draft-armitage-ion-cluster-size-00.txt
Internet-Draft Grenville Armitage
Bellcore
July 12th, 1996
Issues affecting MARS Cluster Size
<draft-armitage-ion-cluster-size-00.txt>
Status of this Memo
This document was submitted to the IETF Internetworking over NBMA
(ION) WG. Publication of this document does not imply acceptance by
the ION WG of any ideas expressed within. Comments should be
submitted to the ion@nexen.com mailing list.
Distribution of this memo is unlimited.
This memo is an internet draft. Internet Drafts are working documents
of the Internet Engineering Task Force (IETF), its Areas, and its
Working Groups. Note that other groups may also distribute working
documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by
other documents at any time. It is not appropriate to use Internet
Drafts as reference material or to cite them other than as a "working
draft" or "work in progress".
Please check the lid-abstracts.txt listing contained in the
internet-drafts shadow directories on ds.internic.net (US East
Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or
munnari.oz.au (Pacific Rim) to learn the current status of any
Internet Draft.
Abstract
IP multicast over ATM currently uses the MARS model [1] to
manage the use of ATM pt-mpt SVCs for IP multicast packet
forwarding. The scope of any given MARS services is the MARS Cluster
- typically the same as an IPv4 Logical IP Subnet (LIS). Current
IP/ATM networks are usually architected with unicast routing and
forwarding issues dictating the sizes of individual LISes. However,
as IP multicast is deployed as a service, the sizes of LISes will
only be as big as a MARS Cluster can be. This document looks at the
issues that will constrain MARS Cluster size, and why large scale IP
over ATM networks might preferably be built with many small Clusters
rather than few large Clusters.
Armitage Expires January 12th, 1997 [Page 1]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
1. Introduction
A MARS Cluster is the set of IP/ATM interfaces that are wil- ling to
engage in direct, ATM level pt-mpt SVCs to perform IP multicast
packet forwarding [1]. Each IP/ATM interface (a MARS Client) must
keep state information regarding the ATM addresses of each leaf node
(recipient) of each pt-mpt SVC it has open. In addition, each
MARS Client receives MARS_JOIN and MARS_LEAVE messages from the
MARS whenever there is a requirement that Clients around the
Cluster need to update their pt-mpt SVCs for a given IP multicast
group.
The definition of Cluster 'size' can mean two things - the number of
MARS Clients using a given MARS, and the geographic distribution of
MARS Clients. The number of MARS Clients in a Cluster impacts on
the amount of state information any given client may need to store
while managing outgoing pt- mpt SVCs. It also impacts on the
average rate of JOIN/LEAVE traffic that is propagated by the MARS on
ClusterControlVC, and the number of pt-mpt VCs that may need
modification each time a MARS_JOIN or MARS_LEAVE appears on
ClusterControlVC.
The geographic distribution of clients impacts on the latency between
a client issuing a MARS_JOIN, and it finally being added onto the
pt-mpt VCs of the other MARS Clients transmitting to the
specified multicast group. (This latency is made up of both the time
to propagate the MARS_JOIN, and the delay in the underlying ATM
cloud's reaction to the subsequent ADD_PARTY messages.)
2. Limitations on state storage
A Cluster should not contain more MARS Clients than the maximum
number of leaf nodes supportable by the most limited member of the
cluster.
Two items are affected by this limitation:
ClusterControlVC from the MARS. It has a leaf node per cluster
member (MARS Client). This limitation applies only to the node
supporting the MARS itself.
Packet forwarding SVCs out of each MARS Client for each IP
multicast group being sent to. The number of MARS Clients that
may chose to be members of a given group may encompass every MARS
Client in the cluster.
Under UNI 3.0/3.1 the most obvious limit on the size of a cluster
is the 2^15 leaf nodes that can be added to a pt-mpt SVC. However, in
Armitage Expires January 12th, 1997 [Page 2]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
practice most ATM NICs (and probably switches) are going to
impose a limit much lower than this - a function of how much per-leaf
node state information they need to store (and are capable of
storing) for pt-mpt SVCs.
A MARS Client may impose its own state storage limitations, such
that the combined memory consumption of a MARS Client and the ATM
NIC in a given host limits the client to fewer leaf nodes than
the ATM NIC alone might have been able to support.
Limitations of the switch to which a MARS or MARS Client is directly
attached may also impose a lower limit on leaf nodes than that of
the MARS, MARS Client, or ATM NIC. Cluster size is limited by the
most constraining of these limits.
It may be possible to work around leaf node limits by distributing
the leaf nodes across multiple pt-mpt SVCs operating in parallel.
However, such an approach requires further study, and is unlikely
to be a useful workaround for Client or NIC based limitations.
A related observation can also be made that the number of MARS
Clients in a Cluster may be limited by the memory constraints of the
MARS itself. It is required to keep state on all the groups that
every one of its MARS Clients have joined. For a given memory
limit, the maximum number of MARS Clients must drop if the average
number of groups joined per Client rises. Depending on the level of
group memberships, this limitation may be more severe that pt-
mpt leaf node limits.
3. Signaling load.
In any given cluster there will be an 'ambient' level of
MARS_JOIN/LEAVE activity. What that level will actually be depends
on the types of multicast applications running on the majority
of the hosts in the cluster. It is reasonable to assume that as the
number of MARS Clients in a given cluster rises, so does the
ambient level of MARS_JOIN/LEAVE activity that the MARS receives
and propagates out on ClusterControlVC.
The existence of MARS_JOIN/LEAVE traffic also has a consequential
impact on signaling activity at the ATM level (across the UNI
and {P}NNI boundaries). For groups that are VC Mesh supported,
each MARS_JOIN or MARS_LEAVE propagated on ClusterControlVC will
result in an ADD_PARTY or DROP_PARTY message sent across the
UNIs of all MARS Clients that are transmitting to a given group.
As a clusters membership increases, so does the average number
of MARS Clients that trigger ATM signaling activity in response to
MARS_JOIN/LEAVEs.
Armitage Expires January 12th, 1997 [Page 3]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
The size of a cluster needs to be chosen to provide some level
of containment to this ambient level of MARS and UNI/NNI
signaling.
Some refinements to the MARS Client behaviour may also be explored
to smooth out UNI signaling transients. The MARS spec currently
requires that revalidation of group memberships only occurs when
the Client starts sending new packets to an invalidated group SVC. A
Client could apply a similar algorithm to decide when it should
issue ADD_PARTYs after seeing a MARS_JOIN - wait until it actually
has a packet to send, send the packet, then initiate the
ADD_PARTY. As a result actively transmitting Clients would update
their SVCs sooner than intermittently transmitting Clients. This
requires careful implementation of the Client state machine.
4. Group change latencies
The group change latency can be defined as the time it takes for all
the senders to a group to have correctly updated their forwarding
SVCs after a MARS_JOIN or MARS_LEAVE is received from the MARS. This
is affected by both the number of Cluster members and the
geographical distribution of Cluster members.
The number of Cluster members affects the ATM level signaling load
offered as soon as a MARS_JOIN or MARS_LEAVE is seen. If the load
is high, the ATM Cloud itself may suffer slow processing of the
various SVC modifications that are being requested.
Wide geographic distribution of Cluster members delays the
propagation of MARS_JOIN/LEAVE and ATM UNI/NNI messages. The further
apart various members are, the longer it takes for them to receive
MARS_JOIN/LEAVE traffic on ClusterControlVC, and the longer it takes
for the ATM network to react to ADD_PARTY and DROP_PARTY
requests. If the long distance paths are populated by many ATM
switches, propagation delays due to per-switch processing will add
substantially to delays due to the speed of light.
Unfortunately, some mechanisms for smoothing out the transient ATM
signaling load described in section 3 have a consequence of
increasing the group change latency (since the goal is for some
of the senders to deliberately delay updating their forwarding SVCs)
A related effect will also be felt by the MARS itself. The larger
the MARS database, the longer it may take to process MARS_JOIN/LEAVE
messages (which involve locating and updating individual group
entries). Whilst this issue may not be important for conferencing
applications (with group membership changes on the human time
frame), high speed simulation environments may find such
Armitage Expires January 12th, 1997 [Page 4]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
considerations important.
5. Large IP/ATM networks using Mrouters
Building a large scale, multicast capable IP over ATM network is a
tradeoff between Cluster sizes and numbers of Mrouters. For a given
number of hosts across the entire IP/ATM network, as cluster
sizes drop you need more of them. Clusters must be interconnected by
Mrouters, so the number of Mrouters rises. (The actual rise
in the number of Mrouters depends largely on the logical IP
topology you choose to implement, since a single physical Mrouter
may interconnect more than two Clusters at once.) It is a local
deployment question as to what the optimal mix of Clusters and
Mrouters will be.
A constructive way to view conventional Mrouters is that they
are aggregation points for signaling and data plane loads. An
Mrouter hides group membership changes in one cluster from
senders within other Clusters, and protects local group members from
being swamped by SVCs from senders in other Clusters.
MARS_JOIN/LEAVE traffic in one Cluster is hidden from the members of
all other Clusters. (The consequential UNI signaling load is
localized to the source Cluster too.) Group members in a cluster are
fed packets from an SVC originating on the MARS Client residing in
their local Mrouter, rather than terminating multiple SVCs
originating on the actual senders in remote Clusters.
As a side effect of the Mrouters role in aggregating data path
flows, it reduces the impact of SVC leaf-node limits. A hypothetical
10000 node Cluster could be broken into two 5000 node Clusters,
or four 2500 node Clusters. In each case the individual Cluster
members need only source pt-mpt SVCs with maximums of 5000 or 2500
leaf nodes respectively.
6. Large IP/ATM networks using Cell Switch Routers (CSRs)
A Cell Switch Router may act as a conventional Mrouter, and provide
all the benefits described in the previous section. However, one of
the useful characteristics of the CSR is the ability to internally
'short-cut' the cells from an incoming VCC to an outgoing VCC. Once
the CSR has identified a flow of IP traffic, and associated it
with an inbound and outbound VCC, it begins to function as an ATM
cell level device rather than a packet level device. Even when
operating in a 'short-cut' mode the CSR is still able to protect
Clusters from the MARS_JOIN/LEAVE activities of surrounding Clusters.
From the perspective of Clusters to which the CSR is directly
attached, the CSR terminates and originates pt-mpt SVCs. It acts as
the path out of a source Cluster, and the entry point into a
Armitage Expires January 12th, 1997 [Page 5]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
target Cluster. It remains unnecessary for senders in one Cluster
to issue ADD_PARTY or DROP_PARTY messages in response to
group membership changes in other Clusters - the CSR tracks these
changes, and updates the pt-mpt trees rooted on its own ATM ports as
needed.
However, there is one significant point of difference to a
conventional Mrouter - a simple CSR cannot aggregate the packet
flows from multiple senders in one Cluster onto a single SVC
into an adjacent Cluster. Within a Cluster with multiple sources, the
CSR is a leaf node on an individual SVC per source (just like a
conventional Mrouter). But if it chooses to 'short-cut' traffic at
the cell level to group members in another Cluster, it must
construct a separate forwarding SVC into the target cluster to
match each VCC from each sender in the source Cluster. This
requirement stems from the need to maintain AAL_SDU boundaries at
the ultimate recipients - the group members in the target cluster.
If the cells from individual senders in the source Cluster were
FIFO merged onto a single outgoing SVC into the target Cluster,
recipients in the target Cluster would have a hard time
reconstructing individual AAL_SDUs from the interleaved cells. (This
is mostly due to our use of AAL5. AAL3/4 could provide a
solution using the MID field, although we would be limited to
2^10 senders per Cluster and introduce a MID management problem.)
Interestingly, this problem can magnify the UNI signaling load
offered within the target Cluster whenever a new group member
arrives. If there are N senders in the source Cluster, the CSR will
have built N identical pt-mpt SVCs out to the group members within
the target Cluster. If a new MARS_JOIN is issued within the
target Cluster, the CSR must issue N ADD_PARTYs to update its SVCs
into the target Cluster. (Under similar circumstances a
conventional Mrouter would have issued only one ADD_PARTY for its
single SVC into the target Cluster.)
A possible solution is for the CSR's underlying cell switching fabric
to provide AAL_SDU-aware cell forwarding. If segmented AAL_SDUs
arriving from the source Cluster could be buffered and forwarded
in groups of cells representing entire AAL_SDUs, the CSR would need
only a single SVC into the target Cluster. Its impact on the
Clusters it was attached to would then be the same as that of a
conventional Mrouter. (This does not necessarily imply full re-
assembly followed by segmentation. It would be sufficient for the
incoming cells to be buffered in sequence, and the fed onto the
outbound SVC. The CSRs switch fabric would not be performing any
AAL level checks other than detecting AAL_SDU boundaries.)
Armitage Expires January 12th, 1997 [Page 6]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
7. The impact of Multicast Servers (MCSs)
The MCS has an intra-Cluster affect somewhat analogous to the
inter-Cluster affect of the Mrouter. It aggregates AAL_SDU flows
around the Cluster into a single pt-mpt SVC. This single pt-mpt
SVC is the only one that needs to be updated when an intra-cluster
group membership change occurs.
It also reduces the amount of MARS_JOIN/LEAVE traffic on
ClusterControlVC - such messages for MCS supported groups are
propagated out on ServerControlVC, thus interrupting only the
(presumably smaller) set of MCSes attached to the MARS. One way to
look at an MCS is a stripped-down Mrouter, operating intra-Cluster
and performing minimal (if any) forwarding decisions based on IP
level information. Whether the use of MCSs allows you to deploy
larger Clusters depends on the mix of MCS supported groups and VC
Mesh supported groups within your Cluster.
8. Conclusion
This short document has provided a high level overview of the
parameters affecting the size of MARS Clusters within multicast
capable IP/ATM networks. Limitations on the number of leaf nodes a
pt-mpt SVC may support, sizes of the MARS database, propagation
delays of MARS and UNI messages, and the frequency of MARS and
UNI control messages are all identified as issues that will
constrain Clusters. Mrouters (either conventional or in Cell
Switch Router form) were identified as useful aggregators of IP
multicast traffic and signaling information. Large scale IP
multicasting over ATM requires a combination of Mrouters and
appropriately sized MARS Clusters.
Security Consideration
Security consideration are not addressed in this document.
Acknowledgments
Author's Address
Grenville Armitage
Bellcore, 445 South Street
Morristown, NJ, 07960
USA
Armitage Expires January 12th, 1997 [Page 7]
Internet Draft <draft-armitage-ion-cluster-size-00.txt> July 12th, 1996
Email: gja@thumper.bellcore.com
Ph. +1 201 829 2635
References
[1] G. Armitage, "Support for Multicast over UNI 3.0/3.1 based ATM
Networks.", Bellcore, INTERNET DRAFT, draft-ietf-ipatm-ipmc-12.txt,
February 1996.
Armitage Expires January 12th, 1997 [Page 8]
| PAFTECH AB 2003-2026 | 2026-04-23 17:15:14 |