One document matched: draft-ietf-rmt-bb-tree-config-01.txt
Differences from draft-ietf-rmt-bb-tree-config-00.txt
RMT Working Group Miriam Kadansky/Sun Microsystems
Internet Engineering Task Force Brian Neil Levine/UMass
INTERNET-DRAFT Dah Ming Chiu/Sun Microsystems
draft-ietf-rmt-bb-tree-config-01.txt Brian Whetten/Talarian
22 November, 2000 Gursel Taskale/Talarian
Expires 22 May 2001 Brad Cain/Mirror Image
Dave Thaler/Microsoft
Seok Joo Koh/ETRI/Korea
Reliable Multicast Transport Building Block: Tree Auto-Configuration
<draft-ietf-rmt-bb-tree-config-01.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are valid for a maximum of six months and may be
updated, replaced, or obsoleted by other documents at any time. It
is inappropriate to use Internet-Drafts as reference material or to
cite them other than as a "work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
The Reliable Multicast Transport Working Group has been chartered to
standardize multicast transport services. This working group expects
to initially standardize three protocol instantiations. This draft is
concerned with the requirements of the tree-based ACK protocol. In
particular, it is concerned with defining the building block for
auto-configuration of the logical ACK-tree. According to the charter,
a building block is "a coarse-grained modular component that is
common to multiple protocols along with abstract APIs that define a
building block's access methods and their arguments." For more
information, see the Reliable Multicast Transport Building Blocks and
Reliable Multicast Design Space documents [WLKHFL00][HWKFV00].
draft-ietf-rmt-bb-tree-config-01 [Page 1]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
Table of Contents
1. Introduction
1.1 Terminology
1.2 Assumptions
1.3 Requirements
1.4 Applicability Statement
2. Overview
3. Distance Metrics
3.1 Multicast Beacon
3.2 Mesh
3.3 Address
3.4 Static
3.5 GRA
4. Session Announcement
5. Service Node Discovery
5.1 Service Node Discovery Algorithms
5.1.1 Static
5.1.2 ERS
5.1.3 POC
5.1.4 Mesh
5.2 Service Node Selection
6. Tree Formation
7. Tree Maintenance
7.1 Monitoring Parents and Children
7.2 Optimizing the Tree
8. Messages
9. External Interfaces
9.1 Interfaces to the BB
9.1.1 findSNs
9.1.2 messageFromSN
9.1.3 lostSN
9.1.4 activateSN
9.1.5 setOptimization
9.1.6 setSN
9.1.7 acceptMember
9.2 Interfaces from the BB
9.2.1 SNlist
9.2.2 SNactive
10. References
Authors' Addresses
draft-ietf-rmt-bb-tree-config-01 [Page 2]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
1. Introduction
The Reliable Multicast Transport (RMT) working group has been
chartered to standardize IP multicast transport services. This draft
is concerned with the requirements of the tree-based ACK
protocol[WCP00]. In particular, this draft defines a building block
for auto-configuration of a logical ACK-tree comprised of a single
Sender, Service Nodes, and Receivers into a tree (called a Session
Tree in this document). The design goals of this draft are motivated
by the needs of TRACK-based protocols, however the trees it
constructs are useful for other services.
The process of session tree construction is difficult for IP
multicast. The best session trees match the underlying (i.e.
forwarding tree) multicast routing tree topology[LPG98], however the
multicast service model[DEE89] does not provide explicit support for
discovering routing tree topology. Furthermore, deployed multicast
architectures can vary; for example, hosts may be restricted to
multicast reception and not transmission with Source-Specific
routing[HC00]; and routers may provide special extended routing
services with Generic Router Assist [CST00]. The RMT charter does
not restrict the use of any particular network service in
constructing the tree, it only suggests preferred scenarios.
Accordingly, there are several viable solutions for constructing a
tree.
The optimality of a tree may also depend on other factors, such as
the need for load balancing, and the need to minimize the depth when
used for collecting feedback information. The goal of this building
block is to specify a distributed procedure for automatically
constructing a tree that is loop-free, but only as good as possible
depending on the information available.
This draft describes a unified solution for tree construction in the
presence of different multicast service models and routing protocols.
In particular, it specifies a single procedure which may be used with
various techniques for service node discovery and distance
measurements, several of which are specified within this document.
The difference in these techniques primarily affects the optimality
of the tree. The unified algorithm ensures that different
implementations can interoperate and construct a loop-free tree.
In order to accommodate various multicast deployments, this document
divides the tree building process into the following major
components:
1. Several techniques for discovering neighboring service nodes.
In particular: Static, Expanding Ring Search, Point-of-Contact, and
Mesh. Discovering neighboring service nodes is a necessary
condition for getting connected, so each node in the session
must implement at least one of the above techniques.
2. Several algorithms for selecting neighboring service nodes.
In particular, the measurement and use of distance-to-neighbor
and distance-to-sender are described. These service node
selection algorithms help produce a good tree.
3. A single distributed procedure for construction and maintenance
of loop-free Session Trees.
This draft is required to conform to the guidelines specified in
draft-ietf-rmt-bb-tree-config-01 [Page 3]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
draft-ietf-rmt-author-guidelines-00, Author Guidelines for RMT
Building Blocks and Protocol Instantiation Documents [KV00]. A
future version of this draft will more explicitly describe this
conformance.
1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
Session
A session is used to distribute data over a multicast address. A
tree is used to provide reliability service for a session.
Sender
The single sender of data on a multicast session. The Sender is the
root of the Session Tree.
Receiver
A receiver receives data from the sender via the multicast session.
Session Identifier
A fixed-size number, chosen either by the application that creates
the session or by the transport. Senders and Receivers use the
Session Identifier to distinguish sessions. The size of this number
is specified by the PI.
Service Node (SN)
A node within the tree which receives and retransmits data, and
aggregates and forwards control information toward the Sender. The
Sender operates as the root Service Node in any session tree.
Service Nodes MAY be dedicated servers within a network designed to
participate in multiple Sessions and support multiple trees, or they
MAY be Receivers participating in an individual session. SNs MAY
limit the number of Children they choose to service, and MAY also
make other restrictions on the characteristics of each Child
(distance, location, etc.). An SN that has accepted Children for a
session is called a Parent.
Session Tree (ST)
The Session Tree is a spanning tree per Session, rooted at the Sender
and consisting of a number of Service Nodes. An ST is constructed
for the forwarding of control information back to the Sender as well
as for the resending of missed data to the Receivers. The ST for a
particular session may change over the course of the session.
Parent
A Parent is an SN or Receiver's predecessor in the ST on the path
toward the Sender. Every SN or Receiver on the tree except the
Sender itself has a parent. Each Parent communicates with its
children using either an assigned multicast address or through
unicast. If a multicast address is used, this may be the same
draft-ietf-rmt-bb-tree-config-01 [Page 4]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
address used by the session, or one specifically assigned to the
Parent.
Children
The set of Receivers and SNs for which an SN is providing repair and
feedback services.
Tree Level
A number indicating the number of "generations" away from the root
The sender is at TL=0. Those that use the sender as service node are
at TL=1 and so on. When a receiver is not connected to the tree yet,
it has a tree level value greater or equal to 128. The reason for
reserving part of the space (of tree levels) for indicating "off-
tree" is so that special measures can be used to prevent forming
loops. The largest value is 256, so the range of off-tree levels are
in the range 128 - 256. Initially, all receivers have a TL value of
128. Once a Node joins the tree, its Tree Level is updated to be one
more than its Parent's level.
Distance Metric
There are several techniques to quantify distances between nodes
(Receivers, SNs, and the Sender) in a session. Each type of
quantification is called a distance metric. Several Distance Metrics
are described in this draft.
Sender Distance
The distance from a node to the Sender.
Neighbor Distance
The distance from a node to a Neighbor.
Neighbor
A node's (Receiver or SN) Neighbors are SNs that are close to the
node, according to the Distance Metric(s) used by the node.
1.2 Assumptions
This document describes how to build trees under the following
conditions:
a. The multicast group has only a single sender.
b. A single SN can serve multiple sessions.
c. Sessions can take advantage of a pre-deployed infrastructure of
SNs (ones that are not necessarily aware of a session before
the receivers), or recruit Receivers to be SNs.
d. Generic Router Assist[CST00] and Expanding Ring Search[YGS95]
are not required of the network infrastructure, but if available
they should be able to be utilized.
1.3 Requirements
The following are specifically required:
draft-ietf-rmt-bb-tree-config-01 [Page 5]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
a. While tree-building may take advantage of information from the
routing layer, the mechanisms described are independent of the
routing protocol(s) used by the underlying multicast tree.
b. All trees constructed must be loop-free
c. These mechanisms must support late joiners and tree optimization
1.4 Applicability Statement
The authors recognize that automatic tree construction is a very
difficult problem. Nonetheless, complete reliance on manual
configuration is very user unfriendly and error prone as well.
This building block describes a procedure for constructing loop-free
trees when there is minimal manual configured information available.
This is analogous to providing a system with default configurations
that allow the system to work correctly, but not necessarily
optimally.
There are many possible criteria for tree optimality. This BB does
not attempt to define a single optimality criterion, nor does it
tries to produce an optimal tree. It is, however, the goal of the BB
to construct better trees as more configuration and measurement data
are introduced to the procedure.
This BB describes only a subset of the relevant parameters for
constructing a better tree, in particular sender distance and
neighbor distance. There are many techniques for measuring these
distances. Some of the techniques may not be applicable globally.
Expanding ring search (ERS) is an effective technique in a local
subnet or intranet (specially when the IP multicast routing protocol
is dense-mode based). On the other hand, it is not practical in a
multi-domain network; it is not effective when the routing protocol
is sparse-mode based; and it can add a significant control traffic
overhead.
Generic Router Assist (GRA) can provide measurement hooks to
determine SNs that are located along the path for multicast data
distribution. But such facilities many not be available in all
networks.
The tree construction procedure does allow manual configuration and
various distance measurement techniques to be selectively and
independently applied for different subgroup of receivers and SNs, to
achieve incremental improvement to the quality of the tree.
2. Overview
The tree building process described within this document builds
logical trees which consist of:
1. A root node which is the Sender
2. Intermediate nodes (Service Nodes or SNs) which may be either
Receivers or nodes specifically allocated to the task of repair
and aggregation
3. Leaf nodes which are Receivers only
ACK trees are spanning trees rooted at the Sender. ACK trees can be
used for the forwarding of control information (i.e. ACKs) towards
draft-ietf-rmt-bb-tree-config-01 [Page 6]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
the root, or for forwarding of data (i.e. repairs) towards the leafs.
ACK trees are constructed per Sender; each node wishing to join the
tree discovers its neighboring SNs and then selects its best parent
based on locally available information, such as the relative sender
distances and neighbor distances. This document specifies several
techniques for measuring distances.
It is important to note that SNs may be actual Receivers (e.g.
Receivers willing and able to also function as SNs) or pre-deployed
"specialized" servers that are prodded into service by Receivers. We
use the term Service Node to refer to either a Receiver or "server"
which is participating as part of the logical tree formation.
Tree construction, regardless of SN discovery and selection
algorithm, proceeds generically as follows.
1. Session Announcement
Receivers of a session use standard out-of-band mechanisms for
discovery of a session's existence (e.g. Session Advertisement
[HPW00], URL, etc). In this way, a Receiver discovers the multicast
group address, the Sender's address, and other information necessary
for logical tree construction. Sessions may be announced in two
parts, the first part containing generic information about the
session, such as the multicast address, and the second part,
announced on the multicast address, containing additional information.
2. Measurements (optional)
All SNs and Receivers that know about the session optionally determine
their distance to the Sender.
3. Neighbor Discovery
Meanwhile, each Receiver discovers nearby SNs for the Session using
the neighbor discovery algorithm(s).
4. Service Node Selection
Once a Receiver (or SN needing to join the tree) discovers a nearby SN,
it obtains the SN's distance to the Sender as well as its distance to
the Receiver, tree level, and other suitability values, if available.
After discovery is complete, the best SN is selected.
(Note, in the case when the POC algorithm is used for neighbor
discovery, neighbor discovery and selection may become a combined
step, performed at the POC).
5. Binding to Service Node
The Receiver or SN then binds to the chosen SN. If a bind is
unsuccessful, the Receiver or SN retries with another nearby SN,
or starts the discovery process all over again. Once an SN receives
a bind from a child, that SN must then also join the tree if it has
not already, discovering an SN of its own, possibly using a different
method than leaf Receivers.
6. Optimization and Fault Recovery (optional)
During a session, a Receiver or SN may change to a different SN for a
number of reasons described below, including fault tolerance. The Session
Tree is maintained and optimized over time.
draft-ietf-rmt-bb-tree-config-01 [Page 7]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
The building block provides mechanisms for maintaining and optimizing
the tree, as well as tearing it down when the Sender is done with it.
In the rest of this document, the term 'Node' denotes a Receiver or
SN.
+--------------------+
| |
| 1. Session |
| Advertisement |
| |
+--------------------+
|
| Node receives tree-building parameters
|
V
+--------------------+
| |
| 2. Measurements |<-------------------------|
| | |
| (optional) | |
+--------------------+ |
| |
| |
| |
V |
+--------------------+ |
| | |
| 3. Neighbor | |
| Discovery | |
| | |
+--------------------+ |
| |
| |
| |
V |
+--------------------+ |
| | |
| 4. Service Node | |
| Selection | |
| | |
+--------------------+ |
| |
| |
| Node picks best Neighbor |
| |
V |
+--------------------+ |
| | |
| 5. Binding to |---------------------------
| Service Node | 6. Optimization and
| | Fault Tolerence
+--------------------+
3. Distance Measurement
Different techniques can be used to determine distances between nodes
in a session. These distances can be used in selecting a Service
Node if several are discovered.
draft-ietf-rmt-bb-tree-config-01 [Page 8]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
These techniques quantify distance differently. Each specific way of
quantifying distance is called a metric. Different Metrics are not
necessarily comparable. For example, if distance between A and B is
X using metric m1, and distance between A and C is Y using metric m2,
then X > Y does not necessarily imply B is farther from A than C.
Only distances of the same metric should be compared and ranked.
Metrics quantify how close two nodes are from each other. Though not
all of them measure specific distances, we will refer to these
measures as distances in this document.
It is not necessary for all nodes to use the same metric to select
their neighboring SN to connect to.
3.1 Multicast Beacon
Periodically, a node sends a Beacon message on the Session's
multicast address. The Beacon message includes the original time-
to-live value set by the node. The distance to the node is then
calculated as
Beacon's original TTL - Beacon's current TTL
One node is closer than another if its distance is a lower number.
3.2 Mesh
This metric is just like the Multicast Beacon Metric, except that
nodes query their local router to determine their distance to the
sender.
3.3 Address
For IPV6 addresses[HOD98], distance can be approximately determined
by the number of aggregation levels one address has in common with
another. For this metric, one node is closer than another if its
address has more aggregation levels in common with the querying
node's address.
3.4 Static
The node's distance to other nodes is made available in some well-
known location. One node's is closer than another if its distance is
a lower number.
3.5 GRA
The node's distance to the sender is determined by a GRA message that
lists the set of GRA routers on the path from the source. Nodes (and
POCs) can use these path-strings to determine relative locations of
SNs.
4. Session Announcement
The first step in the tree-building process is for a node to be
informed of the session's existence. This can be done using some
out-of-band method, such as Session Advertisement [HPW00], URL, e-
mail, etc.
draft-ietf-rmt-bb-tree-config-01 [Page 9]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
SNs do not necessarily receive these advertisements. If an SN is not
a Receiver, it obtains the advertisement information once it is
contacted by a Receiver.
The advertisement includes the multicast address being used, the
Sender's address, the Session Identifier, any specific port numbers
to be used, and any global information useful for tree construction
(such as a sender POC), to be described later.
5. Service Node Discovery
Discovery is the process by which a node determines a suitable
Service Node. During the discovery process, SNs are found, Sender
distances are exchanged, and the best SN is selected.
5.1 Service Node Discovery Algorithms
This draft describes four algorithms for discovering SNs: Static,
Expanding Ring Search (ERS), Point-of-Contact (POC), and Mesh.
Multiple algorithms may be used within a single session. In
particular, one algorithm may be used by statically configured SNs in
a session, while a different algorithm is used by Receivers of the
session. For instance, SNs for a session may use the Mesh algorithm,
while the receivers use static configuration to discover the SNs.
The transport protocols request the following information from this
BB by providing a unique identifier for the data session to join.
This identifier, called a Session Identifier, is obtained from the
Session Advertisement algorithm in the PI.
Service Nodes:
1. ParentAddress: the address and port of the parent node to which
the node should connect
2. UDPListenPort: the number of the port on which the node will listen
for its children's control messages
3. RepairAddr: the multicast address, UDP port, and TTL on which this
node sends control messages to its children.
Receivers:
1. ParentAddress.
Senders:
1. UDPListenPort
2. RepairAddr
After the above information is obtained from auto-tree-config, the
transport protocol may perform the necessary Bind operations for
participating in the Session Tree.
5.1.1 Static
A list of Neighbors is made available to a Node in some well-known
location. The location of the Neighbor list, which could be a file
name, a URL, or a DNS name, MAY be included in the session
advertisement. A node MAY query these neighbors for their Sender
distances and pick the best one, but it may simply pick one and try
to bind to it. It is important to carefully configure static trees
draft-ietf-rmt-bb-tree-config-01 [Page 10]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
without loops. If the tree-building process detects that binding two
nodes would form a loop, the bind will not take place and the node
will not be able to join the tree unless there are alternative static
neighbors.
5.1.2 ERS
Nodes look for Neighbors using a multicast QUERY message. The initial
TTL value in the QUERY message is specified in the session
announcement and may be as large as the session TTL. An SN that is
able to handle additional Children SHOULD respond to a QUERY message
by multicasting a ADVERTISE message. If the SN is not yet a Parent,
the TTL used in this response is the same TTL used in the QUERY
message. If the SN is a Parent, the TTL used is the greater of the
QUERY TTL and the Parent's current ADVERTISE TTL.
The Node listens for ADVERTISE messages after sending the QUERY
message. If one or more ADVERTISE messages are received during a
solicit period, the best SN among them is selected as described in
section 5.2. If no ADVERTISE messages are received, the Node sends
another multicast QUERY message with an incremented TTL. The process
of sending the multicast QUERY message with an increasing TTL value
continues until a response is received. The TTL value is limited by
a value specified in the advertisement.
If the TTL value required to reach the soliciting Node is greater
than the TTL used to reach the SN, a ADVERTISE message may not reach
the Node. However, if future QUERY messages have increased TTL
values, the TTL may eventually be large enough for the ADVERTISE
message to reach the Node. However, it is possible that the Node
will not locate any SNs using Expanding Ring Search. It is advisable
that a backup method, such as static, be available.
SNs SHOULD suppress sending ADVERTISE messages in response to QUERY
messages if one was sent with at least the QUERY's TTL within the
last SolicitInterval (a multicast session parameter).
Nodes MAY suppress sending QUERY messages for one interval when they
first start in order to collect information from ADVERTISE messages
already solicited by other nodes.
5.1.3 POC
Nodes query using unicast a designated node, the POC, for Neighbors.
The POC returns one or more choices of SN, from which the node
chooses. The POC scheme is suited for scenarios where receivers lack
the ability to use multicast for ERS, no mesh is deployed, and
dynamic construction of the session tree is required.
POCs may be well-known or discovered. POCs do not join the multicast
tree of a session. Receivers and SNs inform the POC they wish to
join the session tree and the POC recommends an SN to use. Local
POCs contact the Sender's POC for interdomain connectivity.
POCs are not necessarily specialized infrastructure. Any receiver,
SN, or the Sender may be assigned the role at the start of the
session, as long as all hosts have a unified view of who the local
POC is. We expect an existing local SN may typically be appointed as
the POC.
draft-ietf-rmt-bb-tree-config-01 [Page 11]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
Alternatively, POCs could be specially deployed redirection servers
similar to DNS redirect servers implementing the messages and
interfaces described in this BB.
The address of the Sender's POC MUST be included with the session
advertisement. Local POCs SHOULD be used to configure trees within a
single domain. Receivers and SNs may discover local POCs via a
static configuration or other more dynamic mechanisms such as an
anycast service [PTM93] or directory services. These local POCs MUST
be told of the Sender's POC in order to connect all trees across
domains. POCs also serve as an interface between interdomain and
intradomain tree construction.
SN discovery with POCs proceeds as follows:
1. The node joins the multicast session and learns of the Sender's
POC or a local POC from the session announcement or directory services.
2. The node optionally discovers its distance from the Sender. Any
metric described in Section 3 may be used.
3. The node sends a QUERY message to the POC for a parent. The request
includes (optionally) the node's distance to the sender.
4. The POC chooses a parent for the node and sends the IP address (and
port) of the parent in an ADVERTISE packet.
5. A SN wishing to accept children sends (unicasts) an ADVERTISE message
to a POC. The POC used these ADVERTISE messages as replies to
QUERY messages.
5.1.4 Mesh
The job of an IP Multicast routing protocol is very similar to that
of the auto tree configuration building block-to dynamically
construct and maintain a set of distribution trees, where each router
is a node in the tree. To do so, the multicast routing protocol
makes use of a unicast routing protocol, which is responsible for
discovering and maintaining a set of unicast routes between
neighbors, such that any node in the interconnected mesh can be
reached by any other. The candidates for those neighbors are
statically configured as part of the router configuration, while host
entities (senders and receivers) dynamically find their nearest SN
using neighbor discovery algorithms described in this BB.
The mesh approach to neighbor discovery and distance metrics is
directly analogous to this scenario, providing functionality similar
to what OSPF offers. The mesh approach requires that all of the SN's
that will participate in any given tree construction process exist as
persistent infrastructure entities, with statically configured routes
among their neighbors. Each of these neighbor routes among the SNs
in a mesh exchange unicast routing updates, using a standard unicast
routing protocol such as OSPF or IS-IS, or even a very simple link-
state routing protocol. The link for each route SHOULD be
implemented as a TCP connection.
The result of this unicast routing protocol is a state table at each
SN, with a next-hop neighbor SN forwarding entry for each SN that is
presently connected into the mesh. This next-hop neighbor is used as
the selection for the best neighbor service node to bind to, for SN
draft-ietf-rmt-bb-tree-config-01 [Page 12]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
to SN bindings. Receivers and senders locate a neighboring SN
through any of the other mechanisms described in section 5.
The following algorithm is used.
1. At the time they start running, each SN contacts its statically
configured neighbors via TCP, and joins in the unicast routing protocol
being run over all of the SNs in this mesh.
2. At the time a sender wishes to join a tree, it finds a SN that is part
of the mesh, using any of the other neighbor discovery algorithms in section
5. The sender binds to this SN, and then advertises this SN's address in the
session advertisement it sends to all receivers.
3. When a receiver wishes to join the tree, it finds a neighboring SN using
any of the other neighbor discovery methods in section 5.
4. For neighbor selection, each SN looks up the sender's SN in its own
forwarding table, and uses the primary next-hop neighbor in this table.
5. If that neighbor SN does not wish to join the tree, for example due to
the load at that SN, in the Reject message of the bind it can pass back its
next-hop neighbor to the sender's SN, which can be used to attempt to bind to
instead.
6. Loop freedom is guaranteed by the algorithms in section 6.
5.2 Service Node Selection
Once Neighbors have been discovered, a node selects the best one
using whatever distance information is available.
If there is no sender distance information to compare, the best SN is
simply the one that is closest to the node, without a loop being
formed by binding the node to the SN.
If sender distances are available, there are two cases:
Leaf nodes: For leaf nodes, the goal is to use the closest SN
possible.
SNs: For SNs joining the tree, it is important to pick an SN that is
closer to the sender; neighbor distance is a secondary factor.
Once an SN has been selected, the node tries to bind to it as
described in Section 6. Loop prevention is done during the bind
process using only Tree Level information.
This algorithm is recommended because it assigns each node the
closest SN, and does not require all nodes to measure their sender
distance at the start of the session. Depending on the selected
metric, multiple nodes measuring sender distance could cause message
implosion, and delay tree construction. On the other hand, the SNs
selected may actually be further from the Sender than their children
are. However, it may be necessary to assign nodes to non-optimal SNs
in order to get them on the tree, since it is possible that no SN
closer to the Sender can accept any more children.
More optimal trees can be constructed by using stricter rules for SN
selection. For instance, a session MAY ensure loop-freedom by
building the tree down from the Sender. In this case, nodes only
draft-ietf-rmt-bb-tree-config-01 [Page 13]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
need to select SNs whose level is less than 128 to ensure the tree is
loop-free. However, this makes tree-building take more time.
Alternatively, nodes may be required to measure sender distance
before selecting an SN in order to ensure that each parent is closer
to the Sender than its children. Presumably, this results in a tree
in which parents detect message loss before their children,
minimizing repair requests.
There are other criteria for SN selection based on load balancing and
minimizing the number of levels.
6. Tree Formation
The following is a detailed description of the tree formation
process. All tree construction follows this pattern. The ONLY
differences between instantiations of this building block lie in how
nodes discover neighbors.
Once an SN has been selected, the Node sends a BIND message to the
SN. If the SN has an outstanding request to bind to another SN, it
must refuse the incoming bind request in case it forms a loop.
Otherwise, it MAY accept the Node as a child as long as selecting it
would not cause a loop in the tree. Loop freedom is guaranteed by
these two rules:
1. If the requesting node does not have children, the SN can accept it as
a child as long as the SN has no outstanding bind requests. If it does
have an outstanding bind request, the SN can accept the node as a child
if it's address is less than the child's address.
2. If the requesting node has children, the SN can accept it as a child if
a. the SN's level is 128, i.e., it is the top of a sub-tree not yet
connected to the Sender, or
b. the SN's level is less than 128, i.e., it is connected to the Sender.
The second rule prevents a node from selecting one of its own
children as its parent. Two nodes at level 128 are prevented from
selecting each other using tie-breaking criteria described step 1
above.
If the SN accepts the Node as a Child, it returns an ACCEPT message.
If it does not accept the Node, it sends a REJECT message. If the
Node does not receive a response after a reasonable number of tries
and timeout period, it MAY select the next best Neighbor from its
cached list, or else run the Service Node Discovery process again to
determine an alternate SN to try.
The REJECT message contains a reason code. If the code indicates
that the node was rejected because the SN was not yet on the tree,
the node MAY choose to retry that SN after a suitable timeout.
The REJECT message may also include a list of alternative SNs for the
node to try.
The ACCEPT message MUST include the Parent's current Tree Level. The
Node MUST set its Tree Level to one more than the Parent's level. If
draft-ietf-rmt-bb-tree-config-01 [Page 14]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
the Node itself has Children, it MUST send a HEARTBEAT message
containing the new tree level. The ACCEPT message MAY also indicate
the starting sequence number of the message from which data
reliability is assured.
Service Nodes MAY limit the number of children they support depending
on their capacity. Once an SN has accepted its maximum number of
children, it stops accepting new children until a change in
membership causes its count of children to go below this limit.
If an SN limits the number of children it supports, it SHOULD reserve
at least one child slot for other SNs. This guarantees the growth of
the repair tree.
A note about late joiners: a late joiner is a node seeking membership
in the tree after the Sender has started to send data. In this case,
it is possible that some of the data is no longer available within
the tree. Nodes may need to have specific information about repair
availability before selecting a Parent.
In order to accommodate this requirement, the QUERY, BIND, ACCEPT,
and HEARTBEAT messages may contain information about both a Node's
requirements for repair data, and an SN's ability to provide that
data.
7. Tree Maintenance
Tree maintenance is an ongoing process active in every Node. Because
the tree is based on the operation of SNs, as well as the various
underlying metrics that may change over time, it is important that
these dependencies be monitored for changes. Nodes MUST monitor
Parents for liveliness and changes in tree level, and SHOULD continue
to run the Neighbor Discovery and Selection process in order to
optimize their choice of SN. Parents must also monitor Children for
liveliness.
7.1 Monitoring Parents and Children
Parents periodically send out HEARTBEAT messages unicast or on their
assigned multicast addresses as a keep-alive to their Children. The
HEARTBEAT message MUST contain the Parent's current Tree Level.
Children MUST set their Tree Level to one more that their Parent's
level. Children respond to their Parent's HEARTBEAT message with a
HEARTBEATCONFIRM message. Both messages MAY be suppressed if other
messages in the protocol instantiation have recently provided a
keep-alive function.
If a Child does not receive a HEARTBEAT message from its Parent for a
period of time, it MUST attempt to bind with an alternate SN.
A Child that is leaving a session MUST send a unicast LEAVE message
to its Parent. The Parent SHOULD respond with a LEAVECONFIRM
message.
A Parent that is leaving the session MUST send a multicast LEAVE
message to its Children indicating that they need to bind with an
alternate SN.
If a Parent does not hear a HEARTBEATCONFIRM message from a Child for
a period of time, or it receives a LEAVE message from a Child, it
draft-ietf-rmt-bb-tree-config-01 [Page 15]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
removes that Child from its list of Children.
7.2 Optimizing the Tree
Implementations of this building block SHOULD continue to run the
Neighbor Discovery and Selection process in order to optimize the
choice of SN. This continuous process also keeps the distance
information for the current Parent up- to-date. Whenever the process
returns a better SN than the current one, the Node MAY bind to the
new SN. Once the new SN is bound to, the Node MUST send a LEAVE
message to the original Parent. A Parent with no Children MAY leave
the session.
draft-ietf-rmt-bb-tree-config-01 [Page 16]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
8. Messages
These messages are required for implementations of this building
block. The list below indicates which message contents are required
by implementations. Implementations may also include other
protocol-specific information in these messages.
+------------------+---------------------------------+--------------------------+
| Message Name | Description | |
| m-cast or u-cast | | Contents |
|------------------+---------------------------------+--------------------------+
| QUERY | A message used to discover | Sender distance, TTL |
| both | neighbors. The unicast version | (optional) |
| | is sent to POCs; the multicast | |
| | version is used in ERS. | |
|------------------+---------------------------------+--------------------------+
| ADVERTISE | A message used to advertise a | IP address, port |
| both | SN. The unicast version is sent | |
| | from the POC; the multicast | |
| | version is sent by SN themselves| |
|------------------+---------------------------------+--------------------------+
| BIND | request to SN to join tree | |
| unicast | | current Tree Level |
|------------------+---------------------------------+--------------------------+
| ACCEPT | acceptance of BIND from SN | current Tree Level |
| unicast | | |
|------------------+---------------------------------+--------------------------+
| REJECT | rejection of BIND from SN | reject reason |
| unicast | | alternate SN list |
|------------------+---------------------------------+--------------------------+
| LEAVE | Child leaving Parent (u-cast), | |
| both | or Parent leaving tree (m-cast) | |
|------------------+---------------------------------+--------------------------+
| LEAVECONFIRM | Acknowledgement of LEAVE | |
| unicast | message | |
|------------------+---------------------------------+--------------------------+
| HEARTBEAT | Periodic keep-alive from Parent | current Tree Level |
| both | to Children | Sender distance |
| | | Advertisement (optional) |
|------------------+---------------------------------+--------------------------+
| HEARTBEATCONFIRM | Acknowledgement of HEARTBEAT | |
| unicast | from Child | |
|------------------+---------------------------------+--------------------------+
draft-ietf-rmt-bb-tree-config-01 [Page 17]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
9. External Interfaces This section describes external interfaces for
the building block.
9.1 Interfaces to this BB. These may be used by a PI, or by a higher-
level BB.
9.1.1 findSNs findSNs instructs the BB to start the process of finding
SN candidates for this node. findSNs may return immediately with a
list of candidate SN, or may use the SNlist interface (see section
9.2.1) to return the list at a later time.
9.1.2 messageFromSN(message) messageFromSN passes along a message
received from the SN. The PI or BB should pass messages down to this
BB in case they contain information necessary for the BB.
9.1.3 lostSN lostSN notifies the BB that the connection to the SN was
lost.
9.1.4 activateSN activateSN notifies the BB to begin SN operation if
necessary.
9.1.5 setOptimization(boolean) setOptimization tells the BB to start or
stop the tree optimization process. The upper BB or PI may want to
control when tree optimization takes place.
9.1.6 setSN(SN) setSN informs the BB that the PI or upper BB has
successfully bound to an SN.
9.1.7 acceptMember(node) acceptMember asks the BB to accept or reject
the node as a member. The BB returns a boolean value in response.
9.2 Interfaces from this BB to the PI or a higher-level BB.
9.2.1 SNlist(list) SNlist returns to the upper PI or BB a list of SN
candidates.
9.2.2 SNactive(boolean) SNactive informs the upper PI or BB that this
node has become an SN or has ceased SN operation.
draft-ietf-rmt-bb-tree-config-01 [Page 18]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
10. References
[CST00] B. Cain, T. Speakman, D. Towsley, "Generic Router Assist
(GRA) Building Block - Motivation and Architecture", Internet Draft,
Internet Engineering Task Force, March 2000.
[DEE89] Deering, S., "Host Extensions for IP Multicasting", RFC 1112.
[ECTP00] ITU-T SG7/Q.13 and ISO/IEC JTC1/SC6/WG7, "Enhanced
Communication Transport Protocol," ITU-T Draft Recommendation X.ectp
and JTC1/SC6 Committee Draft 14476, June, 2000.
[HC00] H. Holbrook, B. Cain, "Source-Specific Multicast for IP,"
Internet Draft, Internet Engineering Task Force, March 2000.
[HOD98} R. Hinden, M. O'Dell, S. Deering, "An IPv6 Aggregatable
Global Unicast Address Format," Request For Comments 2374, July 1998.
[HPW00] M. Handley, C. Perkins, E. Whelan, "Session Announcement
Protocol", Internet Draft, Internet Engineering Task Force, March
2000.
[HWKFV00] M. Handley, B. Whetten, R. Kermode, S. Floyd, L. Vicisano,
and M. Luby, "The Reliable Multicast Design Space for Bulk Data
Transfer," Internet Draft, Internet Engineering Task Force, March
2000.
[KV00] R. Kermode, L. Vicisano, "Author Guidelines for RMT Building
Blocks and Protocol Instantiation Documents," Internet Draft,
Internet Engineering Task Force, July, 2000.
[LPG98] B.N. Levine, S. Paul, and J.J. Garcia-Luna-Aceves,
"Organizing Multicast Receivers Deterministically According to
Packet-Loss Correlation," Proc. Sixth ACM International Multimedia
Conference (ACM Multimedia 98), Bristol, UK, September 1998.
[PTM93] C. Partridge, T. Mendez, and W. Milliken. "Host anycasting
service." Request For Comments 1546, November 1993.
[WCP00] B. Whetten, D. Chiu, S. Paul, "TRACK Architecture, A Scalable
Real-Time Reliable Multicast Protocol," Internet Draft, Internet
Engineering Task Force, July 2000.
[WLKHFL00] B. Whetten, L. Vicisano, R. Kermode, M. Handley, S.
Floyd, and M. Luby, "Reliable Multicast Transport Building Blocks for
One-to-Many Bulk-Data Transfer," Internet Draft, Internet Engineering
Task Force, March 2000.
[YGS95] R. Yavatkar, J. Griffioen and M. Sudan, "A Reliable
Dissemination Protocol for Interactive Collaborative Applications",
University of Kentucky, 1995.
draft-ietf-rmt-bb-tree-config-01 [Page 19]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
Authors' Addresses
Brad Cain
brad.cain@mirror-image.com
Mirror Image
49 Dragon Court
Woburn, MA 01801
Dah Ming Chiu
dahming.chiu@sun.com
Miriam Kadansky
miriam.kadansky@sun.com
Sun Microsystems Laboratories
1 Network Drive
Burlington, MA 01803
Seok Joo Koh
sjkoh@pec.etri.re.kr
ETRI/Korea
161 Kajong-dong
Yusong-Gu, TAEJON, 305-350, KOREA
Brian Neil Levine
brian@cs.umass.edu
Computer Science Department
University of Massachusetts
Amherst, MA 01003
Dave Thaler
dthaler@microsoft.com
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052
Gursel Taskale
gursel@talarian.com
Brian Whetten
whetten@talarian.com
Talarian
333 Distel Circle
Los Altos, CA 94022-1404
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet languages other than English.
The limited permissions granted above are perpetual and will not be
draft-ietf-rmt-bb-tree-config-01 [Page 20]
INTERNET DRAFT draft-ietf-rmt-bb-tree-config-01.txt 20 November 2000
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
draft-ietf-rmt-bb-tree-config-01 [Page 21]
| PAFTECH AB 2003-2026 | 2026-04-22 23:18:58 |