One document matched: draft-kadansky-tram-00.txt
INTERNET DRAFT M. Kadansky, D. Chiu, J. Wesley
draft-kadansky-tram-00.txt Sun Microsystems Laboratories
November 9, 1998
Expires: May 9, 1999
Tree-based Reliable Multicast (TRAM)
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To view the entire list of current Internet-Drafts, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).
Abstract
This paper describes TRAM, a scalable reliable multicast transport
protocol. TRAM is designed to support bulk data transfer from a
single sender to many receivers. A dynamically formed repair tree
provides local error recovery allowing the multicast group to support
a large number of receivers. TRAM provides flow control, congestion
control, and other adaptive techniques necessary to operate
efficiently with other protocols. Several bulk data applications
have been implemented with TRAM. TRAM has been tested and simulated
in a number of network environments.
Table of Contents
1 Introduction
1.1 Terminology
2 TRAM Components
2.1 Sender
2.2 Receivers
2.3 Repair Heads
2.3.1 Eager Repair Heads
2.3.2 Reluctant Repair Heads
3 Key Protocol Elements
TRAM [Page 1]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
3.1 Transport Parameters
3.2 Session Id
3.3 Data Message
3.4 Sequence Number
3.5 Acknowledgment
3.6 Beacon
3.7 TTL
4 TRAM Operation
4.1 Starting a TRAM Session
4.2 Tree Formation
4.2.1 Selecting the Best Repair Head
4.2.2 Repair Head Capacity
4.2.3 Repair Head Discovery
4.2.3.1 Bi-directional Multicast Networks
4.2.3.2 Uni-directional Multicast Networks
4.2.3.3 Discovery Mechanism Configuration
4.2.4 Binding
4.2.5 LAN Tree Formation
4.3 Tree Maintenance
4.3.1 Tracking Repair Heads
4.3.2 Tracking Children
4.3.3 Removing a Child
4.3.4 Leaving the Repair Group
4.3.5 Switching Repair Heads
4.3.6 Pruning
4.4 Packet Loss Recovery
4.5 Rate-based Transmission
4.6 Flow and Congestion Control
4.6.1 Slow Start
4.6.2 Congestion Detection and Feedback
4.6.3 Rate Decrease
4.6.4 Rate Increase After Slow Start
4.6.5 Retransmission Data Rate
4.7 Session Keep-alive
4.8 Late Join
4.9 End of Transmission
5 Security
6 Packet Formats
7 Discussion Regarding RFC2357
7.1 Performance Analysis and Discussion
7.2 Security Discussion
8 Limitations and Future Work
9 References
Acknowledgments
Appendix: A Table of Transport Parameters
Authors' Addresses
TRAM [Page 2]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
1 Introduction
Distributing significant amounts of identical data from a single
sender to multiple receivers can take considerable time and bandwidth
if the sender must send a separate copy to each receiver. IP
multicasting allows a sender to distribute data to all interested
parties while minimizing the use of network resources. Many
applications, however, require reliable data delivery which can be
supported by a reliable multicast transport protocol.
TRAM is designed to provide multicast reliability that scales to a
large receiver population. TRAM ensures reliability by using a
selective acknowledgment mechanism, and scalability by adopting a
hierarchical tree-based repair mechanism. The hierarchical tree
avoids acknowledgement implosion and inefficient global repairs by
localized repairs.
The receivers and the sender of a multicast session dynamically form
repair groups. These repair groups are linked together
hierarchically to form a tree with the sender at the root of the
tree. The use of a hierarchical tree has been shown to be the most
scalable way of supporting reliable multicast transmissions [SURVEY],
and is adopted by many other reliable multicast protocols, for
example RMTP-II [RMTP].
Every repair group has a receiver that functions as a group head; the
rest function as group members. These members are said to be
affiliated with their head. Except for the sender, every repair
group head in the system is a member of some other repair group. All
members receive data multicast by the sender. The group members
report lost and successfully received messages to the group head
using a selective acknowledgment mechanism similar to TCP's [SACK].
The repair heads cache every data message received and retransmit
them at a child's request. A group member may re-affiliate with a
different head to improve repair effectiveness and efficiency. This
dynamic nature of the tree allows it to react to changes in the
underlying network infrastructure without sacrificing reliability.
TRAM has intentionally been kept as lightweight as possible. TRAM has
been developed as part of a larger project, the Java(tm) Reliable
Multicast(tm) Service [JRMS]. The JRM Service includes support for a
wide range of features desirable for reliable multicast: group
management, security, receiver customization of data, session
advertisement, address allocation, etc. The JRM Service also
includes a protocol-independent API, designed to support multiple
transport protocols.
TRAM [Page 3]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
2. TRAM Components
2.1 Sender
The Sender is the root of the multicast repair tree in TRAM. It
transmits the data on the multicast address, initiates and controls
the formation of the multicast repair tree, and receives and
processes congestion reports from its immediate members.
2.2 Receivers
All members other than the sender are receivers in TRAM. Some of the
receivers will retransmit lost packets for other receivers - they are
called repair heads.
2.3 Repair Heads
Each repair head has a set of members for which it provides
retransmission service. These members are referred to as the children
of the repair head. The repair head keeps track of the packets its
children have received and those that they missed. The repair head
caches a packet until all of its children have acknowledged it. If a
child reports that a packet is missing, the repair head retransmits
the packet to all of its children by multicasting with appropriate
TTL scope.
2.3.1 Eager Repair Heads
Eager heads are members that have been specifically configured to be
repair heads. An eager head is expected to have sufficient system
resources to cache data packets and service retransmission requests
effectively.
The Sender is always an eager head.
2.3.2 Reluctant Repair Heads
Reluctant heads are repair heads that only accept members and perform
repairs if an eager head is not available in the area. Reluctant
heads solicit members to join their repair group just like eager
heads. However, members select reluctant heads only if they do not
hear from any nearby eager heads.
TRAM [Page 4]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
The default member role (memberRole) is RELUCTANT_HEAD. Members must
be explicitly configured to be EAGER_HEAD or RECEIVER_ONLY members.
3. Key Protocol Elements
3.1 Transport Parameters
TRAM is started at each member with a number of transport parameters.
A complete list of these parameters and their default values is
included in the Appendix. The following descriptions will refer to
these parameters by name.
Some transport parameters are common to all group members. For
example, the multicast address, port number, ackWindow, minDataRate,
and maxDataRate. These group-wide parameters are typically created
once and distributed to all members of the group, for example using
SAP (Session Announcement Protocol [SAP])
Some transport parameters are local, and their values can vary from
member to member. An example is transportMode. TransportMode can be
set to SEND_ONLY, RECEIVE_ONLY or SEND_RECEIVE, depending on whether
the member is a sender or receiver.
3.2 Session Id
The sender generates a sessionId to uniquely identify each session.
This id is used to detect multicast address collisions, as well as
sender restarts.
3.3 Data Message
A Data Message contains a payload and a TRAM protocol header. The
protocol header contains information such as sessionId.
The sender transmits Data Messages using a rate between a minimum and
maximum rate (minDataRate and maxDataRate) as specified in the
transport parameters.
3.4 Sequence Number
Each data packet sent contains a sequence number. The first data
packet sent contains sequence number 1. This is incremented for each
subsequent data packet. Members detect missing packets based on the
packet sequence numbers received. Sequence numbers allow the
receivers to pass the data packets up to the application in the same
order they were sent. Setting the transport parameter ordered to
TRUE selects ordered delivery of data packets to the application.
TRAM [Page 5]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
3.5 Acknowledgment
Receivers send unicast Acknowledgment Messages to their repair head.
The Acknowledgment Message contains a sequence number that indicates
all data packets up to this number have been received. The
Acknowledgment Message can optionally contain a bit mask to indicate
missing packets.
The ackWindow is the number of packets a member must receive before
sending an Acknowledgment Message to its repair head. The default
value for ackWindow is 32.
3.6 Beacon
The sender uses Beacon Messages to signal the start and end of a
multicast session. The sender also transmits Beacon Messages after
data transmission has started if the application stops sending data
for a period of time. These Beacon Messages act as filler to notify
members that the session is still active. Flag bits are used to
indicate the purpose of the Beacon Message.
Like Data Messages, Beacon Messages are always multicast to the
entire group.
3.7 TTL
All multicast packets, including Beacon Messages, Data Messages and
their retransmissions, and other control packets, are transmitted
with specifically chosen Time To Live (TTL) values. TTL determines
the distance into the network a packet will travel.
Beacon and Data Messages have a TTL large enough to reach all
members. This TTL is referred to as the sessionTTL. Only those
receivers that receive the Beacon Messages should join the repair
tree.
Repair heads set the TTL small enough to only reach their children.
This TTL is referred to as the repair TTL.
4.0 TRAM Operation
4.1 Starting a TRAM Session
The Sender transmits Beacon Messages to initiate the session. The
Beacon Message is sent to the entire multicast group at regular
intervals (beaconInterval). Members begin the tree formation process
when they receive a Beacon or Data Message.
TRAM [Page 6]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
After data transmission begins, the sender transmits Beacon Messages
only when there is a gap in the application's data stream (see
description in Section 4.7).
4.2 Tree Formation
The repair tree in TRAM provides the structure for local repair
groups. The repair groups localize repair and control messages, and
provide a feedback path from members to the sender. A repair head
manages its repair group. Repair group management includes accepting
new members, keeping track of children, retransmission of requested
data packets, and aggregation of feedback messages from members.
4.2.1 Selecting the Best Repair Head
Each member selects the best repair head it can find. The best
repair head is the closest available head with the most children
already attached. The multicast TTL value required to reach the
member from the repair head defines the distance between them. A
closer head requires a smaller TTL value. Eager heads are selected
over reluctant heads if everything else is equal. Selecting a close
repair head limits the distance multicast repair packets will travel
into nearby networks. It also localizes control traffic between
members and their repair heads.
Selecting a repair head with the most children minimizes the number
of repair heads. Reducing the number of repair heads minimizes the
number of control messages.
Other criteria used to break ties are: greatest maxChildren, and
lowest IP address.
4.2.2 Repair Head Capacity (maxChildren)
Repair heads limit the number of children they support to
maxChildren. The default is 32 children per repair head. Once the
repair head has accepted its maximum number of children, it stops
accepting new members until a change in membership causes the member
count to go below this limit.
Since the repair tree is critical to the operation of TRAM, each
repair head MUST reserve several slots for other repair heads. This
guarantees the growth of the repair tree.
4.2.3 Repair Head Discovery
Receivers discover repair heads by using multicast solicitation and
advertisement control messages. Some networks such as satellite based
TRAM [Page 7]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
networks support multicast capability only in one direction. Such
networks typically have slow back-channels that may not support
multicast. This is referred to as a uni-directional multicast
network, as opposed to a bi-directional multicast network.
There are two basic mechanisms for repair head discovery:
o member-solicited head advertisement
o unsolicited head advertisement
The member-solicited approach is used for bi-directional multicast
networks. The unsolicited approach is more suitable for uni-
directional multicast networks.
4.2.3.1 Bi-directional Multicast Networks
All members in bi-directional multicast networks can communicate with
every other member via multicast. For such environments, TRAM
supports a member-solicited repair head discovery algorithm to
dynamically build the repair tree.
Receivers join the multicast group and remain idle until the
multicast session is detected to be active. Reception of a Beacon
Message or a Data Message from the sender signifies an active
session. When the session becomes active, the members look for repair
heads using a multicast Member Solicit Message. A repair head that is
already attached to the repair tree and is able to handle additional
members SHOULD respond to a Member Solicit Message by multicasting a
Head Advertisement Message. The TTL used in this response is the same
used in the Member Solicit Message. If the TTL value required to
reach the member is greater than the TTL used to reach the repair
head, the Head Advertisements with the TTL from the first Member
Solicit Message will not reach the member. Future Member Solicit
Messages will have increased TTL values. Eventually the TTL will be
large enough for the Head Advertisement Message to reach the member.
Repair heads that have not joined the repair tree MUST ignore Member
Solicit Messages.
The receiver listens for Head Advertisements after sending the Member
Solicit Message. If one or more Head Advertisements are received
during a solicitInterval, the best repair head among them is
selected. If no Head Advertisements are received, the receiver sends
another Member Solicit Message with a larger TTL (incremented by the
transport parameter solicitTTLInc). The process of sending the
message with an increasing TTL value continues until a response is
received. This process is known as Expanding Ring Search [TMTP].
4.2.3.2 Uni-directional Multicast Networks
TRAM [Page 8]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Uni-directional multicast networks have links that support multicast
in one direction. For such networks, TRAM uses an unsolicited head
advertisement algorithm for head discovery. This method only requires
multicast capability from the repair head to the children.
In a uni-directional multicast network, repair heads multicast Head
Advertisement Messages announcing their existence. These messages are
sent at regular intervals with an increasing TTL value
(advertiseTTLInc). This is repeated until the value of TTL reaches
advertiseLimit.
The sender computes this interval, known as the Head Advertisement
Interval, as follows:
HAI = max( .5 second, (Heads * HASize) / maxAdvertiseBW1 )
Heads: Number of currently advertising heads - this
information is aggregated and propagated to the
sender by every repair head (via Acknowledgment
Messages).
HASize: Head Advertisement packet size
maxAdvertiseBW1: Head Advertisement bandwidth (bytes/second)
- configured
The computed HAI is included in every Beacon and Data Message. This
gives the sender control over the bandwidth used for head discovery.
This is critical because there is no congestion control for tree
formation messages. The sender reduces the rate at which each head
advertises itself as the number of advertising heads increase.
This formula limits the amount of Head Advertisement traffic to a
sender-specified bandwidth based on the number of advertising heads.
Another transport parameter, maxAdvertiseBW2, is used to compute the
HAI suitable for the time after data transmission has started.
Receivers join the multicast group and remain idle until the
multicast session becomes active. Then each receiver listens for Head
Advertisements for an Advertisement Listen Interval, computed as 60
seconds or 3 times the Head Advertisement Interval (HAI), whichever
is smaller.
If any Head Advertisements are received during this interval, the
best repair head is selected. If no head advertisements are received,
the receiver continues listening.
4.2.3.3 Discovery Mechanism Configuration
First, each member is configured with the transport parameter
TRAM [Page 9]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
memberRole. If the memberRole is not RECEIVER_ONLY, then this member
is a potential head.
Each member is configured with the following transport parameters for
the purpose of controlling the tree forming process:
o treeScheme - This parameter specifies the sender's suggested
tree-forming algorithm for the whole multicast group to use. The
values are: HEAD_ADVERTISE, MEMBER_SOLICIT and COMBINED. The
first option means all heads voluntarily advertise as described
in 4.2.3.2. The second option means heads only advertise upon
receiving Member Solicitation Messages, as described in
4.2.3.1. The last option means using option one before data
transmission and option two after data transmission has started.
MEMBER_SOLICIT is the default.
o advertise - This parameter can take the values: NO, YES,
YES_BEFORE_DATA, or SENDER_CHOICE (default). The first three
settings are used when the head is configured to do something
different than what is suggested in the treeScheme parameter.
o solicit - This parameter can take the values: NO, YES,
YES_AFTER_DATA, or SENDER_CHOICE (default), as above.
o parent - This parameter specifies a list of IP addresses and
port numbers of configured heads to use. The default is an empty
list. If the list is non-empty, then the member skips the head
discovery phase of tree building and proceeds to bind to one of
the heads on the list in the order specified.
These parameters allow the whole multicast group to use one of
several common tree-forming algorithms, and/or selected heads to be
locally configured to manually optimize the tree-forming algorithm.
The last parameter can be used to configure a static tree.
4.2.4 Binding
After selecting the best repair head using one of the above head
discovery schemes, the receiver proposes to be a child of the
selected repair head with a unicast Head Bind Message.
If the repair head has not reached its capacity, it responds to the
Head Bind Message with a unicast Accept Member Message; otherwise, it
responds with a unicast Reject Member Message. Accepting a child
requires the repair head to cache the received Data Messages until
the child acknowledges them. Depending on the lateJoinPref transport
parameter (detailed in Section 4.8), the Accept Member Message sent
by the repair head MUST indicate the starting sequence number of the
message from which data reliability is assured. The Accept Member
Message also contains an optional Bit Mask field for the head to
guarantee repair of additional non-contiguous packets.
TRAM [Page 10]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
4.2.5 LAN Tree Formation
When several members reside on the same LAN, TRAM attempts to create
a repair group on the LAN. This confines the control traffic to the
LAN and minimizes the number of heads on the LAN. Members elect a
single repair head called the root LAN head. The root LAN head joins
the rest of the repair tree as described above.
The root LAN head is elected as follows: potential heads on the LAN
send out Head Advertisement Messages with a TTL of 1 and LANState set
to Volunteering. An eager advertising head with the greatest
capacity (maxChildren) is elected root LAN head. If there are two or
more advertising heads with the same capacity, the one with the
lowest IP address is elected. If there are no eager heads
advertising, a reluctant head is elected. This method is compatible
with the method for selecting the best head described in Section
4.2.1. After a period of one Head Advertisement Interval (HAI), the
elected root LAN head changes its LANState to LAN_HEAD.
Potential root LAN heads listen for half of the HAI before sending
out an advertisement. If a better volunteer or an elected root LAN
head is heard from, the potential root LAN head suppresses its
advertisement.
If the number of members on the LAN equals or exceeds the capacity of
the root LAN head, additional heads, called LAN heads, are elected
from the members affiliated with the root LAN head. The root LAN
head or current LAN head announces the election using the Elect LAN
Head flag in the Head Advertisement Message. This ensures that new
members on the LAN will be able to affiliate with a LAN head if one
is available.
Like all heads, LAN heads reserve slots for children that are also
potential heads. In addition, LAN heads must reserve slots for
potential heads that are also LAN members, in order to be able to
grow the LAN tree.
Once LAN heads are elected, only the single LAN head that has room
for more children continues to send Head Advertisement Messages. Two
types of these messages are sent.
The first type has a TTL of 1 and LAN State set to LAN HEAD. These
are intended to inform LAN members about the availability of a LAN
head.
The other type are Head Advertisements sent to inform off-LAN members
of the availability of this head. As described in the above
sections, depending on the value of treeScheme, these Head
TRAM [Page 11]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Advertisement Messages may be triggered by the receipt of Member
Solicitation Messages, or may be unsolicited. These allow off-LAN
members to affiliate with the LAN head while suppressing excess Head
Advertisement Messages from other LAN members.
This LAN Tree formation method is used when the allowLANTrees
transport parameter is set to TRUE. The default value is FALSE.
4.3 Tree Maintenance
TRAM continuously adapts the repair tree to accommodate members
joining and leaving. TRAM also adjusts the tree to changing
conditions within the network. Repair heads and their children must
continuously monitor each other's performance. A repair head SHOULD
remove a child that is unresponsive or cannot keep up with the
sender's minDataRate. A child can select a new repair head if its
current repair head is not responding, or a better one is available.
This continuous maintenance allows the tree to dynamically adapt to
changing membership and network conditions.
4.3.1 Tracking Repair Heads
Each head multicasts a Hello Message to its repair group once per
helloInterval, as a form of keep-alive. After data transmission
starts, a repair head multicasts a Hello Message before the
expiration of a helloInterval when it has received an ackWindow of
new data packets. Whenever a repair head performs a retransmission,
however, it is counted as if it has sent a Hello Message, since the
retransmission serves to assure its children their head is still
active.
The Hello Message is sent to the same multicast address and port as
the multicast session. The TTL of the Hello Messages, however, is set
to the TTL of the repair group, which is the TTL needed to reach the
farthest child in the group.
If a child does not receive a retransmission or Hello Message from
its repair head during an ackInterval, it sets the Hello Not Received
flag in the next Acknowledgment Message it sends. If no Hello or
Retransmission Message is received in (maxHelloMisses * ackInterval),
the child attempts to locate a new repair head.
When the repair head receives an Acknowledgment Message with the
Hello Not Received flag set, it MUST immediately respond to the child
with a Unicast Hello Message.
Changes in network conditions can cause the members to lose Hello or
TRAM [Page 12]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Retransmission messages. This can happen when the changes in the
network require the repair head to use a TTL that is larger than the
previously used value. To adapt to such changes, the repair head
increases its repair TTL by repairTTLInc in response to an
Acknowledgment Message with the Hello Not Received flag set.
The repair TTL can also become larger than necessary. To fine tune
the repair TTL, every child computes its actual TTL distance from the
head. To enable this computation, the repair head includes the
current repair TTL value in every multicast control message sent to
the group. While the repair TTL value assigned in the IP header gets
decremented on a hop by hop basis, the TTL in the TRAM header remains
unchanged. The difference between the TRAM header value and the IP
header value gives the actual TTL distance. Each child then reports
the actual TTL distance via the Actual TTL field in the
Acknowledgment Message. The repair heads update each child's TTL
distance based on this value. When necessary, the repair heads MUST
update the repair TTL in addition to updating a child's TTL distance.
4.3.2 Tracking Children
Repair heads must identify children that become inactive. A repair
head knows that a child is alive and well if it receives
Acknowledgment Messages from it for every ackWindow of packets. If a
child's last acknowledged sequence number is more than two ackWindows
behind the sequence number of the latest packet received at the head,
it includes that child in the Member Address List of its next
Multicast Hello Message. This indicates to those in the Member
Address List that their head has not heard from them recently. The
children listed MUST respond immediately with an Acknowledgment
Message. The repair head repeats this process two more times. If it
has still not heard from the child, it SHOULD remove this child from
the repair group.
4.3.3 Removing a Child
To remove a child from a repair group, the repair head sends the
child a Unicast Hello Message with the Member Disowned flag set. The
child must rejoin the repair tree in order to get retransmissions.
4.3.4 Leaving the Repair Group
Any member that is not a repair head can leave the group at any time.
The member sends an Acknowledgment Message to its repair head with
the Terminate Membership flag set. The repair head removes this child
from its member list.
If the member trying to leave the group is a repair head, it SHOULD
TRAM [Page 13]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
first send its children a Hello Message with the HState field set to
Resigning. This signals the members to locate a new repair head.
Members find new repair heads with the methods described in the
following subsection. Once all of the repair head's children have
terminated their membership, the repair head can leave the group.
4.3.5 Switching Repair Heads
A member can switch to a new repair head if a better repair head is
found. If the current repair head is unresponsive, a new repair head
is chosen as quickly as possible. A member SHOULD switch to a new
repair head if a closer one is found or if the current head is
resigning. In this case, care must be taken to switch to the new one
only after all outstanding repairs are received from the old repair
head. The new repair head may not be able to provide repairs for
packets received prior to the member affiliating.
Hello and Head Advertisement Messages aid in the detection of
alternative repair heads in a region. Members SHOULD listen to Hello
Messages of other heads in the region not only to learn about better
heads but also to maintain a backup repair head list. This backup
repair head list enables quicker switching when the current repair
head becomes unresponsive. The HState reported in the Hello Message
enables members to cache only those repairs heads that are currently
accepting members.
A repair head who has lost its own head MUST not accept new members
until it has re-affiliated to a new head.
Switching repair heads without checking their level in the tree can
result in forming loops that are detached from the rest of the repair
tree. To prevent loops from occurring, TRAM specifies a RxLevel
parameter that indicates the tree level at which a member or a repair
head is operating. The sender is at RxLevel 1, its members at RxLevel
2 and so on. When a repair head attempts to switch its own repair
head, it MUST choose a repair head whose RxLevel is lower than or
equal to its own. If the reason for the switch is loss of its
parent, then the repair head tries to locate a new head for 30
seconds before transitioning to the resigning state.
A head reports its RxLevel periodically via the Hello Message. A
member always tracks the head's RxLevel and assigns its RxLevel to be
one more than the RxLevel reported by its head. When re-affiliating,
if a head sees the RxLevel in the Accept Member Message is higher
than its own RxLevel, it MUST proceed to terminate the membership. A
member with no children does not need to perform the RxLevel checks
when re-affiliating, as it is a leaf node in the tree hierarchy.
TRAM [Page 14]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
The process for affiliating with a new repair head is the same as the
initial bind procedure with the following exception. If the member's
current repair head is unresponsive and it has one or more missing
packets, the member MAY send a Head Bind request to all of the repair
heads that it knows about. The member checks each Accept Member
Message it receives for a repair head that still has the missing
packets available. The member can request retransmission of the
missing packets from this repair node. It MUST then select the best
repair head from those that accepted it and send an Acknowledgment
Message with the Terminate Membership flag set to all of the others.
4.3.6 Pruning
The repair heads must keep track of all members they serve. If one of
its children goes off-line, a repair head MUST detect this in time to
prune the child from the repair tree before its repair cache fills
up. The repair heads MUST also detect members that cannot keep up
with the sender's minDataRate.
The sender adjusts its data transmission rate in reaction to
receivers' feedback on congestion (described in Section 4.6). When
the sender starts to operate at minDataRate, the members are told of
this condition via the Prune Members flag in the Beacon or Data
Messages. Upon seeing the Prune Members signal, a repair head
proceeds to prune only if its cache occupancy reaches the configured
maxCache. Children with the most unacknowledged packets outstanding
SHOULD be considered first. Upon pruning a child, the messages that
are exclusively cached for the child are reclaimed.
4.4 Packet Loss Recovery
The job of packet loss recovery is distributed among the repair
heads. Each repair head receives Acknowledgment Messages from all
its children. The repair heads use this information to retransmit
lost packets to their children, and flush their caches.
Members send Acknowledgment Messages to their repair heads on
ackWindow boundaries. The first Acknowledgment Message is sent on a
random packet within the window. This distributes the Acknowledgment
Messages sent from all children of a repair head across the entire
window. For example:
If ackWindow is 32 packets, a receiver chooses a random initial
packet between 1 and 32 to start sending Acknowledgment Messages to
its repair head. If the first Acknowledgment Message is sent when
packet 3 arrives, the next Acknowledgment Message is sent when packet
35 arrives, when packet 67 arrives, etc.
TRAM [Page 15]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
4.5 Rate Based Transmission
The flow control in TRAM is rate-based and is similar to NETBLT
[NETBLT], a rate-based unicast protocol.
TRAM's packet scheduler computes the amount of time to delay each
packet in order to achieve the current data rate. The delay is
computed with the formula:
packet size / current rate
The overhead in processing the packet is subtracted from this delay.
TRAM then sleeps for the calculated period, sends the packet, and the
cycle continues. This is similar to the widely known token bucket
algorithm.
4.6 Flow and Congestion Control
The sender's data rate is adjusted based on feedback from the
receivers. Adjustments can be made every two ackWindows. This gives
most receivers time to respond to problems in their area.
4.6.1 Slow Start
The sender initially starts sending data at 10% of the maxDataRate or
the minDataRate if it is greater. This data rate is increased by 10%
of the maxDataRate if there are no congestion reports from the
receivers over two ackWindows.
This process is continued until maxDataRate is reached or congestion
is detected. This is called the slow start phase.
Every time the data rate reaches a new high value, the new rate is
saved - as the "Historical High Data Rate". This value is used later
in the rate increase computation after the slow start phase.
4.6.2 Congestion Detection and Feedback
In TRAM, receivers detect and signal congestion when the number of
outstanding missing packets increases from one ackWindow to the next.
When this occurs, a Congestion Message for the most recent ackWindow
is sent to the member's repair head. The repair head MUST immediately
forward a new Congestion Message up the repair tree unless a
congestion report for that ackWindow or a later ackWindow has already
been forwarded. This reduces the number of Congestion Messages
arriving at the sender.
Repair heads can also generate congestion reports based on their
TRAM [Page 16]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
cache occupancy. The cache has a Threshold, an implementation-
specific parameter typically set to half of maxCache; the cache also
has a High Water Mark which is initially set to equal to the
Threshold. When the cache occupancy reaches High Water Mark, a
Congestion Message for the current ackWindow is generated, and the
"High Water Mark" is incremented by ackWindow. When the cache
occupancy falls below Threshold again, the value of High Water Mark
is adjusted back down to Threshold. The Congestion Messages
triggered by high cache occupancy are treated the same way as those
generated by missing packets.
4.6.3 Rate Decrease
When the sender receives the first Congestion Message for an
ackWindow, it immediately cuts its transmission rate in half, or to
the minDataRate, whichever is greater. Future Congestion Messages for
this ackWindow or previous ackWindows are ignored.
At this point, the sender also resets the Rate Increase Increment to
be 1/4 of the difference between the Historical High Data Rate and
the current rate.
4.6.4 Rate Increases After Slow Start
In the absence of congestion reports, the sender increases its rate
by the "Rate Increase Increment" (after the slow start phase). This
allows the sender to quickly increase its rate back up to where it
had operated prior to the congestion. The new rate is capped by the
maxDataRate value. The Historical High Data Rate is updated if the
current rate exceeds the old value.
The interval between successive rate increases MAY be variable
depending on the sender's estimation of the round-trip time for a
rate increase and feedback cycle. The minimal interval between
successive rate increases is two ackWindows without any congestion
reports.
4.6.5 Retransmission Data Rate
The sender retransmits packets to its repair group at the current
data rate. Retransmissions are sent before new data.
Repair heads send retransmissions at the average rate at which they
are receiving data packets.
4.7 Session Keep-alive
In some sessions, application data may arrive in bursts, rather than
TRAM [Page 17]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
all be available at once. In this case, the sender sends Beacon
Messages as a form of session keep-alive.
The sender uses an implementation-specific way to determine the
beginning of an idle period. For example, one way is to wait for 3
times the the inter-packet-departure time (as described in Section
4.5) before sending the first filler Beacon Message. The wait,
however, should not exceed a beaconInterval.
Once the sender determines an idle period has begun, it sends a
Beacon Message with the F flag set. The sequence number included in
this message is the sequence number of the latest Data Message sent.
Additional filler Beacon Messages are sent every beaconInterval.
When a member receives a filler Beacon Message, it SHOULD check to
see if it has any missing packets up to the sequence number in the
Beacon Message. If so, it should send an Acknowledgment Message
requesting repair.
A random delay SHOULD be observed before sending this Acknowledgment
Message so as not to congest the repair heads.
When a receiver does not receive any Data Messages or filler Beacon
Messages for more than 5 beaconIntervals, it MAY consider that the
sender has aborted.
4.8 Late Join
A member joining after the sender has started transmitting data may
select the following options for recovering data previously sent:
o NO_RECOVERY - Don't recover anything sent before the receiver
joined the repair tree. The start of the data stream for this
receiver is the first Data Message that the receiver received
after joining the repair tree.
o LIMITED_RECOVERY - Recover as much data as possible. This
option allows the receiver to request retransmission of all the
data packets that the repair head has cached.
o FULL_RECOVERY - Recover all data sent so far. This is normally
not supported. If a member must receiver all or nothing, this
option should be selected.
The option is selected using the transport parameter lateJoinPref.
The default is NO_RECOVERY. All these options require that the
receiver join the multicast repair tree before any data is forwarded
to the application. This insures that all subsequent data can be
received reliably.
TRAM [Page 18]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
4.9 End Of Transmission
Receivers must be able to determine when the session has completed to
ensure they have received all the data before exiting.
When the sender application completes, end of transmission is
signaled throughout the multicast group. The sender notifies all
members of session completion with a Beacon Message that has the
Transmission Done flag set. This packet also includes the sequence
number of the last data packet sent. The sender transmits this packet
once per beaconInterval until all of its children acknowledge the
receipt of all packets sent.
A member sends an Acknowledgment Message immediately after receiving
a Beacon Message with the Transmission Done flag set. If there are no
missing packets, the member sends an Acknowledgment Message with the
Terminate Membership flag set; otherwise, retransmissions of missing
packets are requested.
If the member is a repair head, it MUST wait for all of its children
to acknowledge and terminate their membership. During the time a head
waits for its children to acknowledge, every Hello Message sent MUST
contain the final sequence number and the Transmission Done flag set.
Those children that failed to receive the Beacon Message react to the
Hello Message in the same manner as the Beacon Message. The Hello
Messages are sent once every helloInterval after Transmission Done
has been signaled by the sender or head. Continuation of Transmission
Done signaling via the Hello Message eliminates the need for the
sender to send Beacon Messages until every receiver of the session
completes.
The members that require retransmissions of data MUST send
retransmission requests in response to every Beacon Message or Hello
Message with a Transmission Done flag set.
If a repair head does not receive an Acknowledgment Message from a
child within a helloInterval, it includes that child's address in its
next Multicast Hello Message. Children that find their addresses
listed in the Hello Message MUST respond with an Acknowledgment
Message. A repair head SHOULD disown a child that has not responded
for 3 helloIntervals.
A repair head completes its head responsibilities when each child has
either acknowledged all the packets or been disowned.
5. Security
TRAM [Page 19]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
The main security concerns for reliable multicast are:
o data confidentiality
o data integrity (including authenticity - i.e. from the right
source)
o ability to deal with denial-of-service attacks
Well-known methods of data encryption and authentication can be
applied to deal with these security concerns. As a group
communication protocol, however, reliable multicast faces more
difficult challenges with key management and many denial-of-service
possibilities.
In terms of key management, TRAM needs a way to efficiently
distribute new key information to eligible receivers. This needs to
be done before the session starts, as well as during a session for
long-lived sessions.
There are many kinds of denial-of-service attacks. Poorly behaved
members can inadvertently affect a session by incorrectly indicating
congestion, requesting excessive repairs, or just overloading one or
more members with excess messages. These behaviors can also be used
to intentionally cause denial-of-service attacks on the session.
Ignoring control messages from unauthorized group members can reduce
exposure to such attacks. The existing capability to refuse repair
service to poorly functioning members also provides some protection
against children requesting excessive repairs.
Work is underway to add encryption, authentication and key management
capabilities to TRAM. Details will be published in a new version of
this Internet Draft.
TRAM [Page 20]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6. Packet Formats
6.1 Beacon Message (multicast)
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |P|D|F| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Head Advertisement Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source IP Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 1
Message SubType: 1
Flags:
P: Set when slow members are to be pruned.
D: Set when transmission is done.
F: Set when used as a filler message.
Session Id: The identifier for the current session.
Length: The packet's length.
Head Advertisement Interval: The number of seconds between
transmission of Head Advertisement Messages.
A value of zero disables unsolicited head
advertisements.
Source IP Address: IP address of the multicast source.
Sequence Number: The packet sequence number of the last packet
sent. If the Transmission Done flag is set, this
field indicates the last sequence number; if data
transmission has not started, this field is zero.
TRAM [Page 21]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.2 Head Advertisement Message (multicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |L| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | TTL | HState| MRole |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RxLevel | LAN State | Unicast Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Direct Member Count | Capacity |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 1
Message SubType: 3
Flags:
L: Elect Lan Head
Session Id: The identifier for the current session.
Length: The packet's length.
TTL: TTL value this packet was sent with. A receiver subtracts
the TTL value in the IP header from this TTL to determine
the distance to its repair head.
HState:
Accepting Members: 1
Accepting Potential Heads: 2
Not Accepting Members: 3
Member Role:
Receiver Only: 1
Eager Head: 2
Reluctant Head: 3
Source Address: IP address of the multicast source.
TRAM [Page 22]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
RxLevel: The level of this member in the repair tree
hierarchy.
LAN State:
Disabled: 1
Volunteering: 2
LAN Head: 3
LAN Member: 4
Unicast Port: Unicast port number to communicate with this
member.
Direct Member Count: Total number of children for this repair
head.
Capacity: The maximum number of children this head
is configured to serve.
TRAM [Page 23]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.3 Member Solicit Message (multicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | TTL | MRole | Rsvd |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RxLevel | Reserved | Unicast Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 1
Message SubType: 4
Flags: None.
Session Id: The identifier for the current session.
Length: The packet's length.
TTL: The original Time to Live used to send this
message.
Member Role:
Receiver Only: 1
Eager Head: 2
Reluctant Head: 3
Source Address: IP address of the multicast source.
RxLevel: The level of this member in the repair tree
hierarchy.
Unicast Port: Port number that this member is using for
unicast communications.
TRAM [Page 24]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.4 Head Bind Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | TTL | MRole | Rsvd |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Direct Members | Indirect Members |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 6
Flags: None.
Session Id: The identifier for the current session.
Length: The packet's length.
TTL: Member computed TTL distance from the head.
Member Role:
Receiver Only: 1
Eager Head: 2
Reluctant Head: 3
Source Address: Address of the Sender
Direct Members: Number of children directly reporting to this
member.
Indirect Members: Number of members indirectly reporting to
this member. This includes all members below
this point in the tree.
TRAM [Page 25]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.5 Accept Member Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | BitMask Length| RxLevel |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Starting Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BitMask |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 1
Flags: None.
Session Id: The identifier for the current session.
Length: The packet's length.
BitMask Length: Number of valid bits in the BitMask field.
RxLevel: The level of this member in the repair
tree hierarchy.
Multicast Address: The multicast address this repair head is
supporting.
Source Address: IP address of the multicast source.
Starting Sequence The base sequence number from which this repair
Number: head provides retransmission if requested.
BitMask: A bit mask indicating selected data packets
earlier than the Starting Sequence Number
available for repair. The first bit corresponds
to (Starting Sequence Number - BitMask Length).
TRAM [Page 26]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.6 Reject Member Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Reason Code | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 2
Flags: None.
Session Id: The identifier for the current session.
Length: The packet's length.
Reason Code:
Accepting Potential Heads: 1
Membership Full: 2
TTL Out Of Limit: 3
Resigning: 4
Source Address: IP address of the multicast source.
TRAM [Page 27]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.7 Multicast Hello Message (multicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |A|D| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | TTL | HState| Rsvd |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved | Ack Member Cnt| RxLevel |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unicast Port Number | Member Count | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Member Address List |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 1
Message SubType: 2
Flags:
A: Immediate Acknowledgment requested from members listed
in Member Address List field
D: Set to indicate Transmission Done, in which case the
Sequence Number field contains the sequence number for
the last packet of the session.
Session Id: The identifier for the current session.
Length: The packet's length.
TTL: The repair TTL used by this head.
HState: (Head State)
Accepting Members: 1
Accepting Potential Heads: 2
Not Accepting Members: 3
Resigning: 4
TRAM [Page 28]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Ack Member Count: Number of members listed in the Member
Address Field. This field is valid if the
Acknowledgment flag is set.
RxLevel: The level of this member in the repair tree
hierarchy.
Source Address: IP address of the multicast source.
Unicast Port: Unicast port number used in communicating
with this member.
Member Count: Total number of members under this member in
the tree.
Sequence Number: Highest sequence number in cache, or the
sequence number of the last packet if the
D Flag is set.
Member Address List: List of IP Addresses of members that must
respond to the head. This field is set if
the Acknowledgment flag is set.
TRAM [Page 29]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.8 Unicast Hello Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |D| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | RxLevel | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 3
Flags:
D: Member Disowned
Session Id: The identifier for the current session.
Length: The packet's length.
RxLevel: The level of this member in the repair tree
hierarchy.
Source Address: IP address of the multicast source.
TRAM [Page 30]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.9 Data Message (multicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |P|D| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Head Advertisement Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 2
Message SubType:
Original 1
Retransmission 2
Flags:
P: Prune slow members
D: Transmission Done
Session Id: The identifier for the current session.
Length: The packet's length, including Payload.
Head Advertisement Interval: The number of seconds between
transmissions of Head Advertisement Messages.
A value of zero disables unsolicited Head
Advertisements.
Source Address: IP address of the multicast source.
Sequence Number: Packet sequence number, starting from 1.
Payload: Application data.
TRAM [Page 31]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.10 Acknowledgment Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type |H|T| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | BitMask Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Actual TTL | Reserved | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Starting Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Direct Member Count | Indirect Member Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Direct Heads Advertising | Indirect Heads Advertising +
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BitMask ... |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 4
Flags:
H: Hello Message not received
T: Terminate Membership
Session Id: The identifier for the current
session.
Length: The packet's length.
BitMask Length: Length in bits of valid bits in the
BitMask field.
Actual TTL: The TTL distance from this member to
its head, computed as the difference
between the original TTL and the
residue TTL of the head's Multicast
Hello Message.
TRAM [Page 32]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Source Address: IP address of the multicast source.
Starting Sequence Number: Base sequence number for the BitMask.
Direct Member Count: Number of children
Indirect Member Count: Number of indirect members.
Direct Heads Advertising: Number of children that are currently
advertising that they are a head.
Indirect Heads Advertising: Number of indirect members currently
advertising that they are a head.
BitMask: Bitmask representing missing and
received packets.
TRAM [Page 33]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
6.11 Congestion Message (unicast)
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version Number| Message Type | Sub Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Id |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Type: 3
Message SubType: 5
Flags: None.
Session Id: The identifier for the current session.
Length: The packet's length.
Sequence Number: Last received sequence number. This
identifies the ackWindow that
congestion is being reported for.
TRAM [Page 34]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
7. Discussion Regarding RFC2357
RFC2357 suggests a number of technical criteria for evaluating a
reliable multicast transport protocol. In this section, we discuss
some of these issues relating to TRAM.
7.1 Performance Analysis and Discussion
The design of TRAM was supported by simulation studies. A description
of these simulation studies can be found in [TRAM1].
We developed a simple simulation and visualization model for tree
building in Java which can directly interface to the tree building
part of the implementation.
We also developed a separate model for studying flow and congestion
control algorithms using the Network Simulator (NS) - a public domain
tool. Initial simulation results show that TRAM shares network
resources with TCP in a fair way [TRAM2]. We intend to participate
in developing a suite of reference simulation scenarios for reliable
multicast and demonstrate how well TRAM behaves in those contexts.
We believe, however, simulations only characterize protocol behaviors
for specific network topologies and dynamics. While it is very
difficult to conclusively describe a protocol's scalability,
stability and fairness properties, below are some additional
observations:
a) scalability
TRAM can scale to potentially very large numbers of receivers if all
the receivers have adequate bandwidth (greater than the minimum data
rate) between themselves and the sender at all times during the
session. Our simulation studies showed that in a 200 node network
with some network dynamics, TRAM behaved robustly. Simulation for
larger networks and other reference scenarios [SCENARIOS] are
underway.
b) fairness with TCP
TRAM can operate in two modes
o fixed rate, by setting maxDataRate and minDataRate to the
same value
o adaptive rate, by setting maxDataRate to be greater than
minDataRate
In the fixed rate case, TRAM is configured to transmit data at a
fixed rate. In such scenarios, network resource allocation is done
TRAM [Page 35]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
out-of-band rather than achieved by the congestion algorithms of the
different flows. This represents one legitimate way to allocate and
use network resources for some networks.
It is in the adaptive rate case that TRAM should compete fairly with
TCP, to the extent allowed by the maxDataRate and minDataRate
specifications. As discussed earlier, TRAM uses algorithms similar
to TCP, and simulation results thus far are encouraging. We expect
to continue to work on this and publish a full report later.
c) limiting factors
The following factors have been observed to affect TRAM's scalability
o Different and varying bandwidth
TRAM adapts the transmission rate to satisfy the slowest link
(or minDataRate). When the capacity of receivers and other
network resources vary wildly and/or have wide fluctuations
over time, TRAM could be obliged to operate at minDataRate and
potentially prune many members.
o Sub-optimal repair tree
When the repair tree is sub-optimal, the efficiency of the
repair mechanism and the feedback mechanism diminishes. The
whole system could have very low efficiency when losses occur.
o Long feedback delay
The congestion control algorithm depends on feedback from all
receivers. When the number of receivers grows and spreads
sparsely in the network, the feedback latency increases quickly.
This slows down the sender's ability to quickly react to
congestion, or increases the likelihood of rate oscillations.
7.2 Security Discussion
The authors are actively working on this problem, as well as
participating in the IRTF Secure Multicast Working Group (SMuG). An
updated version of this Draft will include the security
specifications.
8 Limitations and Future Work
The design of TRAM was based on a number of choices that make it more
suitable for certain applications and not others. Some limitations
include:
o Single Sender - Many data distribution applications (e.g.
Pay-per-view and stock information distribution) require only
a single sender, whereas many collaborative applications (e.g.
shared whiteboard) would require multiple senders. Going from
single-sender to multiple-sender increases the complexity of the
TRAM [Page 36]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
design and the overhead of the protocol. While currently
limited to single sender, TRAM is part of a framework [JRMS]
that supports multiple-protocol selection and a common API.
o Reliance on TTL - To minimize the need for manual configuration,
TRAM comes with automatic repair-tree formation and maintenance.
Many of the automated algorithms are based on using TTL as a
measure of distance. In networks where TTL is not a good
measure of distance, some of TRAM's algorithm may operate in
non-optimal conditions. In such scenarios, it would be
necessary to fall back to using manual configurations to define
the repair tree.
o Security - Secure multicast is still very much a research
problem. While parts of the security mechanisms are intertwined
with transport (e.g. authentication), other aspects can be
decoupled and shared by different transports (e.g. key
management). As noted before, TRAM will be integrated with open
security mechanisms as standards emerge.
Finally, multicast congestion control is also expected to be updated
as more research is done on this hard problem.
9 References
[JRMS] Kadansky M., S. Hanna, and P. Rosenzweig, "The Java Reliable
Multicast Service: A Reliable Multicast Library", Sun Microsystems
Laboratories Technical Report SMLI TR-98-68, September 1998.
[NETBLT] Clark, Lambert, and Zhang, "NETBLT: A High Throughput
Transport Protocol", Proceedings of ACM SIGCOMM 1987, pp 353-359,
August 1987.
[RMTP] Whetten B., M. Basavaiah, S. Paul, T.Montgomery, N.Rastogi,
J.Conlan and T. Yeh, "The RMTP-II Protocol", draft-whetten-rmtp-ii-
00.txt, Internet Draft, IETF, April 1998.
[SACK] Mathis M., J. Mahdavi, S. Floyd, and A. Romanow, "TCP
Selective Acknowledgment Options", RFC 2018, October 1996.
[SAP] Handley M., "SAP - Session Announcement Protocol", work in
progress.
[SCENARIOS] Handley M., "Reference Simulations for Reliable Multicast
Congestion Control Schemes", talk at Reliable Multicast IRTF meeting
in London, July 1998.
[SURVEY] Levine B. and J. Garcia-Lune-Aceves, "A Comparison of Known
Classes of Reliable Multicast Protocols", University of California,
Santa Cruz, 1996.
TRAM [Page 37]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
[TMTP] Yavatkar R., J.Griffioen and M. Sudan, "A Reliable
Dissemination Protocol for Interactive Collaborative Applications",
University of Kentucky, 1995.
[TRAM1] Chiu D. M., S. Hurst, M. Kadansky and J. Wesley, "TRAM: A
Tree-based Reliable Multicast Protocol", Sun Microsystems
Laboratories Technical Report SMLI TR-98-66, September 1998.
[TRAM2] Chiu D. M., "Flow and Congestion Control in Reliable
Multicast", talk at the Reliable Multicast IRTF meeting in London,
July 1998.
Acknowledgments
The authors gratefully acknowledge the many contributions of Steve
Hanna, Phil Rosenzweig, and Radia Perlman.
TRAM [Page 38]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Appendix: A Table of Transport Parameters
+---------------+----------------------------------+----------------+
|Parameter Name | Description |Value Range & |
| | | Default setting|
|---------------+----------------------------------+----------------+
|ackWindow | The number of packets received | [1, 2^16] |
| | before sending an Acknowledgment | |
| | Message, during normal operation.| Default: 32 |
+---------------+----------------------------------+----------------+
|advertise | Selection of how a head should do| NO |
| | advertisements in the absence of | YES |
| | Member Solicitation. The value | YES_BEFORE_DATA|
| | YES_BEFORE_DATA means advertise | SENDER_CHOICE |
| | without solicitation only before | |
| | data transmission has started; | Default: |
| | SENDER_CHOICE means derive the | SENDER_CHOICE |
| | actual selection from treeScheme.| |
+---------------+----------------------------------+----------------+
|advertiseTTLInc| An increment to the TTL value | [1, 255] |
| | when using expanding ring to send| Default: 2 |
| | Head Advertisement Messages. | |
+---------------+----------------------------------+----------------+
|advertiseLimit | The maximum value to be used in | [1, sessionTTL]|
| | the TTL field for sending Head | Default: |
| | Advertisement Messages. | sessionTTL |
+---------------+----------------------------------+----------------+
|allowLANTrees | A switch to enable or disable LAN| TRUE, FALSE |
| | tree formation. | Default: FALSE |
+---------------+----------------------------------+----------------+
|beaconInterval | The interval between successive | [1, 2^32] msec |
| | Beacon Messages. | Default: 1000 |
+---------------+----------------------------------+----------------+
|helloTTLInc | An increment to the TTL value for| [1, 255] |
| | sending Multicast Hello Messages,| Default: 2 |
| | when re-adjusting the repair TTL.| |
+---------------+----------------------------------+----------------+
|helloInterval | The interval between successive | [1, 2^32] msec |
| | Hello Messages from a head to its| Default: 10000 |
| | children. | |
+---------------+----------------------------------+----------------+
|lateJoinPref | The preference for data recovery |LIMITED_RECOVERY|
| | when a receiver joins after data |NO_RECOVERY |
| | transmission has started. |FULL_RECOVERY |
| | | Default: |
| | | NO_RECOVERY |
+---------------+----------------------------------+----------------+
|maxAdvertiseBW1| The maximum bandwidth to be used |[1, maxDataRate]|
TRAM [Page 39]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
| | for tree forming before data | bytes/sec |
| | transmission begins. | Default: |
| | | maxDataRate |
+---------------+----------------------------------+----------------+
|maxAdvertiseBW2| The maximum bandwidth to be used | [1,maxDataRate]|
| | for tree forming after data | bytes/sec |
| | transmission begins. | Default: |
| | | maxDataRate/20|
+---------------+----------------------------------+----------------+
|maxCache | The maximum cache size. A value |[0, 2^32]packets|
| | of 0 means no explicit limit. | Default: 300 |
+---------------+----------------------------------+----------------+
|maxChildren | The maximum number of children | [1, 2^32] |
| | supported by this head. | Default: 32 |
+---------------+----------------------------------+----------------+
|maxDataRate | The maximum data rate that the |[1, 2^32] |
| | sender can transmit, and heads | bytes/sec |
| | can retransmit repairs. |Default: 64000 |
+---------------+----------------------------------+----------------+
|maxHelloMisses | The threshold of Hello Messages | [1, 2^32] |
| | missed by a child before it | Default: 5 |
| | considers a parent unreachable | |
| | (or inoperable). | |
+---------------+----------------------------------+----------------+
|memberRole | A member's role in tree forming: |RECEIVER_ONLY |
| | an EAGER_HEAD SHOULD actively |RELUCTANT_HEAD |
| | seek members; a RELUCTANT_HEAD |EAGER_HEAD |
| | SHOULD act as a head when no | |
| | other suitable head is available;|Default: |
| | a RECEIVER_ONLY member MUST never| RELUCTANT_HEAD |
| | act as a head. | |
+---------------+----------------------------------+----------------+
|minDataRate | The minimum data rate for sender |[1, 2^32] |
| | to transmit data. | bytes/sec |
| | | Default: 1000 |
+---------------+----------------------------------+----------------+
|multicastAddr | The multicast address used for | 224.*.*.* |
| | the session. | |
+---------------+----------------------------------+----------------+
|ordered | A switch to select ordered | TRUE, FALSE |
| | delivery or not. | Default: TRUE |
+---------------+----------------------------------+----------------+
|parent | A list of IP address/port of | IP address/port|
| | heads. The Null list means using | list |
| | an automatic tree forming scheme | Default: Null |
+---------------+----------------------------------+----------------+
|port | The multicast port number. | |
+---------------+----------------------------------+----------------+
TRAM [Page 40]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
|sessionId | An Id used to uniquely identify a| [0, 2^32] |
| | multicast session. | Default: 0 |
+---------------+----------------------------------+----------------+
|sessionTTL | The TTL used by sender to send | [1, 255] |
| | Data and Beacon Packets. | Default: 1 |
+---------------+----------------------------------+----------------+
|solicit | Selection for the optional use of| NO |
| | Member Solicit Messages to | YES |
| | trigger Head Advertisements. The | YES_AFTER_DATA |
| | value YES_AFTER_DATA means | SENDER_CHOICE |
| | solicit after data transmission | |
| | started. SENDER_CHOICE means | Default: |
| | derive the selection from | SENDER_CHOICE |
| | treeScheme. | |
+---------------+----------------------------------+----------------+
|solicitInterval| The interval between Member | [1, 2^32]msec |
| | Solicit Messages. | Default: 500 |
+---------------+----------------------------------+----------------+
|solicitTTLInc | An increment to the TTL value in | [1, 255] |
| | Member Solicit Messages. | Default: 2 |
+---------------+----------------------------------+----------------+
|sourceAddr | The sender's IP address. | |
+---------------+----------------------------------+----------------+
|transportMode | The role of the local transport | SEND_ONLY |
| | agent. | RECEIVE_ONLY |
| | | SEND_RECEIVE |
| | | Default: |
| | | RECEIVE_ONLY |
+---------------+----------------------------------+----------------+
|treeScheme | Selection of the method used to | HEAD_ADVERTISE |
| | form tree; HEAD_ADVERTISE is | MEMBER_SOLICIT |
| | suitable for asymmetric networks;| COMBINED |
| | MEMBER_SOLICIT lets member | |
| | trigger head advertisements; the | Default: |
| | COMBINED method starts with | MEMBER_SOLICIT|
| | HEAD_ADVERTISE and switches to | |
| | MEMBER_SOLICIT after data | |
| | transmission begins. | |
+---------------+----------------------------------+----------------+
Authors Address
Miriam Kadansky
miriam.kadansky@east.sun.com
Dah Ming Chiu
dahming.chiu@east.sun.com
TRAM [Page 41]
INTERNET DRAFT draft-kadansky-tram-00.txt September 1998
Joe Wesley
joseph.wesley@east.sun.com
Sun Microsystems Laboratories
2 Elizabeth Drive
Chelmsford, MA 01824
TRAM [Page 42]
| PAFTECH AB 2003-2026 | 2026-04-23 10:34:08 |