One document matched: draft-ietf-avtcore-rtp-multi-stream-03.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<?rfc sortrefs="yes" ?>
<?rfc symrefs="yes" ?>
<rfc category="std" docName="draft-ietf-avtcore-rtp-multi-stream-03"
ipr="trust200902" updates="3550, 4585">
<front>
<title abbrev="Multiple Media Streams in an RTP Session">Sending Multiple
Media Streams in a Single RTP Session</title>
<author fullname="Jonathan Lennox" initials="J." surname="Lennox">
<organization abbrev="Vidyo">Vidyo, Inc.</organization>
<address>
<postal>
<street>433 Hackensack Avenue</street>
<street>Seventh Floor</street>
<city>Hackensack</city>
<region>NJ</region>
<code>07601</code>
<country>US</country>
</postal>
<email>jonathan@vidyo.com</email>
</address>
</author>
<author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 82 87</phone>
<email>magnus.westerlund@ericsson.com</email>
</address>
</author>
<author fullname="Qin Wu" initials="Q." surname="Wu">
<organization>Huawei</organization>
<address>
<postal>
<street>101 Software Avenue, Yuhua District</street>
<city>Nanjing, Jiangsu 210012</city>
<country>China</country>
</postal>
<email>sunseawq@huawei.com</email>
</address>
</author>
<author fullname="Colin Perkins" initials="C. " surname="Perkins">
<organization>University of Glasgow</organization>
<address>
<postal>
<street>School of Computing Science</street>
<city>Glasgow</city>
<code>G12 8QQ</code>
<country>United Kingdom</country>
</postal>
<email>csp@csperkins.org</email>
</address>
</author>
<date/>
<area>RAI</area>
<workgroup>AVTCORE</workgroup>
<keyword>I-D</keyword>
<keyword>Internet-Draft</keyword>
<!-- TODO: more keywords -->
<abstract>
<t>This document expands and clarifies the behavior of the Real-Time
Transport Protocol (RTP) endpoints when they are using multiple
synchronization sources (SSRCs), e.g. for sending multiple media
streams, in a single RTP session. In particular, issues involving RTCP
Control Protocol (RTCP) messages are described.</t>
<t>This document updates RFC 3550 in regards to handling of multiple
SSRCs per endpoint in RTP sessions. It also updates RFC 4585 to clarify
the calculation of the timeout of SSRCs and the inclusion of feeback
messages.</t>
</abstract>
</front>
<middle>
<section anchor="introduction" title="Introduction">
<t>At the time The <xref target="RFC3550">Real-Time Transport Protocol
(RTP)</xref> was originally written, and for quite some time after,
endpoints in RTP sessions typically only transmitted a single media
stream, and thus used a single synchronization source (SSRC) per RTP
session, where separate RTP sessions were typically used for each
distinct media type.</t>
<t>Recently, however, a number of scenarios have emerged (discussed
further in <xref target="usecases"/>) in which endpoints wish to send
multiple RTP media streams, distinguished by distinct RTP
synchronization source (SSRC) identifiers, in a single RTP session.
Although RTP's initial design did consider such scenarios, the
specification was not consistently written with such use cases in mind.
The specifications are thus somewhat unclear.</t>
<t>The purpose of this document is to expand and clarify <xref
target="RFC3550"/>'s language for these use cases. The authors believe
this does not result in any major normative changes to the RTP
specification, however this document defines how the RTP specification
is to be interpreted. In these cases, this document updates RFC3550. The
document also updates RFC 4585 in regards to the timeout of inactive
SSRCs as specificed in <xref target="sec-timeout-ssrc"/> as well as
clarifying the inclusion of feedback messages.</t>
<!--MW: The above about "does not result in any major normative changes to the RTP specification"
is not correct given the Report Aggregation.-->
<t>The document starts with terminology and some use cases where
multiple sources will occur. This is followed by RTP and RTCP
recommendations to resolve issues. Next are security considerations and
remaining open issues.</t>
<!--MW: Above "and remaining open issues" will need to be removed when no open issues exists.-->
</section>
<section title="Terminology">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref> and indicate requirement levels for
compliant implementations.</t>
</section>
<section anchor="usecases" title="Use Cases For Multi-Stream Endpoints">
<t>This section discusses several use cases that have motivated the
development of endpoints that sends RTP data using multiple SSRCs in a
single RTP session.</t>
<section anchor="telepresence" title="Multiple-Capturer Endpoints">
<t>The most straightforward motivation for an endpoint to send
multiple RTP streams in a session is the scenario where an endpoint
has multiple capture devices, and thus media sources, of the same
media type and characteristics. For example, telepresence endpoints,
of the type described by the <xref
target="I-D.ietf-clue-framework">CLUE Telepresence Framework</xref>,
often have multiple cameras or microphones covering various areas of a
room.</t>
</section>
<section title="Multi-Media Sessions">
<t>Recent work has been done in <xref
target="I-D.ietf-avtcore-multi-media-rtp-session">RTP</xref> and <xref
target="I-D.ietf-mmusic-sdp-bundle-negotiation">SDP</xref> to update
RTP's historical assumption that media sources of different media
types would always be sent on different RTP sessions. In this work, a
single endpoint's audio and video RTP media streams (for example) are
instead sent in a single RTP session.</t>
</section>
<section title="Multi-Stream Mixers">
<t>There are several RTP topologies which can involve a central device
that itself generates multiple RTP media streams in a session.</t>
<t>One example is a mixer providing centralized compositing for a
multi-capture scenario like that described in <xref
target="telepresence"/>. In this case, the centralized node is
behaving much like a multi-capturer endpoint, generating several
similar and related sources.</t>
<t>More complicated is the Selective Forwarding Middlebox, see Section
3.7 of <xref target="I-D.ietf-avtcore-rtp-topologies-update"/>. This
is a middlebox that receives media streams from several endpoints, and
then selectively forwards modified versions of some of the streams
toward the other endpoints it is connected to. Toward one destination,
a separate media source appears in the session for every other source
connected to the middlebox, "projected" from the original streams, but
at any given time many of them can appear to be inactive (and thus are
receivers, not senders, in RTP). This sort of device is closer to
being an RTP mixer than an RTP translator, in that it terminates RTCP
reporting about the mixed streams, and it can re-write SSRCs,
timestamps, and sequence numbers, as well as the contents of the RTP
payloads, and can turn sources on and off at will without appearing to
be generating packet loss. Each projected stream will typically
preserve its original RTCP source description (SDES) information.</t>
</section>
<section title="Multiple SSRCs for a Single Media Source">
<t>There are also several cases where a single media source results in
the usage of multiple SSRCs within the same RTP session. Transport
robustification tools like <xref target="RFC4588">RTP
Retransmission</xref> result in multiple SSRCs, one with source data,
and another with the repair data. Scalable encoders and their RTP
payload foramts, like H.264's extension for Scalable Video Coding<xref
target="RFC6190"> (SVC)</xref> can be transmitted in a configuration
where the scalable layers are distributed over multiple SSRCs within
the same session, to enable RTP packet stream level (SSRC) selection
and routing in conferencing middleboxes.</t>
</section>
</section>
<section title="Multi-Stream Endpoint RTP Media Recommendations">
<t>While an endpoint MUST (of course) stay within its share of the
available session bandwidth, as determined by signalling and congestion
control, this need not be applied independently or uniformly to each
media stream and its SSRCs. In particular, session bandwidth MAY be
reallocated among an endpoint's SSRCs, for example by varying the
bandwidth use of a variable-rate codec, or changing the codec used by
the media stream, up to the constraints of the session's negotiated (or
declared) codecs. This includes enabling or disabling media streams and
their redundancy streams as more or less bandwidth becomes
available.</t>
</section>
<section title="Multi-Stream Endpoint RTCP Recommendations">
<t>This section contains a number of different RTCP clarifications or
recommendations that enables more efficient and simpler behavior without
loss of functionality.</t>
<t>The RTP Control Protocol (RTCP) is defined in Section 6 of <xref
target="RFC3550"/>, but it is largely documented in terms of
"participants". In many cases, the specification's recommendations for
"participants" are to be interpreted as applying to individual SSRCs,
rather than to endpoints. This section describes several concrete cases
where this applies.</t>
<section title="RTCP Reporting Requirement">
<t>For each of an endpoint's SSRCs, whether or not they are currently
sending media, SR/RR and SDES packets MUST be sent at least once per
RTCP report interval. (For discussion of the content of SR or RR
packets' reception statistic reports, see <xref
target="I-D.ietf-avtcore-rtp-multi-stream-optimisation"/>.)</t>
</section>
<section title="Initial Reporting Interval">
<t>When a new SSRC is added to a unicast session, the sentence in
<xref target="RFC3550"/>'s Section 6.2 applies: "For unicast sessions
... the delay before sending the initial compound RTCP packet MAY be
zero." This applies to individual SSRCs as well. Thus, endpoints MAY
send an initial RTCP packet for an SSRC immediately upon adding it to
a unicast session.</t>
<t>This allowance also applies, as written, when initially joining a
unicast session. However, in this case some caution needs to be
exercised if the end-point or mixer has a large number of sources
(SSRCs) as this can create a significant burst. How big an issue this
is depends on the number of sources for which the initial SR or RR
packets and Session Description CNAME items are to be sent, in
relation to the RTCP bandwidth.</t>
<t>(tbd: Maybe some recommendation here? The aim in restricting this
to unicast sessions was to avoid this burst of traffic, which the
usual RTCP timing and reconsideration rules will prevent.)</t>
</section>
<section anchor="compound" title="Compound RTCP Packets">
<t>Section 6.1 in <xref target="RFC3550"/> gives the following advice
to RTP translators and mixers:</t>
<t><list style="empty">
<t>"It is RECOMMENDED that translators and mixers combine
individual RTCP packets from the multiple sources they are
forwarding into one compound packet whenever feasible in order to
amortize the packet overhead (see Section 7). An example RTCP
compound packet as might be produced by a mixer is shown in Fig.
1. If the overall length of a compound packet would exceed the MTU
of the network path, it SHOULD be segmented into multiple shorter
compound packets to be transmitted in separate packets of the
underlying protocol. This does not impair the RTCP bandwidth
estimation because each compound packet represents at least one
distinct participant. Note that each of the compound packets MUST
begin with an SR or RR packet."</t>
</list></t>
<t><list style="empty">
<t>Note: To avoid confusion, an RTCP packet is an individual item,
such as a Sender Report (SR), Receiver Report (RR), Source
Description (SDES), Goodbye (BYE), Application Defined (APP),
<xref target="RFC4585">Feedback</xref> or <xref
target="RFC3611">Extended Report (XR)</xref> packet. A compound
packet is the combination of two or more such RTCP packets where
the first packet has to be an SR or an RR packet, and which
contains a SDES packet containing an CNAME item.</t>
</list>The above results in compound RTCP packets that contain
multiple SR or RR packets from different sources (SSRCs) as well as
any of the other packet types. There are no restrictions on the order
in which the packets can occur within the compound packet, except the
regular compound rule, i.e., starting with an SR or RR.</t>
<t>This advice applies to multi-media-stream endpoints as well, with
the same restrictions and considerations. (Note, however, that the
last sentence does not apply to <xref target="RFC4585">AVPF</xref> or
<xref target="RFC5124">SAVPF</xref> feedback packets if <xref
target="RFC5506">Reduced-Size RTCP</xref> is in use.)</t>
<section title="Maintaining AVG_RTCP_SIZE">
<t>When multiple local SSRCs are sending their RTCP packets in the
same compound packet, this obviously results in larger RTCP compound
packets. This will have an affect on the value of the average RTCP
packet size metering (avg_rtcp_size) that is done for the purpose of
RTCP transmission scheduling calculation. This section discusses the
impact of this and provide recommendations with how to deal with
it.</t>
<t>This section will use the concept of an 'RTCP Compound Packet' to
represent not just proper RTCP compound packets, i.e. ones that
start with an SR or RR RTCP packet and include at least one SDES
CNAME item. For the purpose of the below calculation, other valid
lower layer datagram units an RTCP implementation can send or
receive, independently if they are an aggregate or not of RTCP packets
are also considered. This especially includes <xref
target="RFC5506">Reduced-Size RTCP packets</xref>.</t>
<t>The RTCP packet scheduling algorithm that is defined in <xref
target="RFC3550">RTP</xref> deals with individual SSRCs. These SSRCs
transmit their set of RTCP packets at each scheduled interval. Thus,
to maintain this per-SSRC property of the scheduling, the
avg_rtcp_size needs to be updated with per-SSRC average RTCP
compound packet sizes. The avg_rtcp_size value SHALL be updated for
each received or sent RTCP compound packet with the total size
(including packet overhead such as IP/UDP) divided by the number of
reporting SSRCs. The number of reporting SSRCs SHALL be determined
by counting the number of different SSRCs that are the source of
Sender Report (SR) or Receiver Report (RR) RTCP packets within the
compound. A non-compound RTCP packet, i.e. it contains no SR or RR
RTCP packets at all -- as can happen with <xref
target="RFC5506">Reduced-Size RTCP packets</xref> -- the SSRC count
SHALL be considered to be 1.</t>
<t><list style="empty">
<t>Note: The above makes it possible to amortize the packet
overhead between the number of SSRCs sharing a RTCP compound
packet.</t>
</list>For an RTCP end-point that doesn't follow the above rule,
and instead uses the full RTCP compound packet size as input, the
average RTCP reporting interval will be scaled up (i.e. become
longer) with a factor that is proportional to the number of SSRCs
sourcing RTCP packets in an RTCP compound packet as well as the set
of SSRCs being aggregated in proportion to the total number of
participants. This factor can quite easily become larger than 5,
e.g. with an 1500 byte MTU and an average per-SSRC sum of RTCP
packets of 240 bytes, the MTU will fit 6 packets. If the receiver
end-point has a single SSRC and all other endpoints fill their MTU
fully, the factor will be close to 6. If the RTCP configuration is
such that the transmission interval is bandwidth limited, rather
than any type of minimal interval limitation (Tmin or T_RR_INT),
then the other end-points will likely time out this SSRC due to it
using an regular RTCP interval is more than 5 times the rest of the
endpoints.</t>
</section>
<section title="Scheduling RTCP with Multiple Reporting SSRCs">
<t>When implementing RTCP packet scheduling for cases where multiple
reporting SSRCs are aggregating their RTCP packets in the same
compound packet there are a number of challenges. First of all, we
have the goal of not changing the general properties of the RTCP
packet transmissions, which include the general inter-packet
distribution, and the behavior for dealing with flash joins as well
as other dynamic events.</t>
<t>The below specified mechanism deals with:<list style="symbols">
<t>That one can't have a-priori knowledge about which RTCP
packets are to be sent, or their size, prior to generating the packets. In which
case, the time from generation to transmission ought to be as short
as possible to minimize the information that becomes stale.</t>
<t>That one has an MTU limit, that one ought to avoid exceeding,
as that requires lower-layer fragmentation (e.g., IP
fragmentation) which impacts the packets' probability of
reaching the receiver(s).</t>
</list></t>
<t>Schedule all the endpoint's local SSRCs individually for
transmission using the regular calculation of Tn for the
profile being used. Each time a SSRC's Tn timer expires, do the regular
reconsideration. If the reconsideration indictes that an RTCP
packet is to be sent:<list style="numbers">
<t>Consider if an additional SSRC can be added. That
consideration is done by picking the SSRC which has the Tn value
closest in time to now (Tc).</t>
<t>Calculate how much space for RTCP packets would be
needed to add that SSRC.</t>
<t>If the considered SSRC's RTCP Packets fit within the lower
layer datagram's Maximum Transmission Unit, taking the necessary
protocol headers into account and the consumed space by prior
SSRCs, then add that SSRC's RTCP packets to the compound packet
and go again to Step 1.</t>
<t>If the considered SSRC's RTCP Packets will not fit within the
compound packet, then transmit the generated compound
packet.</t>
<t>Update the RTCP Parameters for each SSRC that has been
included in the sent RTCP packet. The Tp value for each SSRC
MUST be updated as follows:<list style="hanging">
<t hangText="For the first SSRC:">As this SSRC was the one
that was reconsidered the tp value is set to the tc as
defined in <xref target="RFC3550">RTP</xref>.</t>
<t hangText="For any additional SSRC:">The tp value SHALL be
set to the transmission time this SSRC would have had it not
been aggregated and given the current existing session
context. This value is derived by taking this SSRC's Tn
value and performing reconisderation and updating tn until
tp + T <= tn. Then set tp to this tn value.</t>
</list></t>
<t>For the sent SSRCs calculate new tn values based on the
updated parameters and reschedule the timers.</t>
</list></t>
<t>Reverse reconsideration needs to be performed as specified in
<xref target="RFC3550">RTP</xref>. It is important to note that
under the above algorithm when performing reconsideration, the value
of tp can actually be larger than tc. However, that still has the
desired effect of proportionally pulling the tp value towards tc (as
well as tn) as the group size shrinks in direct proportion the
reduced group size.</t>
<t>The above algorithm has been shown in simulations to maintain the
inter-RTCP-packet transmission distribution for the SSRCs and
consume the same amount of bandwidth as non-aggregated packets in
RTP sessions with static sets of participants. With this algorithm
the actual transmission interval for any SSRC triggering an RTCP
compound packet transmission is following the regular transmission
rules. It also handles the cases where the number of SSRCs that can
be included in an aggregated packet varies. An SSRC that previously
was aggregated and fails to fit in a packet still has its own
transmission scheduled according to normal rules. Thus, it will
trigger a transmission in due time, or the SSRC will be included in
another aggregate.</t>
<!--MW: A Possible error case in the above would be if one SSRC accumulate intervals and
move its tp and tn more than 5*Td beyond tc while all SSRCs fits in the aggregate. Then
the SSRCs in the aggregate changes, thus causing this SSRC to not fit. This can a cause
a timeout if no other SSRC aggregates this one. -->
<t>The algorithm's behavior under SSRC group size changes is under
investigation. However, it is expected to be well behaved based on
the following analyses. <list style="hanging">
<t
hangText="RTP sessions where the number of SSRC are growing:">When
the group size is growing, the Td values grow in proportion to
the number of new SSRCs in the group. The reconsideration when
the timer for the tn expires, that SSRC will reconsider the
transmission and with a certain probability reschedule the tn
timer. This part of the reconsideration algorithm is only
impacted by the above algorithm by having tp values that are in
the future instead of set to the time of the actual last
transmission at the time of updating tp. Thus the scheduling
causes in worst case a plateau effect for that SSRC. That effect
depends on how far into the future tp can advance.</t>
<t
hangText="RTP sessions where the number of SSRC are shrinking:">When
the group shrinks, reverse reconsideration moves the tp and tn
values towards tc proportionally to the number of SSRCs that
leave the session compared to the total number of participants
when they left. Thus the also group size reductions need to be
handled.</t>
</list></t>
<t>In general the potential issue that might exist depends on how
far into the future the tp value can drift compared to the actual
packet transmissions that occur. That drift can only occur for an
SSRC that never is the trigger for RTCP packet transmission and
always gets aggregated and where the calculcated packet transmission
interval randomly occurs so that tn - tp for this SSRC is on average
larger than the ones that gets transmitted.</t>
</section>
</section>
<section title="RTP/AVPF Feedback Packets">
<t>This section discusses the transmission of RTP/AVPF feedback
packets when the transmitting endpoint has multiple SSRCs.</t>
<section title="The SSRC Used">
<t>When an RTP endpoint has multiple SSRCs, it can make certain
choices on which SSRC to use as the source of an RTCP Feedback
Packet. This sub-section discusses some considerations of this.<list
style="symbols">
<t>The media type of the media the SSRC transmits is actually
not a relevant factor when considering if an SSRC can transmit a
particular Feedback message.</t>
<t>Feedback messages which are Notification or Indications
regarding the endpoint's own RTP packet stream need to be sent
using the SSRC transmitting the media it relates to. This also
includes notifications that are related to a received request or
command.</t>
<t>The SSRC used to send feedback messages has a role as either
a media sender or a receiver. The bandwidth pools can be
different for SSRCs that are senders and receivers. Thus
feedback messages that expect to be more frequent can be sent
from an SSRC that has the better possibility of sending frequent
RTCP compound packets or reduced size packets. This also affects
the consideration if the SSRC can be used in immediate mode or
not.</t>
<t>Some Feedback Types requires consistency in the sender. For
example TMMBR, if one sets a limitation, the same SSRC needs to
be the one that increases it. Others can simply benefit from
having this property.</t>
</list></t>
<t>Note that the source of the feedback RTCP packet does not need to
be any of the sources (SSRC) including SR/RR packets in a compound
packet. For <xref target="RFC5506">Reduced-Size RTCP</xref> the
aggregation of feedback messages from multiple sources are not
limited, beyond the consideration in Section 4.2.2 of <xref
target="RFC5506"/>.</t>
</section>
<section title="Scheduling a Feedback Packet">
<t>When an SSRC has a need to transmit a feedback packet in early
mode it follows the scheduling rules defined in Section 3.5 in <xref
target="RFC4585">RTP/AVPF</xref>. When following these rules the
following clarifications need to be taken into account:<list
style="symbols">
<t>That a session is considered to be point-to-point or
multiparty not based on the number of SSRCs, but the number of
endpoints directly seen in the RTP session by the endpoint. tbd:
Clarify what is considered to "see" an endpoint?</t>
<t>Note that when checking if there is already a scheduled
compound RTCP packet containing feedback messages (Step 2 in
Section 3.5.2), that check is done considering all local
SSRCs.</t>
</list></t>
<t>TBD: The above does not allow an SSRC that is unable to send
either an early or regular RTCP packet with the feedback message
within the T_max_fb_delay to trigger another SSRC to send an early
packet to which it could piggyback. Nor does it allow
feedback to piggyback
on even regular RTCP packet transmissions that occur within
T_max_fb_delay. A question is if either of these behaviours ought to be
allowed.</t>
<t>The latter appears simple and straight forward. Instead of
discarding a FB message in step 4a: alternative 2, one could place
such messages in a cache with a discard time equal to
T_max_fb_delay, and in case any of the SSRCs schedule an RTCP packet
for transmission within that time, it includes this message.</t>
<t>The former case can have more widespread impact on the
application, and possibly also on the RTCP bandwidth consumption as
it allows for more massive bursts of RTCP packets. Still, on a time
scale of a regular reporting interval, it ough to have no effect on
the RTCP bandwidth as the extra feedback messages increase the
avg_rtcp_size.</t>
</section>
</section>
</section>
<section title="RTCP Considerations for Streams with Disparate Rates">
<t>It is possible for a single RTP session to carry streams of greatly
differing bandwidth. There are two scenarios where this can occur. The
first is when a single RTP session carries multiple flows of the same
media type, but with very different quality; for example a video
switching multi-point conference unit might send a full rate
high-definition video stream of the active speaker but only thumbnails
for the other participants, all sent in a single RTP session. The second
scenarios occurs when audio and video flows are sent in a single RTP
session, as discussed in <xref
target="I-D.ietf-avtcore-multi-media-rtp-session"/>.</t>
<t>An RTP session has a single set of parameters that configure the
session bandwidth, the RTCP sender and receiver fractions (e.g., via the
SDP "b=RR:" and "b=RS:" lines), and the parameters of the <xref
target="RFC4585">RTP/AVPF profile</xref> (e.g., trr-int) if that profile
(or its <xref target="RFC5124"> secure extension, RTP/SAVPF</xref>) is
used. As a consequence, the RTCP reporting interval will be the same for
every SSRC in an RTP session. This uniform RTCP reporting interval can
result in RTCP reports being sent more often than is considered
desirable for a particular media type. For example, if an audio flow is
multiplexed with a high quality video flow where the session bandwidth
is configured to match the video bandwidth, this can result in the RTCP
packets having a greater bandwidth allocation than the audio data rate.
If the reduced minimum RTCP interval described in Section 6.2 of <xref
target="RFC3550"/> is used in the session, which might be appropriate
for video where rapid feedback is wanted, the audio sources could be
expected to send RTCP packets more often than they send audio data
packets. This is most likely undesirable, and while the mismatch can be
reduced through careful tuning of the RTCP parameters, particularly
trr_int in RTP/AVPF sessions, it is inherent in the design of the RTCP
timing rules, and affects all RTP sessions containing flows with
mismatched bandwidth.</t>
<t>Having multiple media types in one RTP session also results in more
SSRCs being present in this RTP session. This increasing the amount of
cross reporting between the SSRCs. From an RTCP perspective, two RTP
sessions with half the number of SSRCs in each will be slightly more
efficient. If someone needs either the higher efficiency due to the
lesser number of SSRCs or the fact that one can't tailor RTCP usage per
media type, they need to use independent RTP sessions.</t>
<t>When it comes to configuring RTCP the need for regular periodic
reporting needs to be weighted against any feedback or control messages
being sent. Applications using RTP/AVPF or RTP/SAVPF are RECOMMENDED to
consider setting the trr-int parameter to a value suitable for the
application's needs, thus potentially reducing the need for regular
reporting and thus releasing more bandwidth for use for feedback or
control.</t>
<t>Another aspect of an RTP session with multiple media types is that
the RTCP packets, RTCP Feedback Messages, or RTCP XR metrics used might
not be applicable to all media types. Instead, all RTP/RTCP endpoints
need to correlate the media type of the SSRC being referenced in a
message or packet and only use those that apply to that particular SSRC
and its media type. Signalling solutions might have shortcomings when it
comes to indicating that a particular set of RTCP reports or feedback
messages only apply to a particular media type within an RTP
session.</t>
<section anchor="sec-timeout-ssrc" title="Timing out SSRCs">
<t>All SSRCs used in an RTP session MUST use the same timeout
behaviour to avoid premature timeouts. This will depend on the RTP
profile and its configuration. The RTP specification provides several
options that can influence the values used when calculating the time
interval. To avoid interoperability issues when using this
specification, this document makes several clarifications to the
calculations.</t>
<t>For RTP/AVP, RTP/SAVP, RTP/AVPF, and RTP/SAVPF with T_rr_interval =
0, the timeout interval SHALL be calculated using a multiplier of 5,
i.e. the timeout interval becomes 5*Td. The Td calculation SHALL be
done using a Tmin value of 5 seconds, not the reduced minimal interval
even if used to calculate RTCP packet transmission intervals. If using
either the RTP/AVPF or RTP/SAVPF profiles with T_rr_interval != 0 then
the calculation as specified in Section 3.5.4 of RFC 4585 SHALL be
used with a multiplier of 5, i.e. Tmin in the Td calculation is the
T_rr_interval.</t>
<t>If endpoints implementing the RTP/AVP and RTP/AVPF profiles (or
their secure variants) are combined in a single RTP session, and the
RTP/AVPF endpoints use a non-zero T_rr_interval that is significantly
lower than 5 seconds, then there is a risk that the RTP/AVPF endpoints
will prematurely timeout the RTP/AVP SSRCs due to their different RTCP
timeout intervals. Conversely, if the RTP/AVPF endpoints use a
T_rr_interval that is significant larger than 5 seconds, there is a
risk that the RTP/AVP endpoints will timeout the RTP/AVPF SSRCs. If
such mixed RTP profiles are used, (though this is NOT RECOMMENDED),
the RTP/AVPF session SHOULD use a non-zero T_rr_interval that is 4
seconds.</t>
<t><list style="empty">
<t>Note: It might appear strange to use a T_rr_interval of 4
seconds. It might be intuitive that this value ought to be 5
seconds, as then both the RTP/AVP and RTP/AVPF would use the same
timeout period. However, considering regular RTCP transmission and
their packet intervals for RTP/AVPF its mean value will (with
non-zero T_rr_interval) be larger than T_rr_interval due to the
scheduling algorithm. Thus, to enable an equal amount of regular
RTCP transmissions in each directions between RTP/AVP and RTP/AVPF
endpoints, taking the altered timeout intervals into account, the
optimal value is around four (4), where almost four transmissions
will on average occur in each direction between the different
profile types given an otherwise good configuration of parameters
in regards to T_rr_interval. If the RTCP bandwidth paramters are
selected so that Td based on bandwidth is close to 4, i.e. close
to T_rr_interval the risk increases that RTP/AVPF SSRCs will be
timed out by RTP/AVP endpoints, as the RTP/AVPF SSRC might only
manage two transmissions in the timeout period.</t>
</list></t>
</section>
<section anchor="sec-rtcp-tuning" title="Tuning RTCP transmissions">
<t>This sub-section discusses what tuning can be done to reduce the
downsides of the shared RTCP packet intervals. First, it is considered
what possibilites exist for the <xref target="RFC3551">RTP/AVP</xref>
profile, then what additional tools are provided by <xref
target="RFC4585">RTP/AVPF</xref>.</t>
<section title="RTP/AVP and RTP/SAVP">
<t>When using the RTP/AVP or RTP/SAVP profiles the tuning one can do
is very limited. The controls one has are limited to the RTCP
bandwidth values and whether the minimum RTCP interval is scaled
according to the bandwidth. As the scheduling algorithm includes
both random factors and reconsideration, one can't simply calculate
the expected average transmission interval using the formula for Td.
But it does indicate the important factors affecting the
transmission interval, namely the RTCP bandwidth available for the
role (Active Sender or Participant), the average RTCP packet size,
and the number of SSRCs classified in the relevant role. Note that
if the ratio of senders to total number of session participants is
larger than the ratio of RTCP bandwidth for senders in relation to
the total RTCP bandwidth, then senders and receivers are treated
together.</t>
<t>Let's start with some basic observations:<list style="letters">
<t>Unless the scaled minimum RTCP interval is used, then Td
prior to randomization and reconsideration can never be less
than 5 seconds (assuming default Tmin of 5 seconds).</t>
<t>If the scaled minimum RTCP interval is used, Td can become as
low as 360 divided by RTP Session bandwidth in kilobits. In SDP
the RTP session bandwidth is signalled using b=AS. An RTP
Session bandwidth of 72 kbps results in Tmin being 5 seconds. An
RTP session bandwidth of 360 kbps of course gives a Tmin of 1
second, and to achieve a Tmin equal to once every frame for a 25
Hz video stream requires an RTP session bandwidth of 9 Mbps!
(The use of the RTP/AVPF or RTP/SAVPF profile allows a smaller
Tmin, and hence more frequent RTCP reports, as discussed
below).</t>
<t>Let's calculate the number (n) of SSRCs in the RTP session
that 5% of the session bandwidth can support to yield a Td value
equal to Tmin with minimal scaling. For this calculation we have
to make two assumptions. The first is that we will consider most
or all SSRC being senders, resulting in everyone sharing the
available bandwidth. Secondly we will select an average RTCP
packet size. This packet will consist of an SR, containing (n-1)
report blocks up to 31 report blocks, and an SDES item with at
least a CNAME (17 bytes in size) in it. Such a basic packet will
be 800 bytes for n>=32. With these parameters, and as the
bandwidth goes up the time interval is proportionally decreased
(due to minimal scaling), thus all the example bandwidths 72
kbps, 360 kbps and 9 Mbps all support 9 SSRCs.</t>
<t>The actual transmission interval for a Td value is
[0.5*Td/1.21828,1.5*Td/1.21828], which means that for Td = 5
seconds, the interval is actually [2.052,6.156] and the
distribution is not uniform, but rather
exponentially-increasing. The probability for sending at time X,
given it is within the interval, is probability of picking X in
the interval times the probability to randomly picking a number
that is <=X within the interval with an uniform probability
distribution. This results in that the majority of the
probability mass is above the Td value.</t>
</list></t>
<t>To conclude, with RTP/AVP and RTP/SAVP the key limitation for
small unicast sessions is going to be the Tmin value. Thus the RTP
session bandwidth configured in RTCP has to be sufficiently high to
reach the reporting goals the application has following the rules
for the scaled minimal RTCP interval.</t>
</section>
<section title="RT/AVPF and RTP/SAVPF">
<t>When using RTP/AVPF or RTP/SAVPF we get a quite powerful
additional tool, the setting of the T_rr_interval which has several
effects on the RTCP reporting. First of all as Tmin is set to 0
after the initial transmission, the regular reporting interval is
instead determined by the regular bandwidth based calculation and
the T_rr_interval. This has the effect that we are no longer
restricted by the minimal interval or even the scaling rule for the
minimal rule. Instead the RTCP bandwidth and the T_rr_interval are
the governing factors.</t>
<t>Now it also becomes important to separate between the
application's need for regular reports and RTCP feedback packet
types. In both regular RTCP mode, as in Early RTCP Mode, the usage
of the T_rr_interval prevents regular RTCP packets, i.e. packets
without any Feedback packets, to be sent more often than
T_rr_interval. This value is applied to prevent any regular RTCP
packet to be sent less than T_rr_interval times a uniformly
distributed random value from the interval [0.5,1.5] after the
previous regular packet packet. The random value recalculated after
each regular RTCP packet transmission.</t>
<t>So applications that have a use for feedback packets for some
media streams, for example video streams, but don't want frequent
regular reporting for audio, could configure the T_rr_interval to a
value so that the regular reporting for both audio and video is at a
level that is considered acceptable for the audio. They could then
use feedback packets, which will include RTCP SR/RR packets, unless
<xref target="RFC5506">reduced-size RTCP feedback packets</xref> are
used, and can include other report information in addition to the
feedback packet that needs to be sent. That way the available RTCP
bandwidth can be focused for the use which provides the most utility
for the application.</t>
<t>Using T_rr_interval still requires one to determine suitable
values for the RTCP bandwidth value, in fact it might make it even
more important, as this is more likely to affect the RTCP behaviour
and performance than when using RTP/AVP, as there are fewer
limitations affecting the RTCP transmission.</t>
<t>When using T_rr_interval, i.e. having it be non zero, there are
configurations that have to be avoided. If the resulting Td value is
smaller but close to T_rr_interval then the interval in which the
actual regular RTCP packet transmission falls into becomes very
large, from 0.5 times T_rr_interval up to 2.73 times the
T_rr_interval. Therefore for configuration where one intends to have
Td smaller than T_rr_interval, then Td is RECOMMENDED to be targeted
at values less than 1/4th of T_rr_interval which results in that the
range becomes [0.5*T_rr_interval, 1.81*T_rr_interval].</t>
<t>With RTP/AVPF, using a T_rr_interval of 0 or with another low
value significantly lower than Td still has utility, and different
behaviour compared to RTP/AVP. This avoids the Tmin limitations of
RTP/AVP, thus allowing more frequent regular RTCP reporting. In fact
this will result that the RTCP traffic becomes as high as the
configured values.</t>
<t>(tbd: a future version of this memo will include examples of how
to choose RTCP parameters for common scenarios)</t>
<t>There exists no method within the specification for using
different regular RTCP reporting intervals depending on the media
type or individual media stream.</t>
</section>
</section>
</section>
<section anchor="security" title="Security Considerations">
<t>In the <xref target="RFC3711">secure RTP protocol (SRTP)</xref>, the
cryptographic context of a compound SRTCP packet is the SSRC of the
sender of the first RTCP (sub-)packet. This could matter in some cases,
especially for keying mechanisms such as <xref
target="RFC3830">Mikey</xref> which allow use of per-SSRC keying.</t>
<t>Other than that, the standard security considerations of RTP apply;
sending multiple media streams from a single endpoint does not appear to
have different security consequences than sending the same number of
streams.</t>
</section>
<section title="Open Issues">
<t>At this stage this document contains a number of open issues. The
below list tries to summarize the issues:<list style="numbers">
<t>Do we need to provide a recommendation for unicast session
joiners with many sources to not use 0 initial minimal interval from
bit-rate burst perspective?</t>
<t>RTCP parameters for common scenarios in <xref
target="sec-rtcp-tuning"/>?</t>
<t>Is scheduling algorithm working well with dynamic changes?</t>
<t>Are the scheduling algorithm changes impacting previous
implementations in such a way that the report aggregation has to be
agreed on, and thus needs to be considered as an optimization?</t>
<t>An open question is if any improvements or clarifications ought to
be allowed regarding FB message scheduling in multi-SSRC
endpoints.</t>
</list></t>
<t/>
</section>
<section anchor="iana" title="IANA Considerations">
<t>No IANA actions needed.</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include='reference.RFC.2119'?>
<?rfc include='reference.RFC.3550'?>
<?rfc include='reference.RFC.3711'?>
<?rfc include='reference.RFC.4585'?>
<?rfc include='reference.RFC.5124'?>
<?rfc include='reference.RFC.5506'?>
</references>
<references title="Informative References">
<?rfc include='reference.RFC.3551'?>
<?rfc include='reference.RFC.3611'?>
<?rfc include='reference.RFC.3830'?>
<?rfc include='reference.RFC.4588'?>
<?rfc include='reference.RFC.6190'?>
<?rfc include='reference.I-D.ietf-avtcore-rtp-topologies-update'?>
<?rfc include='reference.I-D.ietf-mmusic-sdp-bundle-negotiation'?>
<?rfc include='reference.I-D.ietf-clue-framework'?>
<?rfc include='reference.I-D.ietf-avtcore-multi-media-rtp-session'?>
<?rfc include='reference.I-D.ietf-avtcore-rtp-multi-stream-optimisation'?>
</references>
<section title="Changes From Earlier Versions">
<t>Note to the RFC-Editor: please remove this section prior to
publication as an RFC.</t>
<section title="Changes From WG Draft -02">
<t><list style="symbols">
<t>Changed usage of Media Stream</t>
<t>Added Updates RFC 4585</t>
<t>Added rules for how to deal with RTCP when aggregating multiple
SSRCs report in same compound packet:<list style="symbols">
<t>avg_rtcp_size calcualtion</t>
<t>Scheduling rules to maintain timing</t>
</list></t>
<t>Started a section clarifying and discsussing RTP/AVPF Feedback
Packets and their scheduling.</t>
</list></t>
</section>
<section title="Changes From WG Draft -01">
<t><list style="symbols">
<t>None, a keep-alive version</t>
</list></t>
</section>
<section title="Changes From WG Draft -00">
<t><list style="symbols">
<t>Split the Reporting Group Extension from this draft into
draft-ietf-avtcore-rtp-multi-stream-optimization-00.</t>
<t>Added RTCP tuning considerations from
draft-ietf-avtcore-multi-media-rtp-session-02.</t>
</list></t>
</section>
<section title="Changes From Individual Draft -02">
<t><list style="symbols">
<t>Resubmitted as working group draft.</t>
<t>Updated references.</t>
</list></t>
</section>
<section title="Changes From Individual Draft -01">
<t><list style="symbols">
<t>Merged with draft-wu-avtcore-multisrc-endpoint-adver.</t>
<t>Changed how Reporting Groups are indicated in RTCP, to make it
clear which source(s) is the group's reporting sources.</t>
<t>Clarified the rules for when sources can be placed in the same
reporting group.</t>
<t>Clarified that mixers and translators need to pass reporting
group SDES information if they are forwarding RR and SR traffic
from members of a reporting group.</t>
</list></t>
</section>
<section title="Changes From Individual Draft -00">
<t><list style="symbols">
<t>Added the Reporting Group semantic to explicitly indicate which
sources come from a single endpoint, rather than leaving it
implicit.</t>
<t>Specified that Reporting Group semantics (as they now are)
apply to AVPF and XR, as well as to RR/SR report blocks.</t>
<t>Added a description of the cascaded source-projecting mixer,
along with a calculation of its RTCP overhead if reporting groups
are not in use.</t>
<t>Gave some guidance on how the flexibility of RTCP randomization
allows some freedom in RTCP multiplexing.</t>
<t>Clarified the language of several of the recommendations.</t>
<t>Added an open issue discussing how avg_rtcp_size ought to be
calculated for multiplexed RTCP.</t>
<t>Added an open issue discussing how RTCP bandwidths are to be
chosen for sessions where source bandwidths greatly differ.</t>
</list></t>
</section>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 02:39:17 |