One document matched: draft-westerlund-avtcore-rtp-simulcast-01.txt
Differences from draft-westerlund-avtcore-rtp-simulcast-00.txt
Network Working Group M. Westerlund
Internet-Draft B. Burman
Intended status: Standards Track M. Lindqvist
Expires: January 17, 2013 F. Jansson
Ericsson
July 16, 2012
Using Simulcast in RTP sessions
draft-westerlund-avtcore-rtp-simulcast-01
Abstract
In some applications it may be necessary to send multiple media
streams derived from the same media source. This is called
Simulcast. This document discusses the best way of accomplishing
this in RTP. It is concluded that a session based solution provides
best support for simulcast, and a solution for that is defined.
There are two necessary extensions. The first extension is how to
group RTP sessions belonging to the same simulcast source using the
grouping framework, and the second is how to identify which SSRCs
that are the same media source by using a new RTCP SDES item SRCNAME.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 17, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
Westerlund, et al. Expires January 17, 2013 [Page 1]
Internet-Draft RTP Simulcast July 2012
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Westerlund, et al. Expires January 17, 2013 [Page 2]
Internet-Draft RTP Simulcast July 2012
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 5
3. Simulcast and Applicability . . . . . . . . . . . . . . . . . 5
3.1. Simulcasting to RTP Mixer . . . . . . . . . . . . . . . . 5
3.1.1. Simulcast Combined with Scalable Encoding . . . . . . 7
3.2. Multicast Transported Simulcasted Media . . . . . . . . . 7
3.2.1. Diversity in Receiver Population . . . . . . . . . . . 7
3.2.2. Bit-rate Adaptation . . . . . . . . . . . . . . . . . 8
3.3. Simulcasting to a Consuming End-Point . . . . . . . . . . 9
3.4. Same Encoding to Multiple Destinations . . . . . . . . . . 9
3.5. Different Encoding to Independent Destinations . . . . . . 10
4. Simulcast Alternatives . . . . . . . . . . . . . . . . . . . . 10
4.1. Using the Payload Type . . . . . . . . . . . . . . . . . . 11
4.2. Using Single RTP session . . . . . . . . . . . . . . . . . 11
4.3. Using Multiple RTP sessions . . . . . . . . . . . . . . . 11
5. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.1. RTP/RTCP Aspects . . . . . . . . . . . . . . . . . . . . . 12
5.2. Signalling Aspects . . . . . . . . . . . . . . . . . . . . 13
5.3. Network Aspects . . . . . . . . . . . . . . . . . . . . . 13
5.4. Security Aspects . . . . . . . . . . . . . . . . . . . . . 14
5.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 14
6. Signaling Support for Multiple RTP session based Simulcast . . 15
6.1. Grouping Simulcast RTP Sessions . . . . . . . . . . . . . 15
6.1.1. Declarative Use . . . . . . . . . . . . . . . . . . . 15
6.1.2. Offer/Answer Use . . . . . . . . . . . . . . . . . . . 16
6.2. Media Stream Requirements . . . . . . . . . . . . . . . . 16
6.3. Relating Alternative Encodings . . . . . . . . . . . . . . 16
6.4. Multiple Stream handling . . . . . . . . . . . . . . . . . 16
7. Simulcast Signalling Examples . . . . . . . . . . . . . . . . 17
7.1. Alice: Desktop Client . . . . . . . . . . . . . . . . . . 17
7.2. Bob: Telepresence Room . . . . . . . . . . . . . . . . . . 19
7.3. Fred: Dial-out to Legacy Client . . . . . . . . . . . . . 23
7.4. Joe: Dial-out to Desktop Client . . . . . . . . . . . . . 26
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30
9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 30
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30
11.1. Normative References . . . . . . . . . . . . . . . . . . . 30
11.2. Informative References . . . . . . . . . . . . . . . . . . 31
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 32
Westerlund, et al. Expires January 17, 2013 [Page 3]
Internet-Draft RTP Simulcast July 2012
1. Introduction
Simulcast is the act of simultaneously sending multiple different
versions of the same media content, e.g. the same video source
encoded with different video encoders. This can be done in several
ways and for different purposes. This document focuses on the case
where one wants to provide multiple streams with different encodings
over RTP [RFC3550] towards an intermediary so that the intermediary
can select which encoding to forward to other participants in the
session, and more specifically how the grouping of the streams is
defined.
The different encodings of a media content considered in this
document can differ in:
Bit-rate: The difference is the amount of bits spent to encode the
media thus giving different quality.
Codec: Different media codecs are used to ensure that different
receivers that do not have a common set of decoders can decode at
least one of the versions. This can include codec configuration
options that are not compatible, like video encoder profiles, or
the capability of receiving the transport packetization.
Sampling: Different sampling of media, in spatial as well as in
temporal domain, may be used to suit different rendering
capabilities or needs at the receiving endpoints, as well as a
method to achieve different bit-rates. For video streams, spatial
sampling affects image resolution and temporal sampling affects
video frame rate. For audio, spatial sampling relates to the
number of audio channels and temporal sampling affects audio
bandwidth. Obviously, a difference in sampling may result in
difference in bit-rate.
There are different reasons for an application to provide a single
media source in different encodings. As soon as an application has
the need to send multiple encodings, there is a potential need for
simulcast. This need can arise even when using media codecs that
have scalability features built in. The purpose of this document is
to find the most suitable solution for the non-trivial variants of
simulcast and in order to do this, different ways of multiplexing the
different encodings are discussed. Following the presentation of the
alternatives, an analysis is performed on how different aspects like
RTP mechanisms, signaling possibilities, and network features are
affected by the alternatives. This is a specific application of the
aspects discussed in RTP Multiplexing Architecture
[I-D.westerlund-avtcore-multiplex-architecture]. The discussion
results in a conclusion, a solution, and a proposal for the
Westerlund, et al. Expires January 17, 2013 [Page 4]
Internet-Draft RTP Simulcast July 2012
standardization work required to support simulcast.
2. Definitions
2.1. Terminology
The following terms and abbreviations are used in this document:
Encoding: A particular encoding is the choice of the media encoder
(codec) that has been used to compress the media and the fidelity
of that encoding through the choice of sampling, bit-rate and
other codec configuration parameters.
Different encodings: An encoding is different when some parameter
that characterize the encoding of a particular media source is
changed. Such changes can be one or more of the following
parameters; codec, codec configuration, bit-rate, sampling.
Simulcast versions: Media streams used for simulcast that use
different encodings and thus constitute different versions of the
same media source.
2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Simulcast and Applicability
This section discusses different usage scenarios for the term
simulcast and clarifies which of those this document focuses on. It
also reviews why simulcast and scalable codecs can be a useful
combination.
3.1. Simulcasting to RTP Mixer
This scenario relates to a multi-party session where one or more
central nodes are used to facilitate the media transport between the
session participants. Thus, this targets the RTP Mixer Topology
defined in [RFC5117] (Section 3.4: Topo-Mixer). This scenario is
targeted for further discussion in this document.
Simulcasting different media encodings of video that differ both in
resolution and in bit-rate is highly applicable to video conferencing
scenarios. For example, an RTP mixer selects the video of the most
Westerlund, et al. Expires January 17, 2013 [Page 5]
Internet-Draft RTP Simulcast July 2012
active speaker and sends that participant's video stream as a high
resolution stream to the other participants, and in addition also
sends a number of low resolution video streams of the other
participants, enabling the receiving user to both display the current
speaker in high quality and monitor the other participants in lower
quality/resolution/size. As the participants should not receive the
stream showing themselves, the set of streams will be unique to all
participants.
A number of alternatives exist to provide both high and low
resolutions from an RTP Mixer:
Simulcast: The clients send one stream for the low resolution and
another for the high resolution.
Scalable Video Coding: The clients are using a video encoder that
can provide one stream that is both providing the high resolution
and also enables the mixer to extract a low resolution
representation from that single stream.
Transcoding in the Mixer: The clients send a high resolution stream
to the RTP Mixer which performs a transcoding to a lower
resolution stream.
The Transcoding alternative requires that the RTP mixer has
sufficient amount of transcoding resources to produce the number of
low resolution streams required. In worst case, all participants'
streams may need to be transcoded. If the resources are not
available, a different solution is needed. There will also normally
be a quality loss and an increase in latency associated with the
transcoding operation.
Scalable video encoding requires a more complex encoder compared to
non-scalable encoding. Also, if the resolution difference between
the streams is large, a scalable codec may in fact be only marginally
more bandwidth efficient than the simulcast case where the different
resolutions are sent as separate streams from the clients to the
mixer. At the same time, with scalable video encoding, the
transmission of all but the lowest resolution will consume more
bandwidth from the mixer to the other participants than with a non-
scalable encoding.
Simulcasting has the benefit that it is conceptually simple. It
enables the use of any media codec that the participants agree on,
allowing the RTP mixer to be codec-agnostic. With the currently
available video encoders, simulcasting may be less bit-rate efficient
in the path from the sending client to the mixer but more efficient
in the mixer to receiver path compared to Scalable Video Coding.
Westerlund, et al. Expires January 17, 2013 [Page 6]
Internet-Draft RTP Simulcast July 2012
+------------+ +---+
+---+ | |----->| B |
| |=====>| | +---+
| A | | Mixer |
| |----->| | +---+
+---+ | |=====>| C |
+------------+ +---+
Figure 1: RTP Mixer selecting from simulcast versions
The sender A provides the mixer with both a high resolution version
"===>" and a low resolution version "--->". The mixer selects who in
it's receiver population should get a particular version.
3.1.1. Simulcast Combined with Scalable Encoding
As explained in the previous section, a scalable codec is not always
more bandwidth efficient than simulcast, especially in the path from
the mixer to the receiver.
There are however cases where a combination of simulcast and scalable
encoding can be beneficial. By using simulcast in cases where the
scalable codec is less efficient, one can optimize the efficiency of
the complete system. A good example of this usage would be where the
video is encoded using SVC transported in RTP [RFC6190], where each
simulcast stream has a different resolution, and each SVC media
stream uses temporal scalability and signal to noise ratio (SNR)
scalability within that single media stream. If only resolution and
temporal variations are needed, this can be implemented using the
non-scalable part of H.264, as each simulcast version provides the
different resolution, and each media stream within a simulcast
encoding has temporal scalability through the use of non-reference
frames.
3.2. Multicast Transported Simulcasted Media
When using multicast, particularly Source-Specific Multicast (SSM)
[RFC3569] to distribute RTP/RTCP packets to a large receiver
population one faces some issues. There are at least two different
issues where simulcast can potentially be useful.
3.2.1. Diversity in Receiver Population
If there is any diversity in the receivers regarding e.g. capability,
codec support or code base, there are potentially restrictions in
what streams can be delivered to the receivers. If using the lowest
common denominator over a diverse receiver population isn't
acceptable, simulcast can be one possible solution. By offering
Westerlund, et al. Expires January 17, 2013 [Page 7]
Internet-Draft RTP Simulcast July 2012
different stream alternatives, it is possible to let the receivers
choose the simulcast version that matches their capabilities. By
using explicit signalling for simulcast, it is not necessary for the
stream distributor to handle multiple receiver configurations
individually for a multi-media session, nor to ensure that each
receiver gets an encoding that matches their capabilities.
The simulcast version granularity the receivers can select will be on
multicast group level. Thus, this use case puts a strict requirement
on supporting RTP session multiplexing. The reason being that having
a single RTP session straddle several multicast groups makes any
reporting on the received sources very difficult to interpret. Using
one RTP session per simulcast version instead provides consistency.
3.2.2. Bit-rate Adaptation
If the network paths from the media sender to the receivers can
support different bit-rates, there is a need to support media streams
encoded to different bit-rates. If these path differences are of a
more static nature, for example depending primarily on the underlying
link layers, using simulcast has an advantage over scalable encoding.
The reason is that the efficiency of scalable coding will never be
better than encoding to a single target rate. When the receiver can
determine current network interface connectivity, it can choose
simulcast version with certainty. That choice will also be correct
until the event of another network interface becoming the active one.
This assumes that the multicast transmission uses dedicated resources
and will thus not be congested due to other network traffic. To
support this behavior, the signalling must support indication of
which media streams that are alternatives to each other, and it is
also necessary to be able to determine aggregate bit-rate for the
selected multicast group(s) compared to available network properties.
Simulcast is possible to use also in more dynamic situations where
each receiver continuously gathers reception statistics to detect
path congestion and based on that may change which version to
receive. The main issue with such usage is how to achieve a switch
from one version to another with minimal playback interruption and
also avoiding to put extra load on the network during the actual
switch. Here, scalable encoding in general have better
characteristics since scalability layers are typically synchronized.
When comparing simulcast and scalable encoding, the trade-offs are
different and the down-sides occur at different places. Simulcast
will have a higher bit-rate load at a media sender and that will also
be the case for any network path shared between receivers of multiple
simulcast versions. However, for parts of the network path where
there is only a single simulcast version, the achievable quality at a
Westerlund, et al. Expires January 17, 2013 [Page 8]
Internet-Draft RTP Simulcast July 2012
given bit-rate will be slightly higher for simulcast. It will also
be more difficult to seamlessly switch between simulcast versions
than between different scalable encodings, as simulcast actually
switches from one media stream version to another instead of adding
or removing some enhancement layers.
3.3. Simulcasting to a Consuming End-Point
This scenario is based on an RTP Transport Translator (Section 3.3:
Topo-Trn-Translator) [RFC5117]. The transport translator functions
as a relay and transmits all streams received from one participant to
all other participants. For example, when simulcasting a low
resolution and a high resolution video stream, the RTP Translator
would send all the streams to all clients. This clearly increases
the bit-rate transmitted on the paths to the clients compared to the
mixer case in the previous section. The only simulcast benefit for
the receiving client over a single stream scenario would be reduced
decoding complexity for the low resolution streams. A single stream
scenario which only transmits the high resolution stream would allow
the receiver to decode it and scale it down to the desired
resolution.
The usage of transport translator and simulcast becomes efficient if
each receiving client is allowed to control or configure the relay
with respect to which version it wants to receive. However, such
usage of RTP has some potential issues with RTCP. One example is
when a receiver has indicated to the transport translator that it
does not want to receive a particular stream, but at the same time it
is receiving and reporting on other streams from the same sender. In
this case, the sender will receive no RTCP messages about the non-
forwarded stream and therefore get the impression that the stream
somehow is lost. Thus some consideration and mechanism are needed to
support such a use case in order not to break RTCP reception
reporting.
This scenario is considered in the continuation of the document but
with less emphasis than on the RTP mixer case.
3.4. Same Encoding to Multiple Destinations
One interpretation of simulcast is when one encoding is sent to
multiple receivers. This is well supported in RTP by simply copying
all outgoing RTP and RTCP traffic to several transport destinations,
if the intention is to create a common RTP session. As long as all
participants do the same, a full mesh is constructed and everyone in
the multi party session have a similar view of the joint RTP session.
This is analog to an Any Source Multicast (ASM) session but without
the traffic optimization as multiple copies of the same content is
Westerlund, et al. Expires January 17, 2013 [Page 9]
Internet-Draft RTP Simulcast July 2012
likely to have to pass over the same link.
+---+ +---+
| A |<---->| B |
+---+ +---+
^ ^
\ /
\ /
v v
+---+
| C |
+---+
Figure 2: Full Mesh / Multi-unicast
As this type of simulcast is analog to ASM usage and RTP has good
support for ASM sessions, no further consideration for this scenario
is made in this document.
3.5. Different Encoding to Independent Destinations
Another alternative interpretation of simulcast is multiple
destinations, where each destination gets a specifically tailored
version, but where the destinations are independent. A typical
example for this would be a streaming server distributing the same
live session to a number of receivers, adapting the quality and
resolution of the multi-media session to each receiver's capability
and available bit-rate. This case can be solved in RTP by having
independent RTP sessions between the sender and the receivers. Thus
this case is not considered further.
4. Simulcast Alternatives
Simulcast is defined in this document as the act of sending multiple
alternative encodings of the same underlying media source. When
transmitting multiple independent streams that originate from the
same source, it could potentially be done in several different ways
using RTP. The below sub-sections describe potential ways of
achieving stream multiplexing and identification of which streams are
alternative encodings of the same source. In the following
descriptions it is also included how this interacts with multiple
sources (SSRCs) in the same RTP session for other reasons than
simulcast. Multiple SSRCs may occur for various reasons such as
multiple participants in multipoint topologies such as multicast,
transport relays or full mesh transport simulcasting, multiple source
devices, such as multiple cameras or microphones at one end-point, or
other RTP mechanisms such as RTP Retransmission [RFC4588].
Westerlund, et al. Expires January 17, 2013 [Page 10]
Internet-Draft RTP Simulcast July 2012
4.1. Using the Payload Type
This alternative uses only the RTP payload type to identify the
different simulcast streams. Thus all simulcast streams would be
sent in the same RTP session using only a single SSRC per actual
media source. However, as discussed in Guidelines for using the
Multiplexing Features of RTP
[I-D.westerlund-avtcore-multiplex-architecture], using Payload Type
Multiplexing does not work and is hereby dismissed as potential
solution.
4.2. Using Single RTP session
This idea is based on using a unique SSRC for each alternative
encoding of an actual media source within a single RTP session. The
identification of how streams are considered to be alternative needs
an additional mechanism, for example using SSRC grouping [RFC5576]
and a new SDES item such as SRCNAME proposed in
[I-D.westerlund-avtext-rtcp-sdes-srcname] with a semantics that
indicate them as alternatives of a particular media source. When
there are multiple actual media sources in a session, each media
source will have to use a number of SSRCs to represent the different
alternatives it produces. For example, if all actual media sources
are similar and produce the same number of simulcast versions, there
will be n*m SSRCs in use in the RTP session, where n is the number of
actual media sources and m the number of simulcast versions they can
produce. Each SSRC can use any of the configured payload types for
this RTP session. All session level attributes and parameters that
are not source specific will apply and must function with all the
alternative encodings intended to be used.
4.3. Using Multiple RTP sessions
Using multiple RTP sessions means that each different simulcast
version of an actual media source is transmitted in a separate RTP
session, using whatever session identifier to distinguish the
different versions. This solution needs explicit session grouping
[RFC5888] with a semantics that indicate them as alternatives. It is
also important to identify the SSRCs in the different sessions that
are alternative encodings of the same media source. This could be
accomplished using the same SSRC across the sessions, but that is not
robust against SSRC collisions and could potentially force cascading
SSRC changes between sessions. A better choice would be to use the
same value for the a new SDES item proposed in
[I-D.westerlund-avtext-rtcp-sdes-srcname]. Each RTP session will
have its own set of configured RTP payload types available for use
with any SSRC in that session. In addition, all other attributes for
sessions or sources can be used as normal to indicate the
Westerlund, et al. Expires January 17, 2013 [Page 11]
Internet-Draft RTP Simulcast July 2012
configuration of that particular alternative.
5. Analysis
This section provides an analysis of simulcast as a specific case of
the aspects discussed in Guidelines for using the Multiplexing
Features of RTP [I-D.westerlund-avtcore-multiplex-architecture] to
determine what is the most suitable solution. The below section
discusses the relevant points for simulcast and contrasts using only
SSRCs with using both RTP sessions and SSRC.
5.1. RTP/RTCP Aspects
The RTP/RTCP aspects of relevance are:
RTP Specification: From a base RTP specification point of view,
there is no real difference between a single RTP session or using
multiple RTP sessions.
Multiple SSRC Legacy Considerations: Dealing with legacy handling of
multiple SSRCs in one RTP session for simulcast is a minor issue
as end-points supporting simulcast will implement the necessary
support. They should also determine if there is necessary support
based on signalling. However, for cases where usage of simulcast
is combined with legacy in the same scenario, multiple RTP
sessions will have an advantage as the number of SSRCs in each
session does not increase due to simulcast, only the number of
sessions.
Cross Session RTCP Requests: In the case of simulcast, the findings
in the architecture document stands and might be relevant when
switching between simulcast versions to configure current code
control state.
Binding Related Sources: Simulcast will require a clear binding
between the SSRCs carrying the different simulcast versions. This
issue will be independent of using one or multiple RTP sessions.
Transport Translators: Transport translators and simulcast is not
the best match. This as the core of the functionality desired in
simulcast is usually to be able to switch between alternatives,
which is not really possible with transport translators as they do
not manipulate the media streams. However, if one uses multiple
RTP sessions, a session participant can control the simulcast
version it receives in a very coarse grained fashion by joining
the right RTP session. However, it is not capable of switching
individual sources within the sessions.
Westerlund, et al. Expires January 17, 2013 [Page 12]
Internet-Draft RTP Simulcast July 2012
Regarding RTP/RTCP aspects, multiple RTP sessions based solution can
handle legacy better, while an single RTP seesion solution has some
advantage if there is need for synchronized requests across multiple
stream versions, but there are no major differences.
5.2. Signalling Aspects
The signalling aspects is one of the major issues for simulcast. In
the currently used signalling system based on SDP [RFC4566] and
Offer/Answer [RFC3264], the properties of media streams are
negotiated on RTP session level. This is discussed in Section 7.3.1
of the Guidelines for using the Multiplexing Features of RTP
[I-D.westerlund-avtcore-multiplex-architecture].
As simulcast is all about being able to signal and negotiate what the
different simulcast versions should be, it becomes important that the
signalling supports such usage. A SSRC only solution does not
prevent such signalling to be developed, but SSRC centric signalling
is currently almost non-existent. If Session and SSRC based solution
is used instead, it is already possible to signal and negotiate the
version properties on a session level. Negotiated media properties
will apply to all media sources sent in the same RTP session, which
is likely not an issue in most cases. For example, using a common
simulcast version definition across all media sources at one end-
point will allow an RTP mixer choose both which media sources and
which simulcast versions of them to forward towards the other end-
points.
From a signalling perspective, the only rapid way forward is multiple
RTP sessions based solution.
5.3. Network Aspects
The network aspects that have any relevance for simulcast are:
Quality of Service: When using simulcast it might be of interest to
prioritize a particular simulcast version, rather than applying
equal treatment of all versions. For example, lower bit-rate
versions may be prioritized over higher bit-rate versions to
minimize congestion or packet losses in the low bit-rate versions.
Thus, there is a benefit to use a simulcast solution that supports
QoS as good as possible. By using RTP sessions over different
transport flows, a simulcast version can be prioritized by flow
based QoS mechanisms. If the application would like to prioritize
a particular media source in one simulcast version then the two
proposals are equal.
Westerlund, et al. Expires January 17, 2013 [Page 13]
Internet-Draft RTP Simulcast July 2012
NAT/FW Traversal: Using multiple RTP sessions will incur more cost
for NAT/FW traversal unless the solution for multiplexing multiple
RTP sessions on a single lower layer transport
[I-D.westerlund-avtcore-transport-multiplexing] is used, in which
cases they are basically equal. That is both from NAT/FW
traversal perspective and QoS possibilities. If flow based QoS
with any differentiation is desirable, the cost for additional
transport flows is likely necessary.
Multicast: To enable simulcast to be combined with multicast, it
will be required to use multiple RTP sessions. Multicast groups
need be separate for the different versions to allow a multicast
receiver to pick the version it wants, rather than receive all of
them. In this case, the only reasonable implementation is to use
different RTP sessions for each multicast group so that reporting
and other RTCP functions operate as intended.
Using multiple RTP Sessions are clearly the better choice when taking
network aspects into account. Multiple RTP Sessions are required to
support any multicast usage. In addition, it can provide support for
differentiated flow based QoS. The extra NAT/FW traversal costs can
be mitigated completely by multiplexing all RTP sessions over a
single transport.
5.4. Security Aspects
The discussed security aspects has the following applicability or
considerations when it comes to simulcast:
Security Context Scope: Both issues may be applicable to simulcast
usage. If differentiation enforcement is based on encryption and
keying then multiple RTP session based simulcast has a slight
benefit.
Key-Management: There is no significant difference in the solution
except that multiple RTP sessions may require keying more
contexts. Having more contexts is also what brings additional
freedom to make differentiation.
There is a small difference in security aspects where multiple RTP
sessions provides more freedom, but also a higher cost in the amount
of contexts needing to be keyed.
5.5. Summary
Defining multiple RTP sessions based simulcast appears to be the best
choice. It supports the most use cases including the multicast based
one, it has better support for flow based QoS, and the NAT/FW costs
Westerlund, et al. Expires January 17, 2013 [Page 14]
Internet-Draft RTP Simulcast July 2012
can be mitigated. When it comes to signalling, multiple RTP sessions
based simulcast appears to require a modest set of extensions to
work, while a single RTP session seems to require large amounts of
extensions to enable sets of SSRC to negotiate different parameters
that differentiate the simulcast versions. Multiple RTP sessions
also provide greater flexibility when it comes to key-management
choices for the applications.
A single RTP session solution, as a complement to the multiple RTP
sessions, is not considered due to the large amount of extensions
required for signalling. The needed extensions to support single RTP
session simulcast may be defined in the future.
6. Signaling Support for Multiple RTP session based Simulcast
To enable the usage of multiple RTP sessions based simulcast, some
minimal additional signaling support is required. That support is
discussed in this section. First of all, there is a need for a
mechanism to identify the RTP sessions carrying simulcast versions
from the same media source. Secondly, a receiver needs to be able to
identify the SSRCs in the different sessions belonging to the same
media source. Beyond the necessary signaling support for simulcast,
some very useful optimizations regarding transmission of media
streams are described that will also help RTP mixers to select which
stream alternatives to deliver to a specific client, or request a
client to encode in a particular way.
6.1. Grouping Simulcast RTP Sessions
The proposal is to define a new grouping semantics for the session
groupings framework [RFC5888]. There is a need to separate the
semantics of intent to send simulcast streams from the capability to
recognize and receive simulcast streams. For that reason two new
simulcast grouping semantics are defined, "SimulCast Receive" (SCR)
and "SimulCast Send" (SCS). They both act as an indicator that
session level simulcast is desired and provide one set of RTP
sessions that carries simulcast versions of media sources. There may
be multiple sets of RTP Sessions that carries simulcast versions.
6.1.1. Declarative Use
When used as a declarative media description, SCR indicates the
configured end-point's required capability to recognize and receive a
specified set of RTP streams as simulcast streams. In the same
fashion, SCS requests the end-point to send a specified set of RTP
streams as simulcast streams. SCR and SCS MAY be used independently
and at the same time and they need not specify the same or even the
Westerlund, et al. Expires January 17, 2013 [Page 15]
Internet-Draft RTP Simulcast July 2012
same number of RTP sessions in the group.
6.1.2. Offer/Answer Use
When used in an offer, SCS indicates the SDP providing agent's intent
of sending simulcast and the particular set of RTP sessions, and SCR
indicates the agent's capability of receiving simulcast streams
within the configured set of RTP Sessions. SCS and SCR MAY be used
independently and at the same time and they need not specify the same
or even the same number of RTP sessions in the group. The answerer
MUST change SCS to SCR and SCR to SCS in the answer, given that it
has and wants to use the corresponding (reverse) capability. An
answerer not supporting the SCS or SCR direction, or not supporting
SCS or SCR grouping semantics at all, will remove that grouping
attribute altogether, according to the grouping framework [RFC5888].
An offerer that receives an answer indicating lack of simulcast
support in one or both directions, where SCR and/or SCS grouping are
removed, MUST NOT use simulcast in the non-supported direction(s).
6.2. Media Stream Requirements
When doing simulcast, the media streams that are alternatives need
certain considerations to ensure that switching between alternative
streams are as issue-free as possible. The following considerations
are needed:
Same Clock Base: To enable correct alignment of media packets on the
source time-line, all alternative streams (SSRCs) MUST use the
same underlying clock to relate their RTP timestamp values with
the network time protocol (NTP) formatted sender time in the RTCP
Sender Reports.
6.3. Relating Alternative Encodings
To ensure that simulcast streams can be related correctly, the usage
of the SDES SRCNAME [I-D.westerlund-avtext-rtcp-sdes-srcname] with
the same value across simulcast versions is belonging to the same
media source is REQUIRED.
6.4. Multiple Stream handling
The grouping semantics SCR and SCS SHOULD be combined with the SDP
attributes "a=max-send-ssrc" and "a=max-recv-ssrc"
[I-D.westerlund-avtcore-max-ssrc] to indicate the number of
simultaneous streams of each encoding that may be sent or that can be
handled in the receive direction.
Westerlund, et al. Expires January 17, 2013 [Page 16]
Internet-Draft RTP Simulcast July 2012
7. Simulcast Signalling Examples
This example is for a case of client to video conference service
using a centralized media topology with an RTP mixer. Alice and Bob
calls into a conference server for a conference call with audio and
video sent to the RTP mixer, these clients being capable to send a
few video simulcast versions. The conference server also dials out
to Fred, which is a legacy client resulting in fallback behavior.
When dialing out to Joe, more functionality is enabled as Joe is a
client similar to Alice.
+---+ +-----------+ +---+
| A |<---->| |<---->| B |
+---+ | | +---+
| Mixer |
+---+ | | +---+
| F |<---->| |<---->| J |
+---+ +-----------+ +---+
Figure 3: Four-party Mixer-based Conference
Example of Media plane for RTP mixer based multi-party conference
with 4 participants.
7.1. Alice: Desktop Client
Alice is calling in to the mixer with an audiovisual single stream
desktop client, only adding capability to send simulcast and announce
SRCNAME, compared to a legacy client. The offer from Alice looks
like
Westerlund, et al. Expires January 17, 2013 [Page 17]
Internet-Draft RTP Simulcast July 2012
v=0
o=alice 2362969037 2362969040 IN IP4 192.0.2.156
s=Simulcast enabled Desktop Client
t=0 0
c=IN IP4 192.0.2.156
b=AS:825
a=group:SCS 2 3
m=audio 49200 RTP/AVP 96 97 9 8
b=AS:145
a=rtpmap:96 G719/48000/2
a=rtpmap:97 G719/48000
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=ssrc:521923924 cname:alice@foo.example.com
a=ssrc:521923924 srcname:a
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:520
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:* send [x=640,y=360] recv [x=640,y=360] [x=320,y=180]
a=ssrc:192392452 cname:alice@foo.example.com
a=ssrc:192392452 srcname:v
a=mid:2
a=content:main
m=video 49400 RTP/AVP 96
b=AS:160
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 send [x=320,y=180]
a=ssrc:239245219 cname:alice@foo.example.com
a=ssrc:239245219 srcname:v
a=mid:3
a=sendonly
Figure 4: Alice Offer for a Simulcast Conference
As can be seen from the SDP, Alice has a simulcast-enabled client and
offers two different simulcast versions sent from her single camera,
indicated by the SCS grouping tag and the two media IDs (2 and 3).
The first video version with media ID 2 prefers 360p resolution
(signaled via imageattr) and the second video version with media ID 3
prefers 180p resolution. The first video media line also acts as the
single receive video (making media line sendrecv), while the second
video media line is only related to simulcast transmission and is
thus offered sendonly. The two simulcast encoding streams and its
related audio stream are bound together using SRCNAME SDES item with
the identifier "v", a single level is required in this case. We also
Westerlund, et al. Expires January 17, 2013 [Page 18]
Internet-Draft RTP Simulcast July 2012
declare the end-point CNAME as all sources belong to the same
synchronization context.
7.2. Bob: Telepresence Room
Bob is calling in to the mixer with a telepresence client that has
capability for both sending multi-stream, receiving and local
rendering of those multiple streams, as well as sending simulcast
versions to the mixer. More specifically, in this example the client
has three cameras, each being sent in three different simulcast
versions. In the receive direction, up to two main screens can show
video from a (multi-stream) conference participant being active
speaker, and still more screen estate can be used to show videos from
up to 16 other conference listeners. Each camera has a corresponding
(stereo) microphone that can also be negotiated down to mono by
removing the stereo payload type from the answer. The capability to
send and receive multiple SSRC in the same RTP session is explicitly
announced through use of RTP multi-stream signalling
[I-D.westerlund-avtcore-max-ssrc].
v=0
o=bob 129384719 9834727 IN IP4 192.0.2.35
s=Simulcast Enabled Multi Stream Telepresence Client
t=0 0
c=IN IP4 192.0.2.35
b=AS:6035
a=group:SCS 2 3 4
m=audio 49200 RTP/AVP 96 97 9 8
b=AS:435
a=rtpmap:96 G719/48000/2
a=rtpmap:97 G719/48000
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=max-send-ssrc:* 3
a=max-recv-ssrc:* 3
a=ssrc:724847850 cname:bob@foo.example.com
a=ssrc:724847850 srcname:a1
a=ssrc:2847529901 cname:bob@foo.example.com
a=ssrc:2847529901 srcname:a2
a=ssrc:57289389 cname:bob@foo.example.com
a=ssrc:57289389 srcname:a3
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:4500
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:* send [x=1280,y=720] recv [x=1280,y=720]
[x=640,y=360] [x=320,y=180]
a=max-send-ssrc:96 3
Westerlund, et al. Expires January 17, 2013 [Page 19]
Internet-Draft RTP Simulcast July 2012
a=max-recv-ssrc:96 2
a=ssrc:75384768 cname:bob@foo.example.com
a=ssrc:75384768 srcname:v1
a=ssrc:2934825991 cname:bob@foo.example.com
a=ssrc:2934825991 srcname:v2
a=ssrc:3582594238 cname:bob@foo.example.com
a=ssrc:3582594238 srcname:v3
a=mid:2
a=content:main
m=video 49400 RTP/AVP 96
b=AS:1560
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:* send [x=640,y=360]
a=max-send-ssrc:96 3
a=ssrc:1371234978 cname:bob@foo.example.com
a=ssrc:1371234978 srcname:v1
a=ssrc:897234694 cname:bob@foo.example.com
a=ssrc:897234694 srcname:v2
a=ssrc:239263879 cname:bob@foo.example.com
a=ssrc:239263879 srcname:v3
a=mid:3
a=sendonly
m=video 49500 RTP/AVP 96
b=AS:420
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 send [x=320,y=180]
a=max-send-ssrc:96 3
a=ssrc:485723998 cname:bob@foo.example.com
a=ssrc:485723998 srcname:v1
a=ssrc:2345798212 cname:bob@foo.example.com
a=ssrc:2345798212 srcname:v2
a=ssrc:1295729848 cname:bob@foo.example.com
a=ssrc:1295729848 srcname:v3
a=mid:4
a=sendonly
m=video 49600 RTP/AVP 96 97 98
b=AS:2600
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:96 recv [x=1280,y=720]
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c01e
a=imageattr:97 recv [x=640,y=360]
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=42c00d
a=imageattr:98 recv [x=320,y=180]
Westerlund, et al. Expires January 17, 2013 [Page 20]
Internet-Draft RTP Simulcast July 2012
a=max-recv-ssrc:96 1
a=max-recv-ssrc:97 4
a=max-recv-ssrc:98 16
a=max-recv-ssrc:* 16
a=mid:5
a=recvonly
a=content:alt
Figure 5: Bob Offer for a Multi-stream and Simulcast Telepresence
Conference
Bob has a three-camera, three-screen, simulcast-enabled client with
even higher performance than Alice's and can additionally support
720p video, as well as multiple receive streams of various
resolutions. The client implementor has thus decided to offer three
simulcast streams for each camera, indicated by the SCS grouping tag
and the three media IDs (2, 3, and 4) in the SDP.
The first video media line with media ID 2 indicates the ability to
send video from three simultaneous video sources (cameras) through
the max-send-ssrc attribute with value 3. This media line is also
marked as the main video by using the content attribute from
[RFC4796]. Also the receive direction has declared ability to handle
multiple video sources, and in this example it is 2. The
interpretation of content:main for those two streams in the receive
direction is that the client expects and can present (in prime
position) at most two main (active speaker) video streams from
another multi-camera client.
The second and third video media lines with media ID 3 and 4 are the
sendonly simulcast streams. Through the grouping, they can
implicitly be interpreted as also being content:main for the send
direction, but is not marked as such since multiple media blocks with
content:main could be confusing for a legacy client.
The fourth video media line with media ID 5 is recvonly and is marked
with content:alt. That media line should, as was intended for that
content attribute value, receive alternative content to the main
speaker, such as "audience". In a multi-party conference, that could
for example be the next-to-most-active and/or non-active speakers.
The SDP describes that those streams can be presented in a set of
different resolutions, indicated through the different payload types.
The maximum number of streams per payload type is indicated through
the max-recv-ssrc attribute. In this example, at most one stream can
have payload type 96, preferably 720p, as indicated by the related
imageattr line. Similarly, at most 4 streams can have payload type
97, preferably using 360p resolution, and at most 16 streams can have
payload type 98, preferably of 180p resolution. In any case, there
Westerlund, et al. Expires January 17, 2013 [Page 21]
Internet-Draft RTP Simulcast July 2012
must never be more than 16 simultaneous streams of any payload type,
but combinations of payload types may occur, such as for example two
streams using payload type 97 and 8 streams using payload type 98.
The answer from a simulcast-enabled RTP mixer to this last SDP could
look like:
v=0
o=server 238947290 239573929 IN IP4 192.0.2.2
s=Multi stream and Simulcast Telepresence Bob Answer
c=IN IP4 192.0.2.43
b=AS:7065
a=group:SCR 2 3 4
m=audio 49200 RTP/AVP 96
b=AS:435
a=rtpmap:96 G719/48000/2
a=max-send-ssrc:96 3
a=max-recv-ssrc:96 3
a=ssrc:4111848278 cname:server@conf1.example.com
a=ssrc:4111848278 srcname:r1
a=ssrc:835978294 cname:server@conf1.example.com
a=ssrc:835978294 srcname:r2
a=ssrc:2938491278 cname:server@conf1.example.com
a=ssrc:2938491278 srcname:r3
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:4650
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:* send [x=1280,y=720] [x=640,y=360] [x=320,y=180]
recv [x=1280,y=720]
a=max-recv-ssrc:96 3
a=max-send-ssrc:96 2
a=ssrc:2938746293 cname:server@conf1.example.com
a=ssrc:2938746293 srcname:t1
a=ssrc:1207102398 cname:server@conf1.example.com
a=ssrc:1207102398 srcname:t2
a=mid:2
a=content:main
m=video 49400 RTP/AVP 96
b=AS:1560
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:* recv [x=640,y=360]
a=max-recv-ssrc:96 3
a=mid:3
a=recvonly
m=video 49500 RTP/AVP 96
Westerlund, et al. Expires January 17, 2013 [Page 22]
Internet-Draft RTP Simulcast July 2012
b=AS:420
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 recv [x=320,y=180]
a=max-recv-ssrc:96 3
a=mid:4
a=recvonly
m=video 49600 RTP/AVP 96 97 98
b=AS:2600
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:96 send [x=1280,y=720]
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c01e
a=imageattr:97 send [x=640,y=360]
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=42c00d
a=imageattr:98 send [x=320,y=180]
a=max-send-ssrc:96 1
a=max-send-ssrc:97 4
a=max-send-ssrc:98 8
a=max-send-ssrc:* 8
a=ssrc:2981523948 cname:server@conf1.example.com
a=ssrc:2938237 cname:server@conf1.example.com
a=ssrc:1230495879 cname:server@conf1.example.com
a=ssrc:74835983 cname:server@conf1.example.com
a=ssrc:3928594835 cname:server@conf1.example.com
a=ssrc:948753 cname:server@conf1.example.com
a=ssrc:1293456934 cname:server@conf1.example.com
a=ssrc:4134923746 cname:server@conf1.example.com
a=mid:5
a=sendonly
a=content:alt
Figure 6: Server Answer for Bob Multi-stream and Simulcast
Telepresence Conference
In this SDP answer, the grouping tag is changed to SCR, confirming
that the sent simulcast streams will be received. The directionality
of the streams themselves as well as the directionality of multi-
stream and bandwidth attributes are changed. The number of allowed
streams in the content:alt video session has been reduced from 16 to
8 in the answer.
7.3. Fred: Dial-out to Legacy Client
Fred has a simple legacy client that know nothing of the new
signaling means discussed in this document. In this example, the
Westerlund, et al. Expires January 17, 2013 [Page 23]
Internet-Draft RTP Simulcast July 2012
multi-stream and simulcast aware RTP mixer is calling out to Fred.
Even though it is never actually sent, this would be Fred's offer
SDP, should he have called in. It is included here to improve the
reader's understanding of Fred's response to the conference SDP.
v=0
o=fred 82342187 237429834 IN IP4 192.0.2.213
s=Legacy Client
t=0 0
c=IN IP4 192.0.2.213
m=audio 50132 RTP/AVP 9 8
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
m=video 50134 RTP/AVP 96 97
b=AS:405
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00c
a=rtpmap:97 H263-2000/90000
a=fmtp:97 profile=0;level=30
Figure 7: Legacy Client Hypothetical Offer
Fred would offer a single mono audio and a single video, each with a
couple of different codec alternatives.
The same conference server as in the previous example is calling out
to Fred, offering the full set of multi-stream and simulcast features
based on what the server itself can support.
v=0
o=server 323439283 2384192332 IN IP4 192.0.2.2
s=Multi stream and Simulcast Dial-out Offer
c=IN IP4 192.0.2.43
b=AS:7065
a=group:SCR 2 3 4
m=audio 49200 RTP/AVP 96 97 9 8
b=AS:435
a=rtpmap:96 G719/48000/2
a=rtpmap:97 G719/48000
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=max-send-ssrc:* 4
a=max-recv-ssrc:* 3
a=ssrc:3293472833 cname:server@conf1.example.com
a=ssrc:3293472833 srcname:q9
a=ssrc:1734728348 cname:server@conf1.example.com
a=ssrc:1734728348 srcname:Gr
a=ssrc:1054453769 cname:server@conf1.example.com
Westerlund, et al. Expires January 17, 2013 [Page 24]
Internet-Draft RTP Simulcast July 2012
a=ssrc:1054453769 srcname:SO
a=ssrc:3923447729 cname:server@conf1.example.com
a=ssrc:3923447729 srcname:AJ
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:4650
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:* send [x=1280,y=720] [x=640,y=360] [x=320,y=180]
recv [x=1280,y=720]
a=max-recv-ssrc:96 3
a=max-send-ssrc:96 3
a=ssrc:78456398 cname:server@conf1.example.com
a=ssrc:78456398 srcname:bj
a=ssrc:3284726348 cname:server@conf1.example.com
a=ssrc:3284726348 srcname:ON
a=ssrc:2394871293 cname:server@conf1.example.com
a=ssrc:2394871293 srcname:ya
a=mid:2
a=content:main
m=video 49400 RTP/AVP 96
b=AS:1560
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:* recv [x=640,y=360]
a=max-recv-ssrc:96 3
a=mid:3
a=recvonly
m=video 49500 RTP/AVP 96
b=AS:420
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 recv [x=320,y=180]
a=max-recv-ssrc:96 3
a=mid:4
a=recvonly
m=video 49600 RTP/AVP 96 97 98
b=AS:2600
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=imageattr:96 send [x=1280,y=720]
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c01e
a=imageattr:97 send [x=640,y=360]
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=42c00d
a=imageattr:98 send [x=320,y=180]
a=max-send-ssrc:96 1
Westerlund, et al. Expires January 17, 2013 [Page 25]
Internet-Draft RTP Simulcast July 2012
a=max-send-ssrc:97 4
a=max-send-ssrc:98 8
a=max-send-ssrc:* 8
a=ssrc:2342872394 cname:server@conf1.example.com
a=ssrc:1283741823 cname:server@conf1.example.com
a=ssrc:3294823947 cname:server@conf1.example.com
a=ssrc:1020408838 cname:server@conf1.example.com
a=ssrc:1999343791 cname:server@conf1.example.com
a=ssrc:2934192349 cname:server@conf1.example.com
a=ssrc:2234347728 cname:server@conf1.example.com
a=ssrc:3224283479 cname:server@conf1.example.com
a=mid:5
a=sendonly
a=content:alt
Figure 8: Server Dial-out Offer with Multi-stream and Simulcast
The answer from Fred to this offer would look like:
v=0
o=fred 9842793823 239482793 IN IP4 192.0.2.213
s=Legacy Client Answer to Server Dial-out
t=0 0
c=IN IP4 192.0.2.213
m=audio 50132 RTP/AVP 9
b=AS:80
a=rtpmap:9 G722/8000
m=video 50134 RTP/AVP 96
b=AS:405
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00c
m=video 0 RTP/AVP 96
m=video 0 RTP/AVP 96
m=video 0 RTP/AVP 96
Figure 9: Legacy Client Answer to Server Dial-out
as can be seen from the hypothetical offer, Fred does not understand
any of the multistream or simulcast attributes, and does also not
understand the grouping framework. Thus, all those lines are removed
from the answer SDP and any surplus video media blocks except for the
first are rejected. The media bandwidth are adjusted down to what
Fred actually accepts to receive.
7.4. Joe: Dial-out to Desktop Client
This example is almost identical to the one above, with the
difference that the answering end-point has some limited simulcast
Westerlund, et al. Expires January 17, 2013 [Page 26]
Internet-Draft RTP Simulcast July 2012
and multi-stream capability. As above, this is the offer SDP that
Joe would have used, should he have called in.
v=0
o=joe 82342187 237429834 IN IP4 192.0.2.117
s=Simulcast and Multistream enabled Desktop Client
t=0 0
c=IN IP4 192.0.2.117
b=AS:985
a=group:SCS 2 3
m=audio 49200 RTP/AVP 96 97 9 8
b=AS:145
a=rtpmap:96 G719/48000/2
a=rtpmap:97 G719/48000
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=ssrc:1223883729 cname:joe@foo.example.com
a=ssrc:1223883729 srcname:jV
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:520
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:96 send [x=640,y=360] recv [x=640,y=360] [x=320,y=180]
a=ssrc:3842394823 cname:joe@foo.example.com
a=ssrc:3842394823 srcname:BD
a=mid:2
a=content:main
m=video 49400 RTP/AVP 96
b=AS:160
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 send [x=320,y=180]
a=ssrc:1214232284 cname:joe@foo.example.com
a=ssrc:1214232284 srcname:BD
a=mid:3
a=sendonly
m=video 49300 RTP/AVP 96
b=AS:320
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00c
a=imageattr:96 recv [x=320,y=180]
a=max-recv-ssrc:* 2
a=mid:4
a=recvonly
a=content:alt
Figure 10: Desktop Client Hypothetical Offer
Westerlund, et al. Expires January 17, 2013 [Page 27]
Internet-Draft RTP Simulcast July 2012
Joe would send two versions of simulcast, 360p and 180p, from a
single camera and can receive three sources of multi-stream, one 360p
and two 180p streams.
Again, the same conference server is calling out to Joe and the offer
SDP from the server would be almost identical to the one in the
previous example. It is therefore not included here. The response
from Joe would look like:
Westerlund, et al. Expires January 17, 2013 [Page 28]
Internet-Draft RTP Simulcast July 2012
v=0
o=joe 239482639 4702341992 IN IP4 192.0.2.117
s=Answer from Desktop Client to Server Dial-out
t=0 0
c=IN IP4 192.0.2.117
b=AS:985
a=group:SCS 2 3
m=audio 49200 RTP/AVP 96
b=AS:145
a=rtpmap:96 G719/48000/2
a=ssrc:1223883729 cname:joe@foo.example.com
a=ssrc:1223883729 srcname:iJ
a=mid:1
m=video 49300 RTP/AVP 96
b=AS:520
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01e
a=imageattr:96 send [x=640,y=360] recv [x=640,y=360] [x=320,y=180]
a=ssrc:3842394823 cname:joe@foo.example.com
a=ssrc:3842394823 srcname:YD
a=mid:2
a=content:main
m=video 0 RTP/AVP 96
a=mid:3
m=video 49400 RTP/AVP 96
b=AS:160
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=imageattr:96 send [x=320,y=180]
a=ssrc:1214232284 cname:joe@foo.example.com
a=ssrc:1214232284 srcname:YD
a=mid:4
a=sendonly
m=video 49300 RTP/AVP 96
b=AS:320
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00c
a=imageattr:96 recv [x=320,y=180]
a=max-recv-ssrc:* 2
a=mid:5
a=recvonly
a=content:alt
Figure 11: Desktop Client Answer to Server Dial-out
Since the RTP mixer supports all of the features that Joe does and
more, the SDP does not differ much from what it should have been in
an offer. It can be noted that as stated in [RFC5888], all media
Westerlund, et al. Expires January 17, 2013 [Page 29]
Internet-Draft RTP Simulcast July 2012
lines need mid attributes, even the rejected ones, which is why mid:3
is present even though the mid quality simulcast version offered by
the mixer is rejected by Joe.
8. IANA Considerations
This document requests that two new SDP grouping semantics, SCS and
SCR, are registered.
Formal registrations to be written.
9. Security Considerations
The Simulcast grouping semantics are vulnerable to attacks in the
signalling.
A false grouping of non-simulcast streams as simulcast would risk
that some streams are incorrectly ignored by receivers that know
simulcast and that are uninterested in the assumed simulcast streams.
A hostile removal of simulcast grouping will prevent streams from
being interpreted as simulcast, which obviously prevents use of the
simulcast functionality. It will also risk that intended simulcast
streams are instead presented as separate, independent streams to a
receiver.
Neither of the above will likely have any major consequences and can
be mitigated by signaling that is at least integrity and source
authenticated to prevent an attacker to change it.
10. Acknowledgements
11. References
11.1. Normative References
[I-D.westerlund-avtcore-max-ssrc]
Westerlund, M., Burman, B., and F. Jansson, "Multiple
Synchronization sources (SSRC) in RTP Session Signaling",
draft-westerlund-avtcore-max-ssrc-02 (work in progress),
July 2012.
[I-D.westerlund-avtext-rtcp-sdes-srcname]
Westerlund, M., Burman, B., and P. Sandgren, "RTCP SDES
Westerlund, et al. Expires January 17, 2013 [Page 30]
Internet-Draft RTP Simulcast July 2012
Item SRCNAME to Label Individual Sources",
draft-westerlund-avtext-rtcp-sdes-srcname-01 (work in
progress), July 2012.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
Media Attributes in the Session Description Protocol
(SDP)", RFC 5576, June 2009.
[RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description
Protocol (SDP) Grouping Framework", RFC 5888, June 2010.
11.2. Informative References
[I-D.westerlund-avtcore-multiplex-architecture]
Westerlund, M., Burman, B., Perkins, C., and H.
Alvestrand, "Guidelines for using the Multiplexing
Features of RTP",
draft-westerlund-avtcore-multiplex-architecture-02 (work
in progress), July 2012.
[I-D.westerlund-avtcore-transport-multiplexing]
Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a
Single Lower-Layer Transport",
draft-westerlund-avtcore-transport-multiplexing-03 (work
in progress), July 2012.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
[RFC3569] Bhattacharyya, S., "An Overview of Source-Specific
Multicast (SSM)", RFC 3569, July 2003.
[RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
July 2006.
[RFC4796] Hautakorpi, J. and G. Camarillo, "The Session Description
Westerlund, et al. Expires January 17, 2013 [Page 31]
Internet-Draft RTP Simulcast July 2012
Protocol (SDP) Content Attribute", RFC 4796,
February 2007.
[RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
January 2008.
[RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
"RTP Payload Format for Scalable Video Coding", RFC 6190,
May 2011.
Authors' Addresses
Magnus Westerlund
Ericsson
Farogatan 6
SE-164 80 Kista
Sweden
Phone: +46 10 714 82 87
Email: magnus.westerlund@ericsson.com
Bo Burman
Ericsson
Farogatan 6
SE-164 80 Kista
Sweden
Phone: +46 10 714 13 11
Email: bo.burman@ericsson.com
Morgan Lindqvist
Ericsson
Farogatan 6
Kista, SE-164 80
Sweden
Phone: +46 10 719 00 00
Fax:
Email: morgan.lindqvist@ericsson.com
URI:
Westerlund, et al. Expires January 17, 2013 [Page 32]
Internet-Draft RTP Simulcast July 2012
Fredrik Jansson
Ericsson
Farogatan 6
Kista, SE-164 80
Sweden
Phone: +46 10 719 00 00
Fax:
Email: fredrik.k.jansson@ericsson.com
URI:
Westerlund, et al. Expires January 17, 2013 [Page 33]
| PAFTECH AB 2003-2026 | 2026-04-23 14:54:44 |