One document matched: draft-lennox-raiarea-rtp-grouping-taxonomy-00.txt
Network Working Group J. Lennox
Internet-Draft Vidyo
Intended status: Informational K. Gross
Expires: August 22, 2013 AVA
February 18, 2013
A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport
Protocol (RTP) Sources
draft-lennox-raiarea-rtp-grouping-taxonomy-00
Abstract
The terminology about, and associations among, Real-Time Transport
Protocol (RTP) sources can be complex and somewhat opaque. This
document describes a number of existing and proposed relationships
among RTP sources, and attempts to define common terminology for
discussing protocol entities and their relationships.
This document is still very rough, but is submitted in the hopes of
making future discussion productive.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 22, 2013.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Lennox & Gross Expires August 22, 2013 [Page 1]
Internet-Draft RTP Grouping Taxonomy February 2013
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Lennox & Gross Expires August 22, 2013 [Page 2]
Internet-Draft RTP Grouping Taxonomy February 2013
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. RTP Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Synchronization Source . . . . . . . . . . . . . . . . . . 4
2.2. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 4
2.3. Contributing Source . . . . . . . . . . . . . . . . . . . 5
2.4. Multimedia Session . . . . . . . . . . . . . . . . . . . . 5
2.5. Media Source . . . . . . . . . . . . . . . . . . . . . . . 5
3. Terminology used other protocols and groups . . . . . . . . . 5
3.1. SDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1. SDP Multimedia Session . . . . . . . . . . . . . . . . 6
3.1.2. CLUE . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.3. WebRTC . . . . . . . . . . . . . . . . . . . . . . . . 7
4. Grouping of RTP sources . . . . . . . . . . . . . . . . . . . 8
4.1. Overview of hierarchical relationships among groups . . . 8
4.2. Existing and proposed source group semantics . . . . . . . 8
4.2.1. Relationships among media sources . . . . . . . . . . 8
4.2.2. Alternative representations of media sources . . . . . 9
4.2.3. Robustness and Repair . . . . . . . . . . . . . . . . 9
4.2.4. Reporting Associations . . . . . . . . . . . . . . . . 9
5. Existing and proposed mechanisms for describing groups . . . . 9
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 10
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
9. Informative References . . . . . . . . . . . . . . . . . . . . 10
Appendix A. RTP terminology . . . . . . . . . . . . . . . . . . . 11
A.1. Capture . . . . . . . . . . . . . . . . . . . . . . . . . 11
A.2. CNAME . . . . . . . . . . . . . . . . . . . . . . . . . . 11
A.3. End system . . . . . . . . . . . . . . . . . . . . . . . . 11
A.4. Group . . . . . . . . . . . . . . . . . . . . . . . . . . 11
A.5. Multimedia session . . . . . . . . . . . . . . . . . . . . 12
A.6. Multi-stream transmission . . . . . . . . . . . . . . . . 12
A.7. Receiver . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.8. Repair stream . . . . . . . . . . . . . . . . . . . . . . 12
A.9. RTP endpoint . . . . . . . . . . . . . . . . . . . . . . . 12
A.10. RTP session . . . . . . . . . . . . . . . . . . . . . . . 12
A.11. Scene . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.12. Sender . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.13. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 12
A.14. Source . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.15. Media stream . . . . . . . . . . . . . . . . . . . . . . . 13
A.16. Switched capture . . . . . . . . . . . . . . . . . . . . . 13
A.17. Track . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13
Lennox & Gross Expires August 22, 2013 [Page 3]
Internet-Draft RTP Grouping Taxonomy February 2013
1. Introduction
Terminology and understanding about sources in RTP, and how they
relate, is very confused. Different protocols use the same terms
differently, and complexities addressed at one layer are often
glossed over or ignored at another.
This document attempts to provide some clarity by reviewing the
semantics of various aspects of sources in RTP. As an organizing
mechanism, it approaches this by describing various ways that RTP
sources can be grouped and associated together.
2. RTP Terminology
2.1. Synchronization Source
In RTP, the smallest unit of media transport is the "synchronization
source" (SSRC). Defined in [RFC3550], a synchronization source is
the source of a stream of RTP packets, identified by a 32-bit numeric
SSRC identifier carried in the RTP header of every RTP packet. All
packets from a synchronization source form part of the same timing
and sequence number space, so a receiver groups packets by
synchronization source for playback. In the most common cases, a
synchronization source is a single flow of encoded media that can be
fed to a single decoding process.
In some more complex cases, a synchronization source is also used to
encode repair flows and sub-flows of a layered encoding. See
Section 4.2.2 and Section 4.2.3 below.
(Note: as the definition above mentions, "synchronization source" in
[RFC3550] technically refers to the *source* of the stream of RTP
packets, not the packets themselves. A better term for the stream
could be helpful, and the authors are open to one; however, finding a
good term that doesn't collide with terms used in other standards is
very difficult. In particular, both "stream" and "flow" are
overloaded enough that their use would probably confuse more than it
clarifies.)
2.2. RTP Session
The RTP protocol transmits sources in an "RTP session". An RTP
session is an association among a group of participants communicating
with RTP. It is a group communications channel which carries a
number of synchronization sources. Within an RTP session, every
participant finds out meta-data and control information (over RTCP)
about all the synchronization sources in the RTP session, all
Lennox & Gross Expires August 22, 2013 [Page 4]
Internet-Draft RTP Grouping Taxonomy February 2013
synchronization source identifiers are unique in the RTP session, and
the bandwidth of the RTCP control channel is shared by all the
synchronization sources in the session.
2.3. Contributing Source
In an RTP session, in some cases, the media flows of some
synchronization sources are not forwarded by RTP middleboxes (known
as mixers and translators). In the case of mixers, each of the
synchronization sources that are used to create a mixed stream of
media becomes visible instead as a "contributing source" (CSRC),
whose media is no longer directly visibile in some parts of the RTP
session, but whose meta-data is still known by all participants
through RTCP.
2.4. Multimedia Session
A group of RTP sessions can be grouped into a "multimedia session", a
group of concurrent, associated RTP sessions among a common group of
participants. A multimedia session defines logical relationships
among sources that appear in multiple RTP sessions.
2.5. Media Source
At a higher level than this is the "media source", a single piece of
media that can be indepentently rendered. While in the simplest
cases a synchronization source uniquely encodes a media source, many
more complicated cases, discussed in Section 4.2.2 and Section 4.2.3
below, associate multiple synchronization sources together to provide
various methods and options for encoding a media source. (In some of
these cases, these multiple synchronization sources are even sent in
separate RTP sessions, though always in the same multimedia session.)
Unlike the other terminology defined in this section, "media source"
is not a well-established term; existing protocols use a number of
different terms to define similar concepts, described below. ("Media
source" may not be the best term in for this concept; the authors of
this document are open to better ones.)
3. Terminology used other protocols and groups
3.1. SDP
The Session Description Protocol [RFC4566] provides a mechanism for
describing -- and, with SDP Offer/Answer [RFC3264], negotiation --
multimedia sessions.
Lennox & Gross Expires August 22, 2013 [Page 5]
Internet-Draft RTP Grouping Taxonomy February 2013
3.1.1. SDP Multimedia Session
SDP uses the term "multimedia session" in essentially the same manner
as RTP.
3.1.1.1. Media Stream
"Media stream" is perhaps one of the most confusing terms in SDP. It
is never explicitly defined in [RFC4566]; its meaning can only be
inferred from descriptions of its properties, the SDP syntax used to
describe it, and its usage in other protocols.
In SDP, a media stream is described my a "media description", a block
of SDP starting with an "m=" line and followed by other SDP fields
which are scoped to that media description.
When SDP is used to describe a media stream that uses RTP, the
attributes used (in the core SDP protocol) all define attributes of
an RTP session. Furthermore, [RFC4566] does note that "Media streams
can be many-to-many." Thus, when SDP describes a media stream that
uses RTP, the RTP-level protocol element that is described is an RTP
session, which can carry any number of synchronization sources (and
thus media sources).
However, for a number of reasons, many implementations of SDP,
notably SIP devices, instead interpret SDP media streams as
containing only a single media source in each direction (assuming
unicast flows). There are a number of reasons this implementation
decision was chosen, including the lack of definition of "media
stream" in the SDP specification; the poorly chosen name (the term
"media stream" seems like something that should refer to what this
document calls a "media source"); the fact that core SDP, and offer/
answer, provide no description or negotiation mechanisms for anything
more fine-grained than a media source; and the fact that most
existing implementations have no need for more than one media source
and received of each media type.
Other than in discussions of other existing terminology, this
document attempts to avoid the use of the word "stream".
3.1.2. CLUE
The CLUE working group is ongoing work in the IETF to define
protocols and mechanism for interoperable telepresence systems. Its
architecture and terms are defined in [I-D.ietf-clue-framework].
Lennox & Gross Expires August 22, 2013 [Page 6]
Internet-Draft RTP Grouping Taxonomy February 2013
3.1.2.1. CLUE Capture Scene
In CLUE, a "Capture Scene" is a structure representing a spatial
region containing one or more Capture Devices, each capturing media
representing a portion of the region. The spatial region represented
by a scene may or may not correspond to a real region in physical
space, such as a room. A capture scene includes attributes and one
or more capture scene entries, with each entry including one or more
media captures.
3.1.2.2. CLUE Capture Scene Entry
In CLUE, a "Capture Scene Entry" is a list of Media Captures of the
same media type that together form one way to represent the entire
Capture scene.
3.1.2.3. CLUE Capture
In CLUE, a "Media Capture" is a source of Media, such as from one or
more Capture Devices or constructed from other Media streams.
The "Capture" is CLUE's term that is most closely equivalent to what
this document calls a "media source"; however, some usages (such as
"dynamic sources" of "switched captures") may be more complex, and
are still actively being defined.
3.1.2.4. CLUE Capture Encoding
In CLUE, a "Capture Encoding" is a specific encoding of a Media
Capture, to be sent by a Media Provider to a Media Consumer via RTP.
In simulcast scenearios (see Section 4.2.2), multiple capture
encodings can be used simultaneously to transmit a single capture.
3.1.3. WebRTC
WebRTC is ongoing work in the IETF and W3C to define mechanisms and
APIs to allow Javascript applications in web browsers to participate
in multimedia sessions. An overview can be found in
[I-D.ietf-rtcweb-overview].
3.1.3.1. WebRTC RtcMediaStreamTrack
RtcMediaStreamTrack
An "RtcMediaStreamTrack" is an object in WebRTC that maps most
closely to what this document calls a "media source", though some of
the details remain to be defined. However, as WebRTC is defining an
Lennox & Gross Expires August 22, 2013 [Page 7]
Internet-Draft RTP Grouping Taxonomy February 2013
API, the RtcMediaStreamTrack object also includes a good deal of
meta-data about the media source, such as an ID, enabled/disabled
state, and so forth.
3.1.3.2. WebRTC RtcMediaStream
An "RtcMediaStream" in WebRTC is an object that groups a set of
RtcMediaStreamTracks that are synchronized with each other.
4. Grouping of RTP sources
Synchronization sources can have a large variety of relationships
among them. These relationships can apply both between sources
within a single session, and between sources that occur in multiple
sessons.
Ways of relating synchronization sources typically involve groups: a
set of synchronizaiton sources has some relationship that applies to
all those in the group, and no others. (Relationships that involve
arbitrary non-grouping associations among synchronization sources,
such that e.g., A relates to B and B to C, but A and C are unrelated,
are uncommon if not nonexistant.)
In many cases, the semantics of groups are not simply that the
sources form an undifferentiated group, but rather that members of
the group have certain roles.
4.1. Overview of hierarchical relationships among groups
Many (though not all) associations among groups can be described
hierarchically.
These hierarchichal groups can be divided into three large
categories: associations among independent media sources; alternative
representations of media sources; and mechanisms for robustness and
repair of a particular representation of a media source.
4.2. Existing and proposed source group semantics
We can divide group semantics into several broad categories.
(TODO: each of the items below needs to be described in detail.)
4.2.1. Relationships among media sources
Synchronization Contexts -- things identified by a CNAME.
Lennox & Gross Expires August 22, 2013 [Page 8]
Internet-Draft RTP Grouping Taxonomy February 2013
CLUE Scenes (and their substructures, down to Captures.).
WebRTC MediaStreams.
Clock source signaling.
4.2.2. Alternative representations of media sources
Simulcast.
Multi-Stream Transmission of Layered Codecs.
The original FID description in RFC 5888 (grouping) Section 8.
4.2.3. Robustness and Repair
Forward Error Correction (FEC). (But in some cases this can also be
across multiple media sources!)
Retransmission.
4.2.4. Reporting Associations
RTP also uses SSRC values to identify receivers of media, the source
of feedback about synchronization sources (and thus, also media
sources). In some cases these receivers are also grouped. However,
these associations are not associations of senders of media, and thus
are not the same type of relationship as the ones described above.
Inter-Domain Media Synchronization
Reporting groups.
5. Existing and proposed mechanisms for describing groups
SDP groups
SDP source groups
SRCNAME
MSID
CLUE
Application-specific SDP attributes
Lennox & Gross Expires August 22, 2013 [Page 9]
Internet-Draft RTP Grouping Taxonomy February 2013
SDP "number of ports" field to create multiple sessions with SSRC
alignment.
6. Security Considerations
Different for each way of grouping. Is there anything we can
generalize?
Hopefully having a well-defined common terminology and understanding
of the complexities of the RTP architecture will help lead us to
better standards, avoiding security problems.
7. Open Issues
Much of the terminology is still a matter of dispute.
It might be useful to distinguish between a single endpoint's view of
a source, or RTP session, or multimedia session, versus the full set
of sessions and every endpoint that's communicating in them, with the
signaling that established them.
(Sure to be many more...)
8. IANA Considerations
This document makes no request of IANA.
9. Informative References
[I-D.ietf-clue-framework]
Duckworth, M., Pepperell, A., and S. Wenger, "Framework
for Telepresence Multi-Streams",
draft-ietf-clue-framework-08 (work in progress),
December 2012.
[I-D.ietf-rtcweb-overview]
Alvestrand, H., "Overview: Real Time Protocols for Brower-
based Applications", draft-ietf-rtcweb-overview-05 (work
in progress), December 2012.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
Lennox & Gross Expires August 22, 2013 [Page 10]
Internet-Draft RTP Grouping Taxonomy February 2013
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC6222] Begen, A., Perkins, C., and D. Wing, "Guidelines for
Choosing RTP Control Protocol (RTCP) Canonical Names
(CNAMEs)", RFC 6222, April 2011.
Appendix A. RTP terminology
(TODO: This list of definitions needs to be cleaned up and expanded,
if it's useful. Possibly its content should simply be merged into
the appropriate sections of the main document, instead.)
A.1. Capture
"Capture" is the term used in the IETF CLUE Telepresence Framework to
refer to what this document calls a "media source".
A.2. CNAME
An RTP canonical name (RTP CNAME or CNAME) is a persistent transport-
level identifier for an RTP endpoint (Appendix A.9). CNAMEs must be
unique within an RTP session (Section 2.2). The RTCP CNAME can be
either persistent across different RTP sessions for an RTP endpoint
or unique per session, meaning that an RTP endpoint chooses a
different CNAME for each session.[RFC6222] receivers (Appendix A.7)
require the CNAME to keep track of each participant. Receivers may
also require the CNAME to associate multiple data streams from a
given participant in a set of related RTP sessions, for example to
synchronize audio and video.
A.3. End system
An application that generates the content to be sent in RTP packets
and/or consumes the content of received RTP packets. An end system
can produce one or more synchronization sources (Section 2.1) in a
particular RTP session (Section 2.2).[RFC3550]
A.4. Group
Lennox & Gross Expires August 22, 2013 [Page 11]
Internet-Draft RTP Grouping Taxonomy February 2013
A.5. Multimedia session
A set of concurrent, associated RTP sessions (Section 2.2) among a
common group of participants.[RFC3550]
A.6. Multi-stream transmission
Multi-stream transmission (MST) is a mechanism by which different
portions of a layered encoding of a media stream are sent using
separate synchronization sources (sometimes in separate RTP sessons).
MSTs are useful for receiver control of layered media.
A.7. Receiver
A.8. Repair stream
A repair stream is a secondary stream used for error recovery. The
repair mecahnisms available include
o duplication of the original stream,
o duplication of the original stream with a time offset,
o forward error correction (FEC) techniques, and.
o retransmission of lost packets (either globally or selectively).
A.9. RTP endpoint
A.10. RTP session
An RTP session or simply, session is an association among a set of
participants communicating with RTP.[RFC3550]
A.11. Scene
A scene is associated with IETF CLUE and describes a collection of
captures (Appendix A.1) and the metadata required to make
associations (e.g. relative spacial orintation of cameras) between
the captures.
A.12. Sender
A.13. Simulcast
A.14. Source
A synchronization source (SSRC) is the source of a stream of RTP
packets, identified by a 32-bit numeric SSRC identifier carried in
the RTP header so as not to be dependent upon the network address.
All packets from a synchronization source form part of the same
timing and sequence number space, so a receiver groups packets by
Lennox & Gross Expires August 22, 2013 [Page 12]
Internet-Draft RTP Grouping Taxonomy February 2013
synchronization source for playback.[RFC3550]
A.15. Media stream
A.16. Switched capture
A switched capture is used in teleconference appliations and is a
stream (Appendix A.15) whose content is selected from one of a
selection of sources. The source is selected based on an algorithm
in the equipment producing the stream for example, a closeup of the
person currently speaking in a conference room.
A.17. Track
A track is associated with W3C WebRTC and IETF RTCWeb projects. A
track is defined by SSRC and CNAME. A track is the smallest media
entity that may be independently rendered. A single audio channel is
a track.
Authors' Addresses
Jonathan Lennox
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: jonathan@vidyo.com
Kevin Gross
AVA Networks, LLC
Boulder, CO
US
Email: kevin.gross@avanw.com
Lennox & Gross Expires August 22, 2013 [Page 13]
| PAFTECH AB 2003-2026 | 2026-04-23 21:35:37 |