One document matched: draft-alvestrand-rtcweb-msid-02.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-alvestrand-rtcweb-msid-02"
ipr="trust200902">
<front>
<title abbrev="MSID in SDP">Cross Session Stream Identification in the
Session Description Protocol</title>
<author fullname="Harald Alvestrand" initials="H. T." surname="Alvestrand">
<organization>Google</organization>
<address>
<postal>
<street>Kungsbron 2</street>
<city>Stockholm</city>
<region></region>
<code>11122</code>
<country>Sweden</country>
</postal>
<email>harald@alvestrand.no</email>
</address>
</author>
<date day="29" month="May" year="2012" />
<abstract>
<t>This document specifies a grouping mechanism for RTP media streams
that can be used to specify relations betweeen media streams within
different RTP sessions.</t>
<t>This mechanism is used to signal the association between the RTP
concept of SSRC and the WebRTC concept of "media stream" / "media stream
track" using SDP signalling.</t>
<t>This document is an input document for discussion. It should be
discussed in the RTCWEB WG list, rtcweb@ietf.org.</t>
</abstract>
<note title="Requirements Language">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref>.</t>
</note>
</front>
<middle>
<section title="Introduction">
<t></t>
<section title="Why A New Mechanism Is Needed">
<t>There exist cases where an application using RTP and SDP needs to
signal some relationship between RTP media streams (packets carried
using a single SSRC) that may be carried in either the same RTP
session or different RTP sessions.</t>
<t>When all SSRCs are carried in a single RTP session, the
"a=ssrc-group" mechanism <xref target="RFC5576"></xref> can be
used.</t>
<t>When each RTP session carries one and only one SSRC, the SDP
grouping framework <xref target="RFC5888"></xref> can be used.</t>
<t>However, there are use cases (some of which are discussed in <xref
target="I-D.westerlund-avtcore-multiplex-architecture"></xref> ) where
neither of these approaches is appropriate; for instance, there may be
a need to signal a relationship between a video track in one RTP
session and an audio track in another RTP session. In those cases, a
new mechanism is needed.</t>
<t> (Note: When the bundle mechanism, <xref
target="I-D.ietf-mmusic-sdp-bundle-negotiation"></xref>, is used, the
extension is still needed to link SSRCs under different m= lines, even
when they are in the same RTP session). </t>
<t>In addition, there is sometimes the need for an application to
specify some application-level information about the association
between the SSRC and the group. This is not possible using either of
the frameworks above.</t>
</section>
<section title="Application to the WEBRTC MediaStream">
<t>The W3C WebRTC API specification <xref
target="W3C.WD-webrtc-20120209"></xref> specifies that communication
between WebRTC entities is done via MediaStreams, which contain
MediaStreamTracks. A MediaStreamTrack is generally carried using a
single SSRC in an RTP session (forming an RTP media stream. The
collision of terminology is unfortunate.) There might possibly with
additional SSRCs, possibly within additional RTP sessions, in order to
support functionality like forward error correction or simulcast. This
complication is ignored below.</t>
<t>In the RTP specification, media streams are identified using the
SSRC field. Streams are grouped into RTP Sessions, and also carry a
CNAME. Neither CNAME nor RTP session correspond to a MediaStream.
Therefore, the association of an RTP media stream to MediaStreams need
to be explicitly signalled.</t>
<t>The marking needs to be on a per-SSRC basis, since one RTP session
can carry media from multiple MediaStreams, and one MediaStream can
have media in multiple RTP sessions. This means that the <xref
target="RFC4574"></xref> "label" attribute, which is used to label RTP
sessions, is not usable for this purpose.</t>
<t>The marking needs to also carry the unique identifier of the RTP
media stream as a MediaStreamTrack within the media stream; this is
done using a single letter to identify whether it belongs in the video
or audio track list, and the MediaStreamTrack's position within that
array.</t>
<t>This usage is described in <xref
target="sect-media-stream"></xref>.</t>
</section>
</section>
<section title="The Msid Mechanism">
<t>Grouping of SSRCs is done via an "msid" attribute attached to the
SSRC in the SDP description, using the "Source Specific Media Attribute"
mechanism <xref target="RFC5576"></xref>:</t>
<figure>
<artwork><![CDATA[ a=ssrc:1234 msid:examplefoo v1
]]></artwork>
</figure>
<t>The ID is a randomly-generated string of ASCII characters chosen from
0-9, a-z, A-Z and - (hyphen), consisting of between 1 and 64 characters.
It MUST be unique among the ID values used in the same SDP session.</t>
<t>The value "default" (all lower case) has special meaning, and MUST
NOT be generated. Values starting with "example" (all lower case) are
reserved for documentation, and MUST NOT be generated by an
implementation.</t>
<t>Application data is carried on the same line as the ID, separated
from the ID by a space.</t>
<t>ABNF<xref target="RFC5234"></xref> grammar:</t>
<t></t>
<figure>
<artwork><![CDATA[
msidattribute = "msid:" identifier [ " " appdata ]
identifier = 1*64 ("0".."9" / "a".."z" / "-")
appdata = 1*64 ("0".."9" / "a".."z" / "-")
]]></artwork>
</figure>
<t>(Note: one possible generation algorithm is to generate 6 random
bytes, base64 encode them (giving 8 bytes), and prefixing with a letter
that is neither "d" nor "e". Another possibility is using some form of
UUID.)</t>
<t>The ID uniquely identifies a group within the scope of an SDP
description.</t>
<t>There may be multiple msid attributes on a single SSRC.</t>
<t></t>
</section>
<section title="The Msid-Semantic Attribute">
<t>In order to fully reproduce the semantics of the SDP and SSRC
grouping frameworks, a session-level attribute is defined for signalling
the semantics associated with an msid grouping.</t>
<t>This OPTIONAL attribute gives the message ID and its group
semantic.</t>
<figure>
<artwork><![CDATA[ a=msid-semantic: examplefoo LS
]]></artwork>
</figure>
<t>The ABNF of msid-semantic is:</t>
<figure>
<artwork><![CDATA[
msid-semantic-attr = "msid-semantic:" " " msid token
token = <as defined in RFC 4566>
]]></artwork>
</figure>
<t>The semantic field may hold values from the IANA registries
"Semantics for the "ssrc-group" SDP Attribute" and "Semantics for the
"group" SDP Attribute".</t>
</section>
<section anchor="sect-media-stream"
title="Applying Msid to WebRTC Media Streams">
<t>The semantic for WebRTC Media Streams is "WMS".</t>
<t>The value of the msid corresponds to the "id" attribute of a
MediaStream. (note: as of Jan 11, 2012, this is called "label". The word
"label" means many other things, so the same word should not be
used.)</t>
<t>In a WebRTC-compatible SDP description, all SSRCs intending to be
sent from one peer will be identified in the SDP generated by that
entity.</t>
<t>The appdata for a WebRTC MediaStreamTrack consists of the track type
and the track number; the track type is encoded as the single letter "a"
(audio) or "v" (video), and the track number is encoded as a decimal
integer with no leading zeroes. The first track is track zero, and is
identified as "a0" for audio, and "v0" for video.</t>
<t>When an SDP description is updated, a specific msid continues to
refer to the same media stream; an msid value MUST NOT be reused for
another media stream within a PeerConnection's lifetime.</t>
<t>The following are the rules for handling updates of the list of SSRCs
and their msid values.</t>
<t><list style="symbols">
<t>When a new msid value occurs in the description, the recipient
can signal to its application that a new media stream has been
added.</t>
<t>When a description is updated to have more SSRCs with the same
msid value, the recipient can signal to its application that new
media stream tracks have been added to the media stream.</t>
<t>When a description is updated to no longer list the msid value on
a specific ssrc, the recipient can signal to its application that
the corresponding media stream track has been closed.</t>
<t>When a description is updated to no longer list the msid value on
any ssrc, the recipient can signal to its application that the media
stream has been closed.</t>
</list>OPEN ISSUE: Exactly when should the recipient signal that the
track is closed? When the msid value disappears from the description,
when the SSRC disappears by the rules of RFC 3550 section 6.3.4 (BYE
packet received) and 6.3.5 (timeout), any of the above, or some
combination of the above?</t>
<t></t>
<section title="Handling of non-signalled tracks">
<t>Pre-WebRTC entities will not send msid. This means that there will
be some incoming RTP packets with SSRCs where the recipient does not
know about a corresponding MediaStream id.</t>
<t>Handling will depend on whether or not any SSRCs are signalled in
the relevant RTP session. There are two cases:</t>
<t><list style="symbols">
<t>No SSRC is signalled with an msid attribute. The SDP session is
assumed to be a backwards-compatible session. All incoming SSRCs,
on all RTP sessions that are part of the SDP session, are assumed
to belong to a single media stream. The ID of this media stream is
"default".</t>
<t>Some SSRCs are signalled with an msid attribute. In this case,
the session is WebRTC compatible, and the newly arrived SSRCs are
either caused by a bug or by timing skew between the arrival of
the media packets and the SDP description. These packets MAY be
discarded, or they MAY be buffered for a while in order to allow
immediate startup of the media stream when the SDP description is
updated. The arrival of media packets MUST NOT cause a new
MediaStreamTrack to be created.</t>
</list>Note: This means that it is wise to include at least one
a=ssrc: line with an msid attribute, even when no media streams are
yet attached to the session. (Alternative: Mark the RTP session
explicitly as "I will signal the media stream tracks explicitly").</t>
<t>It follows from the above that media stream tracks in the "default"
media stream cannot be closed by signalling; the application must
instead signal these as closed when either an RTCP BYE packet or the
absence of media for a defined interval <what interval?>
indicates that the stream is gone.</t>
</section>
</section>
<section anchor="IANA" title="IANA Considerations">
<t>This document requests IANA to register the "msid" attribute in the
"att-field (source level)" registry within the SDP parameters registry,
according to the procedures of <xref target="RFC5576"></xref></t>
<t>The required information is:</t>
<t><list style="symbols">
<t>Contact name, email: IETF, contacted via rtcweb@ietf.org, or a
successor address designated by IESG</t>
<t>Attribute name: msid</t>
<t>Long-form attribute name: Media stream group Identifier</t>
<t>The attribute value contains only ASCII characters, and is
therefore not subject to the charset attribute.</t>
<t>The attribute gives an association over a set of SSRCs,
potentially in different RTP sessions. It can be used to signal the
relationship between a WebRTC MediaStream and a set of SSRCs.</t>
<t>The details of appropriate values are given in RFC XXXX.</t>
</list></t>
<t>This document requests IANA to register the "WMS" semantic within the
"Semantics for the "ssrc-group" SDP Attribute" registry within the SDP
parameters registry.</t>
<t>The required information is:</t>
<t><list style="symbols">
<t>Description: WebRTC Media Stream, as given in RFC XXXX.</t>
<t>Token: WMS</t>
<t>Standards track reference: RFC XXXX</t>
</list>IANA is requested to replace "RFC XXXX" with the RFC number of
this document upon publication.</t>
</section>
<section anchor="Security" title="Security Considerations">
<t>An adversary with the ability to modify SDP descriptions has the
ability to switch around tracks between media streams. This is a special
case of the general security consideration that modification of SDP
descriptions needs to be confined to entities trusted by the
application.</t>
<t>No attacks that are relevant to the browser's security have been
identified that depend on this mechanism.</t>
</section>
<section anchor="Acknowledgements" title="Acknowledgements">
<t>This note is based on sketches from, among others, Justin Uberti and
Cullen Jennings.</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include='reference.RFC.5576'?>
<?rfc include='reference.RFC.5234'?>
<?rfc include='reference.W3C.WD-webrtc-20120209'?>
</references>
<references title="Informative References">
<?rfc include='reference.RFC.4574'?>
<?rfc include='reference.RFC.5888'?>
<?rfc include='reference.I-D.westerlund-avtcore-multiplex-architecture'?>
<?rfc include='reference.I-D.ietf-mmusic-sdp-bundle-negotiation'?>
</references>
<section title="Design considerations, open questions and and alternatives">
<t>This appendix should be deleted before publication as an RFC.</t>
<t>One suggested mechanism has been to use CNAME instead of a new
attribute. This was abandoned because CNAME identifies a synchronization
context; one can imagine both wanting to have tracks from the same
synchronization context in multiple media streams and wanting to have
tracks from multiple synchronization contexts within one media
stream.</t>
<t>Another suggestion has been to put the msid value within an attribute
of RTCP SR (sender report) packets. This doesn't offer the ability to
know that you have seen all the tracks currently configured for a media
stream.</t>
<t>There has been a suggestion that this mechanism could be used to mute
tracks too. This is not done at the moment.</t>
<t>The special value "default" and the reservation of "example*" seems
bothersome; apart from that, it's a random string. It's uncertain
whether "example" has any benefit.</t>
<t>An alternative to the "default" media stream is to let each new media
stream track without a msid attribute create its own media stream. Input
on this question is sought.</t>
<t>Discarding of incoming data when the SDP description isn't updated
yet (section 3) may cause clipping. However, the same issue exists when
crypto keys aren't available. Input sought.</t>
<t>There's been a suggestion that acceptable SSRCs should be signalled
in a response, giving a recipient the ability to say "no" to certain
SSRCs. This is not supported in the current version of this
document.</t>
<t>This specification reuses the ssrc-group semantics registry for this
semantic, on the argument that the WMS purpose is more similar to an
SSRC grouping than a session-level grouping, and allows values from both
registries, on the argument that some semantics (like LS) are well
defined for MSID. Input sought.</t>
</section>
<section title="Change log">
<t>This appendix should be deleted before publication as an RFC.</t>
<section title="Changes from -00 to -01">
<t>Added track identifier.</t>
<t>Added inclusion-by-reference of
draft-lennox-mmusic-source-selection for track muting.</t>
<t>Some rewording.</t>
</section>
<section title="Changes from -01 to -02">
<t>Split document into sections describing a generic grouping
mechanism and sections describing the application of this grouping
mechanism to the WebRTC MediaStream concept.</t>
<t>Removed the mechanism for muting tracks, since this is not central
to the MSID mechanism.</t>
</section>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-23 13:32:54 |