One document matched: draft-ietf-mmusic-sdp-simulcast-04.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-mmusic-sdp-simulcast-04"
     ipr="trust200902" submissionType="IETF">
  <front>
    <title abbrev="Simulcast">Using Simulcast in SDP and RTP Sessions</title>

    <author fullname="Bo Burman" initials="B." surname="Burman">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Kistavagen 25</street>

          <city>SE-164 80 Stockholm</city>

          <region/>

          <code/>

          <country>Sweden</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>bo.burman@ericsson.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 2</street>

          <city>SE-164 80 Stockholm</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <author fullname="Suhas Nandakumar" initials="S." surname="Nandakumar">
      <organization>Cisco</organization>

      <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>snandaku@cisco.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Mo Zanaty" initials="M." surname="Zanaty">
      <organization>Cisco</organization>

      <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>mzanaty@cisco.com</email>

        <uri/>
      </address>
    </author>

    <date day="3" month="February" year="2016"/>

    <abstract>
      <t>In some application scenarios it may be desirable to send multiple
      differently encoded versions of the same media source in different RTP
      streams. This is called simulcast. This document describes how to
      accomplish simulcast in RTP and how to signal it in SDP. The described
      solution uses an RTP/RTCP identification method to identify RTP streams
      belonging to the same media source, and makes an extension to SDP to
      relate those RTP streams as being different simulcast formats of that
      media source. The SDP extension consists of a new media level SDP
      attribute that expresses capability to send and/or receive simulcast RTP
      streams.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="sec-intro" title="Introduction">
      <t>Most of today's multiparty video conference solutions make use of
      centralized servers to reduce the bandwidth and CPU consumption in the
      endpoints. Those servers receive RTP streams from each participant and
      send some suitable set of possibly modified RTP streams to the rest of
      the participants, which usually have heterogeneous capabilities (screen
      size, CPU, bandwidth, codec, etc). One of the biggest issues is how to
      perform RTP stream adaptation to different participants' constraints
      with the minimum possible impact on both video quality and server
      performance.</t>

      <t>Simulcast is defined in this memo as the act of simultaneously
      sending multiple different encoded streams of the same media source,
      e.g. the same video source encoded with different video encoder types or
      image resolutions. This can be done in several ways and for different
      purposes. This document focuses on the case where it is desirable to
      provide a media source as multiple encoded streams over <xref
      target="RFC3550">RTP</xref> towards an intermediary so that the
      intermediary can provide the wanted functionality by selecting which RTP
      stream(s) to forward to other participants in the session, and more
      specifically how the identification and grouping of the involved RTP
      streams are done.</t>

      <t>This document describes a few scenarios where it is motivated to use
      simulcast, and also defines the needed RTP/RTCP and SDP signaling for
      it.</t>
    </section>

    <section anchor="sec-definitions" title="Definitions">
      <t/>

      <section title="Terminology">
        <t>This document makes use of the terminology defined in <xref
        target="RFC7656">RTP Taxonomy</xref>, and <xref target="RFC7667">RTP
        Topologies</xref>. In addition, the following terms are used:<list
            style="hanging">
            <t hangText="RTP Mixer:">An RTP middle node, defined in <xref
            target="RFC7667"/> (Section 3.6 to 3.9).</t>

            <t hangText="RTP Switch:">A common short term for the terms
            "switching RTP mixer", "source projecting middlebox", and "video
            switching MCU" as discussed in <xref target="RFC7667"/>.</t>

            <t hangText="Simulcast Stream:">One Encoded Stream or Dependent
            Stream from a set of concurrently transmitted Encoded Streams and
            optional Dependent Streams, all sharing a common Media Source, as
            defined in <xref target="RFC7656"/>. Decoding a Dependent Stream
            also requires the related (Dependent and) Encoded Stream(s), but
            in the context of simulcast that is considered a property of the
            Dependent Stream constituting the simulcast stream. For example,
            HD and thumbnail video simulcast versions of a single Media Source
            sent concurrently as separate RTP Streams.</t>

            <t hangText="Simulcast Format:">Different formats of a simulcast
            stream serve the same purpose as alternative RTP payload types in
            non-simulcast SDP, to allow multiple alternative media formats for
            a given RTP Stream. As for multiple RTP payload types on the
            m-line, any one of the alternative formats can be used at a given
            point in time, but not more than one (based on RTP timestamp), and
            what format is used can change dynamically from one RTP packet to
            another. For example, if all participants in a group video call
            can decode H.264 and H.265 video, but only some can encode H.265,
            both H.264 and H.265 can be kept as alternative formats, and the
            format may dynamically switch between H.264 and H.265 as different
            participants become active speaker.</t>
          </list></t>
      </section>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
      </section>
    </section>

    <section anchor="sec-use-cases" title="Use Cases">
      <t>Many use cases of simulcast as described in this document relate to a
      multi-party communication session where one or more central nodes are
      used to adapt the view of the communication session towards individual
      participants, and facilitate the media transport between participants.
      Thus, these cases targets the RTP Mixer type of topology.</t>

      <t>There are two principle approaches for an RTP Mixer to provide this
      adapted view of the communication session to each receiving
      participant:<list style="symbols">
          <t>Transcoding (decoding and re-encoding) received RTP streams with
          characteristics adapted to each receiving participant. This often
          include mixing or composition of media sources from multiple
          participants into a mixed media source originated by the RTP Mixer.
          The main advantage of this approach is that it achieves close to
          optimal adaptation to individual receiving participants. The main
          disadvantages are that it can be very computationally expensive to
          the RTP Mixer and typically also degrades media Quality of
          Experience (QoE) such as end-to-end delay for the receiving
          participants.</t>

          <t>Switching a subset of all received RTP streams or sub-streams to
          each receiving participant, where the used subset is typically
          specific to each receiving participant. The main advantages of this
          approach are that it is computationally cheap to the RTP Mixer and
          it has very limited impact on media QoE. The main disadvantage is
          that it can be difficult to combine a subset of received RTP streams
          into a perfect fit to the resource situation of a receiving
          participant.</t>
        </list></t>

      <t>The use of simulcast relates to the latter approach, where it is more
      important to reduce the load on the RTP Mixer and/or minimize QoE impact
      than to achieve an optimal adaptation of resource usage.</t>

      <section anchor="sec-diverse-receivers"
               title="Reaching a Diverse Set of Receivers">
        <t>The media sources provided by a sending participant potentially
        need to reach several receiving participants that differ in terms of
        available resources. The receiver resources that typically differ
        include, but are not limited to:<list style="hanging">
            <t hangText="Codec:">This includes codec type (such as SDP MIME
            type) and can include codec configuration options (e.g. SDP fmtp
            parameters). A couple of codec resources that differ only in codec
            configuration will be "different" if they are somehow not
            "compatible", like if they differ in video codec profile, or the
            transport packetization configuration.</t>

            <t hangText="Sampling:">This relates to how the media source is
            sampled, in spatial as well as in temporal domain. For video
            streams, spatial sampling affects image resolution and temporal
            sampling affects video frame rate. For audio, spatial sampling
            relates to the number of audio channels and temporal sampling
            affects audio bandwidth. This may be used to suit different
            rendering capabilities or needs at the receiving endpoints, as
            well as a method to achieve different transport capabilities,
            bitrates and eventually QoE by controlling the amount of source
            data.</t>

            <t hangText="Bitrate:">This relates to the amount of bits spent
            per second to transmit the media source as an RTP stream, which
            typically also affects the Quality of Experience (QoE) for the
            receiving user.</t>
          </list>Letting the sending participant create a simulcast of a few
        differently configured RTP streams per media source can be a good
        tradeoff when using an RTP switch as middlebox, instead of sending a
        single RTP stream and using an RTP mixer to create individual
        transcodings to each receiving participant.</t>

        <t>This requires that the receiving participants can be categorized in
        terms of available resources and that the sending participant can
        choose a matching configuration for a single RTP stream per category
        and media source.</t>

        <t>For example, assume for simplicity a set of receiving participants
        that differ only in that some have support to receive Codec A, and the
        others have support to receive Codec B. Further assume that the
        sending participant can send both Codec A and B. It can then reach all
        receivers by creating two simulcasted RTP streams from each media
        source; one for Codec A and one for Codec B.</t>

        <t>In another simple example, a set of receiving participants differ
        only in screen resolution; some are able to display video with at most
        360p resolution and some support 720p resolution. A sending
        participant can then reach all receivers by creating a simulcast of
        RTP streams with 360p and 720p resolution for each sent video media
        source.</t>

        <t>In more elaborate cases, the receiving participants differ both in
        available sampling and bitrate, and maybe also codec, and it is up to
        the RTP switch to find a good trade-off in which simulcasted stream to
        choose for each intended receiver. It is also the responsibility of
        the RTP switch to negotiate a good fit of simulcast streams with the
        sending participant.</t>

        <t>The maximum number of simulcasted RTP streams that can be sent is
        mainly limited by the amount of processing and uplink network
        resources available to the sending participant.</t>
      </section>

      <section anchor="sec-application-specific"
               title="Application Specific Media Source Handling">
        <t>The application logic that controls the communication session may
        include special handling of some media sources. It is for example
        commonly the case that the media from a sending participant is not
        sent back to itself.</t>

        <t>It is also common that a currently active speaker participant is
        shown in larger size or higher quality than other participants (the
        sampling or bitrate aspects of <xref
        target="sec-diverse-receivers"/>). Not sending the active speaker
        media back to itself means there is some other participant's media
        that instead has to receive special handling towards the active
        speaker; typically the previous active speaker. This way, the
        previously active speaker is needed both in larger size (to current
        active speaker) and in small size (to the rest of the participants),
        which can be solved with a simulcast from the previously active
        speaker to the RTP switch.</t>
      </section>

      <section anchor="sec-receiver-preferences"
               title="Receiver Media Source Preferences">
        <t>The application logic that controls the communication session may
        allow receiving participants to apply preferences to the
        characteristics of the RTP stream they receive, for example in terms
        of the aspects listed in <xref target="sec-diverse-receivers"/>.
        Sending a simulcast of RTP streams is one way of accommodating
        receivers with conflicting or otherwise incompatible preferences.</t>
      </section>
    </section>

    <section anchor="sec-requirements" title="Requirements">
      <t>The following requirements need to be met to support the use cases in
      previous sections:<list style="empty">
          <t>Editor's note: Consider adding an explicit requirement that the
          solution supports use of simulcast even when using multiple codecs
          and multiple redundant RTP streams per defined codec (or something
          similar), since this is really an existing requirement and should
          also fully motivate the use of RID as identification mechanism.</t>
        </list><list style="hanging">
          <t anchor="req-1" hangText="REQ-1:">Identification. It must be
          possible to identify a set of simulcasted RTP streams as originating
          from the same media source:<list style="hanging">
              <t anchor="req-1.1" hangText="REQ-1.1:">In SDP signaling.</t>

              <t anchor="req-1.2" hangText="REQ-1.2:">On RTP/RTCP level.</t>
            </list></t>

          <t anchor="req-2" hangText="REQ-2:">Transport usage. The solution
          must work when using:<list style="hanging">
              <t anchor="req-2.1" hangText="REQ-2.1:">Legacy SDP with separate
              media transports per SDP media description.</t>

              <t anchor="req-2.2" hangText="REQ-2.2:"><xref
              target="I-D.ietf-mmusic-sdp-bundle-negotiation">Bundled</xref>
              SDP media descriptions.</t>
            </list></t>

          <t anchor="req-3" hangText="REQ-3:">Capability negotiation. It must
          be possible that:<list style="hanging">
              <t anchor="req-3.1" hangText="REQ-3.1:">Sender can express
              capability of sending simulcast.</t>

              <t anchor="req-3.2" hangText="REQ-3.2:">Receiver can express
              capability of receiving simulcast.</t>

              <t anchor="req-3.3" hangText="REQ-3.3:">Sender can express
              maximum number of simulcast streams that can be provided.</t>

              <t anchor="req-3.4" hangText="REQ-3.4:">Receiver can express
              maximum number of simulcast streams that can be received.</t>

              <t anchor="req-3.5" hangText="REQ-3.5:">Sender can detail the
              characteristics of the simulcast streams that can be
              provided.</t>

              <t anchor="req-3.6" hangText="REQ-3.6:">Receiver can detail the
              characteristics of the simulcast streams that it prefers to
              receive.</t>
            </list></t>

          <t anchor="req-4" hangText="REQ-4:">Distinguishing features. It must
          be possible to have different simulcast streams use different codec
          parameters, as can be expressed by SDP format values and RTP payload
          types.</t>

          <t anchor="req-5" hangText="REQ-5:">Compatibility. It must be
          possible to use simulcast in combination with other RTP mechanisms
          that generate additional RTP streams:<list style="hanging">
              <t anchor="req-5.1" hangText="REQ-5.1:"><xref
              target="RFC4588">RTP Retransmission</xref>.</t>

              <t anchor="req-5.2" hangText="REQ-5.2:"><xref
              target="RFC5109">RTP Forward Error Correction</xref>.</t>

              <t anchor="req-5.3" hangText="REQ-5.3:">Related payload types
              such as audio Comfort Noise and/or DTMF.</t>
            </list></t>

          <t anchor="req-6" hangText="REQ-6:">Interoperability. The solution
          must be possible to use in:<list style="hanging">
              <t anchor="req-6.1" hangText="REQ-6.1:">Interworking with
              non-simulcast legacy clients using a single media source per
              media type.</t>

              <t anchor="req-6.2" hangText="REQ-6.2:">WebRTC environment with
              a single media source per SDP media description.</t>
            </list></t>
        </list></t>
    </section>

    <section anchor="sec-solution-overview" title="Overview">
      <t>As an overview, the above requirements are met by signaling simulcast
      capability and configurations in <xref target="RFC4566">SDP</xref>:<list
          style="symbols">
          <t>An offer or answer can contain a number of simulcast streams,
          separate for send and receive directions.</t>

          <t>An offer or answer can contain multiple, alternative simulcast
          stream formats in the same fashion as multiple, alternative codecs
          can be offered in a media description.</t>

          <t>A single media source per SDP media description is assumed, which
          is aligned with the concepts defined in <xref target="RFC7656"/> and
          will specifically work in a WebRTC context, both with and without
          <xref target="I-D.ietf-mmusic-sdp-bundle-negotiation">BUNDLE</xref>
          grouping.</t>

          <t>The codec configuration for a simulcast stream is expressed
          through use of a separately specified <xref
          target="I-D.ietf-mmusic-rid">RTP-level identification
          mechanism</xref><xref target="I-D.roach-avtext-rid"/>, which
          complements and effectively extends the available simulcast stream
          identification and configuration possibilities that could be
          provided by using only SDP formats.</t>

          <t>It is possible, but not required to use <xref
          target="RFC5576">source-specific signaling</xref> with the proposed
          solution.</t>
        </list></t>
    </section>

    <section anchor="sec-solution" title="Detailed Description">
      <t>This section further details the overview <xref
      target="sec-solution-overview">above</xref>. First, formal syntax is
      <xref target="sec-attr">provided</xref>, followed by the rest of the SDP
      attribute definition in <xref target="sec-cap"/>. <xref
      target="sec-relating">Relating Simulcast Streams </xref> provides the
      definition of the RTP/RTCP mechanisms used. The section is concluded
      with a number of examples.</t>

      <section anchor="sec-attr" title="Simulcast Attribute">
        <t><list style="hanging">
            <t hangText="Name:">simulcast</t>

            <t hangText="Value:">sc-value</t>

            <t hangText="Usage Level:">media</t>

            <t hangText="Charset Dependent:">no</t>

            <t hangText="Multiplex Category:">NORMAL</t>
          </list>Syntax <xref target="RFC5234"/>:</t>

        <figure align="center" anchor="fig-abnf" title="ABNF for Simulcast">
          <artwork align="left"><![CDATA[
sc-attr      = "a=simulcast:" sc-value
sc-value     = sc-str-list [SP sc-str-list]
sc-str-list  = sc-dir SP sc-alt-list *( ";" sc-alt-list )
sc-dir       = "send" / "recv"
sc-alt-list  = sc-id *( "," sc-id )
sc-id-paused = "~"
sc-id        = [sc-id-paused] rid-identifier / token
; SP defined in [RFC5234]
; token defined in [RFC4566]
; rid-identifier defined in [I-D.ietf-mmusic-rid]

]]></artwork>
        </figure>

        <t>The "a=simulcast" attribute has a parameter in the form of one or
        two simulcast stream descriptions, each consisting of a direction
        ("send" or "recv"), followed by a list of one or more simulcast
        streams. Each simulcast stream in that list is separated by a
        semicolon (";"). Each simulcast stream can in turn be offered in one
        or more alternative formats, where each simulcast stream alternative
        is separated by a comma (","). The simulcast stream alternative MUST
        be described in the form of a RID, as described by <xref
        target="I-D.ietf-mmusic-rid"/>. Each simulcast stream can be initially
        <xref target="I-D.ietf-avtext-rtp-stream-pause">paused</xref>,
        indicated by prepending a "~" to the simulcast stream. In case there
        are simulcast stream alternatives, pause can be specified individually
        for each alternative. The reason to allow separate initial pause
        states for each simulcast stream alternative is that pause capability
        can be specified individually for each RTP payload type referenced by
        a RID, which makes it infeasible to pause RID where any of the related
        RTP payload type(s) do not have pause capability.</t>

        <t>Examples:</t>

        <figure align="center" anchor="fig-abnf-examples"
                title="Simulcast Examples">
          <artwork align="left"><![CDATA[a=simulcast:send 1,2,3;~4,~5 recv 1;~2,~5
a=simulcast:recv 1;4,5 send 1;2

]]></artwork>
        </figure>

        <t>Above are two examples of different "a=simulcast" lines.</t>

        <t>The first line is an example offer to send two simulcast streams
        and to receive two simulcast streams. The first simulcast stream in
        send direction can be sent as three different alternatives (1, 2, 3),
        and the second simulcast stream in send direction can be sent as two
        different alternatives (4, 5). All second stream send alternatives are
        offered as initially paused. The first simulcast stream in receive
        direction has no alternatives (only 1). The second simulcast stream in
        receive direction has two alternatives (2, 5) that are both offered as
        initially paused.</t>

        <t>The second line is an example answer to the first line, accepting
        to send and receive the two offered simulcast streams, however send
        and receive directions are specified in opposite order compared to the
        first line, which lets the answer keep the same order of simulcast
        streams in the SDP as in the offer, even though directionality is
        reversed. This example answer has removed all offered alternatives for
        the first simulcast stream (keeping only 1), but kept alternative
        formats for the second simulcast stream in receive direction (4, 5).
        The answer accepts to send two simulcast streams, without
        alternatives. The answer does not accept initial pause of any
        simulcast streams, in either direction. More examples can be found in
        <xref target="sec-ex"/>.</t>
      </section>

      <section anchor="sec-cap" title="Simulcast Capability">
        <t>Simulcast capability is expressed as a new media level <xref
        target="sec-attr">SDP attribute, "a=simulcast"</xref>, with <xref
        target="I-D.ietf-mmusic-sdp-mux-attributes">multiplex category</xref>
        NORMAL.</t>

        <!--Open issue: this section should be re-structured in the recommended steps, with sub-sections:
generating the initial SDP offer, answerer processing the SDP offer, creating the SDP answer,
offerer processing of the SDP answer, and modifying the session.

-->

        <t>For each desired direction (send/recv), the simulcast attribute
        defines a list of simulcast streams (separated by semicolons), each of
        which is a list of simulcast formats (separated by commas). The
        meaning of the attribute on SDP session level is undefined and MUST
        NOT be used.</t>

        <t>The meaning of including multiple "a=simulcast" lines in a single
        SDP media description is undefined and MUST NOT be used. There are
        separate and independent sets of parameters for simulcast in send and
        receive directions. When listing multiple directions, each direction
        MUST NOT occur more than once on the same line.</t>

        <t>The different simulcast streams MUST be identified through the
        <xref target="I-D.ietf-mmusic-rid">RTP-level "RID" identification
        mechanism</xref>.</t>

        <t>Attribute parameters are grouped by direction and consist of a
        listing of simulcast stream identifications to be used. The number of
        (non-alternative, see below) identifications in the list sets a limit
        to the number of supported simulcast streams in that direction. The
        order of the listed simulcast versions in the "send" direction
        suggests a proposed order of preference, in decreasing order: the
        stream listed first is the most preferred <xref
        target="sec-diverse-receivers"/>, and subsequent streams have
        progressively lower preference. The order of the listed simulcast
        streams in the "recv" direction expresses a preference which simulcast
        streams that are preferred, with the leftmost being most preferred.
        This can be of importance if the number of actually sent simulcast
        streams have to be reduced for some reason.</t>

        <t>Formats that have explicit <xref
        target="RFC5583">dependencies</xref> <xref
        target="I-D.ietf-mmusic-rid"/> to other formats (even in the same
        media description) MAY be listed as different simulcast streams.</t>

        <t>Alternative simulcast formats MAY be specified as part of the
        attribute parameters by expressing each simulcast stream as a
        comma-separated list of alternative format identifiers. In this case,
        it is not possible to align what alternative formats that are used
        between different simulcast streams, like requiring all simulcast
        streams to use alternatives with the same codec format. The order of
        the format alternatives within a simulcast stream is significant; the
        alternatives are listed from (left) most preferred to (right) least
        preferred. For the use of simulcast, this overrides the normal codec
        preference as expressed by format type ordering on the "m="-line,
        using regular SDP rules. This is to enable a separation of general
        codec preferences and simulcast stream configuration preferences.</t>

        <t>A simulcast stream can use a codec defined such that the same RTP
        SSRC can change RTP payload type multiple times during a session,
        possibly even on a per-packet basis. A typical example can be a speech
        codec that makes use of <xref target="RFC3389">Comfort Noise</xref>
        and/or <xref target="RFC4733">DTMF</xref> formats. In those cases,
        such "related" formats MUST NOT be listed explicitly in the attribute
        parameters, since they are not strictly simulcast streams of the media
        source, but rather a specific way of generating the RTP stream of a
        single simulcast stream with varying RTP payload type. Instead, only a
        single simulcast stream identification MUST be used per simulcast
        stream or alternative simulcast format (if there are such) in the
        SDP.</t>

        <t>If <xref target="I-D.ietf-avtext-rtp-stream-pause">RTP stream
        pause/resume</xref> is supported, any simulcast stream identification
        MAY be prefixed by a "~" character to indicate that the corresponding
        simulcast stream is initially paused already from start of the RTP
        session. In this case, support for RTP stream pause/resume MUST also
        be included under the same "m="-line listing "a=simulcast". If the
        simulcast stream is specified as a list of alternative formats, the
        indication is prepended to the first format of the list and applies to
        whatever alternative that is eventually chosen. All RTP payload types
        related to such initially paused simulcast stream MUST be listed in
        the SDP as pause/resume capable as specified by <xref
        target="I-D.ietf-avtext-rtp-stream-pause"/>.</t>

        <t>An initially paused simulcast stream in "send" direction MUST be
        considered equivalent to an unsolicited locally paused stream, and be
        handled accordingly. Initially paused simulcast streams are resumed as
        described by the RTP pause/resume specification. An RTP stream
        receiver that wishes to resume an unsolicited locally paused stream
        needs to know the SSRC of that stream. The SSRC of an initially paused
        simulcast stream can be obtained from an RTP stream sender RTCP Sender
        Report (SR) including both the desired SSRC as "SSRC of sender", and
        the stream RID identification as an RID RTCP SDES item.</t>

        <t>Including an initially paused simulcast stream in "recv" direction
        in an SDP towards an RTP sender, SHOULD cause the remote RTP sender to
        put the stream as unsolicited locally paused, unless there are other
        RTP stream receivers that do not mark the simulcast stream as
        initially paused. The reason to require an initially paused "recv"
        stream to be considered locally paused by the remote RTP sender,
        instead of making it equivalent to implicitly sending a pause request,
        is because the pausing RTP sender cannot know which SSRC owns the
        restriction when TMMBR/TMMBN are used for pause/resume signaling since
        the RTP receiver's SSRC in send direction is not known yet.</t>

        <t>Use of the <xref target="RFC2198">redundant audio data</xref>
        format could be seen as a form of simulcast for loss protection
        purposes, but is not considered conflicting with the mechanisms
        described in this memo and MAY therefore be used as any other format.
        In this case the "red" format, rather than the carried formats, SHOULD
        be the one to list as a simulcast stream on the "a=simulcast"
        line.</t>

        <t>The media formats and corresponding characteristics of simulcast
        streams SHOULD be chosen such that they are different. If this
        difference is not required, <xref target="RFC7104">RTP
        duplication</xref> procedures SHOULD be considered instead of
        simulcast.</t>

        <section title="Declarative Use">
          <t>When used as a declarative media description, "a=simulcast" line
          "recv" direction formats indicate the configured end point's
          required capability to recognize and receive a specified set of RTP
          streams as simulcast streams. In the same fashion, "a=simulcast"
          line "send" direction requests the end point to send a specified set
          of RTP streams as simulcast streams.</t>

          <t>If multiple simulcast formats are listed, it means that the
          configured end point MUST be prepared to receive any of the "recv"
          formats, and MAY send any of the "send" formats for that simulcast
          stream.</t>

          <t><list style="empty">
              <t>Editor's note: It may not be beneficial for declarative use
              to be limited to a single media source per "m=" line, as
              elaborated further in <xref target="sec-limitation"/>.</t>
            </list></t>
        </section>

        <section anchor="sec-offer-answer" title="Offer/Answer Use">
          <t>An offerer wanting to use simulcast SHALL include the
          "a=simulcast" attribute in the offer. An offerer that receives an
          answer without "a=simulcast" MUST NOT use simulcast towards the
          answerer. An offerer that receives an answer with "a=simulcast"
          without any simulcast stream identifications in a specified
          direction MUST NOT use simulcast in that direction.</t>

          <t>An answerer that does not understand the concept of simulcast
          will also not know the attribute and will remove it in the SDP
          answer, as defined in existing <xref target="RFC3264">SDP
          Offer/Answer</xref> procedures.</t>

          <t>An answerer that does understand the attribute and that wants to
          support simulcast in an indicated direction SHALL reverse
          directionality of the unidirectional direction parameters; "send"
          becomes "recv" and vice versa, and include it in the answer.</t>

          <t>An offerer listing a set of receive simulcast streams and/or
          alternative formats in the offer MUST be prepared to receive RTP
          streams for any of those simulcast streams and/or alternative
          formats from the answerer.</t>

          <t>An answerer that receives an offer with simulcast containing an
          "a=simulcast" attribute listing alternative formats for simulcast
          streams MAY keep all the alternatives in the answer, but it MAY also
          choose to remove any non-desirable alternatives per simulcast stream
          in the answer. The answerer MUST NOT add any alternatives that were
          not present in the offer.</t>

          <t>An answerer that receives an offer with simulcast that lists a
          number of simulcast streams, MAY reduce the number of simulcast
          streams in the answer, but MUST NOT add simulcast streams.</t>

          <t>An offerer that receives an answer where some simulcast formats
          are kept MUST be prepared to receive any of the kept send direction
          alternatives, and MAY send any of the kept receive direction
          alternatives from the answer. Similarly, the answerer MUST be
          prepared to receive any of the kept receive direction alternatives,
          and MAY send any of the kept send direction alternatives in the
          answer.</t>

          <t>The offerer and answerer MUST NOT send more than a single
          alternative format at a time (based on RTP timestamps) per simulcast
          stream, but MAY change format on a per-RTP packet basis. This
          corresponds to the existing (non-simulcast) SDP offer/answer case
          when multiple formats are included on the "m=" line in the SDP
          answer.</t>

          <t>An offerer that receives an answer where some of the simulcast
          streams are removed MAY release the corresponding resources (codec,
          transport, etc) in its receive direction and MUST NOT send any RTP
          packets corresponding to the removed simulcast streams.</t>

          <t>Simulcast streams or formats using undefined simulcast stream
          identifications MUST NOT be used as valid simulcast streams by an
          RTP stream receiver.</t>

          <t>An answerer that receives an offer without RTP stream
          pause/resume capability MUST NOT mark any simulcast streams as
          initially paused in the answer.</t>

          <t>An answerer that receives an offer with RTP stream pause/resume
          capability MAY mark any simulcast streams as initially paused in the
          answer.</t>

          <t>An answerer that receives indication in an offer of a simulcast
          stream being initially paused , SHOULD mark that simulcast stream as
          initially paused also in the answer, regardless of direction, unless
          it has good reason for the stream not being initially paused.</t>

          <t>An offerer that offered some of its simulcast streams as
          initially paused and that receives an answer that does not indicate
          RTP stream pause/resume capability, MUST NOT intially pause any
          simulcast streams.</t>

          <t>An offerer with RTP stream pause/resume capability that receives
          an answer where some simulcast streams are marked as initially
          paused, SHOULD initially pause them regardless if they were marked
          as initially paused also in the offer, unless it has good reason for
          those streams not being initially paused.</t>

          <t><list style="empty">
              <t>Note: The inclusion of "a=simulcast" or the use of simulcast
              does not change any of the interpretation or Offer/Answer
              procedures for other SDP attributes, like "a=fmtp" or
              "a=rid".</t>
            </list></t>
        </section>
      </section>

      <section anchor="sec-relating" title="Relating Simulcast Streams">
        <t>Simulcast RTP streams MUST be related on RTP level through <xref
        target="I-D.roach-avtext-rid">RID</xref>, as specified in the SDP
        <xref target="sec-cap">"a=simulcast" attribute </xref> parameters.
        This is sufficient as long as there is only a single media source per
        SDP media description. When using <xref
        target="I-D.ietf-mmusic-sdp-bundle-negotiation">BUNDLE</xref>, where
        multiple SDP media descriptions jointly specify a single RTP session,
        the SDES MID identification mechanism in BUNDLE allows relating RTP
        streams back to individual media descriptions, after which the above
        described RID relations can be used. Use of the <xref
        target="RFC5285">RTP header extension</xref> for both MID and RID
        identifications can be important to ensure rapid initial reception,
        required to correctly interpret and process the RTP streams.
        Implementers of this specification MUST support RTCP source
        description (SDES) item and SHOULD support RTP header extension method
        to signal RID on RTP level.</t>
      </section>

      <section anchor="sec-ex" title="Signaling Examples">
        <t>These examples describe a client to video conference service, using
        a centralized media topology with an RTP mixer.</t>

        <figure align="center" anchor="fig-mixer-four-party"
                title="Four-party Mixer-based Conference">
          <artwork align="center"><![CDATA[
+---+      +-----------+      +---+
| A |<---->|           |<---->| B |
+---+      |           |      +---+
           |   Mixer   |
+---+      |           |      +---+
| F |<---->|           |<---->| J |
+---+      +-----------+      +---+]]></artwork>
        </figure>

        <section anchor="sec-ex-single-source" title="Single-Source Client">
          <t>Alice is calling in to the mixer with a simulcast-enabled client
          capable of a single media source per media type. The client can send
          a simulcast of 2 video resolutions and frame rates: HD 1280x720p
          30fps and thumbnail 320x180p 15fps. This is defined below using the
          <xref target="RFC6236">"imageattr"</xref>. In this example, only the
          "pt" RID parameter is used, effectively achieving a 1:1 mapping
          between RID and media formats (RTP payload types), to describe
          simulcast stream formats. Alice's Offer:</t>

          <figure align="center" anchor="fig-up-offer"
                  title="Single-Source Simulcast Offer">
            <artwork align="left"><![CDATA[
v=0
o=alice 2362969037 2362969040 IN IP4 192.0.2.156
s=Simulcast Enabled Client
t=0 0
c=IN IP4 192.0.2.156
m=audio 49200 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49300 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000
a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 pt=97
a=rid:2 pt=98
a=simulcast:send 1;2 recv 1

]]></artwork>
          </figure>

          <t>The only thing in the SDP that indicates simulcast capability is
          the line in the video media description containing the "simulcast"
          attribute. The included format parameters indicates that sent
          simulcast streams can differ in video resolution.</t>

          <t>The Answer from the server indicates that it too is simulcast
          capable. Should it not have been simulcast capable, the
          "a=simulcast" line would not have been present and communication
          would have started with the media negotiated in the SDP.</t>

          <figure align="center" anchor="fig-up-answer"
                  title="Single-Source Simulcast Answer">
            <artwork align="left"><![CDATA[
v=0
o=server 823479283 1209384938 IN IP4 192.0.2.2
s=Answer to Simulcast Enabled Client
t=0 0
c=IN IP4 192.0.2.43
m=audio 49672 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49674 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000
a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 pt=97
a=rid:2 pt=98
a=simulcast:recv 1;2 send 1

]]></artwork>
          </figure>

          <t>Since the server is the simulcast media receiver, it reverses the
          direction of the "simulcast" attribute parameters.</t>
        </section>

        <section anchor="sec-ex-multi-source" title="Multi-Source Client">
          <t>Fred is calling in to the same conference as in the example above
          with a two-camera, two-display system, thus capable of handling two
          separate media sources in each direction, where each media source is
          simulcast-enabled in the send direction. Fred's client is restricted
          to a single media source per media description.</t>

          <t>The first two simulcast streams for the first media source use
          different codecs, <xref target="RFC6190">H264-SVC</xref> and <xref
          target="RFC6184">H264</xref>. These two simulcast streams also have
          a temporal dependency. Two different video codecs, <xref
          target="I-D.ietf-payload-vp8">VP8</xref> and H264, are offered as
          alternatives for the third simulcast stream for the first media
          source. Only the highest fidelity simulcast stream are sent from
          start, the lower fidelity streams being initially paused.</t>

          <t>The second media source is offered with three different simulcast
          streams. All video streams of this second media source are loss
          protected by <xref target="RFC4588">RTP retransmission</xref>. Also
          here, all but the highest fidelity simulcast stream are initially
          paused.</t>

          <t>Fred's client is also using BUNDLE to send all RTP streams from
          all media descriptions in the same RTP session on a single media
          transport. Although using many different simulcast streams in this
          example, the use of RID as simulcast stream identification enables
          use of a low number of RTP payload types. Note that the use of both
          BUNDLE and RID recommends using the <xref target="RFC5285">RTP
          header extension</xref> for carrying these fields, which is
          consequently also included in the SDP.</t>

          <figure anchor="fig-ms-offer"
                  title="Fred's Multi-Source Simulcast Offer">
            <artwork><![CDATA[
v=0
o=fred 238947129 823479223 IN IP4 192.0.2.125
s=Offer from Simulcast Enabled Multi-Source Client
t=0 0
c=IN IP4 192.0.2.125
a=group:BUNDLE foo bar zen

m=audio 49200 RTP/AVP 99
a=mid:foo
a=rtpmap:99 G722/8000

m=video 49600 RTP/AVPF 100 101 103
a=mid:bar
a=rtpmap:100 H264-SVC/90000
a=rtpmap:101 H264/90000
a=rtpmap:103 VP8/90000
a=fmtp:100 profile-level-id=42400d; max-fs=3600; max-mbps=108000; \
    mst-mode=NI-TC
a=fmtp:101 profile-level-id=42c00d; max-fs=3600; max-mbps=54000
a=fmtp:103 max-fs=900; max-fr=30
a=rid:1 send pt=100;max-width=1280;max-height=720;max-fr=60;depend=2
a=rid:2 send pt=101;max-width=1280;max-height=720;max-fr=30
a=rid:3 send pt=101;max-width=640;max-height=360
a=rid:4 send pt=103;max-width=640;max-height=360
a=depend:100 lay bar:101
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rid
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1;2;~4,3

m=video 49602 RTP/AVPF 96 104
a=mid:zen
a=rtpmap:96 VP8/90000
a=fmtp:96 max-fs=3600; max-fr=30
a=rtpmap:104 rtx/90000
a=fmtp:104 apt=96;rtx-time=200
a=rid:5 send pt=96;max-fs=921600;max-fr=30
a=rid:6 send pt=96;max-fs=614400;max-fr=15
a=rid:7 send pt=96;max-fs=230400;max-fr=30
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rid
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 5;~6;~7

]]></artwork>
          </figure>

          <t><list style="empty">
              <t>Note: Empty lines in the SDP above are added only for
              readability and would not be present in an actual SDP.</t>
            </list></t>
        </section>
      </section>
    </section>

    <section anchor="sec-network-aspects" title="Network Aspects">
      <t>Simulcast is in this memo defined as the act of sending multiple
      alternative encoded streams of the same underlying media source. When
      transmitting multiple independent streams that originate from the same
      source, it could potentially be done in several different ways using
      RTP. A general discussion on considerations for use of the different RTP
      multiplexing alternatives can be found in <xref
      target="I-D.ietf-avtcore-multiplex-guidelines">Guidelines for
      Multiplexing in RTP</xref>. Discussion and clarification on how to
      handle multiple streams in an RTP session can be found in <xref
      target="I-D.ietf-avtcore-rtp-multi-stream"/>.</t>

      <t>The network aspects that are relevant for simulcast are:<list
          style="hanging">
          <t hangText="Quality of Service:">When using simulcast it might be
          of interest to prioritize a particular simulcast stream, rather than
          applying equal treatment to all streams. For example, lower bit-rate
          streams may be prioritized over higher bit-rate streams to minimize
          congestion or packet losses in the low bit-rate streams. Thus, there
          is a benefit to use a simulcast solution with good QoS support.</t>

          <t hangText="NAT/FW Traversal:">Using multiple RTP sessions incurs
          more cost for NAT/FW traversal unless they can re-use the same
          transport flow, which can be achieved by <xref
          target="I-D.ietf-mmusic-sdp-bundle-negotiation">Multiplexing
          Negotiation Using SDP Port Numbers</xref>.</t>
        </list></t>

      <t/>

      <section title="Bitrate Adaptation">
        <t>Use of multiple simulcast streams can require a significant amount
        of network resources. If the amount of available network resources
        varies during an RTP session such that it does not match what is
        negotiated in SDP, the bitrate used by the different simulcast streams
        may have to be reduced dynamically. What simulcast streams to
        prioritize when allocating available bitrate among the simulcast
        streams in such adaptation SHOULD be taken from the simulcast stream
        order on the "a=simulcast" line. Simulcast streams that have
        pause/resume capability and that would be given such low bitrate by
        the adaptation process that they are considered not really useful can
        be temporarily paused until the limiting condition clears.</t>
      </section>
    </section>

    <section anchor="sec-limitation" title="Limitations">
      <t>The chosen approach has a few limitations that are described in this
      section. The only one currently described relates to the use of a single
      RTP session for all simulcast formats of a media source.</t>

      <section title="Single RTP Session">
        <t>The limitations in this section come from sending all simulcast
        streams related to a media source under the same SDP media
        description, which also means they are sent in the same RTP
        session.</t>

        <t>It is not possible to use different simulcast streams on different
        media transports, limiting the possibilities to apply different QoS to
        different simulcast streams. When using unicast, QoS mechanisms based
        on individual packet marking are feasible, since they do not require
        separation of simulcast streams into different RTP sessions to apply
        different QoS.</t>

        <t>It is not possible to separate different simulcast streams into
        different multicast groups to allow a multicast receiver to pick the
        stream it wants, rather than receive all of them. In this case, the
        only reasonable implementation is to use different RTP sessions for
        each multicast group so that reporting and other RTCP functions
        operate as intended.</t>
      </section>
    </section>

    <section anchor="sec-iana" title="IANA Considerations">
      <t>This document requests to register a new SDP attribute, simulcast, as
      defined in <xref target="sec-attr"/>.</t>
    </section>

    <section anchor="sec-security" title="Security Considerations">
      <t>The simulcast capability, configuration attributes, and parameters
      are vulnerable to attacks in signaling.</t>

      <t>A false inclusion of the "a=simulcast" attribute may result in
      simultaneous transmission of multiple RTP streams that would otherwise
      not be generated. The impact is limited by the media description joint
      bandwidth, shared by all simulcast streams irrespective of their number.
      There may however be a large number of unwanted RTP streams that will
      impact the share of bandwidth allocated for the originally wanted RTP
      stream.</t>

      <t>A hostile removal of the "a=simulcast" attribute will result in
      simulcast not being used.</t>

      <t>Neither of the above will likely have any major consequences and can
      be mitigated by signaling that is at least integrity and source
      authenticated to prevent an attacker to change it.</t>

      <t>Security considerations related to the use of RID is covered in <xref
      target="I-D.ietf-mmusic-rid"/> and <xref
      target="I-D.roach-avtext-rid"/>. There are no additional security
      concerns related to its use in this specification.</t>

      <!--Open issue: Review this! Are there security issues that arise specifically from this draft's use of RID?-->
    </section>

    <section title="Contributors">
      <t>Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have
      contributed with important material to the first versions of this
      document. Robert Hansen and Cullen Jennings, from Cisco, Peter Thatcher,
      from Google, and Adam Roach, from Mozilla, contributed significantly to
      subsequent versions.</t>
    </section>

    <section anchor="sec-ack" title="Acknowledgements">
      <t/>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.3550'?>

      <?rfc include='reference.RFC.4566'?>

      <?rfc include='reference.RFC.5109'?>

      <?rfc include='reference.RFC.5234'?>

      <?rfc include='reference.RFC.7104'?>

      <?rfc include='reference.I-D.ietf-mmusic-rid'?>

      <?rfc include='reference.I-D.roach-avtext-rid'?>

      <?rfc include='reference.I-D.ietf-avtext-rtp-stream-pause'?>

      <?rfc include='reference.I-D.ietf-mmusic-sdp-mux-attributes'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.2198'?>

      <?rfc include='reference.RFC.3264'?>

      <?rfc include='reference.RFC.3389'?>

      <?rfc include='reference.RFC.4588'?>

      <?rfc include='reference.RFC.4733'?>

      <?rfc include='reference.RFC.5285'?>

      <?rfc include='reference.RFC.5576'?>

      <?rfc include='reference.RFC.5583'?>

      <?rfc include='reference.RFC.6184'?>

      <?rfc include='reference.RFC.6190'?>

      <?rfc include='reference.RFC.6236'?>

      <?rfc include='reference.RFC.7656'?>

      <?rfc include='reference.RFC.7667'?>

      <?rfc include='reference.I-D.ietf-avtcore-multiplex-guidelines'?>

      <?rfc include='reference.I-D.ietf-avtcore-rtp-multi-stream'?>

      <?rfc include='reference.I-D.ietf-mmusic-sdp-bundle-negotiation'?>

      <?rfc include='reference.I-D.ietf-payload-vp8'?>

      <?rfc include='reference.I-D.ietf-payload-flexible-fec-scheme'?>
    </references>

    <section title="Changes From Earlier Versions">
      <t>NOTE TO RFC EDITOR: Please remove this section prior to
      publication.</t>

      <section title="Modifications Between WG Version -03 and  -04">
        <t><list style="symbols">
            <t>Changed to only use RID identification, as was consensus during
            IETF 94.</t>

            <t>ABNF improvements.</t>

            <t>Clarified offer-answer rules for initially paused streams.</t>

            <t>Changed references for RTP topologies and RTP taxonomy
            documents that are now published as RFC.</t>

            <t>Added reference to the new RID draft in AVTEXT.</t>

            <t>Re-structured section 6 to provide an easy reference by the
            updated IANA section.</t>

            <t>Added a sub-section 7.1 with a discussion of bitrate
            adaptation.</t>

            <t>Editorial improvements.</t>
          </list></t>
      </section>

      <section title="Modifications Between WG Version -02 and  -03">
        <t><list style="symbols">
            <t>Removed text on multicast / broadcast from use cases, since it
            is not supported by the solution.</t>

            <t>Removed explicit references to unified plan draft.</t>

            <t>Added possibility to initiate simulcast streams in paused
            mode.</t>

            <t>Enabled an offerer to offer multiple stream identification (pt
            or rid) methods and have the answerer choose which to use.</t>

            <t>Added a preference indication also in send direction
            offers.</t>

            <t>Added a section on limitations of the current proposal,
            including identification method specific limitations.</t>
          </list></t>
      </section>

      <section title="Modifications Between WG Version -01 and  -02">
        <t><list style="symbols">
            <t>Relying on the new RID solution for codec constraints and
            configuration identification. This has resulted in changes in
            syntax to identify if pt or RID is used to describe the simulcast
            stream.</t>

            <t>Renamed simulcast version and simulcast version alternative to
            simulcast stream and simulcast format respectively, and improved
            definitions for them.</t>

            <t>Clarification that it is possible to switch between simulcast
            version alternatives, but that only a single one be used at any
            point in time.</t>

            <t>Changed the definition so that ordering of simulcast formats
            for a specific simulcast stream do have a preference order.</t>
          </list></t>
      </section>

      <section title="Modifications Between WG Version -00 and  -01">
        <t><list style="symbols">
            <t>No changes. Only preventing expiry.</t>
          </list></t>
      </section>

      <section title="Modifications Between Individual Version -00 and WG Version -00">
        <t><list style="symbols">
            <t>Added this appendix.</t>
          </list></t>
      </section>
    </section>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 09:22:26