http://stupid.domain.name/ietf/

One document matched: draft-westerlund-avtext-codec-operation-point-01.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc4585 SYSTEM "reference.RFC.4585.xml">
<!ENTITY rfc5104 SYSTEM "reference.RFC.5104.xml">
<!ENTITY rfc3550 SYSTEM "reference.RFC.3550.xml">
<!ENTITY rfc2119 SYSTEM "reference.RFC.2119.xml">
<!ENTITY rfc5234 SYSTEM "reference.RFC.5234.xml">
<!ENTITY rfc3261 SYSTEM "reference.RFC.3261.xml">
<!ENTITY rfc3264 SYSTEM "reference.RFC.3264.xml">
<!ENTITY rfc4566 SYSTEM "reference.RFC.4566.xml">
<!ENTITY rfc5117 SYSTEM "reference.RFC.5117.xml">
<!ENTITY rfc5888 SYSTEM "reference.RFC.5888.xml">
]>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-westerlund-avtext-codec-operation-point-01"
     ipr="trust200902">
  <front>
    <title abbrev="COP">Codec Operation Point RTCP Extension</title>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 6</street>

          <city>SE-164 80 Kista</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <author fullname="Bo Burman" initials="B." surname="Burman">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 6</street>

          <city>SE-164 80 Kista</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 13 11</phone>

        <email>bo.burman@ericsson.com</email>
      </address>
    </author>

    <author fullname="Laurits Hamm" initials="L." surname="Hamm">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Ericsson Allee 1</street>

          <city>DE-52134 Herzogenrath</city>

          <country>Germany</country>
        </postal>

        <phone>+49 2407 575 6779</phone>

        <email>laurits.hamm@ericsson.com</email>
      </address>
    </author>

    <date day="22" month="October" year="2012"/>

    <abstract>
      <t>The Audio-visual Profile with Feedback (AVPF) specification defines a
      framework and messages for fast feedback and media control over RTCP.
      The Codec Control Messages (CCM) specification defines an extension to
      AVPF, by specifying additional messages for codec control and feedback.
      This specification extends CCM, by specifying messages that let
      participants dynamically communicate a set of codec configuration
      parameters, which enables better optimization of resource efficiency and
      quality of media transmission.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>Multimedia real-time communication services, such as video telephony
      and videoconferencing, use the <xref target="RFC3550">real-time
      transport (RTP/RTCP)</xref> protocol to transmit media streams, such as
      audio and video. A session establishment protocol, such as <xref
      target="RFC3261">SIP</xref>, in combination with a capability
      negotiation protocol, such as <xref target="RFC3264">SDP offer/answer
      </xref> is normally used to establish the session and negotiate media
      capabilities. In some cases, a set of codec parameters is negotiated
      that does not express any specific limit or capability, but just
      describes a certain codec configuration.</t>

      <t>During session establishment, the participating endpoints normally
      have limited knowledge about the session environment, e.g. whether the
      session will be point-to-point or contain some multiparty scenario, how
      users will interact with the application, how network conditions will
      vary during the session, etc. To take those variations into account, the
      participants can renegotiate session parameters to better suit the
      communication environment. At times, when variations or changes are
      frequent in nature, it will require the needed reaction time to be
      short, which may make repeated session renegotiation inefficient and/or
      too slow. In addition, variations may not even affect negotiated session
      parameters, if the variations occur within the negotiated
      boundaries.</t>

      <t>The above scenario can become critical especially in cases where a
      given media stream is transmitted towards, and received by, multiple
      receivers. In multiparty environments, scalable encoding or simulcast
      can be used to make the system more efficient and provide better quality
      to participants that are capable of receiving and utilizing the higher
      quality. These use cases result in that a sending party is requested to
      deliver multiple encoder operation points.</t>

      <t>The <xref target="RFC4585">Audio-Visual Profile with Feedback (AVPF)
      specification</xref> defines a framework and messages for fast feedback
      and media control over RTCP. The <xref target="RFC5104">Codec Control
      Messages (CCM) specification</xref> defines an extension to AVPF, by
      specifying additional messages for codec control and feedback. This
      specification extends CCM, by specifying messages that let participants
      dynamically communicate a set of codec configuration parameters, which
      enable better optimization of resource usage and quality of media
      transmission.</t>

      <t>The codec configuration parameters specified in this document focus
      on some basic audio and video properties, such as video resolution,
      video frame rate, media stream bit-rate, audio sampling rate, number of
      audio channels, maximum RTP packet size and rate. Additional parameters
      can be standardized in the future.</t>

      <t>The codec control messages are not meant to replace the configuration
      performed using e.g. SDP. Instead, the messages can be used to
      communicate dynamic and frequent changes that take place within
      boundaries that have been negotiated as part of the session
      establishment.</t>
    </section>

    <section title="Definitions">
      <t/>

      <section title="Terminology">
        <t>The following terms and abbreviations are used in this
        document:</t>

        <t><list style="hanging">
            <t hangText="Bandwidth:">The network resource needed to transport
            a certain bitrate and any transport overhead, measured in bits per
            second. There will be spare network bandwidth when the (media)
            data bitrate and overhead is less than the available bandwidth.
            Similarly, data will have to be buffered when the available
            bandwidth excluding transport overhead is less than the bitrate
            used by the sender, or the excess data will be lost. The available
            bandwidth typically varies dynamically over time.</t>

            <t hangText="Bitrate:">The amount of (media) data transmitted per
            time unit, measured in bits per second, utilizing some amount of
            the available network bandwidth resource. In the context of this
            specification and unless otherwise specified, it excludes
            IP/UDP/RTP overhead. Depending on the (media) data source, the
            bitrate can either be constant or vary dynamically over time.</t>

            <t hangText="Codec Configuration Parameter:">The configurable
            value describing a certain codec property, which may impact
            user-perceived media fidelity, encoded media stream
            characteristics, or both. The parameter has a type (codec
            parameter type, see below) and a value, where the type describes
            what kind of codec property is controlled, and the value describes
            the property setting as well as how the value should be used in
            comparison operations. A single parameter value can express one
            specific value or an open-ended range. A pair of parameter values
            with different comparison types can describe a value range. Such
            value range can also be combined with a third, target value within
            that range.</t>

            <t hangText="Codec Operation Point:">Also denoted just operation
            point. A set of codec configuration parameter values, describing
            the characteristics of one single encoding. For scalable encoding,
            it describes the resulting characteristics from combining a set of
            dependent sub-streams.</t>

            <t hangText="Codec Parameter Type:">The specific type of a codec
            configuration parameter. Each parameter type defines what unit the
            value has. This specification defines a number of generally useful
            parameter types in <xref target="sec-parameters"/> that can be
            used to control codec operation.</t>

            <t hangText="Encoding:">A particular encoding is the resulting
            media stream from applying a certain choice of codec configuration
            parameters to the encoder. The media stream will have a certain
            fidelity (quality) from that encoding through the choice of
            sampling, bit-rate and other configuration parameters.</t>

            <t hangText="Endpoint:">A host or node that has a presence in the
            RTP session with one or more Synchronization Sources (SSRC)s.</t>

            <t hangText="Mixer:">An RTP session centralized node that
            generates media streams based on incoming media streams from other
            endpoints. See Topo-Mixer in <xref target="RFC5117">RTP
            Topologies</xref>.</t>

            <t hangText="RTP Session:">An association among a set of
            participants communicating with RTP. The distinguishing feature of
            an RTP session (defined in <xref target="RFC3550"/>) is that each
            RTP session maintains a full, separate space of SSRC identifiers.
            Each participant in the RTP session can see SSRC or CSRC
            identifiers from the other participants, either by RTP, RTCP, or
            both.</t>

            <t hangText="Sub-Stream:">An individually decodeable part of a
            scalable media stream, including all dependent sub-streams. The
            characteristics of a certain sub-stream can be described by a
            codec operation point.</t>

            <t hangText="Translator:">An RTP session centralized node that
            forwards all media streams from other endpoints, modified to some
            extent, e.g. addressing, encoding, fidelity. See Topo-Translator
            in <xref target="RFC5117">RTP Topologies</xref>.</t>
          </list></t>
      </section>

      <section title="Abbreviations">
        <t><list style="hanging">
            <t hangText="AVC:">Advanced Video Coding</t>

            <t hangText="AVPF:">Extended RTP Profile for RTCP-Based
            Feedback</t>

            <t hangText="CCP:">Codec Configuration Parameter</t>

            <t hangText="COP:">Codec Operation Point</t>

            <t hangText="COPN:">Codec Operation Point Notification</t>

            <t hangText="COPR:">Codec Operation Point Request</t>

            <t hangText="COPS:">Codec Operation Point Status</t>

            <t hangText="CPT:">Codec Parameter Type</t>

            <t hangText="FCI:">Feedback Control Information</t>

            <t hangText="FMT:">Feedback Message Type</t>

            <t hangText="GUI:">Graphical User Interface</t>

            <t hangText="MST:">Multi-Session Transmission</t>

            <t hangText="MVC:">Multiview Video Coding</t>

            <t hangText="OP:">Operation Point</t>

            <t hangText="OPID:">Operation Point Identification number</t>

            <t hangText="PPS:">Picture Parameter Set</t>

            <t hangText="SPS:">Sequence Parameter Set</t>

            <t hangText="SST:">Single-Session Transmission</t>

            <t hangText="SVC:">Scalable Video Coding</t>

            <t hangText="TLV:">Type-Length-Value</t>
          </list></t>
      </section>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119"/>.</t>
      </section>
    </section>

    <section title="Motivation">
      <section anchor="sec-problem" title="Problem Description">
        <t>Networks can contain endpoints with different capabilities,
        including CPU power, capture and render device fidelity (e.g. image
        resolution), and codecs. In addition, the characteristics and
        properties of networks can vary, which endpoints have to cope with.
        For example, in videoconferencing and telepresence services, a large
        number of endpoints may participate, and there may be a large number
        of media streams associated with the session. Such multiparty
        scenarios typically use entities for media mixing, switching and
        transcoding. The aim is to provide the best possible quality to each
        endpoint, taking endpoint and network capabilities into
        consideration.</t>

        <t>Many communication services today use codecs that can be configured
        in a number of different ways. Often, the codecs have multiple
        properties that can be configured and those properties may also be
        inter-related, often in complex ways. One example is the <xref
        target="H264">H.264 (AVC)</xref> video codec and its scalable (SVC)
        and multiview (MVC) versions. Most other video codecs, and codecs for
        many other types of media, also have multiple configurable properties.
        Such configurable properties will be referred to as "codec
        configuration parameters" in this specification.</t>

        <t>There can be several reasons to change the media rate or other
        encoding or packetization properties during an ongoing communication
        session. Reasons can be that the available network bandwidth varies,
        or that other network properties change, such as effective MTU or
        packet rate limitations. Other reasons can be that the quality or
        representation of the media rendered to the end user changes, maybe as
        a direct result of the user manipulating the GUI (e.g. changing window
        position or size), or the relative importance of the received media
        stream changes (e.g. active or non-active speaker in a conferencing
        scenario), or the user selects to show some other content source that
        is available among the advertised media streams.</t>

        <t>The codec changes above can be made directly between endpoints in a
        point-to-point scenario, or they may involve, and be acted upon, by
        media aware intermediaries (e.g. RTP mixers). An RTP mixer can do
        transcoding to provide each receiver with media streams of adapted
        quality, but transcoding has drawbacks as it always consumes
        processing power, typically impacts media quality in a negative way,
        and often introduces additional delays.</t>

        <t>In order to avoid separate transcoding towards each endpoint, an
        RTP mixer can, by taking the capabilities of the endpoints into
        account, decide to request specific codec configurations from sending
        endpoints, which will minimize the need for transcoding. Also, in
        scenarios where no RTP mixers are used and transmitted media reaches
        multiple endpoints, the sender will have to take into account that
        each endpoint may have different capabilities. The <xref
        target="sec-usecases">use cases section</xref> shows different use
        cases, with and without RTP mixers.</t>

        <t>Resource optimization involving bandwidth is expected to be one of
        the major reasons for changing encoding properties, since it is
        desirable to avoid using more bandwidth than absolutely necessary,
        especially considering that <list style="symbols">
            <t>the expectation for high media quality will continue to
            increase;</t>

            <t>the bitrate required to transmit the media, despite
            increasingly efficient media coding, can due to the above also be
            expected to increase;</t>

            <t>the available bandwidth is commonly a scarce and/or costly
            resource and will continue to be in the future;</t>

            <t>the relation between media bitrate and media codec
            configuration, the used set of media codec property values, is
            typically complex and the mapping between each individual codec
            property and bitrate is not linear;</t>

            <t>the used media bitrate does not uniquely identify the media
            codec configuration, but there are multiple codec configurations
            that can generate the same media bitrate;</t>

            <t>the media receiver preferences how the codec property values
            should be set for a certain media bitrate will vary with the
            specific end-user service requirements (for example, but not
            limited to, users with special needs) and the current media stream
            role in the application;</t>

            <t>the communication scenarios will not be limited to
            point-to-point, potentially involving multiple and at least partly
            conflicting constraints from different receivers.</t>
          </list>Other resources that may be desirable to optimize include,
        but not limited to, endpoint and middle node processing (CPU)
        utilization, and transport quality (QoS).</t>

        <t>A media receiver cannot be assumed to know exactly what codec
        configuration will be best for the media sender to use, given that the
        sender needs to take multiple aspects into account, including
        implementation limitations in the actual encoder. It should be more
        likely to find a value acceptable to both sender and receiver if the
        receiver can indicate an acceptable range instead of just a single
        value.</t>

        <t>When an RTP mixer distributes streams to multiple receivers with
        different media quality requirements, it is sometimes possible to
        avoid targeted transcoding for every single receiver. That can be
        accomplished if the media sender has the ability to produce multiple
        media versions, such as for example scalable encoding or simulcast.
        Thus, there is a need to both address specific media versions and
        describe the fact that multiple media versions with different
        configurations should be used.</t>
      </section>

      <section title="Legacy Methods">
        <section title="Relation to SDP">
          <t>The session description protocol (SDP) <xref target="RFC4566"/>
          is commonly used to negotiate and configure codecs, as well as to
          establish RTP/RTCP session parameters during session establishment
          and ongoing sessions, e.g. by using it in conjunction with <xref
          target="RFC3261"> SIP</xref> and <xref target="RFC3264">SDP
          Offer/Answer</xref>.</t>

          <t>As described in <xref target="sec-problem"/>, many of the
          underlying reasons which make media receivers desire certain codec
          encoding properties are highly dynamic in nature and using SIP/SDP
          to renegotiate the session will in many cases be too slow to be
          useful. SIP messages containing an SDP may become quite large for
          sessions containing many media types, and since there is no defined
          way to send a partial SDP, even very small changes require sending
          the entire SDP. Most of the current defined properties in SDP are
          oriented to be common for all media streams in the same RTP session,
          at least the ones sharing the same RTP Payload Type, rather than
          being specific to one media stream (e.g. "a=fmtp:98
          profile-level-id=42C00C").</t>

          <t>The mechanism in this specification does not replace SDP, or the
          SDP Offer/Answer mechanism. It is expected that SDP is used in order
          to negotiate and configure boundary values for codec properties, and
          COP can then be used to communicate specific values within those
          boundaries, as long as there is no impact on the values negotiated
          using SDP. It is possible to establish communication sessions even
          if one or more endpoints do not support COP.</t>
        </section>

        <section title="Relation to RTCP">
          <t>As discussed in CCM, regular RTCP reporting or <xref
          target="RFC3611">extended reports </xref> can to some extent be used
          to reconfigure an encoder, but the reported measures seldom map
          directly back to encoding properties and they typically cannot
          express an unwanted situation in terms of encoding properties and
          what the receiver would like to receive instead. Communicating codec
          properties indirectly as a set of network properties will require
          interpretation by both sender and receiver and will thus risk
          misinterpretations and ambiguity. Since it is likely that a decoder
          is able to identify unwanted characteristics of the media stream in
          terms of encoding properties, the most straight forward approach is
          to convey those properties directly to the encoder.</t>

          <t>Responsive techniques to control encoding are already available,
          e.g. <xref target="RFC5104">Codec Control Messages (CCM)</xref>.
          Although highly applicable, the possibilities to control encoding is
          however not explicit enough, both in terms of the amount of
          available parameters to control, and the fact that they may be
          inter-related, alternative, or both.</t>

          <t>Some codecs define codec-specific methods to enable receiver
          control of some encoding aspects, but it should be beneficial for
          interoperability to use codec agnostic signaling instead.</t>
        </section>
      </section>
    </section>

    <section anchor="sec-usecases" title="Use Cases for COP">
      <t>This section discusses a number of use cases for codec operation
      points.</t>

      <!--Editor's note:  Consider if each Topology scenario below should list 1. 
Media sender changing an existing operation point, 2. Media receiver requesting to change 
an existing operation point, 3. Media sender adding an operation point (making the 
stream more scalable), 4. Media receiver requesting to add an operation point, 5. 
Media sender removing an operation point (reducing scalability), and 6. Media receiver 
requesting to remove an operation point. Make sure to describe that any combination of 
those six may happen in a single step, and why that is the case. The other, maybe 
better, alternative is to use the six points above as top level and have Topologies as 
sub-sections. -->

      <t/>

      <section title="Point to Point">
        <t>This set of use cases focuses on communication, which is directly
        point to point between a media sender and a receiver. There is no need
        for further forwarding of the media streams. Thus, the goal should be
        to produce a media stream, transport it to the media receiver, where
        it is consumed as optimal as possible for the application. Thanks to
        this one-to-one mapping between encoder and decoder, great flexibility
        exists to produce a media stream tailored to the receiver's needs,
        given the constraints that exist from media sender, transport network
        and the receiver.</t>

        <t>Some constraints are static (and thus suitable for session
        configuration signalling), but others are highly dynamical and
        desirable to adapt to during the session:<list style="hanging">
            <t hangText="Video Resolution in GUI:">In a video communication
            application, including WebRTC based ones, the window where the
            media senders media stream is presented may change, for example
            due to the user modifying the size of the window. It might also be
            due to other application related actions, like selecting to show a
            collaborative work space and thus reducing the area used to show
            the remote video. In both of these cases it is the receiver side
            that knows how big the actual screen area is and what the most
            suitable resolution would be. It appears suitable to let the
            receiver request the media sender to send a media stream
            conforming to the displayed video size.</t>

            <t hangText="Network Bit-rate Limitations:">If the receiver
            discovers a network bandwidth limitation, it can choose to meet it
            by requesting media stream bit-rate limitations. Especially in
            cases where a media sender provides multiple media streams, the
            relative distribution of available bit-rate can help the
            application to provide the most suitable experience in a
            constrained situation.</t>

            <t hangText="CPU Constraint:">A media receiver may become
            constrained in the amount of available processing resources. This
            may occur in the middle of a session for example due to the user
            selecting a power saving mode, or starting additional applications
            requiring resources. When this occurs, the receiving application
            can select which and how much to constrain codec parameters to
            best suit the needs of the application. For example, if lower
            framerate is a better constraint than lower resolution.</t>
          </list></t>
      </section>

      <section anchor="sec-recv-to-mixer" title="Media Receiver to RTP Mixer">
        <t>This section considers a multiparty session with a centralized
        media intermediary, like an RTP mixer, where the media receiver uses
        COP to affect the delivered media.</t>

        <figure align="center" anchor="fig-receiver-cop"
                title="Receiver (B) using COP to adapt a media stream">
          <artwork><![CDATA[            +------------+        +---+
            |            |--RTP-->| B |
            |            |<--COP--|   |
            |            |        +---+
            |            | 
 +---+      |            |        +---+
 | A |-RTP->|   Mixer    |--RTP-->| C |
 +---+      |            |        +---+
            |            |
            |            |        +---+
            |            |--RTP-->| D |
            +------------+        +---+]]></artwork>
        </figure>

        <t>In the above <xref target="fig-receiver-cop"/> we focus on the
        possible usages of COP by a media receiver, like B. Here the
        functional role of the intermediary becomes important <xref
        target="RFC5117">(Topo-Mixer)</xref>. An RTP mixer uses its own
        SSRC(s) to channel selected media streams to B from other participants
        like A. If the intermediary is instead a translator, the Receiver B
        can see A's SSRC(s) directly instead of possibly showing up as CSRC.
        We will in this section focus on the mixer case. The RTP translator
        case is further discussed in <xref target="sec-multicast-usage"/>.</t>

        <t>The RTP mixer's usage of its own SSRC allows mixer to receiver
        media flows to be associated with a role or purpose in the application
        rather than a given media source. Based on the assumption that the set
        of available stream roles are connected to the specific use case or
        application, it is likely that the set of stream roles (for example
        most active speaker) provided from a mixer will change less often than
        the original media source representing that role is changed. It is
        further assumed that the desirable media characteristics related to a
        specific role will be fairly constant. To minimize the amount of
        signaling needed to modify stream characteristics, it could thus be
        appropriate to let a stream represent a role rather than limiting it
        to represent the original source. When there exist multiple RTP
        streams from the mixer to a receiver, the receiver can use COP to
        request an operations point that better suits the receiver's needs on
        each particular stream (role) of the media stream. COP also allows the
        receiver to select its desired trade-off in properties and quality
        between multiple delivered media streams.</t>

        <t>There exist different reasons why B would need to indicate changes
        in its capabilities to receive a particular media stream:<list
            style="hanging">
            <t hangText="Network Path:">The receiver detects changes in the
            network that on a mid to long term will result in a new capability
            regarding the maximum bit-rate that can be supported.</t>

            <t hangText="Bandwidth Trade-off:">In an application receiving
            multiple media streams, if the receiving application likes to
            change the relative bit-rate trade-off between the streams.</t>

            <t hangText="Presentation or GUI Changes:">If the presentation or
            graphical user interface (GUI) changes on the receiving side this
            results in other requirements or needs on the media streams. For
            example if the application window is resized by the user, the
            amount of screen estate to present the different video elements
            changes. To optimize the video quality in relation to bit-rate the
            receiver indicates the new preferred video resolution.</t>
          </list></t>

        <t>In all the above cases the receiver sends a COP request to the
        mixer for new codec operation points on mixer controlled media
        stream(s). It then becomes the mixer's responsibility to determine if
        and how the requested COPs can be supported. For example by requesting
        new operations points from the media source as discussed in <xref
        target="sec-mixer-to-sender"/>. The selection of another media source
        to deliver in a media stream can result in that the mixer may have to
        update the receiver on the properties of the operations point.</t>
      </section>

      <section anchor="sec-mixer-to-sender" title="RTP Mixer to Media Sender">
        <t>This section looks at the usage of COP in cases of multiparty with
        centralized media intermediary, like an RTP mixer, selecting and
        requesting tailored media stream or streams a media sender delivers to
        the intermediary for further forwarding or manipulation. This usage
        can be simplified to the media streams from one media sender (A),
        which is currently being delivered to multiple receivers (B-D) as
        depicted in <xref target="fig-mixer-cop"/>.</t>

        <figure align="center" anchor="fig-mixer-cop"
                title="Mixer using COP to adapt media streams to multiple receivers">
          <artwork><![CDATA[            +------------+        +---+
            |            |--RTP-->| B |
            |            |        +---+
 +---+      |            |
 | A |<-COP-|            |        +---+
 |   |-RTP->|   Mixer    |--RTP-->| C |
 +---+      |            |        +---+
            |            |
            |            |        +---+
            |            |--RTP-->| D |
            +------------+        +---+]]></artwork>
        </figure>

        <t>The media path from the mixer to B, C and D are different and thus
        the available resources may vary between them. In addition B, C and D
        may have different capabilities when it comes to handling media
        streams. These limitations can be learned by the mixer through session
        configuration signalling, media transmission feedback (e.g. RTCP), or
        usage of COP by the receivers (See <xref
        target="sec-recv-to-mixer"/>). Limitations are also expected to be
        updated during the session lifetime.</t>

        <t>The media sender (A) has certain capabilities and what is possible
        to do will depend on A's capabilities and what has been configured
        between A and the mixer. Let's consider different capabilities of A
        and how they influence the usage of COP to affect the media stream(s)
        delivered to the mixer. <list style="hanging">
            <t hangText="Single Media Encoding:">If A can only provide a
            single media encoding of a particular media source, the mixer has
            to make a choice on what property it would like to request for
            that media stream. The most basic choice is to request the lowest
            common denominator across the receiver population. If the mixer
            has certain capabilities for media transcoding it could select to
            request another operation point for the media encoding with higher
            quality and then transcode to some few receivers. That enables a
            higher quality to several receivers while still being able to
            serve endpoints with the least capabilities. In these cases the
            mixer has to send COP requests that indicate only a single
            operation point with parameters matching the restrictions in the
            best possible way.</t>

            <t hangText="Scalable Media Encoding:">If A is capable of
            producing a scalable media stream encoding, the mixer can request
            multiple operation points for the same media stream. For example,
            if A is capable of producing three different operation points, the
            mixer in the above <xref target="fig-mixer-cop"/> would be able to
            request scalability layers that match the capabilities of all
            three receivers B, C and D. If several receivers have similar
            capabilities, the mixer may choose to request fewer operation
            points. In this case, other than in the single media encoding, the
            mixer must determine which packets or parts of packets to send to
            each receiver based on their capabilities. This requires that the
            mixer is capable of identifying in the media stream which
            scalability layer matches a requested operation point. Thus, it is
            desirable that the media sender can indicate to the mixer which
            layer matches a given operation point.</t>

            <t hangText="Simulcast Media:">If A and the mixer have negotiated
            the usage of simulcasted media encoding of the media source, then
            the mixer can adopt several operation points to best suit the
            receivers, just like for scalable encoding. When simulcasting, the
            mixer will however have to send one COP request per media stream
            it actually wants to affect. It is necessary to ensure that
            configuration changes over multiple media streams from the same
            media source take place. Compared to scalable media, the mixer
            does not need not strip away layers to match a particular
            operation point but can forward entirely self-contained media
            streams.</t>
          </list></t>

        <t>The use of COP as described above can be triggered by a multitude
        of reasons. We will here discuss some of them. We already mentioned
        that bit-rate adaptation (congestion control) on the mixer to receiver
        path can indicate a need to change an operation point. Another reason
        is when a new session participant joins that has certain receiver
        capabilities (both decoding or other hardware, as well as network path
        related), thus potentially changing the optimal set of operation
        points. There also exist a number of different cases where the desired
        application behavior results in changes in desired operation points,
        like change of active speakers, reconfiguration of the display layout,
        etc.</t>

        <t>It is important to remember that <xref target="fig-mixer-cop"/>
        only presents the view of a single media sender. In most communication
        sessions there are multiple media senders, and the mixer will need to
        take the combination of media streams from multiple media senders into
        account when choosing what is to be sent to a given receiver. Thus
        changes at one media sender can result in related changes of the
        operation points at the other media senders.</t>
      </section>

      <section anchor="sec-multicast-usage"
               title="Media Receiver in Multicast or with RTP Transport Translator">
        <t>This section covers the usage of COP in multicast transported RTP
        sessions, as well as when <xref target="RFC5117">transport translators
        (Topo-Translator)</xref> are used. Transport translators can be used
        to emulate any source multicast (ASM) over unicast. Multicast usages
        also include <xref target="RFC4607">Source Specific Multicast
        (SSM)</xref>, which according to <xref target="RFC5760">"RTP Control
        Protocol (RTCP) Extensions for Single-Source Multicast Sessions with
        Unicast Feedback"</xref> has two main modes: simple mode, and summary
        feedback mode. SSM modes affect the usage of COP functionalities.</t>

        <figure align="center" anchor="fig-translator"
                title="RTP translator topology">
          <artwork><![CDATA[+---+      +------------+      +---+
| A |<---->|            |<---->| B |
+---+      |            |      +---+
           | Translator |
+---+      |            |      +---+
| C |<---->|            |<---->| D |
+---+      +------------+      +---+]]></artwork>
        </figure>

        <t>A <xref target="RFC5117">transport translator</xref>, which main
        purpose is to forward any incoming packets to all the other session
        participants, emulates an ASM session (see <xref
        target="fig-translator"/>). As anyone can send to all other in both
        cases, there are some properties in these large scale sessions with
        many participants which require extra consideration.</t>

        <figure align="center" anchor="fig-ssm-session"
                title="SSM based RTP session">
          <artwork><![CDATA[ +-----+  +-----+          +-----+
 | MS1 |  | MS2 |   ....   | MSm |
 +-----+  +-----+          +-----+
    ^        ^                ^
    |        |                |
    V        V                V
+---------------------------------+
|       Distribution Source       |
+--------+                        |
| FT Agg |                        |
+--------+------------------------+
  ^ ^           |
  :  .          |
  :   +...................+
  :             |          .
  :            / \          .
+------+      /   \       +-----+
| FT1  |<----+     +----->| FT2 |
+------+    /       \     +-----+
  ^  ^     /         \     ^  ^
  :  :    /           \    :  :
  :  :   /             \   :  :
  :  :  /               \  :  :
  :   ./\               /\.   :
  :   /. \             / .\   :
  :  V  . V           V .  V  :
 +----+ +----+     +----+ +----+
 | R1 | | R2 | ... |Rn-1| | Rn |
 +----+ +----+     +----+ +----+]]></artwork>
        </figure>

        <t>In the above <xref target="fig-ssm-session"/>, the media senders
        (MS1 ... MSm) send their media streams and RTCP traffic to the
        distribution source (DS). The DS forwards the RTP and RTCP traffic
        from the media senders to the SSM group. Using the <xref
        target="RFC5760">RTCP extension for unicast RTCP feedback</xref>, the
        receivers (R1...Rn) send their RTCP traffic to their configured
        feedback target. This sample session has two feedback targets to scale
        with the amount of receivers. RTCP messages that need to go to a media
        sender are forwarded to the FT aggregator part of the distribution
        source for further forwarding over the unicast paths between the
        distribution source and the media senders. The feedback target and the
        feedback aggregator also forward all RTCP messages from receivers in
        simple mode, and aggregate it in summary mode. Some RTCP messages from
        a receiver may still have to be forwarded over the SSM group.</t>

        <t>COP needs to support some reasonable functionality over the
        different multiparty topologies described above and it is important
        that COP does not cause significant issues in any of the
        environments.</t>

        <t>In the basic case, where only a single multicast group exists,
        there is a well known problem associated with adapting content and
        bit-rate to the receiver population. The more receivers, the larger
        the potential for non-matching requirements in requests from the
        different receivers. One strategy for meeting this is to use the
        lowest common denominator among the requests from the receiver
        population. This normally results in sub-optimal quality for a
        significant part of the session participants, the main benefit being
        that all participants will be able to receive some content.</t>

        <t>Because of the above limitations of operation within a single
        group, the usage of COP in larger groups becomes difficult unless the
        parameters that can be adopted and affected by COP requests are such
        that a limited set of participants is expected to request them, and
        the impact for the others are limited or acceptable. The authors
        therefore expects the usage of COP in large groups to be limited and
        this specification focuses on operation in smaller groups. However, as
        it is not possible to define the threshold when a group changes from
        being small to be too large to work well with COP in the generic case,
        it is important that COP can operate safely in a large group, although
        the possibilities to satisfy the request may be severely limited.</t>

        <t>There also exist use cases for COP where the media application uses
        multiple multicast groups to enable multiple operation points and
        allows each receiver to join the multicast groups that suits the
        participant's capabilities. An example of such usage would be Scalable
        Video Coding (SVC) using the Multi-Session Transport (MST) mode of the
        <xref target="RFC6190">SVC RTP payload format</xref>. The SVC MST RTP
        streams that are sent in each group can still contain multiple
        scalability layers. One could combine coarse-grained control on the
        operation points by having the receiver join a particular session with
        a more fine-grained control using COP to adjust the included
        scalability layers to the receiver's needs, such as lower CPU
        load.</t>
      </section>
    </section>

    <section title="Requirements">
      <t>The solution outlined in this specification should fulfill the
      following requirements:<list style="hanging">
          <t hangText="REQ-1:">Enable dynamic control of possibly
          inter-related codec properties during an ongoing media session.</t>

          <t hangText="REQ-2:">Be media type agnostic, to the furthest extent
          possible, and at least cover audio and video media.</t>

          <t hangText="REQ-3:">Be codec agnostic (within the same media type),
          to the furthest extent possible.</t>

          <t hangText="REQ-4:">Work with different media transmission types,
          i.e. single-stream, simulcast, single-stream scalable, and
          multi-stream scalable transmission.</t>

          <t hangText="REQ-5:">Work with un-encrypted as well as encrypted
          media.</t>

          <t hangText="REQ-6:">Be extensible, making it simple to add control
          and description of new codec properties.</t>

          <t hangText="REQ-7:">Complement rather than conflict with other
          codec configuration methods such as other RTCP based techniques and
          SDP.</t>

          <t hangText="REQ-8:">Support configurable parameters that are
          directly visible in the media stream as well as those that are not
          visible in the media stream.</t>
        </list>In addition, <xref target="RFC5968">Guidelines for Extending
      RTCP</xref> should be followed.</t>
    </section>

    <section title="Solution Overview">
      <t>The mechanism described in this specification especially targets
      heterogeneous multiparty scenarios where different endpoints require
      differently encoded media from the same source, but its use in other
      situations is not precluded. In fact, point-to-point scenarios are
      considered to be of equal importance but not more demanding that the
      multiparty case. In the targeted scenario, the media stream from one
      encoder is sent to multiple decoders. Hence, the encoder must possibly
      provide an encoding with multiple operation points, suitable for the
      receivers. This is only possible with so-called scalable codecs, but
      some codecs may have inherent scalability features without being
      generally considered as scalable (e.g. H.264/AVC temporal scalability
      through non-reference frames). Multiparty services often involve a media
      mixer <xref target="RFC5117">(Topo-Mixer)</xref> as a central network
      node.</t>

      <figure align="center" anchor="fig-target-topology"
              title="RTP mixer topology">
        <artwork><![CDATA[       +---+
       | S |
       +---+
         |
         v
     +-------+
     | Mixer |
     +-------+
   /     |     \
  v      v      v
+---+  +---+  +---+
| A |  | B |  | C |
+---+  +---+  +---+]]></artwork>
      </figure>

      <t>The solution defined in this specification is targeted for automatic
      control of codec parameters, not as a direct result of user interaction,
      although the automatic control can in turn be triggered by user
      interaction. It can be used during an active session to quickly adapt to
      changes in media receiver available bandwidth and/or preferences for one
      or more codec properties, while still conforming to the session
      configuration, like SDP offer/answer negotiated minimum or maximum
      limits (depending on individual SDP property semantics). Some codec
      property changes will also motivate to renegotiate the SDP, but the
      scope of this specification intends to cover only changes that lie
      within the SDP negotiated set and thus do not impact the SDP.</t>

      <t>Three message types are defined to support the solution: a request, a
      notification, and a status report:<list style="hanging">
          <t hangText="Request (COPR):">A media receiver requesting a media
          sender to adjust one or more of it's media encoding parameters for a
          media stream. The request is normally based on a specific set of
          media encoding parameters that the media sender has explicitly
          notified the media receiver about in a notification.</t>

          <t hangText="Notification (COPN):">A media sender notifying a media
          receiver of the currently used media encoding parameters for a media
          stream. The notification is initiated by the media sender, typically
          whenever the media encoding parameters changed significantly from
          what was previously used. The reason for the change can either be
          local to the media sender (user, endpoint or network), or it can be
          the result of one or more requests from remote endpoints.</t>

          <t hangText="Status Report (COPA):">A media sender reporting to a
          request sender (media receiver) on request reception status, which
          specific request from the media receiver that was received and
          considered in setting current media encoding parameters, and the
          identification of the media stream that is considered to fulfill the
          request. The status report can also indicate various error
          conditions, such as reception of invalid or failing requests.</t>
        </list></t>

      <t>More details about the individual messages are found in the following
      sub-sections.</t>

      <section title="Message Structure">
        <t>A COP message is sent from an RTP session participant in its role
        either as media receiver or media sender. Each message can contain one
        or more message items of one or more message types, all originating
        from a single media source.</t>

        <t>The individual message items each relate only to a single operation
        point, describing part of an atomic notification or request.</t>

        <t>The general structure is outlined below:</t>

        <figure align="center" anchor="fig-message-structure"
                title="COP message structure">
          <artwork><![CDATA[+--------------------------------------+
| AVPF PSFB FMT="COP"                  |
| SSRC of Packet Sender                |
| SSRC of Media Source                 |
| +----------------------------------+ |
| | COP Message Item 0               | |
| +----------------------------------+ |
| | (Codec Configuration Parameters) | |
| +----------------------------------+ |
| +----------------------------------+ |
| | COP Message Item 1               | |
| +----------------------------------+ |
| | (Codec Configuration Parameters) | |
| +----------------------------------+ |
| ...                                  |
+--------------------------------------+]]></artwork>
        </figure>

        <t>Note that the request is the only COP message item defined in this
        specification that is sent in the media receiver role and makes use of
        "SSRC of media source" as the targeted media stream for the request.
        Both the notification and the status report message items are sent in
        the media sender role, reporting on the message sender's own
        configuration and thus relate only to the "SSRC of packet sender",
        being agnostic to the "SSRC of media source" field.</t>

        <t>It is for example possible to collocate COPS and COPN messages for
        the same media source in the same COP FCI. It is also possible to
        co-locate one or more COPR referring to a single "SSRC of media
        source" with one or more COPN and/or COPS relating to a single "SSRC
        of packet sender" within a single COP message.</t>

        <t>Multiple message items of the same type in the same COP message are
        used to describe a notification, status or request for a media stream
        containing multiple operation points (see <xref
        target="sec-overview-operation-point"/>).</t>

        <t>Multiple COP messages are needed to be able to refer to multiple
        different "SSRC of packet sender" and/or "SSRC of media source".</t>
      </section>

      <section title="Codec Configuration Parameter Use">
        <t>The codec configuration parameters that are applicable to a certain
        codec may be specific to the media type (audio, video, ...), and may
        also be codec specific. Some codec properties (described by codec
        configuration parameters) have to be explicitly enabled by (non-RTCP
        based) capability signaling to be possible or permitted to use.</t>

        <t>An endpoint implementing this specification does not need to
        support all available codec configuration parameters defined herein or
        in extensions to this specification. A certain parameter could be
        unnecessary for a certain codec or media stream, even if it is
        generally supported by the endpoint. This specification therefore
        defines capability signaling that allows a COP receiver to declare
        explicit support per parameter type on a per codec level. The set of
        codec configuration parameters that can be used for a certain media
        stream by a COP sender is thus restricted by the combination of
        applicability, capability signaling and explicit receiver parameter
        support signaling.</t>

        <t>Any codec configuration parameter that is applicable and feasible
        to use, but is not included as part of an operation point, has a
        default value. This default value is defined for each parameter type,
        but should preferably whenever possible be taken from capability
        signaling. It is not necessary to use all defined parameter types in a
        media stream description. Some parameter types can, depending on media
        type or codec, either be unnecessary, or not possible to describe or
        control in detail, in which case they can be left out. This means that
        the effective value is "undefined" within the limits set by capability
        signaling (outside the scope of this specification).</t>
      </section>

      <section anchor="sec-overview-operation-point" title="Operation Point">
        <t>The codec configuration parameters contained in a single message
        item jointly constitute a description of an operation point for a
        specific media stream from a media sender.</t>

        <t>For the purpose of COP signaling, each operation point is
        identified with an identity number (OPID), which is scoped by the
        media sender's RTP SSRC identification, and can be chosen freely by
        the media sender. The need for this media sub-stream identification
        only appears with scalable coding or other media encoding methods that
        introduce separable and configurable sub-streams within the same SSRC.
        An OPID thus refers to such configurable sub-stream, described by a
        set of related codec configuration parameters.</t>

        <figure align="center" anchor="fig-opid"
                title="Relation of OPID to media source, RTP session and SSRC">
          <artwork><![CDATA[                  +--RTP Session 1 ---------------------+
Media Source 1----+-+-> SSRC1 --> Sub-Stream 1 -> OPID1 |
(MIC, Camera)     |           \-> Sub-Stream 2 -> OPID2 |
                  |                                     |
Media Source 2-+--+---> SSRC2 --> Sub-Stream 1 -> OPID3 |
               |  |           \-> Sub-Stream 2 -> OPID4 |
               |  |           \-> Sub-Stream 3 -> OPID5 |
               |  +-------------------------------------+
               | 
               |  +--RTP Session 2 ---------------------+
               +--+---> SSRC3 --> Sub-Stream 1 -> OPID6 |
                  |           \-> Sub-Stream 2 -> OPID7 |
                  +-------------------------------------+]]></artwork>
        </figure>

        <t><xref target="fig-opid"/> depicts the possible relations between
        media sources, RTP sessions, RTP streams (SSRCs), RTP sub-streams, and
        the OPID.</t>

        <t>For example, a single video camera may be encoded using SVC for a
        combined SST and MST transmission configuration. In that case a subset
        of scalability layers is sent as SST in the first RTP session using
        SSRC2. Another set of scalability layers is transported in the second
        RTP session as another SST using SSRC3. The RTP packet stream from
        each SSRC can thus contain several sub-streams, each identified with
        its own OPID. As a result, a single media source is present in two RTP
        sessions, using two different SSRCs (2 and 3) containing a total of
        five sub-streams (OPID 3 to 7).</t>

        <t>Since an operation point is expected to change over time, as a
        result of media receiver <xref
        target="sec-overview-request">requests</xref>, resulting from local
        <xref target="sec-overview-notification">media sender
        considerations</xref>, or both, the operation point (OPID) is version
        handled. The version is scoped by SSRC and OPID.</t>

        <t>It is expected that all encoders dividing a media stream into
        sub-streams will include some means to identify those sub-streams in
        the media stream. However, it is also expected that such
        identification is in general codec specific. There is thus a need to
        map the codec agnostic COP OPID identification to codec specific
        identification, and this specification therefore includes a method for
        such <xref target="sec-codec-sub-stream-id">mapping</xref>.</t>
      </section>

      <section anchor="sec-overview-request" title="Request">
        <t>The request is sent by a media receiver, which can be either an
        endpoint or a middle node such as an RTP mixer. The receiver of the
        request may similarly be either the original media sender or a RTP
        mixer. Included in the request is a description of the desired codec
        configuration for a specific media (sub-)stream. The parameter values
        communicated in a <xref
        target="sec-overview-notification">notification</xref> of that
        (sub-)stream are taken as a starting point when deciding what
        parameters and parameter values to choose for the request, and only
        parameters with changed values need to be included the request. The
        media receiver can of course use other sources of information when
        choosing parameters and values, for example observation of the
        received media stream and capability signaling.</t>

        <t>It is not required to receive a notification beforehand to be able
        to create a meaningful request. The request can include a set of
        changed properties for existing streams, but it can also request the
        addition or removal of one or more media sub-streams having certain
        properties, in which case there will be no notification to base the
        request on. A media receiver may also want to send a request prior to
        having received any notifications for existing streams, and can then
        base the request on other information such as for example observing
        the media stream or use information from the capability signaling. In
        case there is no existing stream and OPID to refer to in the request,
        a "provisional" OPID MUST be chosen in the request, which will have to
        be mapped back to an existing (sub-)stream and "real" OPID through
        <xref target="sec-codec-sub-stream-id">methods defined in this
        specification</xref>.</t>

        <t>The media sender receiving a specific request is not required to
        reconfigure the encoder accordingly, even if it should try to do so.
        The media sender is allowed to take other (previous or concurrent)
        requests and any local considerations into account, possibly modifying
        some of the parameter values, or even to reject the request completely
        if it is not seen as feasible. It is thus not possible for a media
        receiver to uniquely see from the media stream or even from a
        notification if the media sender received the request or if the
        request was lost and needs to be resent.</t>

        <t>A request should be based on a notification, but there may be
        situations where a request is sent approximately simultaneously with a
        new notification for the same stream. In that case, there is a risk
        that the request is based on the wrong set of codec properties
        compared to the new notification. It is therefore necessary to have
        the set of codec properties version controlled, identified by an OPID.
        If a notification announces a specific version of the operation point,
        where the version is updated every time it is changed, the request can
        refer to that specific version and any mis-reference can be clearly
        identified and resolved. In addition, it allows for easy
        identification of repeated notifications and requests by checking the
        operation point identification and the version, without the need to
        parse through all codec properties for changes.</t>
      </section>

      <section anchor="sec-overview-notification" title="Notification">
        <t>The notification is sent by a media sender and describes a media
        stream or sub-stream in terms of a defined, finite set of codec
        properties. That same set of codec properties can also be used in a
        <xref target="sec-overview-request">request</xref>. The notification
        and the set of defined properties is important to be known at the
        media receiver since it is rarely possible to see from the media
        stream itself what controllable properties were used to generate the
        stream. The set of codec properties and their values used to describe
        a certain media stream at a certain point in time are henceforth
        called a codec configuration. Each operation point in this codec
        configuration is implemented using an RTP payload type, defined by
        capability signaling outside the scope of this specification.</t>

        <t>It must be possible for a media sender to change the codec
        configuration not only based on requests from media receivers, but
        also based on local limitations, considerations, or user actions. This
        implies that the notification can be sent standalone and not only as a
        response to a request (compare <xref target="RFC5104">TMMBR and
        TMMBN</xref>). To avoid that media receivers have to guess what codec
        configuration is used, a media sender should always send a
        notification when the codec configuration for a stream changes. Loss
        of a notification messages should not be critical since a media
        receiver could either fall back to infer the approximate codec
        configuration from the media stream itself, or simply wait with a
        request until the next notification is sent.</t>

        <t>A notification can potentially contain a large amount of codec
        properties. However, parameters that are not enabled by codec and COP
        capability signaling, or inherently are not part of the used codec
        will not be included. The notification only describes the currently
        used codec configuration, and each parameter of an operation point
        will be described by a single value. To further limit the amount of
        properties to be sent, it is possible to rely on parameter defaults
        (listed by individual parameter type definitions) whenever those
        values are acceptable.</t>

        <t>The media receiver could want to take local action at the time when
        the codec configuration in the media stream changes. Using the same
        reasoning as above, this may not be possible to see from the media
        stream itself. This functionality is explicitly enabled by including
        the RTP time stamp in the notification, where the time stamp describes
        a time (possibly in the future) when the codec configuration is
        (estimated to be) effective.</t>

        <t>It is not required that a media sender sends notifications for all
        media streams or sub-streams. However, the non-announced streams or
        sub-streams will then not be accessible to media receiver <xref
        target="sec-overview-request">control</xref>. Any media or transport
        resources occupied by those non-announced streams (in COP terms) must
        be excluded from the total amount of available resources when deciding
        feasible parameter value ranges for the announced streams.</t>
      </section>

      <section anchor="sec-overview-status" title="Status Report">
        <t>The status report is sent by a media sender and is needed to
        confirm reception of a request OPID to avoid unnecessary
        retransmission of requests. Loss of a status report will likely
        trigger a request retransmission, except when the request sender can
        infer from the media stream or a notification that the stream is now
        acceptable.</t>

        <t>The status report is not a required acknowledgement of every
        request, but instead reports on the last received request, identified
        by a request sequence number in addition to the OPID. This decoupling
        of requests and status reports reduces the needed amount of status
        reports in case of frequently updated requests and/or lack of
        resources to send status reports.</t>

        <t>If a request is somehow not acceptable to a media sender, the
        status report can also indicate failure and a reason for failure.</t>

        <t>In case the OPID in the request is a <xref
        target="sec-overview-request">"provisional" OPID</xref>, the status
        report responds with that exact OPID, but also includes a reference to
        a "real" media (sub-)stream identification or OPID that the media
        sender considers appropriate for the request.</t>

        <t>No description of any codec configuration is included in a status
        report, even if the corresponding request was successful. The codec
        configuration is only carried in the <xref
        target="sec-overview-notification">notification</xref> message.
        Multiple status reports targeted for multiple request senders can
        through media (sub-)stream identification and OPID point to the same
        notification message, reducing the need to repeat applicable codec
        configuration parameters with every accepted request.</t>
      </section>

      <section title="Adding and Removing Operation Points">
        <t>A media sender can unilaterally create a new operation point by
        simply selecting a free OPID identifier and use COPN to announce
        it.</t>

        <t>To remove an operation point, the media sender simply stops
        announcing it in COPN. This procedure can be used both for entire
        media streams containing a single operation point and to add/remove
        sub-streams in media streams containing multiple operation points.</t>

        <t>The media receiver can request a new operation point to be created
        by using a COPR with an unused identifier and by setting a flag to
        indicate that this requests a new OPID. The media sender then decides
        if it honors the request or not, and announces the new OPID as
        described above.</t>

        <t>The media receiver can indicate that it is no longer interested in
        receiving an operation point corresponding to a media sub-stream by
        not including any COPR message item for it in a single COP message.
        The media receiver can indicate a wish to continue to receive an
        unmodified operation point using a COPR without any codec properties
        (no change).</t>
      </section>
    </section>

    <section title="Codec Control Message Extension">
      <t>This specification specifies a new feedback message, COP, for codec
      control of real-time media, as an extension to the <xref
      target="RFC4585">AVPF</xref> and <xref target="RFC5104">CCM</xref>
      specifications. The AVPF specification outlines a mechanism for fast
      feedback messages over RTCP, which is applicable for IP based real-time
      media transport and communication services. It defines both transport
      layer and payload-specific feedback messages. This specification targets
      the payload-specific type, since a certain codec is typically described
      by a payload type.</t>

      <t>AVPF defines three and CCM defines four payload-specific feedback
      messages (PSFB). All AVPF and CCM messages are identified by means of
      the feedback message type (FMT) parameter. This specification specifies
      one additional payload-specific feedback message.</t>

      <t>One new PSFB FMT value is assigned in this specification: <list
          style="hanging">
          <t hangText="TBA1: ">Codec Operation Point (COP)</t>
        </list>This section defines the feedback message structure, message
      items and their semantics with the exception of the actual codec
      configuration parameters which are defined in the <xref
      target="sec-parameters">next section</xref>.</t>

      <section anchor="sec-cop-message" title="COP Message">
        <t>The COP message is a payload-specific AVPF CCM message identified
        by the PSFB FMT value listed above. It carries one or more COP message
        items, each with either a request for, a description of a certain
        "operation point"; a set of codec parameters, or a request status
        indication.</t>

        <t>Not all message items makes use of the "SSRC of media source" in
        the common packet header. "SSRC of media source" SHALL be set to 0 if
        no message item that makes use of it is included in the FCI.</t>
      </section>

      <section anchor="sec-fci-format" title="FCI Format">
        <t>The COP FCI MUST contain one or more codec operation point message
        items. The maximum number of COP message items in a COP message is
        limited by the <xref target="RFC4585"/> Common Packet Format 'length'
        field.</t>

        <t>The definition of the AVPF feedback message format mandates that
        the FCI part is a multiple of 32-bit words. The below defined message
        items will not be 32-bit word aligned. Therefore it is sometimes
        necessary to insert one to three padding bytes at the end of the FCI.
        The number of padding bytes are determined by a receiver by comparing
        the sum of the message items and the feedback message length fields.
        The padding byte MUST be set to zero (0) and ignored on reception.</t>

        <figure anchor="fig-rtcp-message-structure"
                title="COP RTCP Message Structure">
          <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|FMT=TBA1 |     PT=206    |          length               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  SSRC of packet sender                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  SSRC of media source                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                 COP message item header #1                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                 COP message item payload #1                   :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:               |          COP message item header #2           :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:               |          COP message item payload #2          :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:                              ...                              :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: COP message item payload #N   |         Padding (0)           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
        </figure>

        <section anchor="sec-item-format" title="Message Item Format">
          <t>All codec operation point message items share a common header
          format:</t>

          <figure align="center" anchor="fig-message-header"
                  title="COP message item header format">
            <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Type |      Payload Length     |     OPID      |N|   Version   |     
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:                    (Message Item Payload)                     :]]></artwork>
          </figure>

          <t>The message header fields are:<list style="hanging">
              <t hangText="Type (3 bits):">Message item type. Three item types
              are defined in this specification, COPR, COPN and COPS, with
              values as listed in <xref target="tab-itemtype"/> below. More
              item types MAY be defined in extensions to this specification.
              Message items with a type field that has an unknown value SHALL
              be ignored by the receiver.</t>

              <t hangText="Payload Length (13 bits):">The total length in
              bytes of all data belonging to this message, following the
              message item header, i.e. anything following the Version
              field.</t>

              <t hangText="OPID (8 bits):">Operation point ID. Some (typically
              scalable) codecs are capable of encoding into multiple
              simultaneous operation points using the same SSRC, and each
              operation point can then be referenced by OPID. MUST be unique
              within the scope of an SSRC when N flag is not set. MUST be set
              to 0 for message items not using the field. See also <xref
              target="sec-opid"/>.</t>

              <t hangText="N (1 bit):">A "New OPID" flag, indicating that the
              OPID value is chosen arbitrarily and is not meant to refer to
              any existing operation point. The message sender SHOULD NOT use
              an already known OPID in combination with the N flag. See also
              individual message item definitions.</t>

              <t hangText="Version (7 bits):">Referencing a specific version
              of the codec configuration identified by the OPID.</t>
            </list></t>
        </section>

        <section title="Message Item Types">
          <t>The message types defined in this specification are:</t>

          <texttable anchor="tab-itemtype" title="Message Item Type Values">
            <ttcol>Value</ttcol>

            <ttcol>Message Item Type</ttcol>

            <c>0</c>

            <c>Codec Operation Point Notification (COPN)</c>

            <c>1</c>

            <c>Codec Operation Point Request (COPR)</c>

            <c>2</c>

            <c>Codec Operation Point Status (COPS)</c>

            <c>3-6</c>

            <c>Unassigned</c>

            <c>7</c>

            <c>Reserved for future extensions</c>
          </texttable>

          <t>Each message type defined in this specification is described in
          detail in subsequent sections.</t>
        </section>

        <section anchor="sec-opid" title="Operation Point Identification">
          <t>All RTP media streams belonging to the same session can per
          definition be identified by the SSRC. However, identification of any
          sub-streams contained in the same RTP media stream (SSRC) needs to
          use some other identification method, scoped by the SSRC. This is
          the case for a media stream containing more than one operation
          point, like for example <xref target="RFC6190">SVC</xref> streams
          being sent using Single Stream Transport (SST) RTP
          packetization.</t>

          <t>The encoding of and restrictions for such sub-stream (operation
          point) identification will in general be codec specific. Therefore,
          the OPID used in this specification is merely an SSRC-unique
          identification number. It is however necessary to create a mapping
          between this generic number and the codec specific sub-stream
          identification that can be found in the media stream. This mapping
          is achieved by including the <xref target="sec-id">ID
          parameter</xref> in a message item carrying a certain OPID.</t>

          <t>In <xref target="sec-codec-sub-stream-id"/>, codec specific ID
          parameter formats are defined for a few of the most common codecs
          that supports scalability.</t>
        </section>
      </section>

      <section anchor="sec-copn" title="Codec Operation Point Notification">
        <t/>

        <section title="Message Format">
          <t/>

          <figure align="center" anchor="fig-copn" title="COPN format">
            <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Type |      Payload Length     |     OPID      |N|   Version   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Transition Time Stamp                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R|Payload Type |    Codec Configuration Parameters             :
+-+-+-+-+-+-+-+-+                                               :
:                                                               :]]></artwork>
          </figure>

          <t>The COPN-specific message fields are (see also <xref
          target="sec-item-format">message item format</xref>):<list
              style="hanging">
              <t hangText="Type (3 bits):">Set to 0, as listed in <xref
              target="tab-itemtype"/>.</t>

              <t hangText="OPID (8 bits):">The OPID which is described by the
              codec configuration parameters.</t>

              <t hangText="N (1 bit):">Not used by COPN and SHALL be set to 0
              by senders.</t>

              <t hangText="Version (7 bits):">Referencing a specific version
              of the codec configuration identified by the OPID. SHALL be
              increased by 1 modulo 2^8 whenever the used codec configuration
              referenced by the OPID is changed. A repeated message SHALL NOT
              increase the Version. The initial value SHOULD be chosen
              randomly.</t>

              <t hangText="Transition Time Stamp (32 bits):">The RTP Time
              Stamp value when the listed codec configuration parameters will
              be effective in the media stream, using the same time line as
              RTP packets for the referenced SSRC (media sender SSRC). The
              Time Stamp value MAY express either a time in the past or in the
              future, and need not map exactly to an actual RTP Time Stamp
              present in an RTP packet for that SSRC. The same timestamp value
              SHOULD be used for subsequent transmissions of the identical set
              of codec configuration parameters for the same OPID and
              version.</t>

              <t hangText="R (1 bit):">Reserved. MUST be set to 0 by senders
              and MUST be ignored by receivers implementing this
              specification. MAY be defined differently by extensions to this
              specification.</t>

              <t hangText="Payload Type (7 bits):">SHALL be identical to the
              RTP header Payload Type valid for the (sub-)stream described by
              this OPID.</t>

              <t
              hangText="Codec Configuration Parameters (variable length):">Contains
              zero or more TLV carrying codec configuration parameters as
              defined in <xref target="sec-parameters">parameter
              types</xref>.</t>
            </list></t>
        </section>

        <section anchor="sec-copn-semantics" title="Semantics">
          <t>This message is used to inform the media receiver(s) about used
          codec configuration parameters at the media sender. The available
          codec parameter types that can be used to describe the codec
          configuration are defined in <xref target="sec-parameters"/>.</t>

          <t>Some codecs may have clear inband indications in the encoded
          media stream of how one or more of the codec configuration
          parameters are configured. For those codecs and codec configuration
          parameters, COPN is not strictly necessary. Still, for some codecs
          and / or for some codec configuration parameters, it is not
          unambiguously possible to see individual codec configuration
          parameter values from the encoded media stream, or even possible to
          see some codec configuration parameters at all, motivating use of
          COPN.</t>

          <t>COPN SHOULD be scheduled for transmission when it becomes known
          that there are media receivers in the RTP session that did not yet
          receive any codec configuration parameters for an active operation
          point, or whenever the effective codec configuration parameters has
          changed significantly, but MAY be scheduled for transmission at any
          time. The media sender decides what amount of change is required to
          be considered significant.</t>

          <t>The reason for a codec configuration parameter change can either
          be local to the sending terminal, for example as a result of user
          interaction or some algorithmic decision, or resulting from
          reception of one or more <xref target="sec-copr">COPR
          messages</xref>.</t>

          <t>If a media sender can no longer fulfill the established codec
          configuration parameter restrictions of a operation point that was
          previously described by a COPN, it MAY change any codec
          configuration parameter or even remove the entire operation point,
          and SHOULD then signal this at the earliest opportunity by sending
          an updated COPN to the media receiver(s).</t>

          <t>An OPID can implicitly be indicated as no longer being used by
          omitting that OPID from the set of COPN message items in the COP
          PSFB message. All OPIDs that the media sender intends to use at the
          latest time indicated by any transition timestamp value in the set
          of COPN present in the COP PSFB message, MUST be included in that
          COP message.</t>

          <t>All operation points referred by a <xref
          target="sec-cops">COPS</xref> SHOULD also be detailed by a COPN
          message contained in the same or in a subsequent COP feedback
          message, even if the operation point did not change significantly
          from previous COPN.</t>

          <t>Note that the OPID Version of that COPN, subsequent to COPS, will
          be equal or larger than the Version indicated in the COPS. The
          Version difference may be larger than one (taking field wraparound
          into account) depending on the number of updated COPN sent since the
          COPR that triggered the COPS. See also description of those messages
          below.</t>

          <t>Note: COPN may be seen as a more explicit and elaborate version
          of the TSTN message of <xref target="RFC5104"/> and most of the
          considerations detailed there for TSTN also apply to COPN.</t>

          <section title="Parameters">
            <t>The media sender decides what codec configuration parameters to
            use in the COPN to describe an operation point. It is RECOMMENDED
            that all codec configuration parameters that were accepted as
            restrictions based on received COPR messages are included. All
            codec configuration parameters significantly more restrictive than
            implicit or explicit restrictions set by capability signaling
            (outside the scope of this specification) SHOULD also be included.
            Any codec configuration parameter that are either not applicable
            to the Payload Type or not enabled by capability signaling MUST
            NOT be included. All codec configuration parameters not covered by
            the above restrictions MAY be included.</t>

            <t>When the operation point has dependency to other operation
            points (such as in scalable coding), the values to use for codec
            configuration parameters MUST describe the result when all
            dependencies are utilized. For example, assume an operation point
            describing a base layer with 15 Hz framerate, and a dependent
            operation point describing an enhancement layer adding another 15
            Hz to the base layer, resulting in 30 Hz framerate when both
            layers are combined. The correct parameter value to use for that
            latter, dependent "enhancement" operation point is 30 Hz, not the
            15 Hz difference.</t>

            <t>The value of a codec configuration parameter that was not
            included in a COPN message SHOULD either be inferred from other
            signaling, e.g. session setup or capability negotiation, outside
            the scope of this specification, or if such signaling is not
            available or not applicable, use the default value as defined per
            <xref target="sec-parameters">parameter type</xref>.</t>

            <t>An operation point describes one specific setting of codec
            parameters, and a COPN message therefore MUST NOT include the
            <xref target="sec-alt">ALT parameter type</xref> in the codec
            parameters describing the operation point.</t>
          </section>

          <section title="Relation to COPR">
            <t>To limit RTCP bandwidth and avoid bandwidth expansion, COPN is
            not mandated as response to every received <xref
            target="sec-copr">COPR</xref>.</t>

            <t>A media sender implementing this specification SHOULD take
            requested operation points from COPR messages into account for
            future encoding, but MAY decide to use other codec configuration
            parameter values than those requested, e.g. as a result of
            multiple (possibly contradicting) COPR messages from different
            media receivers, or any media sender policies, rules or
            limitations. Thus, a COPN message operation point MAY use other
            codec configuration parameters and other values than those
            requested in a COPR.</t>

            <t>The media sender SHOULD try to maintain OPIDs between COPR and
            COPN when COPR sender suggests a new OPID value (N flag is set) in
            the COPR, but MAY use another OPID in COPN. Examples where other
            OPID values have to be chosen are for example when the suggested
            OPID conflicts with an already existing OPID, or when the media
            sender decides that a the suggested new OPID can be fulfilled by
            an already existing OPID.</t>

            <t>Even if a COPR references an existing OPID (N flag cleared),
            the media sender may have to take other aspects than a specific
            COPR into account when choosing how many operation points to use,
            and the exact contents of those operation points. See the
            description on <xref target="sec-cops">COPS</xref> on how to
            achieve mapping between a suggested new OPID and what OPID will
            actually be used.</t>

            <t>When OPID cannot be kept the same between COPN and COPR, the
            mapping SHALL be done using identical <xref target="sec-id">ID
            parameters</xref> in the COPS and COPN resulting from the COPR.
            Further details are described in the <xref
            target="sec-cops">section on COPS</xref>.</t>

            <t>Since COPR references a certain COPN OPID, Version, and COPN is
            send unreliably and may be lost, COPN senders MUST keep at least
            the two last COPN Versions for each SSRC, OPID tuple and SHOULD
            keep at least four.</t>
          </section>
        </section>

        <section title="Timing Rules">
          <t>The timing follows the rules outlined in section 3 of <xref
          target="RFC4585">AVPF</xref>. This notification message may be time
          critical and SHOULD be sent using early or immediate feedback RTCP
          timing, but MAY be sent using regular RTCP timing.</t>

          <t>A typical example when regular RTCP timing can be appropriate is
          when the sent media stream is further restricted from what was
          described by the most recent COPN, which should not cause any
          problems in the media receivers. Similarly, it is likely appropriate
          to use early or immediate timing when effective media stream
          restrictions urgently needs to be removed, which may require media
          receivers to increase their resource usage.</t>
        </section>
      </section>

      <section anchor="sec-copr" title="Codec Operation Point Request">
        <t/>

        <section title="Message Format">
          <t/>

          <figure align="center" anchor="fig-copr" title="COPR format">
            <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Type |      Payload Length     |     OPID      |N|   Version   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence No   | Codec Configuration Parameters                :
+-+-+-+-+-+-+-+-+                                               :
:                                                               :]]></artwork>
          </figure>

          <t>The COPR-specific message fields are:<list style="hanging">
              <t hangText="Type (3 bits):">Set to 1, as listed in <xref
              target="tab-itemtype"/>.</t>

              <t hangText="OPID (8 bits):">The OPID this request refers to for
              an existing OPID, and an arbitrarily chosen but unique value in
              requests for new operations points, i.e. with the N flag
              set.</t>

              <t hangText="N (1 bit):">MUST be set to 0 when OPID references
              an existing OPID announced in a COPN received from the targeted
              media sender, and MUST be set to 1 otherwise.</t>

              <t hangText="Version (7 bits):">When N flag is not set (0),
              referencing a specific version of the codec configuration
              identified by the OPID in a COPN received from the targeted
              media sender. Not used and MUST be set to 0 when N flag is set
              (1).</t>

              <t hangText="Sequence No (8 bits):">Sequence Number. SHALL be
              incremented by 1 modulo 2^8 for every COPR that includes an
              updated set of requested codec configuration parameters
              described by the same OPID and Version as was used with the
              previous Sequence Number. Sequence Number SHALL be kept
              unchanged in repetitions of this message. Initial value SHOULD
              be chosen randomly.</t>

              <t
              hangText="Codec Configuration Parameters (variable length):">Contains
              zero or more TLV carrying codec configuration parameters as
              defined in <xref target="sec-parameters">parameter
              types</xref>.</t>
            </list></t>
        </section>

        <section anchor="sec-copr-semantics" title="Semantics">
          <t>This message item is sent by a media receiver wanting to control
          one or more codec configuration parameters of the targeted media
          sender. The requested values MUST stay within the media capability
          negotiated by other means than this specification. The available
          codec configuration parameters that can be controlled are listed in
          <xref target="sec-parameters"/>.</t>

          <t>Note: COPR may be seen as a more explicit and elaborate version
          of the TSTR message of <xref target="RFC5104"/> and most of the
          considerations detailed there for TSTR also apply to COPR.</t>

          <section anchor="sec-copr-sender" title="Sender Behavior">
            <t>If at least one <xref target="sec-copn">COPN</xref> is received
            for the targeted stream, the codec configuration parameters for
            that stream (SSRC) with defined OPID and Version are known to the
            COPR sender. The COPR MUST refer to the OPID and Version of the
            most recently received COPN (if any) for the targeted stream.
            Since it references a defined set of codec configuration
            parameters from a COPN, the COPR SHOULD only include the codec
            configuration parameters it wishes to change in the message, but
            it MAY include also unchanged codec configuration parameters.</t>

            <t>If no COPN is received for the targeted stream, the COPR sender
            MUST choose an arbitrary OPID and set the N flag to indicate that
            the OPID does not refer to any existing operation point. In this
            case the Version field is not used and MUST be set to 0. The OPID
            value SHALL NOT be identical to any OPID from the same media
            source that the media receiver is aware of and has received COPN
            for. Since in this case no COPN reference exist, the COPR sender
            SHOULD include all codec configuration parameters that it wishes
            to include a specific restriction for (other than the default).
            Note that for some codecs, some codec configuration parameters may
            be possible to infer from the media stream, but if the wanted
            restriction includes also those and lacking a describing COPN,
            they SHOULD anyway be included explicitly in the COPR.</t>

            <t>Any codec configuration parameter that are not enabled by
            capability signaling MUST NOT be included.</t>

            <t>A COPR sender MUST increment the SN field modulo 2^8 with every
            new COPR that includes any update to the codec configuration
            parameters (referring to a specific version of an OPID compared to
            the previously sent SN, as long as it does not receive any <xref
            target="sec-cops">COPS</xref> with the same OPID, Version, and SN
            as was used in the most recently sent COPR. COPR having a later SN
            MUST be interpreted as replacing any COPR with identical OPID and
            Version but with lower SN, taking field wrap into account.</t>

            <t>A COPR sender that did not receive any corresponding COPS, but
            did receive a COPN with the same OPID and with a higher Version
            than was used in the last COPR SHALL reconsider the COPR and MAY
            send an updated COPR referencing the new Version.</t>

            <t>If the capability negotiation has established that a codec
            supporting scalable operation is used, and if the media receiver
            wishes to request that scalability is used, it MAY do so by
            sending multiple COPR with different OPID to the same media
            sender. The OPID and Version used in such request MAY be based on
            an existing operation point, but it MAY also indicate a desire to
            introduce scalability into a previously non-scalable stream by
            choosing a new OPID (indicated by setting the N flag). In any
            case, the resulting OPIDs and sub-streams are identified through
            use of the <xref target="sec-id">ID parameter</xref> in subsequent
            COPS and COPN. See also the <xref target="sec-cops">description of
            COPS</xref>.</t>

            <t>An operation point without any codec configuration parameters
            MAY be used and MUST be interpreted as a request to keep the
            operation point unchanged. This is especially useful when
            modifying some but not all in a set of sub-streams.</t>

            <t>When a COPR sender is receiving multiple operation points and
            wants to continue to do so, it MUST include all operation points
            it still wishes to receive in the COPR, also those that can be
            left unchanged.</t>

            <t>An COPR MAY also describe alternative operation points that the
            media sender can choose from, through use of one or more <xref
            target="sec-alt">ALT parameters</xref>.</t>

            <t>Since COPR references a specific COPN using SSRC, OPID and
            Version, a COPR sender typically needs to keep the latest Version
            of received COPN for each SSRC and OPID, also including the codec
            configuration parameters.</t>
          </section>

          <section title="Media Sender Behavior">
            <t>A media sender receiving a COPR SHOULD take the request into
            account for future encoding, but MAY also take COPR from other
            media receivers and other information available to the media
            sender into account when deciding how to change encoding
            properties.</t>

            <t>A media receiver sending COPR thus cannot always expect that
            all parameter values of the request are fully honored, or even
            honored at all. It can only know that the COPR was taken into
            account when receiving a <xref target="sec-cops">COPS</xref> from
            the media sender with a matching OPID, Version and SN.</t>

            <t>To what extent a COPR is honored is described by the chosen
            codec configuration parameter values contained in a subsequent
            <xref target="sec-copn">COPN message </xref> with a later (taking
            wraparound into account) Version than the one referred by the
            COPR.</t>
          </section>
        </section>

        <section title="Timing Rules">
          <t>The timing follows the rules outlined in section 3 of <xref
          target="RFC4585"/>. This request message MAY be sent using
          Immediate, Early or Regular timing depending on the application's
          needs.</t>

          <t>A COPR sender that did not receive a corresponding COPS MAY
          choose to retransmit the COPR, without increasing the SN.</t>

          <t>When an RTP media receiver (SSRC) is timing out or leaves (BYE
          received) from the RTP session, it SHALL implicitly imply that all
          COPR restrictions put by that media receiver are removed.</t>
        </section>
      </section>

      <section anchor="sec-cops" title="Codec Operation Point Status">
        <t/>

        <section title="Message Format">
          <t/>

          <figure align="center" anchor="fig-cops" title="COPS format">
            <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Type |      Payload Length     |     OPID      |N|   Version   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      SSRC of COPR sender                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence No   | RC  | Reason  |Codec Configuration Parameters :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               :
:                                                               :]]></artwork>
          </figure>

          <t>The COPS-specific message fields are:<list style="hanging">
              <t hangText="Type (3 bits):">Set to 2, as listed in <xref
              target="tab-itemtype"/>.</t>

              <t hangText="OPID (8 bits):">MUST be set identical to the same
              field in the COPR being reported on.</t>

              <t hangText="N (1 bit):">MUST be set identical to the same field
              in the COPR being reported on.</t>

              <t hangText="Version (7 bits):">MUST be set identical to the
              same field in the COPR being reported on.</t>

              <t hangText="SSRC of COPR sender (32 bits):">MUST be set
              identical to the SSRC of packet sender field in the common AVPF
              header part of the COPR being reported on.</t>

              <t hangText="Sequence No (8 bits):">MUST be set identical to the
              same field in the COPR being reported on.</t>

              <t hangText="RC (3 bits):">Return Code. Indicates degree of
              success or failure of the COPR being reported on, as described
              in <xref target="tab-retcode"/>.</t>

              <t hangText="Reason (5 bits):">Contains more detailed
              information on the reason for success or failure, as described
              in <xref target="tab-reason"/> or extensions to this
              specification.</t>

              <t hangText="Codec Configuration Parameters (variable):">MAY
              contain an ID codec configuration parameter providing codec
              specific media identification of the OPID, subject to conditions
              outlined in the text below, or MAY be empty.</t>
            </list></t>
        </section>

        <section title="Semantics">
          <t>The COPS message item indicates the request status related to a
          certain SSRC OPID tuple by listing the latest received <xref
          target="sec-copr">COPR</xref> SN. It effectively informs the COPR
          sender that it no longer needs to resend that COPR SN (or any
          previous SN).</t>

          <t>COPS indicates that the specified COPR was successfully received
          by the media sender targeted in the request. If the COPR suggested
          codec configuration parameters could be <xref
          target="tab-retcode">understood</xref>, they may be taken into
          account, possibly together with COPR messages from other receivers
          and other aspects applicable to the specific media sender. The
          Return Code carries an indication to which extent the COPR could be
          honored.</t>

          <texttable anchor="tab-retcode" title="Return Code Values">
            <ttcol>Value</ttcol>

            <ttcol>Meaning</ttcol>

            <c>0</c>

            <c>Success</c>

            <c>1</c>

            <c>Partial success</c>

            <c>2</c>

            <c>Failure</c>

            <c>3-6</c>

            <c>Unassigned</c>

            <c>7</c>

            <c>Reserved for future extension</c>
          </texttable>

          <t>A Success Return Code indicates that the resulting media
          configuration is fully in line with the COPR.</t>

          <t>A Partial Success Return Code indicates that the resulting media
          configuration is not fully in line with the COPR, but that the media
          sender regards the COPR to be sufficiently well represented by one
          or more of the existing operation points.</t>

          <t>A Failure Return code indicates that the media sender failed to
          take the COPR into account, either due to some error condition or
          because no media stream could be created or changed to comply.</t>

          <t>The Reason Values defined below are independent of Return Code,
          but all reasons may not be meaningful with all return codes. More
          reasons MAY be defined in extensions to this specification.</t>

          <texttable anchor="tab-reason" title="Reason Values">
            <ttcol>Value</ttcol>

            <ttcol>Meaning</ttcol>

            <c>0</c>

            <c>Success</c>

            <c>1</c>

            <c>Unknown OPID</c>

            <c>2</c>

            <c>Too many operation points</c>

            <c>3</c>

            <c>Request violates capability limits</c>

            <c>4</c>

            <c>Too old operation point version</c>

            <c>5</c>

            <c>Unknown parameter type</c>

            <c>6</c>

            <c>Parameter value too long</c>

            <c>7</c>

            <c>Invalid comparison type</c>

            <c>8</c>

            <c>One or more parameter values in the request were changed</c>

            <c>9-31</c>

            <c>Unassigned</c>
          </texttable>

          <t>COPS is typically sent without any codec configuration
          parameters. When the N flag was set in the related COPR, a
          non-failing COPS MUST include an <xref target="sec-id">ID
          parameter</xref> identifying the actual sub-stream that the media
          sender considers applicable to the COPR. The OPID used by that
          sub-stream can be found through examining ID parameters of
          subsequent COPN from the same media source for ID values matching
          the one in COPS.</t>

          <t>Senders implementing this specification MUST NOT use any other
          codec configuration parameter types than ID in a COPS message. The
          contained ID parameter points to the specific media (sub-)stream
          that the media sender regards as applicable to the COPR.<!--MW: Future consideration: do we want a parameter carrying the ID of not understood TLVs in the request?--></t>

          <t>When a COPR receiver has received multiple COPR messages from a
          single COPR source with the same OPID but with several different
          values of Version and/or SN, and for which it has not yet sent a
          COPS, it SHALL only send COPS for the COPR with the Highest SN,
          taking field wrap of those two fields into account.</t>
        </section>

        <section title="Timing Rules">
          <t>COPS SHALL be sent at the earliest opportunity after having
          received a COPR, with the following exception: <list style="hanging">
              <t>A media sender that receives a COPR with a previously
              received OPID, Version, and SN closely after sending a COPS for
              that same OPID, Version, and SN (within 2 times the longest
              observed round trip time, plus any AVPF-induced packet sending
              delays), SHOULD await a repeated COPR before scheduling another
              COPS transmission for that OPID, Version, and SN.</t>
            </list>The exception is introduced to avoid unnecessary COPS
          transmission when there is a chance that already sent COPS or COPN
          may satisfy or invalidate the COPR.</t>
        </section>
      </section>

      <section title="Handling in Mixers and Translators">
        <t/>

        <section title="COPN">
          <t>Any media sender, including mixers and translators, that sends
          RTP media marked with it's own SSRC and that implements this
          specification SHALL also be prepared to send COPN, even if it is not
          the originating media source. As a result of that, such media sender
          may have to send updated COPN whenever the included media sources
          (CSRC) changes, subject to rules laid out <xref
          target="sec-copn-semantics">above</xref>. Note that this can be
          achieved in different ways, for example by forwarding (possibly
          cached) COPN from the included CSRC when the mixer is not performing
          transcoding.</t>

          <t>In cases where a mixer or translator needs to forward a COPR from
          one side (A) to the other (B) (as described in <xref
          target="sec-copr-mix"/>), the COPN sent to the A side MAY need to be
          delayed until the mixer or translator has received a corresponding
          COPN from the B side, as indicated in <xref
          target="fig-copn-delay"/> below.</t>

          <figure align="center" anchor="fig-copn-delay"
                  title="Mixer delay of COPN">
            <artwork><![CDATA[+-------+ 1. COPR +-------+ 2. COPR +-------+
|       |-------->|       |-------->|       |
|   A   | 4. COPN | Mixer | 3. COPN |   B   |
|       |<--------|       |<--------|       |
+-------+         +-------+         +-------+]]></artwork>
          </figure>

          <t>If a mixer or translator has decided to act partially (modify the
          media stream with respect to some parameter types, but not all) on a
          received COPR from the A side, and a COPN is received from the B
          side indicating that the current media modifications are no longer
          necessary, the mixer or translator SHOULD cease it's own actions
          that are no longer needed. It SHOULD then also issue a COPN
          describing the new situation to the A side, as indicated in <xref
          target="fig-copn-update"/> below.</t>

          <figure align="center" anchor="fig-copn-update"
                  title="Mixer update of COPN">
            <artwork><![CDATA[+-------+ 1. COPR +-------+         +-------+
|       |-------->|       | 2. COPR |       |
|       | 3. COPN |       |-------->|       |
|   A   |<--------| Mixer | 4. COPN |   B   |
|       | 5. COPN |       |<--------|       |
|       |<--------|       |         |       |
+-------+         +-------+         +-------+]]></artwork>
          </figure>
        </section>

        <section anchor="sec-copr-mix" title="COPR">
          <t>A mixer or media translator that implements this specification
          and encodes content sent to the media receiver issuing the COPR
          SHALL consider the request to determine if it can fulfill it by
          changing its own encoding parameters. A mixer encoding for multiple
          session participants will need to consider the joint needs of all
          participants when generating a COPR on its own behalf towards the
          media sender.</t>

          <t>A mixer or translator able to fulfill the COPR partially MAY act
          on the parts it can fulfill (and SHALL then send COPS and COPN
          accordingly), but SHOULD anyway forward the unaltered COPR towards
          the media sender, since it is likely most efficient to make the
          necessary codec configuration parameter changes directly at the
          original media source.</t>

          <t>A media translator that does not act on COP messages will forward
          them unaltered, according to normal translator rules.</t>
        </section>

        <section anchor="sec-cops-mix" title="COPS">
          <t>A mixer or media translator that implements this specification,
          encoding content sent to media receivers and that acts on COPR SHALL
          also report using COPS, just like any other media sender. An RTP
          translator not knowing or acting on COPR will forward all COP
          messages unaltered, according to normal RTP translator rules.</t>
        </section>
      </section>
    </section>

    <section anchor="sec-parameters" title="Parameter Types">
      <t>This section defines the general codec configuration parameter (CCP)
      TLV format. Then a number of different parameter formats are defined. It
      is expected that a number of additional CCPs will be defined in the
      future as the needs of different codecs are explored or developed.</t>

      <section title="Parameter Format">
        <t>COP message items MAY contain one or more codec configuration
        parameters, encoded in TLV (Type-Length-Value) format, which SHOULD
        then be interpreted as simultaneously applicable to the defined
        operation point. Parameter values MUST be byte-aligned.</t>

        <figure align="center" anchor="fig-param-format"
                title="Codec parameter format">
          <artwork><![CDATA[0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ParamType     | C |  Length   |                               |
+---------------+---+-----------+                               |
|                                                               |
/                        Parameter Value                        /
/                                                +--------------+
|                                                |
+------------------------------------------------+]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="ParamType (8 bits):">The codec configuration
            parameter type, encoded as defined in <xref
            target="tab-paramtype"/> and possible extensions to this
            specification. A parameter with an unknown ParamType SHALL be
            ignored on reception in a COPN and SHALL either be reported as
            unknown in COPS or be ignored when received in COPR.</t>

            <t hangText="C (2 bits):">Comparison Type, encoded as defined in
            <xref target="tab-comparetype"/>, unless specified otherwise by
            individual ParamType definitions. The Comparison Type specifies
            what type of restriction the codec configuration parameter value
            expresses and how it should be compared to other codec
            configuration parameter values of the same ParamType.<list
                style="hanging">
                <t hangText="Exact:">The parameter value is an exact value,
                and no other values are acceptable. MUST NOT be used together
                with any other Comparison Types for the same ParamType.</t>

                <t hangText="Minimum:">The parameter value is an inclusive
                minimum restriction. MAY be used together with Maximum and/or
                Target Comparison Types for the same ParamType. If no minimum
                restriction is specified, no specific minimum restriction
                exist.</t>

                <t hangText="Maximum:">The parameter value is an inclusive
                maximum restriction. MAY be used together with Minimum and/or
                Target Comparison Types for the same ParamType. If no maximum
                restriction is specified, no specific maximum restriction
                exist.</t>

                <t hangText="Target:">The parameter value is a preferred
                target value, but other values within a specified range are
                acceptable. This type MUST be used together with at least one
                of Minimum and Maximum Comparison Types for the same
                ParamType. If no target is specified, no specific preference
                exist.</t>
              </list></t>

            <t hangText="Length (6 bits):">The parameter value Length in
            bytes, excluding the ParamType and the Length field itself. A
            Length of 0 indicates that the parameter has no value, effectively
            constituting a wild-carded parameter that can take on any value
            (expresses no specific restriction). This is also the RECOMMENDED
            way to explicitly remove a previously effective restriction.</t>

            <t hangText="Parameter Value (variable length):">The actual
            parameter value, encoded in a format defined by the specific
            ParamType definition.</t>
          </list></t>

        <t>The meaning of Multiple codec configuration parameters with the
        same ParamType and the same Comparison Type included as part of the
        same operation point is undefined and SHALL NOT be used.</t>

        <t>A codec configuration parameter that is encoded in a way (including
        incorrectly) that cannot be interpreted by the receiver SHALL be
        ignored.</t>

        <t>The below parameters encoded as signed or unsigned integers uses a
        variable size representation in the value field. It is RECOMMENDED to
        only include the minimal number of bytes necessary to represent the
        value that is to be included in the parameter TLV. The length field in
        the parameter TLV will explicitly indicate how many bytes are present
        in the value field. All parameters using a variable size
        representation of their value MUST define the maximum number of bytes
        possible to include in the value field.</t>

        <t>The ParamType values and the SDP tags (see <xref
        target="sec-sdp"/>) for the codec configuration parameter types
        defined in this specification are listed below.</t>

        <texttable anchor="tab-paramtype" title="Parameter Type Values">
          <ttcol>Value</ttcol>

          <ttcol>Meaning</ttcol>

          <ttcol>Tag</ttcol>

          <c>0</c>

          <c>ALT</c>

          <c>alt</c>

          <c>1</c>

          <c>ID</c>

          <c>id</c>

          <c>2</c>

          <c>Payload Type</c>

          <c>pt</c>

          <c>3</c>

          <c>Bitrate</c>

          <c>bitrate</c>

          <c>4</c>

          <c>Token Bucket Size</c>

          <c>token-bucket</c>

          <c>5</c>

          <c>Framerate</c>

          <c>framerate</c>

          <c>6</c>

          <c>Horizontal Pixels</c>

          <c>hor-size</c>

          <c>7</c>

          <c>Vertical Pixels</c>

          <c>ver-size</c>

          <c>8</c>

          <c>Sample Aspect Ratio</c>

          <c>sar</c>

          <c>9</c>

          <c>Picture Aspect Ratio</c>

          <c>par</c>

          <c>10</c>

          <c>Channels</c>

          <c>channels</c>

          <c>11</c>

          <c>Sampling Rate</c>

          <c>sampling</c>

          <c>12</c>

          <c>Maximum RTP Packet Size</c>

          <c>max-rtp-size</c>

          <c>13</c>

          <c>Maximum RTP Packet Rate</c>

          <c>max-rtp-rate</c>

          <c>14</c>

          <c>Frame Aggregation</c>

          <c>aggregate</c>

          <c>15-254</c>

          <c>Undefined</c>

          <c/>

          <c>255</c>

          <c>Reserved for future extension</c>

          <c/>
        </texttable>

        <t>The values of the defined parameter value comparison type are
        listed below.</t>

        <texttable anchor="tab-comparetype" title="Comparison Type Values">
          <ttcol>Value</ttcol>

          <ttcol>Meaning</ttcol>

          <c>0</c>

          <c>Exact</c>

          <c>1</c>

          <c>Minimum</c>

          <c>2</c>

          <c>Maximum</c>

          <c>3</c>

          <c>Target</c>
        </texttable>

        <t>The following sub-sections describe the syntax and semantics of the
        different codec configuration parameter types defined in this
        specification.</t>

        <t>Unless explicitly specified in the sub-sections below, or in
        extensions to this specification, all parameter type values are binary
        encoded unsigned integers, most significant byte first (for multi-byte
        values).</t>
      </section>

      <section anchor="sec-alt" title="ALT">
        <t>This codec parameter type is a special parameter, separating the
        codec configuration parameters preceding it from the ones that follow
        into two separate, alternative operation points.<list style="hanging">
            <t hangText="Type Value:">0</t>

            <t hangText="Tag:">alt</t>

            <t hangText="Unit:">Not applicable.</t>

            <t hangText="Semantics:">A special parameter expressing an
            "alternative" relation between the parameters preceding it and the
            parameters following it. This SHOULD be interpreted as describing
            two alternate operation points where one and only one SHALL be
            chosen, with the operation point preceding ALT in the parameter
            list being preferred. Multiple ALT parameters MAY be used in the
            same parameter list, in which case each set of parameters to
            evaluate can be either before the first ALT parameter, between two
            ALT parameters, or after the last ALT parameter. Evaluating from
            the top of the list and obeying the above preference rule, the
            first acceptable set of parameters (not containing any ALT
            parameter) is the one to choose.</t>

            <t hangText="Encoding:">Not applicable.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">MUST be used with the Length
            field set to 0. Two ALT parameters MUST be separated by at least
            one parameter other than ALT.</t>

            <t hangText="Default Value:">Not applicable.</t>

            <t hangText="Comparison Types:">MUST be set to 0.</t>

            <t hangText="Note:"/>
          </list></t>
      </section>

      <section anchor="sec-id" title="ID">
        <t>This codec parameter type is a special parameter that enables codec
        specific identification of sub-streams, for example when there are
        multiple sub-streams in a single SSRC. It can also be used to
        reference OPID, when the used codec does not support or use
        sub-streams. When used, it SHALL be listed first among the codec
        parameters used to describe the sub-stream.<list style="hanging">
            <t hangText="Type Value:">1</t>

            <t hangText="Tag:">id</t>

            <t hangText="Unit:">Not applicable.</t>

            <t hangText="Semantics:">A special parameter describing the,
            possibly codec specific, media identification for the OPID.</t>

            <t hangText="Encoding:">If used with non-scalable encoding, it
            MUST contain an <xref target="sec-item-format">OPID</xref>. If
            used with scalable encoding, this codec specific encoding MUST be
            defined by <xref target="sec-codec-sub-stream-id"/>. It MUST be
            defined to occupy an integer number of bytes, where all bits in
            the bytes are defined as part of the format.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">If used with non-scalable
            encoding, any OPID restrictions apply. If used with scalable
            encoding, any restrictions MUST be defined by the definition of
            the <xref target="sec-codec-sub-stream-id">codec specific
            sub-stream identification definition</xref>.</t>

            <t hangText="Default Value:">Not set.</t>

            <t hangText="Comparison Types:">MUST be set to 0.</t>

            <t hangText="Note:">MAY be used whenever there is a need to
            identify an operation point in codec native format, or when there
            is a need to map that against an OPID.</t>
          </list></t>
      </section>

      <section anchor="sec-payload-type" title="Payload Type">
        <t><list style="hanging">
            <t hangText="Type Value:">2</t>

            <t hangText="Tag:">pt</t>

            <t hangText="Unit:">Not applicable.</t>

            <t hangText="Semantics:">Referencing the RTP Payload Type to use
            for the OPID.</t>

            <t hangText="Encoding:">The least significant 7 bits MUST use the
            same encoding as the RTP Payload Type field in the RTP header. The
            most significant bit MUST be set to 0.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">The same restrictions valid for
            RTP Payload Type apply, i.e. 7-bit values 0-127. MUST be
            represented by a single byte in the value field.</t>

            <t hangText="Default Value:">Not set.</t>

            <t hangText="Comparison Types:">MUST be set to 0.</t>

            <t hangText="Note:">MAY be used whenever there is a need to
            specify codec configuration parameters valid only for a certain
            RTP Payload Type. What media type, codec and possible parameters
            that are described by the RTP Payload Type is outside the scope of
            this specification, but is typically defined in capability or call
            setup signaling, for example SDP.</t>
          </list></t>
      </section>

      <section anchor="sec-bitrate" title="Bitrate">
        <t><list style="hanging">
            <t hangText="Type Value:">3</t>

            <t hangText="Tag:">bitrate</t>

            <t hangText="Unit:">Bits per second.</t>

            <t hangText="Semantics:">Media level per second average media
            bitrate, excluding IP/UDP/RTP overhead, but including RTP payload
            headers (similar to <xref target="RFC3890">b=TIAS from SDP
            signaling</xref>), rounded up to the closest integer.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">A value of 0 MAY be used. The
            largest value allowed is what is possible to represent in a 64-bit
            unsigned integer value, i.e. a value between 0 and
            18,446,744,073,709,551,615.</t>

            <t hangText="Default Value:">Maximum value computed from
            capability or call setup signaling, e.g. b= parameter from SDP.
            Note that it is often not possible to achieve more than a rough
            estimation from such computation.</t>

            <t hangText="Comparison Types:">All. The Exact comparison type is
            meaningful only for streams that are able to produce a set of
            predictable (e.g. constant) packet sizes, sent at predictable
            (e.g. constant) inter-packet intervals.</t>

            <t hangText="Note:">This parameter used with a maximum comparison
            type parameter is significantly similar to CCM Temporary Maximum
            Media Bit Rate (TMMBR). When being used with a maximum or exact
            comparison type value of 0, it is also significantly similar to
            <xref
            target="I-D.westerlund-avtext-rtp-stream-pause">PAUSE</xref>.
            Compared to those, this parameter conveys significant extra
            information through the relation to other parameters applied to
            the same operation point, as well as the possibility to express
            other restrictions than a maximum limit. When CCM TMMBR is
            supported in addition to this specification, the Bitrate
            parameters from all operation points within each SSRC should be
            considered and CCM TMMBR messages SHOULD be sent for those SSRC
            that are found to be in the bounding set (see <xref
            target="RFC5104">CCM</xref>, section 3.5.4.2). When PAUSE is
            supported in addition to this specification, the Bitrate
            parameters from all operation points within each SSRC should be
            considered and CCM PAUSE messages SHOULD be sent for those SSRC
            that contain only operation points that are limited by a Bitrate
            maximum value of 0. There only difference between setting the
            bitrate to 0 and removing the OPID entirely is that increasing the
            bitrate from 0 just requires the bitrate parameter to be sent
            again, while re-activating a removed OPID requires it to be fully
            re-defined including all other parameters that are included in the
            OPID.</t>
          </list></t>
      </section>

      <section anchor="sec-token-bucket-size" title="Token Bucket Size">
        <t><list style="hanging">
            <t hangText="Type Value:">4</t>

            <t hangText="Tag:">token-bucket</t>

            <t hangText="Unit:">Bytes.</t>

            <t hangText="Semantics:">Media level <xref target="RFC2212">token
            bucket</xref> size excluding IP/UDP/RTP overhead, but including
            RTP payload headers, describing the bitrate variability over time
            as described in <xref
            target="I-D.westerlund-mmusic-sdp-bw-attribute"/>. This parameter
            can be combined with the parameter <xref
            target="sec-bitrate">bitrate</xref> (above) to provide token
            bucket fill rate plus bucket size for a complete token bucket
            model.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">A value of 0 is generally not
            meaningful and SHOULD NOT be used. Values that can be represented
            using a 32-bit unsigned integer, i.e. 0 to 4,294,967,295.</t>

            <t hangText="Default Value:">4096 bytes.</t>

            <t hangText="Comparison Types:">Maximum, Target.</t>

            <t hangText="Note:">Changing the token bucket size does not imply
            changing the average bitrate, it just changes the acceptable
            average bitrate variation over time.</t>
          </list></t>
      </section>

      <section anchor="sec-framerate" title="Framerate">
        <t><list style="hanging">
            <t hangText="Type Value:">5</t>

            <t hangText="Tag:">framerate</t>

            <t hangText="Unit:">100th of a Hz. This definition allows e.g.
            distinguishing between video encoded at 30 Hz (two-byte value
            3000) and 29.97 Hz (two-byte value 2997). It also allows for high
            speed video cameras, like 1000 Hz (three-byte value 100000), and
            slow-scan down to one frame every 100 seconds (one-byte value
            1).</t>

            <t hangText="Semantics:">The number of media frames to render per
            second.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">Mainly intended for video and timed
            image media types, but MAY be used also for other media types.<!--Could be used for audio like selecting between iLBC 20 and 30 ms frames. 
An audio specific frame duration parameter is likely semanticaly clearler 
for that case. --></t>

            <t hangText="Value Restrictions:">A value of 0 MAY be used,
            meaning single-frame, request based encoding (request procedure is
            out of scope for this specification). Values that can be
            represented using a 32-bit unsigned integer, i.e. 0 to
            42,949,672.95 Hz.</t>

            <t hangText="Default Value:">Maximum allowed by call setup and/or
            capability signaling, e.g. a=framerate parameter from <xref
            target="RFC4566">SDP</xref>, or codec-specific configuration.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">A media frame is typically a set of
            semantically grouped samples, e.g. the relation that a video image
            has to its individual pixels, or the relation that an audio frame
            has to individual audio samples. The value applies to encoded
            media framerate, not the <xref target="sec-max-rtp-rate">packet
            rate</xref> that may also be changed as a result of different
            <xref target="sec-frame-agg">Frame Aggregation</xref>. When the
            COP end-point also makes use of <xref target="RFC5104">CCM</xref>
            TSTR/TSTN, COPN with this parameter MAY be used in combination
            with TSTN to explicitly indicate what framerate setting the TSTR
            resulted in, making it possible for the TSTR sender to adjust the
            used, relative TSTR scale to more closely match what framerate was
            actually received.</t>
          </list></t>
      </section>

      <section anchor="sec-hor-pixels" title="Horizontal Pixels">
        <t><list style="hanging">
            <t hangText="Type Value:">6</t>

            <t hangText="Tag:">hor-size</t>

            <t hangText="Unit:">Pixels.</t>

            <t hangText="Semantics:">Horizontal image size.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">Video and image.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used.</t>

            <t hangText="Default Value:">Maximum allowed by call setup and/or
            capability signaling. Values that can be represented using a
            32-bit unsigned integer, i.e. 1 to 4,294,967,295.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">The pixel and picture aspect ratios cannot be
            changed with this parameter. Video encoders can typically describe
            both pixel and picture aspect ratios as part of the encoded media
            stream. If the COP end-point supports <xref
            target="RFC6236">imageattr signaling</xref>, values for this
            parameter SHOULD be chosen only among the negotiated set in the
            SDP, and should be done so both for the media receiving COPR
            sender and the media sending COPN sender, according to imageattr
            values for the affected media stream direction.</t>
          </list></t>
      </section>

      <section anchor="sec-ver-pixels" title="Vertical Pixels">
        <t><list style="hanging">
            <t hangText="Type Value:">7</t>

            <t hangText="Tag:">ver-size</t>

            <t hangText="Unit:">Pixels.</t>

            <t hangText="Semantics:">Vertical image size.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">Video and image.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used. Values that can be represented
            using a 32-bit unsigned integer, i.e. 1 to 4,294,967,295.</t>

            <t hangText="Default Value:">Maximum allowed by call setup and/or
            capability signaling.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">See Note in <xref
            target="sec-hor-pixels"/>.</t>
          </list></t>
      </section>

      <section anchor="sec-sar" title="Sample Aspect Ratio">
        <t><list style="hanging">
            <t hangText="Type Value:">8</t>

            <t hangText="Tag:">sar</t>

            <t hangText="Unit:">Unit-less value pair.</t>

            <t hangText="Semantics:">The ratio between the intended horizontal
            distance between the columns and the intended vertical distance
            between the rows of the luma sample array in a frame, similar to
            what is defined in <xref target="H241"/>.</t>

            <t hangText="Encoding:">Two binary encoded, unsigned 8-bit
            integers in order horizontal, vertical.</t>

            <t hangText="Media Types:">Video and image.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used as value in either the
            horizontal or vertical component. Component values that can be
            represented using an 8-bit unsigned integer, i.e. 1 to 255.</t>

            <t hangText="Default Value:">The same as defined in <xref
            target="H241"/> when there is no explicit indication, based on
            image size.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">If the COP end-point supports <xref
            target="RFC6236">imageattr signaling</xref>, values for this
            parameter SHOULD be chosen only among the negotiated set in the
            SDP, and should be done so both for the media receiving COPR
            sender and the media sending COPN sender, according to imageattr
            values for the affected media stream direction.</t>
          </list></t>
      </section>

      <section anchor="sec-par" title="Picture Aspect Ratio">
        <t><list style="hanging">
            <t hangText="Type Value:">9</t>

            <t hangText="Tag:">par</t>

            <t hangText="Unit:">Unit-less value pair.</t>

            <t hangText="Semantics:">The ratio between the intended horizontal
            width and the intended vertical height of a displayed picture,
            similar to what is defined in <xref target="H241"/>.</t>

            <t hangText="Encoding:">Two binary encoded, unsigned 8-bit
            integers in order horizontal, vertical.</t>

            <t hangText="Media Types:">Video and image.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used as value in either the
            horizontal or vertical component. Component values that can be
            represented using an 8-bit unsigned integer, i.e. 1 to 255.</t>

            <t hangText="Default Value:">The same as defined in <xref
            target="H241"/> when there is no explicit indication, based on
            image size.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">If the COP end-point supports <xref
            target="RFC6236">imageattr signaling</xref>, values for this
            parameter SHOULD be chosen only among the negotiated set in the
            SDP, and should be done so both for the media receiving COPR
            sender and the media sending COPN sender, according to imageattr
            values for the affected media stream direction.</t>
          </list></t>
      </section>

      <section anchor="sec-channels" title="Channels">
        <t><list style="hanging">
            <t hangText="Type Value:">10</t>

            <t hangText="Tag:">channels</t>

            <t hangText="Unit:">Unit-less.</t>

            <t hangText="Semantics:">The number of media channels.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used. Values that can be represented
            using a 16-bit unsigned integer, i.e. 1 to 65,535.</t>

            <t hangText="Default Value:">Taken from call setup or capability
            signaling, or 1 if no other value is available.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">This codec configuration parameter SHOULD NOT
            be used if the capability negotiation did not establish that
            suitable multi-channel coding is supported by both ends. For
            audio, the interpretation and spatial mapping SHALL follow the one
            for the indicated payload format. If no such channel mapping is
            defined in the payload format, and if not specifically signalled
            by other means, e.g. SDP, the channel configurations defined in
            <xref target="RFC3551"/> SHALL be used. For video, it SHALL be
            interpreted as the number of views in multiview coding, where the
            number 2 SHOULD represent stereo (3D) coding, unless negotiated
            otherwise by means outside of this specification, e.g. SDP. If
            multiple payload formats are defined and if those do not share
            channel configurations, the <xref
            target="sec-payload-type">Payload Type parameter</xref> MUST be
            included as one of the parameters for the OPID.</t>
          </list></t>
      </section>

      <section title="Sampling Rate">
        <t><list style="hanging">
            <t hangText="Type Value:">11</t>

            <t hangText="Tag:">sampling</t>

            <t hangText="Unit:">Hz.</t>

            <t hangText="Semantics:">Frequency of the media sampling clock in
            Hz, as input to the codec, per <xref
            target="sec-channels">channel</xref>.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">Mainly intended for audio media, but
            MAY be used for other media types.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used. Values that can be represented
            using a 32-bit unsigned integer, i.e. 1 to 4,294,967,295.</t>

            <t hangText="Default Value:">Taken from call setup or capability
            signaling, e.g. RTP TS rate from SDP m-line.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">The value refers to the media sample clock,
            not the media <xref target="sec-framerate">Framerate</xref>. It
            does not specify any codec-internal up- or down-sampling that may
            take place as part of the encoding process. If multiple <xref
            target="sec-channels">channels</xref> are used and different
            channels use different sampling rates, then this parameter MUST
            NOT be used unless there is a known sampling rate relationship and
            an ordering between the channels, in which case the specified
            sampling rate value SHALL be taken as applicable to the first
            channel of the ordered set. The relationship may e.g. be known
            implicitly by each party through some specification, or be
            negotiated using other means than this specification. Typically
            only a limited subset of sampling frequencies makes sense to the
            media encoder, and sometimes it is not possible to change at all.
            For video, the sampling rate is very closely connected to the
            image <xref target="sec-hor-pixels">horizontal</xref>, <xref
            target="sec-ver-pixels">vertical</xref> resolution, and <xref
            target="sec-framerate">framerate</xref>, which are more explicit
            and meaningful and SHOULD therefore be used instead. For audio,
            changing sampling rate may require changing codec and thus
            changing RTP payload type. The actual media sampling rate may not
            be identical to the sampling rate specified for RTP Time Stamps
            for that RTP Payload Type. E.g. almost all video codecs use only
            90 000 Hz sampling clock for RTP Time Stamps, while the actual
            pixel sampling clock is typically in the range from a few to
            several hundred MHz. Also some recent audio codecs use an RTP Time
            Stamp rate that differ from the actual media sampling rate.
            Aspects related to mid-stream changes of RTP Time Stamp rate is
            described in <xref
            target="I-D.ietf-avtext-multiple-clock-rates"/>.</t>
          </list></t>
      </section>

      <section anchor="sec-max-rtp-size" title="Maximum RTP Packet Size">
        <t><list style="hanging">
            <t hangText="Type Value:">12</t>

            <t hangText="Tag:">max-rtp-size</t>

            <t hangText="Unit:">Bytes.</t>

            <t hangText="Semantics:">The maximum size of an RTP packet,
            including the RTP header but excluding lower layers.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">The meaning of a value less than
            the size of the RTP header (12 bytes for <xref
            target="RFC3550">current RTP specification</xref>) is not defined
            and SHOULD NOT be used. Values that can be represented using a
            32-bit unsigned integer, i.e. 0 to 4,294,967,295.</t>

            <t hangText="Default Value:">1400 bytes for IPv4, 1280 bytes for
            IPv6 or if IP version cannot be determined.</t>

            <t hangText="Comparison Types:">Maximum.</t>

            <t hangText="Note:">The parameter should typically be used to
            adapt encoding to a known or assumed MTU limitation, and MAY be
            used to assist MTU path discovery in point-to-point as well as in
            RTP mixer or translator topologies.</t>
          </list></t>
      </section>

      <section anchor="sec-max-rtp-rate" title="Maximum RTP Packet Rate">
        <t><list style="hanging">
            <t hangText="Type Value:">13</t>

            <t hangText="Tag:">max-rtp-rate</t>

            <t hangText="Unit:">RTP packets per second.</t>

            <t hangText="Semantics:">Maximum number of RTP packets per second,
            calculated or estimated as the largest value appearing during a
            one-second sliding window, similar to the definition of <xref
            target="RFC3890">"maxprate"</xref>.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">All.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used. Values that can be represented
            using a 32-bit unsigned integer, i.e. 1 to 4,294,967,295.</t>

            <t hangText="Default Value:">Not set.</t>

            <t hangText="Comparison Types:">Maximum.</t>

            <t hangText="Note:">The parameter should typically be used to
            adapt encoding on a network that is packet rate rather than
            bitrate limited, if such property is known. This codec
            configuration parameter MUST NOT exceed any negotiated <xref
            target="RFC3890">"maxprate"</xref> value, if present.</t>
          </list></t>
      </section>

      <section anchor="sec-frame-agg"
               title="Application Data Unit Aggregation">
        <t><list style="hanging">
            <t hangText="Type Value:">14</t>

            <t hangText="Tag:">aggregate</t>

            <t hangText="Unit:">Milliseconds.</t>

            <t hangText="Semantics:">The amount of non-redundant application
            data unit (ADU) representing different RTP Time Stamps that should
            be included in the RTP payload, henceforth in this specification
            called an "ADU aggregate". An ADU aggregation value of 1 is
            equivalent to no aggregation.</t>

            <t hangText="Encoding:">Binary encoded unsigned integer, most
            significant byte first.</t>

            <t hangText="Media Types:">Mainly intended for audio, but MAY be
            used also for other media, e.g. <xref target="RFC4103">Real-Time
            Text</xref>.</t>

            <t hangText="Value Restrictions:">The meaning of the value 0 is
            not defined and SHALL NOT be used. Values that can be represented
            using a 16-bit unsigned integer, i.e. 1 to 65,535.</t>

            <t hangText="Value Default Value:">1.</t>

            <t hangText="Comparison Types:">All.</t>

            <t hangText="Note:">To use this parameter, there MUST exist a
            defined way of including multiple ADUs into the same RTP payload
            for the used RTP Payload Type. There MUST also exist a known
            internal timing relationship between individual ADUs within the
            RTP payload for the used RTP Payload Type. Some payload formats
            (typically video) do not allow multiple ADUs (representing
            different sampling times) in the RTP payload. This codec
            configuration parameter SHOULD NOT be used unless the <xref
            target="RFC3890">"maxprate"</xref> and/or "ptime" parameters are
            included in the SDP. The requested ADU aggregation level MUST NOT
            cause exceeding the negotiated "maxprate" value, if present, and
            SHOULD NOT exceed the negotiated "ptime" value, if present. The
            requested frame aggregation level MUST NOT be in conflict with any
            <xref target="sec-max-rtp-size">Maximum RTP Packet Size</xref> or
            <xref target="sec-max-rtp-rate">Maximum RTP Packet Rate</xref>
            parameters. The packet rate that may result from different frame
            aggregation values is related to, but semantically not the same
            as, media <xref target="sec-framerate">Framerate</xref>.</t>
          </list></t>
      </section>
    </section>

    <section anchor="sec-sdp" title="SDP Extensions">
      <t>As described in <xref target="RFC4585"/> and <xref
      target="RFC5104"/>, the rtcp-fb attribute may be used to negotiate
      capability to handle specific AVPF commands and indications, and
      specifically the "ccm" feedback value is used for codec control. All
      rules defined there related to use of "rtcp-fb" and "ccm" also apply to
      the new feedback message defined in this specification.</t>

      <section anchor="sec-rtcp-fb" title="Extension of the rtcp-fb Attribute">
        <t>In this document, a new "ccm" rtcp-fb-ccm-param is defined,
        according to the method of extension described in <xref
        target="RFC5104"/>:<list style="symbols">
            <t>"cop" indicates support for all COP message items defined in
            this specification, and one or more of the codec configuration
            parameters defined in this specification</t>
          </list>The <xref target="RFC5234">ABNF</xref> for the new
        rtcp-fb-ccm-param is:</t>

        <figure anchor="fig-abnf-cop" title="ABNF for cop">
          <artwork><![CDATA[rtcp-fb-ccm-param =/ SP "cop" 1*rtcp-fb-ccm-cop-param
; rtcp-fb-ccm-param defined in [RFC5104]

rtcp-fb-ccm-cop-param = SP "alt"
                      / SP "id"
                      / SP "pt"
                      / SP "bitrate"
                      / SP "token-bucket"
                      / SP "framerate"
                      / SP "hor-size"
                      / SP "ver-size"
                      / SP "sar"
                      / SP "par"
                      / SP "channels"
                      / SP "sampling"
                      / SP "max-rtp-size"
                      / SP "max-rtp-rate"
                      / SP "aggregate"
                      / SP token ; for future extensions
; token defined in [RFC4566]]]></artwork>
        </figure>

        <t>Token values for rtcp-fb-ccm-cop-param are defined in <xref
        target="tab-paramtype"/>. Their semantics are described in <xref
        target="sec-parameters"/>.</t>

        <t>Supported parameter types are indicated by including one or more
        rtcp-fb-ccm-cop-param.</t>
      </section>

      <section title="Offer/Answer Usage">
        <t>The usage of <xref target="RFC3264">Offer/Answer</xref> in this
        specification inherits all applicable usage defined in <xref
        target="RFC5104"/>.</t>

        <t>In order to announce support, and willingness to use, the CCM "cop"
        feedback message, an offerer or answerer SHALL indicate that
        capability through the extended SDP rtcp-fb attribute, defined in
        <xref target="sec-rtcp-fb"/>. The offerer or answerer MUST include a
        list of the parameter types that it is willing to receive.</t>

        <t>If an SDP offer does not indicate support of the CCM "cop" feedback
        message, the answerer MUST NOT indicate support in the associated SDP
        answer.</t>

        <t>The answerer MAY add and/or remove parameter types that were not
        present in the associated SDP offer. If the answerer adds parameter
        types to the SDP answer, it MUST be able to receive such messages, but
        the answerer MUST NOT send such messages towards the offerer.</t>

        <t>If an SDP answer does not indicate support of the CCM "cop"
        feedback message, the offerer MUST NOT send such messages towards the
        answerer.</t>

        <t>The offerer and the answerer SHOULD NOT send any parameter types
        that the remote party did not indicate receive support for. As
        described in <xref target="sec-parameters"/>, a parameter with an
        unknown ParamType SHALL be ignored on reception in a COPN and SHALL
        either be reported as unknown in COPS or be ignored when received in
        COPR.</t>

        <t>Entities MUST list all supported parameter types in every
        subsequent SDP offer or answer associated with the session. If a
        parameter type is not listed, it is an indication that the offerer or
        answerer is no longer willing to receive such messages within the
        session.</t>
      </section>

      <section title="Declarative Usage">
        <t>Declarative use of the CCM "cop" does not differ from the
        Offer/Answer usage.</t>
      </section>
    </section>

    <section anchor="sec-codec-sub-stream-id"
             title="Codec Sub-Stream Identification">
      <t>The defined mechanism is not bound to a specific codec. It uses the
      main characteristics of a chosen set of media types, including audio and
      video. To what extent this mechanism can be applied depends on which
      specific codec is used.</t>

      <t>When using a codec that can produce separate sub-streams within a
      single SSRC, those sub-streams can only be referred with a COP OPID if
      there is a defined relation to the codec-specific sub-stream
      identification. This is accomplished in this specification by defining
      an ID parameter format using codec-specific sub-stream identification
      for each such codec.</t>

      <t>If such sub-streams have dependencies, the OPID describes the
      characteristics of the sub-stream including all it's dependencies, but
      excluding any sub-streams that are dependent on this sub-stream. The
      sub-stream identification describes a single, payload specific node in a
      dependency tree, and does in general not include any identification of
      the sub-streams it depends on, or the dependency structure between
      sub-streams. Any dependency structure must thus be described by the
      media stream payload format and is out of scope for this
      specification.</t>

      <t>This section contains ID parameter format definitions for a few
      selected codecs. The format definitions MUST use an integer number of
      bytes and MUST define all bits in those bytes. Note, the ID parameter is
      interpreted in the context of a given SSRC and a specific RTP payload
      type.</t>

      <t>Extensions to this specification MAY add more codec-specific
      definitions than the ones described in the sub-sections below. Such
      definitions made in extensions to this specification SHOULD be
      considered as an integrated part of this section, with respect to usage
      with other mechanisms defined in this specification.</t>

      <section title="H.264 AVC">
        <t>Some non-scalable video codecs such as <xref target="H264">H.264
        AVC</xref> and corresponding <xref target="RFC6184">RTP payload
        format</xref> can accomplish simultaneous encoding of multiple
        operation points. H.264 AVC can encode a video stream using
        limited-reference and non-reference frames such that it enables
        limited temporal scalability, by use of the nal_ref_id syntax
        element.</t>

        <t>The ID parameter type is defined below:</t>

        <figure align="center" anchor="fig-id-avc"
                title="ID definition for AVC">
          <artwork><![CDATA[ 0
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|  Reserved | N |
+-+-+-+-+-+-+-+-+]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="Reserved (6 bits):">Reserved. SHALL be set to 0 by
            senders and SHALL be ignored by receivers implementing this
            specification. MAY be defined differently by extensions to this
            specification.</t>

            <t hangText="N (2 bits):">SHALL be identical to the highest value
            of the nal_ref_idc H.264 NAL header syntax element valid for the
            sub-bitstream described by this OPID, with the exception of
            nal_ref_idc value 3 that is valid for and is part of all
            sub-bitstreams.</t>
          </list></t>
      </section>

      <section title="H.264 SVC">
        <t>This document specifies the usage of multiple, simultaneous codec
        operation points and therefore maps well to scalable video coding.
        Scalable video coding such as <xref target="H264">H.264 SVC (Annex
        G)</xref> uses three scalability dimensions: temporal, spatial, and
        quality. It also includes the possibility to use redundant encodings
        and priority among sub-streams.</t>

        <t>The ID SHALL be considered describing an SVC sub-bitstream, which
        is defined in <xref target="H264">G.3.59 of H.264</xref> and
        corresponding <xref target="RFC6190">RTP payload format</xref>. For
        use with H.264 SVC, ID SHALL be constructed as defined below:</t>

        <figure align="center" anchor="fig-id-svc"
                title="ID definition for SVC">
          <artwork><![CDATA[ 0                   1                   2
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R|    PID    |     RPC     | DID |  QID  | TID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="R (1 bit):">Reserved. SHALL be set to 0 by senders
            and SHALL be ignored by receivers implementing this specification.
            MAY be defined differently by extensions to this
            specification.</t>

            <t hangText="PID (6 bits):">SHALL be identical to an unsigned
            binary integer representation of the priority_id H.264 syntax
            element valid for the sub-bitstream described by this OPID. SHALL
            be set to 0 if no priority_id is available.</t>

            <t hangText="RPC (7 bits):">SHALL be identical to an unsigned
            binary integer representation of the redundant_pic_cnt H.264
            syntax element valid for the sub-bitstream described by this OPID.
            SHALL be set to 0 if no redundant_pic_cnt is available.</t>

            <t hangText="DID (3 bits):">SHALL be identical to the
            dependency_id H.264 syntax element valid for the sub-bitstream
            described by this OPID.</t>

            <t hangText="QID (4 bits):">SHALL be identical to the quality_id
            H.264 syntax element valid for the sub-bitstream described by this
            OPID.</t>

            <t hangText="TID (3 bits):">SHALL be identical to the temporal_id
            H.264 syntax element valid for the sub-bitstream described by this
            OPID</t>
          </list></t>
      </section>
    </section>

    <section title="Examples">
      <t>COP messages are binary encoded. However, in the following examples,
      all COP messages are for clarity listed in symbolic, pseudo-code form,
      where only COP message fields of interest to the example are included,
      along with the COP parameters.</t>

      <section anchor="sec-ex-offer-answer" title="SDP Offer/Answer">
        <t>The SDP capabilities for COP are defined as receiver capabilities,
        meaning that there is no explicit indication what COP messages an
        endpoint will use in the send direction. It is however reasonable to
        expect that an endpoint can also send the same messages that it can
        understand and act on when received. This is assumed in all the SDP
        examples below, but note that symmetric COP capabilities is not a
        requirement.</t>

        <t>The example below shows an SDP Offer, where support of CCM "cop"
        message is announced for the video codecs.</t>

        <figure anchor="fig-ex-sdp-offer"
                title="SDP offer (COP support indicated)">
          <artwork><![CDATA[v=0
o=alice 2890844526 2890844526 IN IP4 host.atlanta.example
s=-
c=IN IP4 host.atlanta.example
t=0 0    
m=audio 50000 RTP/AVP 0 8 97    
b=AS:80
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 50010 RTP/AVPF 31 32
b=AS:600
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
a=rtcp-fb:31 ccm cop framerate bitrate token-rate
a=rtcp-fb:32 ccm cop hor-size ver-size framerate bitrate \
             token-rate]]></artwork>
        </figure>

        <t>Note that the offer contains two different video payload types, and
        that the COP parameters differ between them, meaning that the
        possibility for codec configuration also differ. In this case, the
        MPEG-1 codec can control both framerate and image size, but for H.261
        only the framerate can be controlled.</t>

        <t>In the SDP Answer below, responding to the above offer, the
        answerer supports CCM "cop" messages.</t>

        <figure anchor="fig-ex-sdp-answer"
                title="SDP answer (COP support indicated)">
          <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 host.biloxi.example
s=-
c=IN IP4 host.biloxi.example
t=0 0    
m=audio 52000 RTP/AVP 0    
b=AS:80
a=rtpmap:0 PCMU/8000
m=video 52100 RTP/AVPF 32    
b=AS:600
a=rtpmap:32 MPV/90000
a=rtcp-fb:32 ccm cop hor-size ver-size framerate bitrate \
             token-rate packet-size]]></artwork>
        </figure>

        <t>Note that the answerer indicates support for more parameter types
        than the offerer.</t>

        <t>Below is another SDP Answer, also responding to the same offer
        above, where the answerer does not support "cop".</t>

        <figure anchor="fig-ex-sdp-answer-no"
                title="SDP answer (COP support not indicated)">
          <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 host.biloxi.example
s=-
c=IN IP4 host.biloxi.example
t=0 0    
m=audio 52000 RTP/AVP 0    
b=AS:80
a=rtpmap:0 PCMU/8000
m=video 52100 RTP/AVPF 32    
b=AS:600
a=rtpmap:32 MPV/90000]]></artwork>
        </figure>
      </section>

      <section title="Dynamic Video Re-sizing">
        <t>In this example, two COP-enabled endpoints communicate in an
        audio/video session. The receiving endpoint has a graphical user
        interface that can be dynamically changed by the user. This user
        interaction includes the ability to change the size of the receiving
        video window, which is also indicated in the <xref
        target="sec-ex-offer-answer">previous SDP example</xref>.</t>

        <t>At some point during the established communication, a notification
        about current video stream codec operation point is sent to the
        resizable window endpoint that receives the video stream.</t>

        <figure align="center" anchor="fig-ex-resize-copn"
                title="COPN for QVGA 15 Hz">
          <artwork><![CDATA[COPN {SSRC:123456, OPID:123, Version:5,
      bitrate(max):325000,
      token-bucket(exact):1000,
      framerate(exact):15,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>

        <t>Some time later the user of the resizable window endpoint reduces
        the size of the video window. As a result of the resize operation, the
        video window can no longer make full use of the received video
        resolution, wasting bandwidth and decoder processing resources. The
        resizable window endpoint thus decides to notify the video stream
        sender about the changed conditions by sending a request for a video
        stream of smaller size:</t>

        <figure align="center" anchor="fig-ex-resize-copr"
                title="COPR for 243x185">
          <artwork><![CDATA[COPR {SSRC:123456, OPID:123, Version:5,
      hor-size(target):243,
      ver-size(target):185}]]></artwork>
        </figure>

        <t>The COPR refers to the previously received COPN with the same OPID
        and Version, and thus need only list parameters that need be changed.
        The request could arguably contain also other parameters that are
        potentially affected by the spatial resolution, such as the bitrate,
        but that can be omitted since the media sender is not slaved to the
        request but is allowed to make it's own decisions based on the
        request.</t>

        <t>The request sender has chosen to use target type values instead of
        an exact value for the horizontal and vertical sizes, which can be
        interpreted as "anything sufficiently similar is acceptable". The
        target values is in this example chosen to correspond exactly to the
        resized video display area. Many video coding algorithms operate most
        efficiently when the image size is some even multiple, and this way of
        expressing the request explicitly leaves room for the media sender to
        take such aspect into account.</t>

        <t>The media sender (COPR receiver) responds with the following:</t>

        <figure anchor="fig-ex-resize-cops-copn"
                title="COPS and COPN for partial success">
          <artwork align="center"><![CDATA[COPS {SSRC:123456, OPID:123, Version:5,
      Partial Success,
      One or more parameter values in the request were changed}

COPN {SSRC:123456, OPID:123, Version:6,
      bitrate(max):240000,
      token-bucket(exact):1000,
      framerate(exact):15,
      hor-size(exact):240,
      ver-size(exact):176}]]></artwork>
        </figure>

        <t>It can be noted that the updated COPN (version 6) indicates that
        the media sender has, in addition to reducing the video horizontal and
        vertical size, chosen to also reduce the bitrate. This bitrate
        reduction was not in the request, but is a reasonable decision taken
        by the media sender. It can also be seen that the horizontal and
        vertical sizes are not chosen identical to the request, but is in fact
        adjusted to be even multiples of 16, which is a local restriction of
        the fictitious video encoder in this example. To handle the mismatch
        of the request and the resulting video stream, the video receiver can
        perform some local action such as for example automatic readjustment
        of the resized window, image scaling (possibly combined with
        cropping), or padding.</t>
      </section>

      <section title="Illegal Request">
        <t>In this example, the sent request is asking the media sender to go
        beyond what is negotiated in the SDP. The SDP Offer below indicates to
        use video with H.264 Constrained Baseline Profile at level 1.1.</t>

        <figure anchor="fig-ex-illegal-sdp"
                title="SDP offer with H.264 level 1.1">
          <artwork><![CDATA[v=0
o=alice 2893746526 2893746526 IN IP4 host.atlanta.example
s=-
c=IN IP4 host.atlanta.example
t=0 0    
m=audio 49160 RTP/AVP 96    
b=AS:80
a=rtpmap:96 G722/16000
m=video 51920 RTP/AVPF 97
b=AS:200
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42e00b
a=rtcp-fb:97 ccm cop framerate bitrate token-rate]]></artwork>
        </figure>

        <t>Assuming this offer is accepted and that the answerer also supports
        COP, further assume that this COP message exchange occurs at some time
        during the established communication:</t>

        <figure align="center" anchor="fig-ex-illegal-cop"
                title="COP message exchange indicating failure">
          <artwork><![CDATA[Media Sender                      Media Receiver
------------                      --------------

COPN {SSRC:9876, OPID:67,      ->
      Version:2,
      bitrate(exact):190000,
      token-bucket(exact):500,
      framerate(exact):10,
      hor-size(exact):320,
      ver-size(exact):240}

                               <-  COPR {SSRC:9876, OPID:67,
                                         Version:2,
                                         framerate(exact):10,
                                         hor-size(exact):352,
                                         ver-size(exact):288}

COPS {SSRC:9876, OPID:67,      ->
      Version:2,
      Failure,
      Request violates capability limits}]]></artwork>
        </figure>

        <t>The failure above is due to a combination of frame size and frame
        rate that exceeds H.264 level 1.1, which would thus exceed the limits
        established by SDP Offer/Answer. The maximum permitted framerate for
        352x288 pixels (CIF) is 7.6 Hz for H.264 level 1.1, as defined in
        Annex A of <xref target="H264"/>.</t>
      </section>

      <section title="Reference Response to Modification of Scalable Layer">
        <t>When scalable coding is used, each layer correspond to a codec
        operation point. A media receiver can thus target a request towards a
        single layer. Assume a video encoding with three framerate layers,
        announced in a (multiple operation point) notification as:</t>

        <figure align="center" anchor="fig-ex-mod-copn"
                title="COPN indicating three framerate layers">
          <artwork><![CDATA[COPN {SSRC:9876, OPID:67, Version:2, ID:2
      bitrate(exact):190000,
      token-bucket(exact):500,
      framerate(exact):10,
      hor-size(exact):320,
      ver-size(exact):240}

COPN {SSRC:9876, OPID:73, Version:1,
      bitrate(exact):350000, ID:1
      token-bucket(exact):600,
      framerate(exact):30,
      hor-size(exact):320,
      ver-size(exact):240}

COPN {SSRC:9876, OPID:95, Version:5, ID:0
      bitrate(exact):400000,
      token-bucket(exact):800,
      framerate(exact):60,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>

        <t>Assume further that the media receiver is not pleased with the low
        framerate of OPID 67, wanting to increase it from 10 Hz to 25-30 Hz.
        Note that the media receiver still wants to receive the other layers
        unchanged, not remove them, and thus has to explicitly indicate this
        by including them without parameters.</t>

        <figure align="center" anchor="fig-ex-mod-copr"
                title="COPR requesting to change one layer">
          <artwork><![CDATA[COPR {SSRC:9876, OPID:67, Version:2,
      framerate(greater):25,
      framerate(less):30}

COPR {SSRC:9876, OPID:73, Version:1}

COPR {SSRC:9876, OPID:95, Version:5}]]></artwork>
        </figure>

        <t>The media sender decides it cannot meet the request for OPID 67,
        but instead considers (an unmodified) OPID 73 (with ID 1) to be a
        sufficiently good match:</t>

        <figure align="center" anchor="fig-ex-mod-cops-copn"
                title="COPS and COPN with layer modification partial success">
          <artwork><![CDATA[COPS {SSRC:9876, OPID:67, Version:2,
      Partial Success,
      One or more parameter values in the request were changed,
      ID:1}

(COPN for the other two OPIDs omitted here for brevity)

COPN {OSSRC:9876, OPID:73, Version:1, ID:1
      bitrate(exact):350000,
      token-bucket(exact):600,
      framerate(exact):30,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>

        <t>The COPS indicates partial success and uses the ID number to refer
        another OPID, describing the best compromise that can currently be
        used to meet the request. COPS does not contain the referred OPID, but
        ID should be defined in a codec-specific way that makes it possible to
        identify the layer directly in the media stream. If the corresponding
        OPID is needed, for example to attempt another request targeting that,
        it can be found by searching the active set of COPN for matching ID
        values.</t>
      </section>

      <section title="Successful Request to Add Codec Operation Point">
        <t>In this example, the media receiver is receiving a non-scalable
        stream from a codec that can support scalability, and wishes to add a
        scalability layer. Assume the existing OPID from the media sender is
        announced as:</t>

        <figure align="center" anchor="fig-ex-add-copn"
                title="COPN with single operation point">
          <artwork><![CDATA[COPN {SSRC:3492, OPID:4, Version:2,
      bitrate(exact):350000,
      token-bucket(exact):600,
      framerate(exact):30,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>

        <t>The media receiver constructs a request for multiple streams by
        including multiple requests for different OPID. Since the new stream
        does not exist, it has no OPID from the media sender and the receiver
        chooses a random value as reference and indicates that it is a new,
        temporary OPID. The request for the new stream includes all parameters
        that the media receiver has an opinion on, and leaves the other
        parameters to be chosen by the media sender. In this case it is a
        request for identical frame size and doubled framerate.</t>

        <figure align="center" anchor="fig-ex-add-copr"
                title="COPR requesting to add operation point">
          <artwork><![CDATA[COPR {SSRC:3492, OPID:4, Version:2}

COPR {SSRC:3492, OPID:237, New, Version:0,
      framerate(exact):60,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>

        <t>The media sender decides it can start layered encoding with the
        requested parameters. The status response to the new OPID contains a
        reference to an ID that is included as part of the matching,
        subsequent COPN. Note that since both the original and the new streams
        are now part of a scalable set, they must both be identified with ID
        parameters to be able to distinguish between them. The media sender
        has chosen an OPID for the new stream in the COPN, which need not be
        identical to the temporary one in the request, but the new stream can
        anyway be uniquely identified through the ID that is announced in both
        the COPS and COPN.</t>

        <t>Note that since the ID has a defined relation to the media
        sub-stream identification, decoding of that new sub-stream can start
        immediately after receiving the COPS. It may however not be possible
        to describe the new stream in COP parameter terms until the COPN is
        received (depending on COP parameter visibility directly in the media
        stream).</t>

        <figure align="center" anchor="fig-ex-add-cops-copn"
                title="COPS and COPN indicating operation point added">
          <artwork><![CDATA[COPS {SSRC:3492, OPID:4, Version:2,
      Success, Success,
      ID:1}

COPS {SSRC:3492, OPID:237, New, Version:0,
      Success, Success,
      ID:0}

COPN {SSRC:3492, OPID:4, Version:2, ID:1,
      bitrate(exact):350000,
      token-bucket(exact):600,
      framerate(exact):30,
      hor-size(exact):320,
      ver-size(exact):240}

COPN {SSRC:3492, OPID:9, Version:0, ID:0,
      bitrate(exact):390000,
      token-bucket(exact):600,
      framerate(exact):60,
      hor-size(exact):320,
      ver-size(exact):240}]]></artwork>
        </figure>
      </section>
    </section>

    <section title="IANA Considerations">
      <t>Following the guidelines in <xref target="RFC4566"/>, in <xref
      target="RFC4585"/>, and in <xref target="RFC3550"/>, the IANA is
      requested to register:</t>

      <t><list style="numbers">
          <t hangText="Grouping Tag">The 'cop' tag to be used with ccm under
          rtcp-fb AVPF attribute in SDP.</t>

          <t hangText="New SDES">The FMT number TBA1 to be allocated to the
          COP feedback message from this specification.</t>

          <t hangText="New SDES">A registry listing registered values for
          'cop' message item type, with initial values from <xref
          target="tab-itemtype"/>.</t>

          <t hangText="New SDES">A registry listing registered values and tag
          names for 'cop' parameter type, with initial values from <xref
          target="tab-paramtype"/>.</t>
        </list></t>
    </section>

    <section title="Security Considerations">
      <t>This document extends the <xref target="RFC5104">CCM</xref> and
      defines new messages, i.e. COPR, COPN and COPS. The exchange of these
      new messages MAY have some security implications, which need to be
      addressed by the user. Following are some important implications,</t>

      <t><list style="numbers">
          <t>Identity spoofing - An attacker can spoof him/herself as an
          authenticated user and can falsely control or indicate the codec
          parameters of any source transmission. In order to prevent this type
          of attack, a strong authentication and integrity protection
          mechanism is needed.</t>

          <t>Denial of Service (DoS) - An attacker can falsely set codec
          parameters for all the source streams which MAY result in Denial of
          Service (DoS). An Authentication protocol MAY save from this
          attack.</t>

          <t>Man-in-Middle Attack (MiMT) - The codec configuration and
          notification of changes of the RTP source is prone to a
          Man-in-Middle attack. The public key authentication May be used to
          prevent MiMT.</t>
        </list></t>
    </section>

    <section title="Open Issues">
      <t>There is currently no defined way for a media receiver to indicate
      that it wants to release the restrictions it previously had on an
      operation point, if the media stream contains only a single operation
      point.</t>
    </section>

    <section title="Acknowledgements">
      <t>The authors would like to thank Prof. Dr.-Ing. Markus Kampmann at
      Fachhochschule Koblenz University of Applied Sciences and Prof. Dr.-Ing.
      Frank Hartung at Multimediatechnik, Audio- und Videotechnik at
      Fachhochschule Aachen for fruitful contributions and discussions during
      the initial stages of writing this specification. The authors would also
      like to thank Christer Holmberg for feedback on the specification.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>

      <?rfc include='reference.RFC.3264'?>

      <?rfc include='reference.RFC.3550'?>

      <?rfc include='reference.RFC.3551'?>

      <?rfc include='reference.RFC.3890'?>

      <?rfc include='reference.RFC.4566'?>

      <?rfc include='reference.RFC.4585'?>

      <?rfc include='reference.RFC.5104'?>

      <?rfc include='reference.RFC.5234'?>

      <?rfc include='reference.RFC.6184'?>

      <?rfc include='reference.RFC.6190'?>

      <?rfc include='reference.RFC.6236'?>

      <reference anchor="H264">
        <front>
          <title>Advanced video coding for generic audiovisual
          services</title>

          <author>
            <organization>ITU-T Recommendation H.264</organization>
          </author>

          <date month="March" year="2010"/>
        </front>
      </reference>

      <reference anchor="H241">
        <front>
          <title>Extended video procedures and control signals for H.300
          series terminals</title>

          <author>
            <organization>ITU-T Recommendation H.241</organization>
          </author>

          <date month="May" year="2006"/>
        </front>
      </reference>
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.2212'?>

      <?rfc include='reference.RFC.3261'?>

      <?rfc include='reference.RFC.3611'?>

      <?rfc include='reference.RFC.4103'?>

      <?rfc include='reference.RFC.4607'?>

      <?rfc include='reference.RFC.5117'?>

      <?rfc include='reference.RFC.5760'?>

      <?rfc include='reference.RFC.5968'?>

      <?rfc include='reference.I-D.westerlund-mmusic-sdp-bw-attribute'?>

      <?rfc include='reference.I-D.westerlund-avtext-rtp-stream-pause'?>

      <?rfc include='reference.I-D.ietf-avtext-multiple-clock-rates'?>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 04:05:08