One document matched: draft-westerlund-rtcweb-codec-control-00.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-westerlund-rtcweb-codec-control-00"
ipr="trust200902">
<front>
<title abbrev="Abbreviated-Title">Codec Control for WebRTC</title>
<author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 82 87</phone>
<email>magnus.westerlund@ericsson.com</email>
</address>
</author>
<author fullname="Bo Burman" initials="B." surname="Burman">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 13 11</phone>
<email>bo.burman@ericsson.com</email>
</address>
</author>
<date day="16" month="May" year="2012"/>
<abstract>
<t>This document proposes how WebRTC should handle media codec control
between peers. With media codec control we mean such parameters as video
resolution and frame-rate. This includes both initial establishment of
capabilities using the SDP based JSEP signalling and during ongoing
real-time interactive sessions in response to user and application
events. The solution uses SDP for initial boundary establishment that
are rarely, if ever changed. During the session the RTCP based Codec
Operations Point (COP) signaling solution is used for dynamic control of
parameters enabling timely and responsive controls.</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>In WebRTC there exist need for codec control to improve the
efficiency and user experience of its real-time interactive media
transported over a PeerConnection. The fundamentals of the codec control
is that the media receiver provides preference for how it would like the
media to be encoded to best suit the receiver's consumption of the media
stream. This includes parameters such as video resolution and
frame-rate, and for audio number of channels and audio bandwidth. It
also includes non media specific properties such as how to provision
available transmission bit-rates between different media streams.</t>
<t>This document proposes a specific solution for how to accomplish
codec control that meets the goals and requirements. It is based on
establishing the outer boundaries, when it comes to codec support and
capabilities, at PeerConnection establishment using <xref
target="I-D.ietf-rtcweb-jsep">JSEP</xref> and <xref
target="RFC4566">SDP</xref>. During an ongoing session the preferred
parameters are signalled using the <xref
target="I-D.westerlund-avtext-codec-operation-point">Codec Operation
Point RTCP Extension (COP)</xref>. The java script Application will
primarily make its preferences made clear through its usage of the media
elements, like selecting the size of the rendering area for video. But
it can also use the constraints concept in the API to indicate
preferences that the browser can weigh into its decision to request
particular preferred parameters.</t>
<t>This document provides a more detailed overview of the solution. Then
it discusses the use cases and requirements that motivates the solution,
followed by an analysis of the benefits and downsides of the proposed
solution. This is followed by a proposed specification of how WebRTC
should use SDP and COP.</t>
</section>
<section title="Definitions">
<t/>
<section title="Abrevations">
<t>The following Abbreviations are used in this document.<list
style="hanging">
<t hangText="COP:">Codec Operation Point RTCP Extension, the
solution for codec control defined in <xref
target="I-D.westerlund-avtext-codec-operation-point"/>.</t>
<t hangText="JSEP:"><xref target="I-D.ietf-rtcweb-jsep">Java
script Session Establishment Protocol</xref>.</t>
<t hangText="RTP:"><xref target="RFC3550">Real-time Transport
Protocol</xref>.</t>
<t hangText="SDP:"><xref target="RFC4566">Session Description
Protocol</xref>.</t>
</list></t>
<t/>
</section>
<section title="Requirements Language">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119"/>.</t>
</section>
</section>
<section title="Overview">
<t>The basic idea in this proposal is to use JSEP to establish the outer
limits for behavior and then use <xref
target="I-D.westerlund-avtext-codec-operation-point">Codec Operation
Point (COP)</xref> proposal to handle dynamic changes during the
session.</t>
<t>Boundary conditions are typically media type specific and in some
cases also codec specific. Relevant for video are highest resolution,
frame-rate and maximum complexity. These can be expressed in JSEP SDP
for H.264 using the <xref target="RFC6184">H.264 RTP payload
format</xref> specifying the profile and level concept. The authors
expect something similar for the <xref target="I-D.ietf-payload-vp8">VP8
payload format</xref>.</t>
<t>During the session the browser implementation detects when there is
need to use COP to do any of the following things.<list style="letters">
<t>Request new target values for codec operation, for example based
on that the GUI element displaying a video has changed due to window
resize or purpose change. This includes parameters such as
resolution, frame-rate, and picture aspect ratio.</t>
<t>Change parameters due to changing display screen attached to the
device. Affected parameters include resolution, picture aspect ratio
and sample aspect ratio.</t>
<t>Indicate when the end-point changes encoding parameters in its
role as sender.</t>
<t>Change important parameters affecting the transport for media
streams such as a maximum media bit-rate, token bucket size (to
control the burstiness of the sender), used RTP payload type,
maximum RTP packet size, application data unit Aggregation (to
control amount of audio frames in the same RTP packet).</t>
<t>Affect the relative prioritization of media streams.</t>
</list></t>
<t>The receiving client may send a COP request in RTCP to request some
set of parameters to be changed according to the receiving client's
preferences. The applications preferences are primarily indicated
through its usage of the media elements. But there exist cases and
properties where the application will have to provide additional
preference information for example using the constraints. The browser
implementation takes all these information into account when expressing
preference using a set of parameters.</t>
<t>The media sender evaluates the request and weights it against other
potential receiver's requests and may update one or more (if scalability
is supported) codec operation points to better suit the receivers. Any
new operation point(s) are announced using a COP Notification.
Independently if the codec operation point(s) are changed or not, the
COP request is acknowledged using a COP status message.</t>
<t>Using RTCP and <xref target="RFC5124">"Extended Secure RTP Profile
for Real-time Transport Control Protocol (RTCP)-Based Feedback
(RTP/SAVPF)"</xref> the COP message can in most cases be sent
immediately or with a very small delay. As the message travels in the
media plane it will reach the peer or the next middlebox that are part
of the media path directly.</t>
</section>
<section title="Requirements and Motivations">
<t>This section discusses the use cases and the requirements for codec
control. This includes both the ones explicitly discussed in the use
case document but also derived ones. This is followed by a discussion
why the proposed mechanism is considered the most suitable for
WebRTC.</t>
<section title="Use Cases and Requirements">
<t>There are use cases and derived requirements in <xref
target="I-D.ietf-rtcweb-use-cases-and-requirements">"Web Real-Time
Communication Use-cases and Requirements"</xref>.</t>
<t>A Selection of interesting Use Cases and the description parts that
are most applicable to Codec Control are:<list style="hanging">
<t hangText="4.2.1 - Simple Video Communication Service:">Two or
more users have loaded a video communication web application into
their browsers, provided by the same service provider, and logged
into the service it provides. The web service publishes
information about user login status by pushing updates to the web
application in the browsers. When one online user selects a peer
online user, a 1-1 video communication session between the
browsers of the two peers is initiated. The invited user might
accept or reject the session.</t>
<t hangText="">During session establishment a self-view is
displayed, and once the session has been established the video
sent from the remote peer is displayed in addition to the
self-view. During the session, each user can select to remove and
re-insert the self-view as often as desired. Each user can also
change the sizes of his/her two video displays during the session.
Each user can also pause sending of media (audio, video, or both)
and mute incoming media</t>
<t hangText="">The two users may be using communication devices of
different makes, with different operating systems and browsers
from different vendors.</t>
<t hangText="4.2.10 - Multiparty video communication:">In this
use-case is the Simple Video Communication Service use-case
(Section 4.2.1) is extended by allowing multiparty sessions. No
central server is involved - the browser of each participant sends
and receives streams to and from all other session participants.
The web application in the browser of each user is responsible for
setting up streams to all receivers.</t>
<t hangText="">In order to enhance intelligibility, the web
application pans the audio from different participants differently
when rendering the audio. This is done automatically, but users
can change how the different participants are placed in the
(virtual) room. In addition the levels in the audio signals are
adjusted before mixing.</t>
<t hangText="">Another feature intended to enhance the use
experience is that the video window that displays the video of the
currently speaking peer is highlighted.</t>
<t hangText="">Each video stream received is by default displayed
in a thumbnail frame within the browser, but users can change the
display size.</t>
<t hangText="">Note: What this use-case adds in terms of
requirements is capabilities to send streams to and receive
streams from several peers concurrently, as well as the
capabilities to render the video from all received streams and be
able to spatialize, level adjust and mix the audio from all
received streams locally in the browser. It also adds the
capability to measure the audio level/activity.</t>
<t
hangText="4.3.3 - Video conferencing system with central server:">An
organization uses a video communication system that supports the
establishment of multiparty video sessions using a central
conference server.</t>
<t hangText="">The browser of each participant send an audio
stream (type in terms of mono, stereo, 5.1, ... depending on the
equipment of the participant) to the central server. The central
server mixes the audio streams (and can in the mixing process
naturally add effects such as spatialization) and sends towards
each participant a mixed audio stream which is played to the
user.</t>
<t hangText="">The browser of each participant sends video towards
the server. For each participant one high resolution video is
displayed in a large window, while a number of low resolution
videos are displayed in smaller windows. The server selects what
video streams to be forwarded as main- and thumbnail videos
respectively, based on speech activity. As the video streams to
display can change quite frequently (as the conversation flows) it
is important that the delay from when a video stream is selected
for display until the video can be displayed is short.</t>
<t hangText="">Note: This use-case adds requirements on support
for fast stream switches F7. There exist several solutions that
enable the server to forward one high resolution and several low
resolution video streams: a) each browser could send a high
resolution, but scalable stream, and the server could send just
the base layer for the low resolution streams, b) each browser
could in a simulcast fashion send one high resolution and one low
resolution stream, and the server just selects or c) each browser
sends just a high resolution stream, the server transcode into low
resolution streams as required.</t>
</list></t>
<t>The derived requirements that applies to codec control are:<list
style="hanging">
<t hangText="F3:">Transmitted streams MUST be rate controlled.</t>
<t hangText="F6:">The browser MUST be able to handle high loss and
jitter levels in a graceful way.</t>
<t hangText="F7:">The browser MUST support fast stream
switches.</t>
<t hangText="F24:">The browser MUST be able to take advantage of
capabilities to prioritize voice and video appropriately.</t>
<t hangText="F25:">The browser SHOULD use encoding of streams
suitable for the current rendering (e.g. video display size) and
SHOULD change parameters if the rendering changes during the
session.</t>
</list></t>
<t>It might not be obvious how some of the above requirements actually
have impact on the question of controlling the media encoder in a
transmitter so let's go through what the document authors consider be
its applicability. But let's start with reviewing the topologies that
exist.<list style="hanging">
<t hangText="Peer to Peer:">This is the basic topology used in use
case "Simple Video Communication Service". Two end-points
communicating directly with each other. A PeerConnection directly
connects the source and the sink of the media stream. Thus in this
case it is simple and straightforward to feed preferences from the
sink into the source's media encoder to produce the best possible
match that the source is capable of, given the preferences.</t>
<t hangText="Peer to Multiple Peers:">A given source have multiple
PeerConnections going from the source to a number of receivers,
i.e. sinks as described by use case "Multiparty video
communication". In some implementations this will be implemented
as Peer to Peer topology where only the source for the raw media
is common between the different PeerConnections. On more resource
constrained devices that can't afford individual media encodings
for each PeerConnection the media stream is to be delivered over,
there exist a need to merge the different preferences from the
different receivers into a single or a set of fewer configurations
that can be produced. For codecs that has scalability features, it
might be possible to produce multiple actual operation points in a
single encoding and media stream. For example multiple frame rates
can be produced by H.264 by encoding using a frame structure where
some frames can be removed to produce a lower bit-rate and lower
frame rate version of the stream. Thus possibly allowing multiple
concurrent operation points to be produced to meet the demands for
an even larger number of preferred operation points.</t>
<t hangText="Centralized Conferencing:">This topology consists of
a central server to which each conference participant connects his
PeerConnection(s). Over that PeerConnection the participant will
receive all the media streams the conference service thinks should
be sent and delivered. The actual central node can work in several
different modes for media streams. It can be a very simple relay
node (<xref target="RFC5117">RTP transport translator</xref>),
where it forwards all media streams arriving to it to the other
participants, forming a common RTP session among all participants
with full visibility. Another mode of operation would be an RTP
mixer that forwards selected media streams using a set of SSRC the
RTP mixer has. The third mode is to perform actual media mixing
such as where audio is mixed and video is composited into a new
video image and encoded again.</t>
<t hangText="">This results in two different behaviors in who
needs to merge multiple expressed preferences. For a simple relay
central node, the merge of preferences may be placed on the
end-point, similar to the resource constrained peer to multiple
peer case above. The second alternative is to let the central node
merge the preferences into a single set of preferences, which is
then signalled to the media source end-point.</t>
<t hangText="">Note: In the above it might be possible to
establish multiple PeerConnections between an end-point and the
central node. The different PeerConnections would then be used to
express different preferences for a given media stream. This
enables simulcast delivery to the central node so that it can use
more than a single operation point to meet the preferences
expressed by the multiple receiving participants. That approach
can improve the media quality for end-points capable of receiving
and using a higher media quality, since they can avoid being
constrained by the lowest common denominator of a single operation
point.</t>
<t hangText="Peer Relayed:">This is not based on an explicit use
case in the use case document. It is based on a usage that appears
possible to support, and for which there has been interest. The
topology is that Peer A sources a media stream and sends it over a
PeerConnection to B. B in its turn has a PeerConnection to Peer C.
B chooses to relay the incoming media stream from A to C. To
maintain quality, it is important that B does not decode and
re-encode the media stream (transcoding). Thus a case arises where
B will have to merge the preferences from itself and C into the
preferences it signals to A.</t>
</list></t>
<t>Comments on the applicability of the requirement on the codec
control:<list style="hanging">
<t hangText="F3:">This requirement requires rate-control on the
media streams. There will also be multiple media streams being
sent to or from a given end-point. Combined, this creates a
potential issue when it comes to prioritization between the
different media streams and what policy to use to increase and
decrease the bit-rate provided for each media stream. The
application's preferences combined with other properties such as
current resolution and frame-rate affects which parameter that is
optimal to use when bit-rate needs to be changed. The other aspect
is if one media stream is less relevant so that reducing that
stream's quality or even terminating the transmission while
keeping others unchanged is the best choice for the application.
In other cases, applying the cheese cutter principle and reduce
all streams in equal proportion is the most desirable. Another
aspect is the potential for requesting aggregation of multiple
audio frames in the same RTP packet to reduce the overhead and
thus lower the bit-rate for some increased delay and packet loss
sensitivity.</t>
<t hangText="F6:">The browser MUST be able to handle high loss and
jitter levels in a graceful way. When such conditions are
encountered, it will be highly beneficial for the receiver to be
able to indicate that the sender should try to combat this by
changing the encoding and media packetization. For example for
audio it might be beneficial to aggregate several frames together
and apply additional levels of FEC on those fewer packets that are
produced to reduce the residual audio frame loss.</t>
<t hangText="F7:">The browser MUST support fast stream switches.
Fast stream switches occur in several ways in WebRTC. One is in
the centralized conferencing when relay based central nodes turn
on and off individual media streams depending on the application's
current needs. Another is RTP mixers that switches input sources
for a given outgoing SSRC owned by the mixer. This should have
minimal impact on a receiver as there is no SSRC change. Along the
same lines, the application can cause media stream changes by
changing their usage in the application. By changing the usage of
a media stream from being the main video to become a thumbnail of
one participant in the session, there exist a need to rapidly
switch the video resolution to enable high efficiency and avoid
increased bit-rate usage.</t>
<t hangText="F24:">The browser MUST be able to take advantage of
capabilities to prioritize voice and video appropriately. This
requirement comes from the QoS discussion present in use case
4.2.6 (Simple Video Communication Service, QoS). This requirement
assumes that the application has a preference for one media type
over another. Given this assumption, the same prioritization can
actually occur for a number of codec parameters when there exist
multiple media streams and one can individually control these
media streams. This is another aspect of the discussion for
requirement F3.</t>
<t hangText="F25:">The browser SHOULD use encoding of streams
suitable for the current rendering (e.g. video display size) and
SHOULD change parameters if the rendering changes during the
session. This requirement comes from a very central case that a
receiving application changes the display layout and where it
places a given media stream. Thus changing the amount of screen
estate that the media stream is viewed on also changes what
resolution that would be the optimal to use from the media sender.
However, this discussion should not only apply to video
resolution. Additional application preferences should result in
preferences being expressed to the media sender also for other
properties, such as video frame-rate. For audio, number of audio
channels and the audio bandwidth are relevant properties.</t>
</list></t>
<t>The authors hope this section has provided a sufficiently clear
picture that there exist both multiple topologies with different
behaviors, and different points where preferences might need to be
merged. The discussion of the requirements also provides a view that
there are multiple parameters that needs to be expressed, not only
video resolution.</t>
</section>
<section title="Motivations">
<t>This section discusses different types of motivations for this
solution. It includes comparison to the solution described in <xref
target="I-D.alvestrand-rtcweb-resolution">"RTCWEB Resolution
Negotiation"</xref>.</t>
<section title="Performance">
<t>The proposed solution has the following performance
characteristics. The initial phase, establishing the boundaries, is
done in parallel with the media codec negotiation and establishment
of the PeerConnection. Thus using the signalling plane is optimal as
this process does not create additional delay or significant
overhead.</t>
<t>During an ongoing communication session, using COP messages in
RTCP has the following properties:<list style="hanging">
<t hangText="Path Delay:">The COP messages are contained in the
RTCP packets being sent over the PeerConnection, i.e. the most
optimal peer to peer path that ICE has managed to get to work.
Thus one can expect this path to be equal or shorter in delay
than the signalling path being used between the PeerConnection
end-points. If the signalling message is instead sent over the
PeerConnection's data channel, it will be using the same path.
In almost all cases, the direct path between two peers will also
be shorter than a path going via the webserver.</t>
<t hangText="Media Plane:">The COP messages will always go to
the next potential RTP/RTCP processing point, i.e. the one on
the other side of the PeerConnection. Even for multiparty
sessions using centralized servers, the COP message may need to
be processed in the middle to perform the merger of the
different participant's preferences.</t>
<t hangText="Overhead:">An RTCP COP message can be sent as <xref
target="RFC5506">reduced size RTCP message</xref> thus having
minimal unnecessary baggage. For example a COP Request message
requesting a new target resolution from a single SSRC will be 29
bytes. Using reduced size RTCP keeps the average RTCP size down
and enables rapid recovery of the early allowed flag in early
mode and in more cases enable the immediate mode.</t>
<t hangText="Minimal Blocking:">Using RTCP lets the transmission
of COP messages be governed by RTCP's transmission rules. As
WebRTC will be using the SAVPF profile it is possible to use the
early mode, allowing an early transmission of an RTCP packet
carrying a feedback event, like a COP request, to be sent with
little delay. It might even be possible to determine that the
immediate mode of operation can be enabled, thus allowing the
RTCP feedback events to be sent immediate in all cases while
using the mode. The small overhead and usage of reduced size
RTCP will help ensure that the times when transmission of a COP
message will be blocked is a rare event and will quickly be
released.</t>
<t hangText="">The next aspect of RTCP blocking is that we
expect that the application will need to rapidly switch back and
forth between codec parameters. Thus requiring both a protocol
that allows quick setting of parameters and also the possibility
to revert back to previous preferences while the request is
outstanding. COP has support for such updated requests, even if
the first request is in flight.</t>
</list></t>
<t>If the above is compared to the properties that <xref
target="I-D.alvestrand-rtcweb-resolution">Harald Alvestrand's
proposal</xref> has, the following differences are present. When it
comes to signalling path delay, a signalling plane based solution
will in almost all cases at best have the same path delay as a media
plane solution, achieved by using the data channel to carry the
signalling. There the only difference will be the message size,
which will only incur a minor difference in transfer times. But in
cases where the application has not implemented use of the data
channel, the signalling path will be worse, possibly
significantly.</t>
<t>Using the signalling plane for solutions based on centralized
conference mixers can easily result in that the request message
needs to be processed in the webserver before being forwarded to the
mixer node that actually processes the media, followed by the
central mixer triggering additional signalling messages to other
end-points that also needs to react. This can be avoided assuming
that the data channel is used for signalling transport. Using the
media plane for such signalling will be equal or better in almost
all cases.</t>
<t>When it comes to blocking, there exist a significant issue with
using JSEP to carry this type of messages. Someone that has sent an
SDP offer in an offer/answer exchange is blocked from sending a new
update until it has received a final or provisional answer to that
offer. Here COP has a great advantage as the design has taken rapid
change of parameters into consideration and allows multiple
outstanding requests.</t>
</section>
<section title="Ease of Use">
<t>We see a great benefit in that COP can be allowed to be mainly
driven by the browser implementation and its knowledge of how media
elements are currently being used by the application. For example
the video resolution of the display area can be determined by the
browser, further determining that the resource consumption would be
reduced and the image quality improved or at least maintained by
requesting another target resolution better suiting the current
size. There are also other metrics or controls that exist in the
browser space, like the congestion control that can directly use the
COP signalling to request more suitable parameters given the
situation.</t>
<t>Certain application preferences can't be determined based solely
on the usage. Thus using the constraints mechanism to indicate
preferences is a very suitable solution for most such properties.
For example the relative priority of media streams, or a desire for
lower frame rate to avoid reductions in resolution or image quality
SNR are likely to need constraints.</t>
<t>This type of operation results in better performance for simple
applications where the implementor isn't as knowledged about the
need to initiate signalling to trigger a change of video resolution.
Thus providing good performance in more cases and having less amount
of code in their applications.</t>
<t>Still more advanced applications should have influence on the
behavior. This can be realized in several ways. One is to use the
constraints to inform the browser about the application's
preferences how to treat the different media streams, thus affecting
how COP is used. If required, it is possible to provide additional
API functionalities for the desired controls.</t>
<t>The authors are convinced that providing ease of use for the
simple application is important. Providing more advanced controls
for the advanced applications is desirable.</t>
</section>
</section>
</section>
<section title="SDP Usage">
<t>SDP SHALL be used to establish boundaries and capabilities for the
media codec control in WebRTC. This includes the following set of
capabilities that is possible to express in SDP:<list style="hanging">
<t hangText="Codec Capabilities:">For all media codecs it is needed
to determine what capabilities that are available if there exist
optional functionalities. This concerns both media encoding and the
RTP payload format as codec control can affect both. For codecs
where the span of complexities are large there might exist need to
express the level of complexity supported. For Video codecs like
H.264 this can be expressed by the profile level ID. These
capabilities are expected to be defined by the RTP payload format or
in SDP attributes defined in the RTP payload formats to be used.</t>
<t hangText="COP Parameters Supported:">SDP SHALL be used to
negotiate the set of COP parameters that the peers can express
preferences for and for which they will send notification on their
sets of parameter values used.</t>
</list></t>
<t/>
</section>
<section title="COP Usage">
<t>An WebRTC end-point SHALL implement <xref
target="I-D.westerlund-avtext-codec-operation-point">Codec Operation
Point RTCP Extension</xref>. The following COP parameters SHALL be
supported:<list style="symbols">
<t>Payload Type</t>
<t>Bitrate</t>
<t>Token Bucket Size</t>
<t>Framerate</t>
<t>Horizontal Pixels</t>
<t>Vertical Pixels</t>
<t>Maximum RTP Packet Size</t>
<t>Maximum RTP Packet Rate</t>
<t>Application Data Unit Aggregation</t>
</list></t>
<t>Please note that also the ALT and ID parameters must be implemented
in COP for COP to correctly function.</t>
<t>To make COP usage efficient the end-point SHALL implement <xref
target="RFC5506">Reduced size RTCP packets</xref>.</t>
<t>To provide in addition to requesting specific frame-rates also the
RTCP Codec Control Messages <xref target="RFC5104">"Temporal-Spatial
Trade-off Request and Notification"</xref> . This enables a receiver to
make a relative indication of their preferred trade-off between spatial
and temporal quality. This provides an highly useful indication to the
media sender about what the receiver prefer in a relative sense. The COP
framerate or resolution parameters can be used to further provides
target, max or min values to further indicate within which set of
parameters the sender should find this relative trade-off.</t>
<t>To enable an receiver to temporarily halt or pause delivery of a
given media stream an WebRTC end-point SHALL also implement <xref
target="I-D.westerlund-avtext-rtp-stream-pause">"RTP Media Stream Pause
and Resume"</xref>. This is important COP related features as described
by the use case and motivations to enable the receiver to indicate that
it prefers to have a given media stream halted if the aggregate media
bit-rate is reduced. It can also be used to recover aggregate media
bit-rate when the application has no current use of a given media
stream, but may rapidly need it again due to interactions in the
application or with other instances.</t>
</section>
<section anchor="IANA" title="IANA Considerations">
<t>This document makes no request of IANA.</t>
<t>Note to RFC Editor: this section may be removed on publication as an
RFC.</t>
</section>
<section anchor="Security" title="Security Considerations">
<t>The usage of COP and its security issues are described in <xref
target="I-D.westerlund-avtext-codec-operation-point"/>. The main threats
to this usage of COP are the following things:<list style="letters">
<t>That the SDP based codec boundary signalling and COP parameter
negotiation could be intercepted and modified. Thus enabling denial
of service attacks on the end-points reducing the scope of the COP
usage and the media codec parameters to provide sub-optimal quality
or block certain features. To prevent this the SDP needs to be
authenticated and integrity protected.</t>
<t>The COP messages themselves could be modified to affect the
negotiated codec parameters. This could have sever impact on the
media quality as media streams can be completely throttled, or
configured to very reduced framerate or resolution. To prevent this
source authentication and integrity protection must be applied to
the RTCP compound packets.</t>
<t>In multi-party applications of COP an entity may need to combine
multiple sets of requested parameters. In these multi-party cases a
particular participant may target the other participants and
actively try to degrade their experience. Any COP entity merging
sets will need to consider if a particular participant is actively
harmful to the others and can chose to ignore that entities
request.</t>
</list></t>
<t/>
</section>
<section anchor="Acknowledgements" title="Acknowledgements">
<t/>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include='reference.RFC.3550'?>
<?rfc include='reference.RFC.4566'?>
<?rfc include='reference.RFC.5104'?>
<?rfc include='reference.RFC.5506'?>
<?rfc include='reference.I-D.ietf-rtcweb-jsep'?>
<?rfc include='reference.I-D.westerlund-avtext-codec-operation-point'?>
<?rfc include='reference.I-D.westerlund-avtext-rtp-stream-pause'?>
</references>
<references title="Informative References">
<?rfc include='reference.I-D.alvestrand-rtcweb-resolution'?>
<?rfc include='reference.RFC.5117'?>
<?rfc include='reference.RFC.5124'?>
<?rfc include='reference.RFC.6184'?>
<?rfc include='reference.I-D.ietf-rtcweb-use-cases-and-requirements'?>
<?rfc include='reference.I-D.ietf-payload-vp8'?>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:22:31 |