One document matched: draft-westerlund-mmusic-sdp-bw-attribute-01.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-westerlund-mmusic-sdp-bw-attribute-01"
ipr="trust200902">
<front>
<title abbrev="Extensible Bandwidth Attribute">Extensible Bandwidth
Attribute for SDP</title>
<author fullname="Tomas Frankkila" initials="T." surname="Frankkila">
<organization>Ericsson</organization>
<address>
<postal>
<street>Laboratoriegrand 11</street>
<city>SE-971 28 Lulea</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 30 20</phone>
<email>tomas.frankkila@ericsson.com</email>
</address>
</author>
<author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 82 87</phone>
<email>magnus.westerlund@ericsson.com</email>
</address>
</author>
<author fullname="Bo Burman" initials="B." surname="Burman">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 13 11</phone>
<email>bo.burman@ericsson.com</email>
</address>
</author>
<date day="24" month="April" year="2012"/>
<workgroup>MMUSIC Working Group</workgroup>
<abstract>
<t>Knowledge of what bandwidths the end-points intend to use is
important both for the other end-point and for resource allocation in
various types of networks. This is especially important for wireless
access networks which typically have quite limited resources. The
bandwidth attribute in Session Description Protocol (SDP),
‘b=AS’, is today quite widely used to define the bandwidth
that the end-points intends to use, in various types of sessions. This
document will show that the existing bandwidth attribute, such as
‘b=AS’, although widely used in todays scenarios, has
limitations that make it hard or even impossible for the end-points to
express their intentions accurately when it comes to bandwidth usage. To
solve the identified problems, this document defines a new extensible
SDP bandwidth attribute ‘a=bw’ which enables more detailed
control over the bandwidth declarations, request, and allocations. With
the new bandwidth attribute it is possible to define different scopes in
the session setup and then negotiate the bandwidth individually for each
scope.</t>
</abstract>
</front>
<middle>
<section anchor="Introduction" title="Introduction">
<t>This document looks at the issues of non-basic usage of <xref
target="RFC3550">RTP</xref> and analyzes how well the existing <xref
target="RFC4566">SDP</xref> attribute ‘b=AS’ for bandwidth
negotiation performs in different scenarios.</t>
<t>This analysis is done by defining a number of use cases, containing
sessions with:<list style="symbols">
<t>single and multiple media types;</t>
<t>symmetric and asymmetric media streams;</t>
<t>single and multiple media sources, including multiple sources
from the same end-point;</t>
<t>multiple end-points each having one or more media sources,
including applications that use multiple encodings of a particular
media.</t>
</list></t>
<t>It is shown that the existing bandwidth attributes <xref
target="RFC4566">‘b=AS’</xref> and <xref
target="RFC3890">'b=TIAS'</xref> has limitations which make it unclear
or even impossible for end-points and for resource allocation functions
in the network to determine how much bandwidth the service will use. The
analysis also provides the design rationale for the new bandwidth
attribute.</t>
<t>This document then proposes a general and extensible mechanism for
bandwidth negotiation that can be used for any type of session.
Interoperability with the existing mechanisms for bandwidth negotiation
is especially important since the existing bandwidth attribute has a
wide-spread usage.</t>
<t>This document also presents several examples for how the new
bandwidth attribute can be used in the session setup phase for various
types of sessions. The examples are derived for IP/UDP/RTP transport
although nothing should prevent using the new bandwidth attribute also
for other transport protocols.</t>
</section>
<section anchor="Definitions" title="Definitions">
<t/>
<section anchor="RequirementLanguage" title="Requirements Language">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref>.</t>
</section>
<section title="Terminology">
<t>The following terms and abbreviations are used in this
document:<list style="hanging">
<t hangText="Bandwidth:">In this document, the bandwidth is
defined as the IP level bandwidth, i.e. including the network
protocol (IPv4 or IPv6) and transport protocol (TCP, UDP, RTP,
etc) overhead. When RTP is used then the RTCP bandwidth is handled
separately from the bandwidth used for RTP packets. Bandwidth in
this context is in the unit bits per second, not Hz.</t>
<t hangText="Encoding:">A particular encoding is the choice of the
media encoder (codec) that has been used to compress the media.
Different encodings result in the fidelity of that encoding
through the choice of sampling, bit-rate and other configuration
parameters.</t>
<t hangText="End-point:">A single entity sending and/or receiving
RTP packets. It may be decomposed into several functional blocks,
but as long as it behaves a single RTP stack entity it is
classified as a single end-point.</t>
<t hangText="Media stream:">A sequence of RTP packets using a
single SSRC that together carry carries part or all of the content
of a specific Media Type from a specific sender source within a
given RTP session.</t>
<t hangText="RTP session:">An RTP session consists of one or more
media streams that have the same purpose. The typical example is
to have one RTP session per media type, i.e. that voice and video
use different RTP sessions (different ports) since they have
different purpose. It is however possible to have multiple streams
in an RTP session, for example when having both a stream for
non-redundant audio and another stream for re-transmissions of
audio packets. The fundamental definition of an RTP session is a
single SSRC space.</t>
</list></t>
<t/>
</section>
</section>
<section anchor="UseCasesDesignRationale"
title="Use Cases and Design Rationale">
<t>This section describes a number of use cases where the existing
bandwidth attribute ‘b=AS’ is used for bandwidth definition.
It also discusses why the limitations of the existing bandwidth
attribute makes it hard for other end-points and resource allocation
functions to know or estimate how much bandwidth that will be used in
the ongoing session.</t>
<t>The analysis is made by defining a set of use cases. The first use
cases include fairly simple session types, i.e. point-to-point sessions
with or without asymmetry. A few more complex use cases are then
analyzed. The last set up use cases reflect fairly advanced session
types, e.g. various variants of multiplexing and usage of multiple media
streams.</t>
<t>The discussion is then summarized and the design rationales for the
new bandwidth attribute are outlined.</t>
<section anchor="ExistingBandwidthAttribute"
title="Existing Bandwidth Attribute">
<t>The existing bandwidth modifier ’b=’ defined in <xref
target="RFC4566">RFC 4566</xref> is reviewed in this section.</t>
<section title="Attribute Definition">
<t>The existing bandwidth attribute ‘b=’ is defined in
Section 5.8 of <xref target="RFC4566">RFC 4566</xref>. The syntax
is:</t>
<t><list style="empty">
<t>b=<bwtype>:<bandwidth></t>
</list></t>
<t>where:<list style="empty">
<t><bwtype> is either:<list style="empty">
<t>‘AS’ (“Application Specific”),
which is the maximum bandwidth as estimated by the
application; or:</t>
<t>‘CT’ (“Conference Total”), which
is the total bandwidth for all media at all sites.</t>
</list></t>
<t><bandwidth> is the bandwidth value in kilobits per
second.</t>
</list></t>
<t>Bandwidth types have been defined for the negotiation of the RTCP
bandwidth using ‘b=RS’ and ‘b=RR’, <xref
target="RFC3556">RFC 3556</xref>.</t>
<t>There is also a bandwidth type for negotiating the transport
independent application specific maximum bandwidth,
‘b=TIAS’, <xref target="RFC3890">RFC 3890</xref>. This
bandwidth type is similar to the ‘b=AS’ bandwidth type,
except that the overhead caused by the transport protocol headers is
not included.</t>
<t>One issue with the existing bandwidth attribute is that the
syntax is very limited since it only allows for defining new
bandwidth types (<bwtype>) and their respective single
numerical value. This limitation needs to be considered in the
discussion below.</t>
</section>
<section anchor="IssuesOfferAnswerForExistingBandwidthAttribute"
title="Offer/answer Procedure for the Existing Bandwidth Attribute">
<t><xref target="RFC3264">"An Offer/Answer Model with the Session
Description Protocol (SDP)"</xref> describes the offer/answer
procedures for the existing bandwidth attribute. For the SDP offer,
it describes that the bandwidth attribute indicates the desired
bandwidth that the offerer would like to receive. For the SDP
answer, it describes that the bandwidth attribute indicates the
bandwidth that the answerer would like the offerer to use when
sending media. Thus, for offer/answer negotiations, the bandwidth
attribute indicates the bandwidth for the receive direction of each
end-point.</t>
<t>The solution presented in this document focuses primarily on
clarifying and assisting the Application Specific (AS)
bandwidth.</t>
<t>[It is an open question to decide if and how to handle the RTCP
bandwidth negotiation, e.g. corresponding to b=RS and b=RR.]</t>
<t>[It is an open question to develop semantics for the transport
independent bandwidth negotiation, e.g. corresponding to
b=TIAS.]</t>
</section>
<section anchor="IssuesEndpointBehaviorWhenGeneratingTraffic"
title="End-point Behavior when Generating Traffic">
<t>When an end-point is sending media then this can be done in many
different ways, depending on the choices the implementers have
made.</t>
<t>Some end-points may send it’s data in a fairly “nice
and smooth” media stream, which means that both the packet
sizes and the packet rates are more or less constant all the time.
An example of a smooth stream is when the end-point is encoding
speech and is sending one packet every 20 ms and when the packets
are of equal size.</t>
<t>Other end-points may generate bursty streams, which have a large
peak-to-average ratio. An example of a bursty stream is when an
end-point is encoding video. Most of the time, the end-point is
sending packets with almost the same size and with constant packet
rate. However, it happens occasionally that the encoder generates
much more data for a frame, which may give a very large packet size.
It may even happen that the sender has to segment the data into
several packets, which may be transmitted in a burst, thereby
causing a very high peak rate.</t>
<t>Whether the stream is smooth or bursty makes a big difference for
the network and the policy control that usually applies in QoS
controlled networks. If the stream is too bursty, then a policy
control function may decide to drop packets that exceed the granted
rate. This will lead to degraded quality and reduced user
satisfaction.</t>
<t>The existing bandwidth attribute offers no mechanism to negotiate
what temporal variations that can be allowed for a stream. The only
available mechanism is to negotiate the maximum bandwidth, but there
is nothing that defines any kind of averaging window (or something
similar) that can be used to control the bandwidth variations from
the transmitted stream.</t>
<t>It is therefore proposed to use a Token Bucket model to describe
the bandwidth with two parameters, the token bucket rate and the
bucket size, see <xref target="RFC2212">RFC 2212</xref>.</t>
</section>
</section>
<section anchor="IssuesPointToPointSessionsUsingSDPOfferAnswer"
title="Point-to-point Sessions using SDP offer/answer">
<t>The existing modifier for the application specific bandwidth
‘b=AS’ is frequently used in the SDP offer/answer
negotiation <xref target="RFC3264">RFC 3264</xref> for setting up
point-to-point sessions, for example for bi-directional point-to-point
VoIP or video telephony sessions. In this section, the use of the
legacy bandwidth modifier is reviewed for the use in point-to-point
sessions using SDP offer/answer.</t>
<section anchor="IssuesSymmetricPointToPointSessionsFixedrateCodecs"
title="Symmetric Point-to-point Sessions, Fixed-rate Codecs">
<t>This example below shows the SDP offer from end-point A for
several fixed-rate codecs, mu-law and A-law PCM/<xref
target="G.711">G.711</xref>, AD-PCM/<xref
target="G.726">G.726</xref> and CS-ACELP/<xref
target="G.729">G.729</xref>. The codecs have different bit rates.
PCM encodes speech at 64 kbps. G.726 can encode speech at four
different rates, 64, 32, 24 and 16 kbps, but in this case it is
assumed that the 32 kbps variant is used. G.729 encodes speech at 8
kbps. The IP/UDP/RTP overhead with 20 ms packetization and IPv4
becomes 16 kbps in all cases giving 80, 48 and 24 kbps,
respectively.</t>
<figure title="SDP offer for mu-law PCM, A-law PCM, G.726 and G.729 with IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 8 0 96 18
b=AS:80
a=rtpmap:96 G726-32/8000/1
a=ptime:20
a=maxptime:80
]]></artwork>
</figure>
<t>If end-point B accepts to use this codec then a likely SDP answer
would be:</t>
<figure title="SDP answer for mu-law PCM, A-law PCM, G.726 and G.729 with IPv4">
<artwork><![CDATA[m=audio 49400 RTP/AVP 8 0 96 18
b=AS:80
a=rtpmap:96 G726-32/8000/1
a=ptime:20
a=maxptime:80
]]></artwork>
</figure>
<t>In this case, both end-points offer to receive 80 kbps. A
resource allocation function would thereby allocate 80 kbps in each
direction.</t>
<t>However, if end-point B accepts to use one of the lower rate
codecs, for example G.729, but not the PCM codecs, then a likely SDP
answer would be:</t>
<figure title="SDP answer for G.729 with IPv4">
<artwork><![CDATA[m=audio 49400 RTP/AVP 18
b=AS:24
a=ptime:20
a=maxptime:80
]]></artwork>
</figure>
<t>This means that the offerer has offered to receive 80 kbps while
the answerer has offered to receive 24 kbps. In the direction A to B
it is clear that a resource allocation function should allocate 24
kbps. However, in the direction B to A it is a little more unclear.
On one hand, end-point A has offered to receive 80 kbps. But, on the
other hand, end-point B has only indicated support for the G.729
codec and its unknown if B can send with something in addition to
G.729 from A's offered set.</t>
<t>A resource allocation may also (incorrectly) conclude that
end-point B will also send maximum 24 kbps, since b=AS indicates 24
kbps. But, since maxptime is 80 ms, this means that end-point B
could very well use application layer redundancy and encapsulate
redundant frames together with non-redundant frames, which would
result in a bandwidth exceeding 24 kbps. Even if maxptime would be
20 ms, end-point B could still use application layer redundancy, if
the non-redundant and redundant frames are transmitted in different
packets. This is possible since end-point A has indicated that it is
capable of receiving 80 kbps. Hence, if the resource allocation
function uses the codec information and assumes that end-point B
will send with only 24 kbps, then this may cause packet losses
and/or long delays.</t>
<t>It should be clear with this example that the current bandwidth
attribute, b=AS, can create ambiguities related to what bandwidth
that will be used in each direction. If the end-points and the
resource allocation functions make different interpretations then
there is a risk for either poor quality or wasted resources.</t>
<t>To solve this, a new bandwidth negotiation method should enable
negotiating different bandwidths for different codecs. If a codec
can be configured in several different ways, e.g. G.726 offers the
possibility to use four different static bit rates then this would
typically be negotiated using different RTP Payload Types. This
means that the solution needs to be capable of negotiating different
bandwidths for different Payload Types.</t>
</section>
<section anchor="IssuesSymmetricPointToPointSessionsRateAdaptiveCodec"
title="Symmetric Point-to-Point Sessions with Rate-Adaptive Codec">
<t>This use case describes what might happen when using
rate-adaptive codecs in a session, for example <xref
target="AMR">AMR</xref>. The rate adaptation should adapt to a high
bitrate when the operating conditions are good, but should adapt to
a low bitrate when the operating conditions are degraded, e.g. due
to congestion or bad coverage.</t>
<t>One example of the SDP offer-answer negotiation for rate-adaptive
codec is shown below.</t>
<figure title="SDP offer from end-point A for AMR and IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<t>The bandwidth attribute in the SDP indicates the bandwidth that
the offerer would like to receive, <xref target="RFC3264">RFC
3264</xref>.</t>
<figure title="SDP answer from end-point B also for AMR and IPv4">
<artwork><![CDATA[m=audio 49100 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<t>The bandwidth attribute in the SDP answer indicates the maximum
bandwidth that the answerer would like the offerer to use when
sending media, <xref target="RFC3264">RFC 3264</xref>.</t>
<t>In this case, it is clear that both end-points are prepared to
receive up to 29 kbps of media. Since AMR can adapt the rate for the
encoding, this means that the bandwidth can be reduced, e.g. to the
5.9 kbps mode, if congestion is detected. The existing bandwidth
attribute ‘b=AS” is however only used to negotiate the
maximum rate. This means that there is nothing in the SDPs that
describes how the rate will be adapted. In some cases, usually for
speech codec, it might be possible to derive the lowest rate from
the codec information. However, there is no guarantee that the
end-points will adapt to this rate or whether it will stay at some
higher rate. For video codecs, there is usually no codec information
at all that could be used to determine how low rate the end-points
will use. The lowest usable rate for a video codec is generally not
a video codec limitation, but rather some end-user or service
consideration on what is the lowest video quality that is still
useful or acceptable in the actual scenario.</t>
<t>This means that a resource allocation function has no information
which could be used to determine how the end-points will adapt
during periods of congestion. Hence the network does not know what
to assume from the end-points.</t>
<t>To solve this, a new bandwidth negotiation method should allow
for negotiating not only the highest rate but also the minimum rate
that is still useful.</t>
</section>
<section anchor="IssuesSymmetricPointToPointSessionsSeveralRateAdaptiveCodecs"
title="Symmetric Point-to-Point Sessions with Several Rate-Adaptive Codecs">
<t>Another example is when the originating end-point offers several
rate-adaptive codecs, with different bandwidths, and when the
answerer only support one or several of the lower-rate
configurations but not the configuration that uses the highest
bandwidth. With the legacy bandwidth modifier ‘b=AS’ it
is only possible to indicate one bandwidth for the whole RTP
session, which means that the end-point needs to indicate the
highest bandwidth since this is the worst-case scenario. An
offer/answer for this case is shown below. The offerer supports both
AMR and AMR-WB <xref target="AMR-WB">AMR-WB</xref> and therefore
indicates the bandwidth needed for the AMR-WB configuration since it
is higher than for AMR. If the answerer does not support the AMR-WB
codec then it will have to remove this configuration from the SDP
when creating the SDP answer. This means that the answerer
calculates the bandwidth required for AMR instead of AMR-WB.</t>
<figure title="SDP offer from end-point A for AMR-WB, AMR and IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 96 97
b=AS:41
a=rtpmap:96 AMR-WB/16000/1
a=fmtp:96 mode-change-capability=2; max-red=80
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<figure title="SDP answer from end-point B for AMR and IPv4 (AMR-WB is removed)">
<artwork><![CDATA[m=audio 49100 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<t>Since the indicated bandwidth is for the receiving direction in
this example this means that:<list style="symbols">
<t>A must send media with a bandwidth not exceeding 29 kbps;
and:</t>
<t>B must send media with a bandwidth not exceeding 41 kbps.</t>
</list></t>
<t>This gives the same problem with ambiguous maximum rate as shown
in <xref
target="IssuesSymmetricPointToPointSessionsFixedrateCodecs"/>. In
addition, since both AMR and AMR-WB are rate-adaptive codecs, with
different bit rates, they also have different minimum rates. This
means that a resource allocation would be unaware about both the
maximum bandwidth and the minimum (required) bandwidth.</t>
<t>To solve this, a new bandwidth attribute should allow for
negotiating both maximum and minimum bitrates individually for each
payload type.</t>
<t>For speech codecs, it is usually possible to derive the minimum
rate from the codec information. However, this is typically not
possible for video codecs since they only indicate the maximum
encoding level. For example, if end-point A offers to use H.264
level 3.0 <xref target="H.264">H.264</xref> but end-point B is only
capable of using level 1.2, then this only limits the maximum
bandwidth in the direction from A to B. In the other direction,
end-point A is still capable of receiving level 3.0.</t>
</section>
<section anchor="IssuesAsymmetricPointToPointSessions"
title="Asymmetric Point-to-Point Sessions">
<t>The session setup for asymmetric streams is not always straight
forward. Lets say that one want to set up a session with 600 kbps in
the sending direction and 200 kbps in the receiving direction.</t>
<figure title="SDP offer to receive 200 kbps video">
<artwork><![CDATA[m=video 49200 RTP/AVP 96
b=AS:200
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00c
a=sendrecv
]]></artwork>
</figure>
<t>From this SDP, it can be determined that the end-point wants to
receive 200 kbps. There is some implicit information in the level
part of the profile-level-id for the H.264 example above, indicating
that the end-point can send using a higher bandwidth (up to 768
kbps), but it requires codec-specific knowledge to be able to
extract that implicit information. In this example, lets assume that
the sender does not even want to utilize the maximum allowed
bandwidth for the signaled level, but a slightly lower one, say 600
kbps. So how is the answerer supposed to know that the offerer
really wants to send up to 600 kbps, especially since not even the
implicit level-related can be used? There could be many reasons to
use a lower video bandwidth than the one defined as level maximum;
limited terminal performance in the send direction, a known network
bandwidth limitation, a bandwidth charging model that makes the user
prefer a lower bandwidth, etc.</t>
<t>One way to express the asymmetry is to set up different RTP
sessions for sending and receiving directions. An SDP offer for this
might be:</t>
<figure title="SDP offer with separate sessions for send and receive">
<artwork><![CDATA[m=video 49200 RTP/AVP 96
b=AS:600
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=sendonly
m=video 49202 RTP/AVP 97
b=AS:200
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c00c
a=recvonly
]]></artwork>
</figure>
<t>If the answerer decides to accept this then the SDP answer might
be:</t>
<figure title="SDP answer with separate sessions for send and receive">
<artwork><![CDATA[m=video 49200 RTP/AVP 96
b=AS:600
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=recvonly
m=video 49202 RTP/AVP 97
b=AS:200
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c00c
a=sendonly
]]></artwork>
</figure>
<t>In this example, it is clear that the offerer can send video with
600 kpbs and receive video with up to 200 kbps. However, if the
offer is for different codecs, using different bandwidths, then one
have the same problem as described in <xref
target="IssuesSymmetricPointToPointSessionsSeveralRateAdaptiveCodecs"/>.</t>
<t>Specifically for video, but possibly also for other media, it may
happen that different implementations send the media in different
ways. Some implementations may try to provide a fairly
”smooth” stream in terms of bandwidth variation over
time, while other implementations may give a very
”bursty” stream.</t>
<t>There also exist cases where opening additional RTP sessions just
for expressing asymmetric transmission bandwidths are not
desirable.</t>
</section>
</section>
<section anchor="IssuesSessionsWithMultipleStreams"
title="Sessions with Multiple Streams">
<t>In this part of the analysis, it is assumed that an RTP session is
set up for multiple streams. This can be done in several ways and for
several reasons, as discussed in <xref
target="I-D.westerlund-avtcore-multiplex-architecture">RTP
Multiplexing Architecture</xref>.</t>
<section anchor="IssuesMultipleStreams" title="Multiple Streams">
<t>The assumed usage here is a multi-party session, for example a
video conference using an RTP mixer. Some of the attendees are
active and their audio and video is distributed to the other users.
Some attendees are inactive and thus only receive media. In this
example, each end-point sends one video stream, but can receive up
to four simultaneous video streams, multiplexed as different SSRC in
the same RTP session. One or more central nodes (RTP Mixer) are used
to help facilitate the media transport between the participants, and
are involved in choosing the streams to be forwarded. Assume that
there is an aggregate bandwidth limit of 3 Mbps in the receive
direction, and that each received video stream should be limited to
max 1 Mbps.</t>
<t>An SDP offer for the setting up a session with one video stream
for the sending direction and four video streams for the receiving
direction is shown below when using <xref
target="I-D.westerlund-avtcore-max-ssrc"/> to explicitly declare
capability to handle multiple streams. In this case, only the legacy
‘b=AS’ bandwidth attribute is used, valid only for the
aggregate.</t>
<figure title="SDP offer to receive multiple video streams">
<artwork><![CDATA[m=video 49300 RTP/AVP 96
b=AS:3000
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c016
a=max-recv-ssrc:* 4
]]></artwork>
</figure>
<t>This example again highlights the asymmetry problem with the
existing bandwidth attribute, but it also highlights the lack of
per-stream bandwidth specification. This means that it is not
possible to declare the 1 Mbps bandwidth limit that should be used
for each one of the for streams in the receiving direction, which
thus is a desirable property of the new bandwidth attribute. Note
also that in this example, the 1 Mbps limit per stream cannot be
fully utilized if all four streams are used simultaneously.</t>
<!--MW: On some mailing list dicussion the issue of per SSRC bandwidth was raised. Should consider if we need to define
a scope parameter<SSRC:XYZ> to allow for that? Here would be a good location to introduce that.-->
</section>
</section>
<section anchor="IssuesUserExperienceAndBandwidthNegotiation"
title="User Experience and Bandwidth Negotiation">
<t>Resource allocation is typically a compromise between perceived
quality and network utilization. From an end-user perspective, the
bandwidth for a service should be as high rate as possible, since this
should give the best user experience. However, from a network
perspective, one would like to minimize the rate, since this should
maximize the number of sessions that can be supported.</t>
<t>For some services, like conversational voice- and/or
video-telephony, one needs to ensure that the network is capable of
delivering a certain at least required rate, even when the network
load is high. This is needed to ensure user satisfaction, both in
terms of quality and end-to-end delay. This means that the end-points
and the network need to agree on what maximum bandwidth that can be
used for the session as well as some lowest useful "least required"
bandwidth.</t>
<t>The current bandwidth modifier, ‘b=AS’, is used to
negotiate the maximum bandwidth. However, since it only allows for
negotiate one bandwidth it cannot be used to also negotiate a lower
bandwidth limit.</t>
<t>To solve this, a new bandwidth negotiation method should allow for
negotiating not only the highest rate but also the “at least
required” rate. To enable a negotiation between the end-point
and the network, a reasonable approach is that the end-point requests
a lower bandwidth limit and then the network indicate what
“least required” rate that was granted.</t>
</section>
<section anchor="SummaryOfFindings" title="Summary of Findings">
<t>It should be clear from the above discussion that the current
bandwidth attribute is too limited to be used for all use cases and
that some extensions are needed.</t>
<t>The current bandwidth attribute, ‘b=AS’, is sufficient
for simple sessions but gives ambiguities when negotiating more
advanced session types. One of the drawbacks is that
‘b=AS’ only indicates the desired bandwidth for the
receiving direction but, if the answering end-point wants to use a
lower rate than what is offered, then there is often no way for the
resource allocation function to know what bandwidth that will be used
in the offerer's sending direction.</t>
<t>Implementers of end-points and resource allocation functions may
try to resolve this ambiguity by using other information available in
the SDP, e.g. codec-specific information. However, such information is
not always easily available, e.g. for video codecs.</t>
<t>End-points may have to perform a second offer/answer negotiation to
resolve the ambiguity. This, obviously, has the drawbacks that the SIP
traffic is increased and that this takes some extra time. It is also
not guaranteed that the end-points will actually initiate a second
offer/answer negotiation.</t>
<t>The analysis above has also shown that the current bandwidth
attribute is insufficient to properly describe the session for
multi-stream scenarios.</t>
<t>The analysis above has also shown that the current bandwidth
modifier can be used to negotiate the maximum bit rate in bearers
allocated in some wireless networks, but it is insufficient for also
negotiating a lower, "least required", bandwidth limit.</t>
<t>Another problem with the existing bandwidth attribute is that the
syntax is very limited and does not allow for introducing extensions,
only additional identifiers with a single value each.</t>
<t>It is therefore proposed to define a new bandwidth attribute,
including a new syntax. The new bandwidth attribute should
support:<list style="hanging">
<t hangText="Directionality:">One need to be able to have
different sets of attributes values depending on direction.</t>
<t hangText="Payload specific:">With the new bandwidth attribute
it should be possible to specify different bandwidth values for
different RTP Payload types. This is because some codecs have
different characteristics and one may want to limit a specific
codec and payload configuration to a particular bandwidth.
Especially combined with codec negotiation there is a need to
express intentions and limitations on usage for that particular
codec. In addition, payload agnostic information is also
needed.</t>
<t hangText="Multiple streams:">The new bandwidth attribute should
support bandwidth negotiation both for single streams and for
multiple streams. When multiple streams are used, the new
bandwidth attribute should allow for declaring both the bandwidth
per stream and the aggregated bandwidth.</t>
<t hangText="Bandwidth specification method:">To have a clear
specification of what any bit-rate values mean we propose that
Token bucket parameters should be used, i.e. bucket depth and
bucket fill rate, where appropriate for the semantics. If single
values are to be specified, a clear definition on how to derive
that value must be specified, including averaging intervals
etc.</t>
<t hangText="Bandwidth semantics:">It should be possible to
negotiate different types of bandwidths for each scope, including
several bandwidth properties in the same negotiation. It should,
at least, be possible to negotiate the highest bandwidth and a
lower bandwidth limit that indicates the lowest useful bandwidth
to use the related media. The least required bandwidth limit
should ideally, but need not necessarily, be guaranteed by the
network and the remote end-point(s).</t>
<t hangText="Extensibility:">The semantics need to be extensible,
so that new semantics can be defined in the future.</t>
</list></t>
<t>The existing bandwidth modifier, ‘b=AS’, is widely used
today. The existing SDP attributes for directionality,
‘a=sendrecv’, ‘a=recvonly’,
‘a=sendonly’ and ‘a=inactive’, are also widely
used. It is therefore important to ensure interworking between the new
bandwidth attribute and the mechanisms already existing in SDP.</t>
</section>
</section>
<section anchor="AttributeSpecification" title="Attribute Specification">
<t>This section proposes a new bandwidth attribute ‘a=bw’
that can be used either as an extension to the already existing
bandwidth attribute ‘b=AS’ or replacing the existing
bandwidth attribute. The new bandwidth attribute includes semantics that
allows for also replacing the existing bandwidth attribute.</t>
<t>The syntax for the new bandwidth attribute is:</t>
<figure>
<artwork><![CDATA[a=bw:<direction> <scope> <semantic>:<value>
]]></artwork>
</figure>
<t>where:<list style="empty">
<t><direction> is the direction in which the scope and
semantics applies,</t>
<t><scope> describes for what scope the definitions
applies,</t>
<t><semantic> is the actual bandwidth specification,</t>
<t><value> in the form defined by the semantic used.</t>
</list></t>
<t>The new attribute is designed to allow for future extendability.</t>
<section anchor="SDPGrammar" title="SDP Grammar">
<t>The ABNF <xref target="RFC5234">RFC 5234</xref> for this attribute
is the following:</t>
<figure title="ABNF for 'bw' attribute">
<artwork><![CDATA[bw-attrib = "a=bw:" direction SP [req] scope SP
[req] semantics ":" values
direction = "send" / "recv" / "sendrecv" / direction-ext
scope = payloadType / scope-ext
payloadType = "pt=" ("*" / (PT-spec) *("," PT-spec))
PT-spec = PT-value / PT-value-range
PT-value = 1*3DIGIT
PT-value-range = PT-value "-" PT-value
req = "!"
semantics = "SMT" / "AMT" / "SLT" / "SLTR" / "ALT" / "ALTR" /
semantics-ext
values = token-bucket / value-ext
token-bucket = "tb=" br-value ":" bs-value
br-value = "*" / 1*15DIGIT ; Bucket Rate [bps]
bs-value = "*" / 1*15DIGIT ; Bucket Size [bytes]
direction-ext = token ; As defined in RFC 4566
scope-ext = 1*VCHAR ; As defined in RFC 5234
semantics-ext = token ; As defined in RFC 4566
value-ext = 0*(WSP / VCHAR) ; As defined in RFC 5234
]]></artwork>
</figure>
<t>The ‘a=bw‘ attribute defines three possible
directionalities for the bandwidth:<list style="hanging">
<t hangText="send:">In the send direction for SDP Offer/Answer
agent or in case of declarative use in relation to the device that
is being configured by the SDP.</t>
<t hangText="recv:">In the receiving direction for the SDP
Offer/Answer agent providing the SDP or in case of declarative use
in relation to the device that is being configured by the SDP.</t>
<t hangText="sendrecv:">The provided bandwidth values apply
equally in send and receive directions, i.e. the values configures
the directions symmetrically.</t>
</list></t>
<t>The directionality must be specified when the ‘a=bw’
attribute is used. Only one directionality can be specified on each
‘a=bw’ line. Special care must be taken to avoid
conflicting definitions. For example, if ‘sendrecv’ has
been specified on one ‘a=bw’ line for a scope, e.g.
payload number 96, then the direction cannot be set to
‘send’ or ‘recv’ on another ‘a=bw’
line for the same scope. However, it is allowed to specify
directionality ‘send’ on one ‘a=bw’ line for a
scope and directionality ‘recv’ on another
‘a=bw’ line. This is useful when the bandwidth is
different in different directions. Using ‘sendrecv’ as
directionality on an ‘a=bw’ line is a shortcut in the
sense that it is equivalent to using two separate ‘a=bw’
lines where one uses ‘send’ and the other
‘recv’ but that otherwise are semantically identical.</t>
<t>The scope indicates what is being configured by the bandwidth
semantics on this attribute line. Two different scopes are defined
based on payload type:<list style="hanging">
<t hangText="Payload Type:">The bandwidth configuration applies to
the specific payload type value(s).</t>
<t hangText="pt=*:">Applies to all payload types being used.</t>
</list></t>
<t><list style="empty">
<t>Using pt=* indicates that the definitions apply to all payload
types being used. The scope may be a single payload type value,
e.g. pt=96. A list of payload type values can be created by using
a comma-separated list, e.g. pt=96,98,105. It is also possible to
specify a range of payload type values, e.g. pt=96-102, which
means that the definitions apply to all the payload type numbers
from 96 to 102. It is also possible to combine payload type
values, payload type lists and payload type ranges, e.g.
pt=96,98-102,104,105,110-113.</t>
</list></t>
<t>The scope parameter is extensible to allow for adding other scope
definitions in the future.</t>
<t>This specification defines six related semantics. All semantics
represent either the bandwidth consumption of a single stream or the
aggregate of streams as a token bucket defining a transmission profile
which the media sender must stay within. The token bucket values are
the token rate in bits per second and the bucket size in bytes both
provided as integers, see <xref target="RFC2212">RFC 2212</xref>. The
below semantics includes the whole IP packet, for example IP, UDP, RTP
headers and RTP payload, as what shall be metered when determining if
the send pattern is within the profile. The token bucket definition
allows for wild cards enable to specify that one want a value as token
bucket, but has no proposed value.</t>
<t>The definitions of the semantics in more detail are:<list
style="hanging">
<t hangText="SMT (Stream Maximum Token bucket):">The maximum
intended or allowed bandwidth usage, including protocol overhead,
for each individual source (each SSRC) in an RTP session at the
sender side specified by a token bucket. The token bucket wild
cards (“*”) should not be used for the SMT semantics
since it should always be possible to estimate the maximum
bandwidth. This semantics is possible to use with the scope for
any payload type (pt=*) where it applies independent of encoding
and packetization, or for a specific or a set of payload
type(s).</t>
<t hangText="AMT (Aggregate Maximum Token bucket):">The maximum
intended or allowed bandwidth usage for the sum of all sources
(SSRCs) in an RTP session according to the specified
directionality at the media sender specified by a token bucket.
The ‘sendrecv’ directionality parameter indicates
equal token buckets in both directions, i.e. the aggregate of
streams sent to an end-point shall be within the token bucket
defined transmission profile, and the aggregate of streams sent
from that end-point shall also be within the same token bucket
profile at the sender. It can be used either to express the
maximum for one particular payload type, for a set of payload
types or for any payload type (pt=*). The token bucket wild card
(“*”) should not be used for the AMT semantics since
it should always be possible to estimate the maximum
bandwidth.</t>
<t hangText="SLT (Stream Least required Token bucket):">The least
required bandwidth, including IP protocol overhead, needed for the
stream for each individual source (each SSRC) in an RTP session as
specified by a token bucket at the sender. When using the SLT
semantic, the SMT semantic SHOULD also be specified for the same
direction and scope. If the SLT semantics is not defined then this
means that the least required bandwidth limit is zero. The least
required bandwidth is the minimum bandwidth that is necessary for
the service to work with usable quality.</t>
<t
hangText="SLTR (Stream Least required Token bucket Request):">The
request for establishing the least required bandwidth, including
protocol overhead, needed for the stream for each individual
source (each SSRC) in an RTP session, as specified by a token
bucket at the stream sender. An end-point may use the SLTR
semantics to request to establish a least required bandwidth. An
end-point using the SLTR semantics may set the token bucket rate
and/or the token bucket size to “*” to indicate that
the end-point has no preference, but that it expects some network
node or the answering end-point to define the value(s). A network
node answering to the SLTR SHALL replace this with the SLT
semantics to indicate the least required bandwidth it sees
necessary and which it has attempted to guarantee. If the request
is for certain specified payload types, a network node that cannot
grant bandwidth based on payload types MAY replace those requested
payload types with “*” in the SLT response to indicate
a payload type agnostic grant. An end-point receiving an SDP with
SLTR, i.e. where the network has not replaced the SLTR semantics
with any SLT semantics, SHOULD NOT assume that the requested
bandwidth is guaranteed.</t>
<t hangText="ALT (Aggregated Least required Token bucket):">The
least required bandwidth, including protocol overhead, needed for
the sum of all sources (all SSRCs) in an RTP session as specified
by a token bucket at the stream sender. When using the ALT
semantic the AMT semantic SHOULD also be specified for the same
direction and scope. The directionality and payload type
considerations for ALT are the same as for AMT. If the ALT
semantics is not defined then this means that the least required
bandwidth is zero.</t>
<t
hangText="ALTR (Aggregated Least required Token bucket Request):">The
request for establishing a least required bandwidth, including
protocol overhead, needed for the sum of all sources (all SSRCs)
in an RTP session as specified by a token bucket at the media
sender side. The directionality and payload type considerations
for ALTR are the same as for SLTR. The ALTR semantics MUST only be
used together with AMT.</t>
</list></t>
<t><list style="empty">
<t>The SMT and AMT semantics, with or without SLT and ALT
respectively, may be used both symmetrically and in a particular
direction. They can be used either to express the maximum (and
minimum) for one particular payload type, for a set of payload
types or for any payload type (pt=*).</t>
</list></t>
<t>The required prefix (“!”) is used when the direction,
scope and semantics is required be supported and understood by the SDP
consuming end-point.</t>
</section>
<section anchor="DeclarativeUse" title="Declarative Use">
<t>In declarative usage the SDP attribute is interpreted from the
perspective of the end-point being configured by the particular SDP.
An interpreter MAY ignore ‘a=bw’ attribute lines that
contains unknown scope or semantics that does not start with the
required ("!") prefix. If a "required" prefix is present at an unknown
scope or semantics, the interpreter SHALL NOT use this SDP to
configure the end-point.</t>
</section>
<section anchor="UsageInOfferAnswer" title="Usage in Offer/Answer">
<t>The offer/answer negotiation is performed for each
‘a=bw‘ attribute line individually with the scope and
semantics immutable.</t>
<t>An offerer may use the ‘a=bw’ attribute(s) for some or
all of the offered media types. An answerer may remove the
‘a=bw’ attribute(s) for the media types where it was used
in the SDP offer.</t>
<t>The SDP may include an offer for an Aggregated Maximum Token bucket
(AMT) without specifying any Stream Token Buckets (SMTs) for any
individual streams.</t>
<t>When using the ‘a=bw’ attribute to define the token
bucket for a certain scope then the offerer should define token
buckets for all scopes of the same type. For example, if the SDP offer
includes three payload types, e.g. 96, 97 and 98, and if a token
bucket is defined for payload type 96, then the offerer should also
define token buckets for the other payload types. This can be done
either by defining one token bucket each for payload type 97 and 98 or
by defining a common token bucket for payload type 97 and 98.</t>
<t>When the token bucket rate and size are declared in an offer for
directionality ‘sendrecv’ then this indicates the token
bucket rate and the token bucket sizes are the same in both
directions. For example, if the offered bandwidth is 1 Mbps, then the
end-point declares that it is capable of sending with a bandwidth up
to 1 Mbps and that it is capable of receiving with a bandwidth up to 1
Mbps.</t>
<t>If either the token bucket rate(s) or the token bucket sizes are
different in sending and receiving direction then
‘sendrecv’ cannot be used. One should instead include two
or more ‘a=bw’ lines with the respective directionality,
bandwidths and sizes.</t>
<t>When the token bucket parameters are declared in an SDP offer for
directionality ‘send’ then this indicates the token bucket
parameters the sender intends to use. The answerer may change this
value, both to increase it and to reduce it, see below.</t>
<t>When the token bucket parameters are declared in an SDP offer for
directionality ‘recv’ then this indicates that the largest
envelope for the token bucket parameters that the offerer thinks the
media sender shall use.</t>
<t>An agent understanding the ‘a=bw’ attribute and
answering to an offer including the ‘a=bw’ attribute
SHOULD include the attribute in the answer for all media types for
which it was offered.</t>
<t>An answerer SHOULD ignore ‘a=bw’ attribute lines that
contains unknown scope or semantics that does not contain the required
("!") prefix. If a "required" prefix is present at an unknown scope or
semantics, then the answerer SHALL reject the media description by
setting the port to 0 and copy the 'a=bw' attributes not understood in
the answer. In this case, 'a=bw' attributes that are understood SHALL
NOT be included in the answer.</t>
<t>If an answerer would like to add additional bandwidth
configurations using other directionality, scope, and semantics
combination, then it MAY do so by adding such definitions in the SDP
answer.</t>
<t>An agent may also divide an ‘a=bw’ offer into several
‘a=bw’ offers. One example is when the SDP offer included
an ‘a=bw’ offer with directionality
‘sendrecv’, which indicates that the token bucket
parameters are the same in sending and receiving direction. If the
answerer would like to change the parameters for one or both
directions, so that the parameters are no longer the same for both
directions, then the answerer can include two ‘a=bw’ lines
in the SDP answer, one for sending direction and another for receiving
direction. In case an offered sendrecv media becomes a single
direction media then the sendrecv can be modified to that single
direction.</t>
<t>An agent responding to an offer will need to consider the
directionality and reverse them in the answer when responding to media
streams using unicast.</t>
<t>For media stream offers over unicast with directionality send, the
answerer SHALL reverse the directionality and indicate its reception
bandwidth capability, which may be lower or higher than what the
sender has indicated as its intended maximum.</t>
<t>For media stream offers over unicast with directionality receive,
the token bucket parameters indicate the upper limits. The answerer
SHALL reverse the directionality and may reduce the bandwidth when
producing the answer indicating the answerer intended maximum
transmission rate.</t>
<t>If the answerer removes one or several RTP Payload Types from the
SDP when creating the SDP answer then the corresponding
‘a=bw’ lines SHOULD be removed as well. The answerer MAY
however keep an ‘a=bw’ line when the removed RTP Payload
Type number is included within an identified range or list of Payload
Type numbers.</t>
</section>
<section anchor="BucketSizeEstimation" title="Bucket Size Estimation">
<t>In SDP bandwidth terms, the bucket size is a new parameter and what
value to use for it may be hard to understand for implementers of this
specification. This section therefore gives some guidelines on how to
set bucket size values.</t>
<t>A token bucket specifies an envelope for a transmission profile
where individual measurements have some impact if the media stream or
aggregate should be considered within the specified profile. The
semantics defined in this document only require that the media stream
is within the token bucket specification at the point emitting it into
the network. The network may add jitter causing the media
stream/aggregate to no longer be within the specified token bucket
profile.</t>
<section anchor="SenderBucket" title="Sender Specified Token Bucket">
<t>A sender SHOULD base the choice of token bucket size on how it
plans to send data. That can in turn be decided from e.g. codec
configuration, intended number of encoded frames per packet (ptime),
network interface, maximum transmission unit (MTU), etc. In
practice, for the simplified case where the sender is designed to
send all packets with precisely even time spacing, the token bucket
size can be set to the maximum packet size and the bit-rate to the
long term highest bit-rate intended to be used.</t>
<t>However, for media streams that are more variable the bucket
parameters should be chosen so that the emitted traffic is not too
bursty measured over a shorter interval. Until the bucket is
drained, the media sender will be able to emit packets at or close
to the interface's maximum bit-rate. Long burst of packets at
interface speed becomes more sensitive to loss due to cross-traffic
in switching fabrics with small buffers. Due to this, a sender can
consider transmission scheduling to a rate lower than the interface
rate but higher than the token bucket average rate.</t>
<t>Let's consider the example of a large video intra frame
consisting of 10 full MTU (let's assume 1500 bytes) packets which is
5 times the size of the median frame size of two full MTU packets.
The average bit-rate may be 1 Mbps. If the token bucket was to be
configured to (1 Mbps, 1500) then that would imply that a new full
MTU packet could be emitted no more often than one packet every 12
ms. That would require 120 ms to transmit the intra frame, which for
a 25 frames per second video is 3 frame intervals. Thus potentially
inducing significant playout jitter at a receiver. A token buffer
specification of (1 Mbps, 15000) would allow all 10 packets be sent
up to line speed. This could result in them being emitted every 1.2
ms over a 100 Mbps interface if there is no competing traffic. To
ensure that a 10 packet burst should be possible to transmit within
one frame interval of 40 ms, then the bucket depth needed is burst
size in bits, minus time interval times bucket fill rate, and the
resulting value converted back into bytes: (15000*8-0.04*1M) / 8 =
10000 bytes. The average bit-rate for this intra frame over a single
frame period becomes 4 Mbps. So the question is if bursts up to 4
Mbps should be allowed now and then as long as the average is within
1 Mbps, or if the sender has to transmit the intra using several
frame intervals, skipping the next frame(s) and hoping that the
receiver doesn't drop the intra frame as being too late. The sender
could also consider reducing the quality of the intra frame,
resulting in a reduced number of MTU required to transmit it.</t>
<t>A sender SHOULD avoid adding excessive safety margins to the
sending bucket size. A sender MAY add bucket size margins if it has
knowledge of internal transmission timing variations, or if it knows
about packet handling outside the sender itself that will affect the
effective bucket size (as seen from a receiver) that is otherwise
not reflected in the conveyed bucket size figure.</t>
</section>
<section anchor="ReceiverBucket"
title="Receiver Specified Token Bucket">
<t>With the semantics specified in this document, the intended media
receiver gets to provide token bucket parameters that specifies how
the sender should behave. The traffic received by the receiver (or
intermediate nodes) may no longer conform to the token bucket due to
jitter introduced by the network path between the sender and the
receiver. This document assumes that the receiver will have receiver
buffers for de-jittering that are significantly larger than the
token bucket parameters. This due to that a media unit like a video
frame may be transmitted over time using more data than the bucket
depth provides and instead spread it in time, transmitting each
fragment when the bucket is refilled enough for the next fragment to
be sent.</t>
<t>A receiver's input to the sender's bit-rate limitation should be
based on known limitations such as the networks, decoding
capabilities etc. The bucket depth will control how bursty the
traffic can be beyond the long term average specified by the bucket
refill rate.</t>
</section>
<section anchor="BucketAdjustmentInMiddleNodes"
title="Bucket Adjustment in Middle Nodes">
<t>When there are media aware middle nodes on the media path between
the sender and receiver, those middle nodes may have to or want to
apply similar considerations as the original media sender and
receiver. If those middle nodes are aware of SDP and the new
bandwidth attribute from this specification, and have in-path SDP
adjustment capabilities, they could benefit from modifying the
values to better fit the actually available end-to-end media path
capabilities. For example, an RTP Media Translator can express what
it actually is going to deliver of the far end-point's media to an
end-point instead of that far end-point's provided values.</t>
</section>
<section title="Network Policing">
<t>As the token bucket specified for the semantics in this document
is based on what the sender emit into the network, a policer should
have some margin allowing for network introduced jitter. The amount
will of course be dependent on the policer's location in relation to
the media sender.</t>
</section>
<section anchor="DynamicUpdateOfSenderTokenBucket"
title="Utilizing Network Feedback">
<t>If the media uses RTP and when the media has been transmitted for
some time, the sender should have received a fair amount of RTCP
receiver reports from the receiver. The sender can from RTCP
estimate the observed network jitter at the receiver and may be able
to dynamically adjust the sender behavior such that the aggregate of
the sender behavior and the reported network jitter are fulfilling
the senders token bucket profile.</t>
</section>
</section>
<section anchor="SDPExamplesPointToPointSessionsUsingSDPOfferAnswer"
title="SDP Examples for Point-to-point Sessions">
<t>These SDP examples show how the new bandwidth attribute can be
used. The benefits, compared to the legacy bandwidth attribute, are
also highlighted.</t>
<t>The SDP examples included below are intentionally not complete.
Only the parts that are relevant for this description are
included.</t>
<section anchor="SymmetricPointToPointSessionsFixedRateCodecs"
title="Symmetric Fixed-rate Codecs">
<t>This example shows the SDP offer for several fixed-rate codecs,
mu-law and A-law PCM, G.726 and G.728.</t>
<figure title="SDP offer for mu-law and A-law PCM and IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 8 0 96 18
b=AS:80
a=rtpmap:96 G726-32/8000/1
a=bw:sendrecv pt=0,8 SMT:tb=80000:1000
a=bw:sendrecv pt=96 SMT:tb=48000:1000
a=bw:sendrecv pt=18 SMT:tb=24000:1000
a=ptime:20
a=maxptime:20
]]></artwork>
</figure>
<t>The new bandwidth attribute offers the possibility to negotiate
the bandwidth individually for each codec. If the answerer removes a
codec when creating the answer then it is still known how much
bandwidth the other codecs will use. This means that the ambiguities
listed in <xref
target="IssuesSymmetricPointToPointSessionsFixedrateCodecs"/> can be
avoided.</t>
</section>
<section anchor="SymmetricPointToPointSessionsRateAdaptiveCodec"
title="Symmetric Rate-Adaptive Codec">
<t>This example shows the SDP negotiation for offering using the AMR
codec, <xref target="AMR">AMR</xref>.</t>
<figure title="SDP offer from end-point A for AMR and IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=bw:sendrecv pt=97 SMT:tb=28800:200
a=bw:sendrecv pt=97 SLTR:tb=22400:200
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<figure title="SDP answer from end-point B also for AMR and IPv4">
<artwork><![CDATA[m=audio 49100 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=bw:sendrecv pt=97 SMT:tb=28800:200
a=bw:sendrecv pt=97 SLT:tb=22400:200
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<t>Since the new bandwidth attribute offers a possibility to
negotiate both the maximum and the at least required bandwidth, it
is possible for both the other end-point and any resource allocation
function to know how the end-points will adapt when congestion is
detected.</t>
</section>
<section anchor="SymmetricPointToPointSessionsSeveralRateAdaptiveCodecs"
title="Symmetric Several Rate-Adaptive Codecs">
<t>This example shows how the new bandwidth attribute,
‘a=bw’, can be used to negotiate the maximum and the
least required bandwidths for multiple rate-adaptive codecs, in this
case for AMR and AMR-WB, <xref target="AMR-WB">AMR-WB</xref>. For
AMR, the highest codec mode is 12.2 kbps, giving a maximum bandwidth
of 28.8 kbps, and the at least required mode is selected to be 5.9
kbps, giving a least required bandwidth of 22.4 kbps. For AMR-WB,
the highest codec mode is 23.85 kbps, giving a maximum bandwidth of
40.4 kbps, and the least required mode is 8.85 kbps, giving a least
required bandwidth of 25.6 kbps.</t>
<figure title="SDP offer from end-point A for AMR-WB, AMR and IPv4">
<artwork><![CDATA[m=audio 49200 RTP/AVP 96 97
b=AS:41
a=rtpmap:96 AMR-WB/16000/1
a=fmtp:96 mode-change-capability=2; max-red=80
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=bw:sendrecv pt=96 SMT:tb=40400: 350
a=bw:sendrecv pt=96 SLTR:tb=25600:350
a=bw:sendrecv pt=97 SMT:tb=28800:200
a=bw:sendrecv pt=97 SLTR:tb=22400:200
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<figure title="SDP answer from end-point B for AMR and IPv4 (AMR-WB is removed)">
<artwork><![CDATA[m=audio 49100 RTP/AVP 97
b=AS:29
a=rtpmap:97 AMR/8000/1
a=fmtp:97 mode-change-capability=2; max-red=80
a=bw:sendrecv pt=97 SMT:tb=28800:200
a=bw:sendrecv pt=97 SLT:tb=22400:200
a=ptime:20
a=maxptime:100
]]></artwork>
</figure>
<t>In this case, it is clear when the answer is received that the
bandwidth needed for AMR applies to both directions. There is no
need for a send offer/answer negotiation to clarify that the
bandwidth applies also to end-point A’s receiving direction.
Thereby, the issues listed in <xref
target="IssuesSymmetricPointToPointSessionsSeveralRateAdaptiveCodecs"/>
are resolved.</t>
</section>
<section anchor="AsymmetricPointToPointSessions"
title="Asymmetric Session">
<t>The following SDP example shows how to use the new bandwidth
attribute to offer asymmetric streams. In this case, the end-point
offers to send H.264 video with 1 Mbps while it is capable of
receiving H.264 with up to 3 Mbps. Note that this example does not
make use of the codec-specific H.264 level asymmetry signaling as
defined in <xref target="RFC6184">RFC 6184</xref>.</t>
<figure title="SDP offer with asymmetric video bandwidth">
<artwork><![CDATA[m=video 50324 RTP/AVP 96
b=AS:3000
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c016
a=bw:send pt=96 SMT:tb=1000000:8192
a=bw:recv pt=96 SMT:tb=3000000:16384
]]></artwork>
</figure>
<t>It should be clear from this example that the new bandwidth
attribute is useful when negotiating asymmetric sessions since it
offers the possibility to define the token bucket parameters for
both sending and receiving directions separately.</t>
</section>
<section anchor="PointToPointSessionsWithRetransmissions"
title="Session with Retransmission">
<t>This SDP example shows how the new bandwidth attribute,
‘a=bw’, can be used for negotiating the bandwidth when
the RTP Retransmission Payload Format <xref target="RFC4588">RFC
4588</xref> is used.</t>
<figure title="SDP offer with aggregate bandwidth and RTP retransmission">
<artwork><![CDATA[m=video 49170 RTP/AVPF 96 97
b=AS:500
a=rtpmap:96 MP4V-ES/90000
a=rtcp-fb:96 nack
a=fmtp:96 profile-level-id=8; config=01010000012000884006682C2090A21F
a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96;rtx-time=3000
a=bw:send pt=* AMT:tb=500000:4096
a=bw:recv pt=* AMT:tb=500000:8192
]]></artwork>
</figure>
<t>In this case, it is beneficial to use the Aggregate Maximum Token
bucket semantics to allow the end-points to adapt the bandwidths
used for the original stream and for the retransmission stream
during the session. The end-point can send more original packets
when the packet loss rate is low. When the packet loss rate is high
then the end-point can use less bandwidth for the original packets
and instead allow for more retransmissions. It would also be
possible to specify separate limits for the original stream and the
retransmission stream by using a separate set of
‘a=bw’-lines for pt=96 and pt=97.</t>
</section>
</section>
<section anchor="SDPExamplesSessionsWithMultipleStreams"
title="SDP Examples with Sessions with Multiple Streams">
<t/>
<section anchor="MultipleStreams" title="Multiple Streams">
<t>The example below is based on the use case described in <xref
target="IssuesMultipleStreams"/>. Only the negotiation for video is
shown here.</t>
<figure title="SDP offer with both per-stream and aggregate bandwidth">
<artwork><![CDATA[m=video 49300 RTP/AVP 96
b=AS:3000
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c01f
a=bw:send pt=* SMT:tb=1000000:1000
a=bw:recv pt=* SMT:tb=1000000:2000
a=bw:send pt=* AMT:tb=1000000:1000
a=bw:recv pt=* AMT:tb=3000000:6000
a=max-recv-ssrc:* 4
]]></artwork>
</figure>
<t>With the new bandwidth attribute, it is possible to define the
bandwidth for each received stream independently from each other. In
this case, the SDP shows that the end-point is prepared to send
maximum 1 Mbps, and that the end-point is prepared to receive
maximum 1 Mbps per stream. The SDP also shows that the end-point is
prepared to receive maximum 3 Mbps, aggregated for the up to four
streams in the receiving direction. Note that this implies that to
receive more than three streams, each stream’s bandwidth must
be reduced to comply with the maximum aggregate.</t>
</section>
<section anchor="DeclarativeExampleWithStreamAsymmetry"
title="Declarative Example with Stream Asymmetry">
<t>This example shows a declarative usage of the new bandwidth
attribute.</t>
<figure title="SDP offer with payload-specific per-stream bandwidth">
<artwork><![CDATA[m=video 50324 RTP/AVP 96 97 98
a=rtpmap:96 H264/90000
a=rtpmap:97 H263-2000/90000
a=rtpmap:98 MP4V-ES/90000
a=max-recv-ssrc:96 2
a=max-recv-ssrc:* 5
a=bw:send pt=* SMT:tb=1200000:16384
a=bw:recv pt=96 SMT:tb=1500000:16384
a=bw:recv pt=97,98 SMT:tb=2500000:16384
a=bw:recv pt=* AMT:tb=8000000:65535
]]></artwork>
</figure>
<t>In the above example, the outgoing single stream is limited to
bucket rate of 1.2 Mbps and bucket size of 16384 bytes. The up to 5
incoming streams can in total use maximum 8 Mbps bucket rate and
with a bucket size of 65535 bytes. However, the individual streams
maximum rate is depending on payload type. Payload type 96 (H.264)
is limited to 1.5 Mbps with a bucket size of 16384 bytes, while the
Payload types 97 (H.263) and 98 (MPEG-4) may use up top 2.5 Mbps
with a bucket size of 16384 bytes.</t>
</section>
</section>
<section anchor="InteroperabilityIssues" title="Interoperability Issues">
<t>The proposed new bandwidth attribute obviously has connections to
the bandwidth modifier ‘b=AS’ and the attributes defined
for directionality (’a=sendrecv’,
’a=sendonly’, ’a=recvonly’ and
’a=inactive’) defined in <xref target="RFC4566">RFC
4566</xref>. It is therefore important to properly analyze these
relationships so that any interoperability issues can be avoided.</t>
<section anchor="InteroperabilityWithExistingBandwidthAttribute"
title="Interoperability with Existing Bandwidth Attribute">
<t>If the SDP includes both the ‘b=AS’ bandwidth
modifier and ‘a=bw’ bandwidth attribute then alignment
may be necessary to avoid confusion. This section gives some
guidelines for such alignment. It may however happen that some usage
needs other alignments than what is discussed below. If so, then
those alignments need to be considered on a case-by-case. The
discussion below should therefore not be seen as an exhaustive
list.</t>
<t>In general, the bandwidths offered with ‘b=AS’ and
‘a=bw’ should be aligned for the direction that applies
for the ‘b=AS’ bandwidth modifier. For
‘sendrecv’ and ‘recvonly’ sessions,
‘b=AS’ indicates the bandwidth for the receiving
direction. The b=AS is closest in interpretation to the AMT
semantic. If the stream maximum semantic (SMT) is used then the sum
of the bandwidths in the receive direction may exceed the
‘b=AS’ bandwidth but the AMT should not exceed the b=AS
value.</t>
<t>If the session includes multiple streams, but if not all of the
streams will be active simultaneously, then ‘b=AS’
should indicate the maximum bandwidth that will be used for the
combinations of streams that are active simultaneously, the same way
AMT could be used in such a session. This also means that the
bandwidths offered with ‘a=bw’ are accumulated for the
combination of streams that are active, and this aggregated
bandwidth should not exceed the bandwidth defined with
‘b=AS’. Note however that it is possible and feasible to
specify an aggregate that is less than the sum of the maximum
bandwidth for the maximum amount of available streams. It may be
possible to use the maximum number of active streams with a lower
bandwidth than the maximum, or it may be possible to reduce the
active number of streams to stay within the bandwidth limit.</t>
<t>The SDP below gives an example of how this is done. In this
example, the intention is to use either the payload type pair (96,
97) or the payload type pair (98, 99). The intention is however to,
for example, not pair payload types 96 and 98.</t>
<figure title="SDP offer with complex bandwidth relations">
<artwork><![CDATA[m=video 50000 RTP/AVP 96 97 98 99 100
b=AS:1000
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c00c
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=42c00d
a=rtpmap:99 H264/90000
a=fmtp:99 profile-level-id=42c00c
a=rtpmap:100 H264/90000
a=fmtp:100 profile-level-id=42c00c
a=bw:sendrecv 96 SMT:tb=700000:4000
a=bw:recv 97 SMT:tb=300000:3000
a=bw:sendrecv 98 SMT:tb=500000:3000
a=bw:recv 99 SMT:tb=200000:2000
a=bw:send 100 SMT:tb=300000:1400
a=sendrecv
]]></artwork>
</figure>
<t>This session is bi-directional, as shown with the
‘a=sendrecv’ attribute. The bandwidth offered with
‘b=AS’ therefore applies to the receive direction. The
‘b=AS’ is then set based on the combination of streams
that gives the highest bandwidth, i.e. the payload type pair (96,
97).</t>
<t>This means that the bandwidths offered with ‘a=bw’
are aligned with the bandwidth offered with ‘b=AS’.</t>
<t>If, on the other hand, the intention would be to use another
combination of payload types, for example (96, 98), then this would
add up to 1200 kbps, which would mean that the stream bandwidths
would not be aligned with the ‘b=AS’ bandwidth.</t>
<t>This shows that bandwidths for ‘sendrecv’ and
‘recv’ directions are added together when determining
the bandwidth for the combined streams.</t>
<t>If the offer is “complex”, for example offering
multiple streams for both speech and video, possibly with many
different codecs, (and therefore uses ‘a=bw’ together
with the ‘b=AS’ bandwidth modifier) and if the answerer
wants to change this into a “simple” session (e.g. plain
simple VoIP with only one RTP payload type for codec X) then the
answerer may remove the ‘a=bw’ lines when creating the
answer. It may therefore happen that the answer includes only
‘b=AS’ bandwidth modifier in the SDP answer. However, if
the offer does not include any ‘b=AS’ line then it is
recommended to maintain the ‘a=bw’ lines also in the
answer, even for “simple” sessions. This means that the
offerer cannot rely on the existence of ‘a=bw’ in the
answer.</t>
</section>
<section anchor="InteroperabilityWithExistingDirectionalAttribute"
title="Interoperability with Existing Directional Attribute">
<t>Since the ‘a=bw’ attribute includes a parameter for
directionality it is important to clarify the relationship to the
already existing directional attributes in SDP
(‘sendrecv’, ‘sendonly’,
‘recvonly’ and ‘inactive’). In general, one
can say that:<list style="symbols">
<t>The SDP attribute indicates the directionality for the
session.</t>
<t>The ‘a=bw’ attribute defines the directionality
for the bandwidth for streams within the session.</t>
<t>The SDP attribute for directionality has precedence over the
‘a=bw’ parameter for directionality when it comes to
the media that is actually being transmitted.</t>
</list></t>
<t>At session setup time, it is therefore acceptable to define
streams with other directionality than what is shown with the SDP
attribute for directionality. However, when media is transmitted,
then the SDP attribute for directionality has to be followed. An
example of this is shown below.</t>
<figure title="SDP offer specifying bandwidth in 'inactive' direction">
<artwork><![CDATA[m=video 5000 RTP/AVP 96 97 98
b=AS:1000
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42c00d
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42c00c
a=bw:sendrecv 96 SMT:tb=700000:4000
a=bw:recv 97 SMT:tb=200000:3000
a=bw:send 97 SMT:tb=300000:1400
a=recvonly
]]></artwork>
</figure>
<t>This means that three bandwidths are defined at session
setup:<list style="symbols">
<t>one stream (PT=96) for 700 kbps bi-directional video;</t>
<t>one stream (PT=97) for 200 kbps receive-only video; and:</t>
<t>one stream (PT=97) for 300 kbps send-only video.</t>
</list></t>
<t>However, since ‘a=recvonly’ is defined then this
means that the end-point is, at the session setup time, only willing
to receive media even though the SDP contains bandwidth declarations
also for the sending direction. This allows for setting up streams
that are effectively inactive in one or both directions from the
beginning of the session and then enabling them later in the
session.</t>
<t>This can be compared with the case when one defines one or more
codecs, even if the session starts up as ‘inactive’.</t>
</section>
</section>
</section>
<section anchor="FutureExtensions"
title="Rules and Recommendations for Extensions">
<t>The a=bw attribute is defined to be extensible and this section
discusses the extension points that are available.</t>
<t/>
<section title="Directionality">
<t>The current specification defines send, recv and sendrecv. In case
some new directionality behavior is needed that doesn't match the
existing, a new one could be defined. This should be avoided unless a
clear need for a new directionality is found.</t>
</section>
<section title="Scope">
<t>It is expected that there will be a need to extend the bandwidth
scope. This document only defines two scope types, session and payload
type, and there is very likely other desirable scopes that will be
defined in the future. Possible examples of scopes are those applying
to a specific SSRC, a particular end-point, or a class of
end-points.</t>
</section>
<section title="Semantics">
<t>This is the extension point that is expected to be frequently used
in the future. A major proliferation of semantics is not good for
interoperability, but it is likely that bandwidth shortcomings or
missing functionalities will be discovered in the future. Thus
defining new semantics gives maximum flexibility to define the meaning
of the provided value(s), the format of the values and how to
interpret the directionality and scope values.</t>
</section>
<section title="Values">
<t>This document only defines token buckets as values. In case fewer
or more parameters are needed to express a particular semantics, new
value formats can be defined. Defining new value formats should be
done with some consideration of generality and reuse so that future
semantics can also use the new value format, with the target to try to
minimize the number of different formats.</t>
</section>
</section>
<section anchor="OpenQuestions" title="Open Issues">
<t>This document contain a few open issues:<list style="numbers">
<t>Multicast behavior needs to be specified.</t>
<t>It is an open question to decide if and how to handle the RTCP
bandwidth negotiation, e.g. corresponding to b=RS and b=RR.</t>
<t>It is an open question to develop semantics for the transport
independent bandwidth negotiation, e.g. corresponding to b=TIAS.</t>
<t>It is an open question what rules and recommendations there
should be for extensions to this memo.</t>
</list></t>
<t/>
</section>
<section anchor="IANAConsiderations" title="IANA Considerations">
<t>Following the guidelines in <xref target="RFC4566">RFC 4566</xref>
and in <xref target="RFC3550">RFC 3550</xref>, the IANA is requested to
register:<list style="numbers">
<t>The bw attribute as defined in Section 4.1.</t>
<t>The bw attribute directionality registry rules</t>
<t>The bw attribute scope registry rules.</t>
<t>The bw attribute semantics registry rules.</t>
<t>The bw attribute values registry rules.</t>
</list></t>
<t>This section will be filled out in future versions of this
document.</t>
</section>
<section anchor="SecurityConsiderations" title="Security Considerations">
<t>Excessive bandwidth allocation can consume all the resources, much
more than what the end-point(s) intend to use. So, if a session
allocates an unnecessarily high bandwidth then this will likely mean
that some other users cannot be admitted, or that they cannot get QoS
guaranteed resources that they requested and have to use best effort. It
can also happen that the session itself is rejected, if the end-points
try to allocate resources that are not available. Allocating too little
bandwidth is likely to negatively impact the perceived media quality or
entirely prevent reception of requested media.</t>
<t>The above shows that the bandwidth attribute is a potential vector
for attacks both from malicious end-points or third party attackers that
attempts to modify the attribute to impact the system to allocate
unnecessary resources, deny end-points service, reduce quality for
end-points or incur cost on users.</t>
<t>To prevent third party attacks the signalling should be source
authenticated and integrity protected to prevent any on or off-path
attacker from injecting or modifying the SDP. Malicious end-points can't
as easily be protected against using crypto, instead behavior analysis
and preventing such a malicious end-point from having serious impact on
other end-points are needed.</t>
</section>
<section anchor="Acknowledgements" title="Acknowledgements">
<t/>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include='reference.RFC.2212'?>
<?rfc include='reference.RFC.3264'?>
<?rfc include="reference.RFC.3550"?>
<?rfc include='reference.RFC.3556'?>
<?rfc include='reference.RFC.3890'?>
<?rfc include='reference.RFC.4566'?>
<?rfc include="reference.RFC.5234"?>
</references>
<references title="Informative References">
<?rfc include="reference.RFC.4588"?>
<?rfc include="reference.RFC.6184"?>
<?rfc include='reference.I-D.westerlund-avtcore-max-ssrc'?>
<?rfc include='reference.I-D.westerlund-avtcore-multiplex-architecture'?>
<reference anchor="G.711">
<front>
<title>ITU-T Recommendation G.711, "Pulse Code Modulation (PCM) of
Voice Frequencies".</title>
<author>
<organization/>
</author>
<date month="November" year="1988"/>
</front>
</reference>
<reference anchor="G.726">
<front>
<title>ITU-T Recommendation G.726, "40, 32, 24, 16 kbit/s Adaptive
Differential Pulse Code Modulation (ADPCM)".</title>
<author>
<organization/>
</author>
<date month="December" year="1990"/>
</front>
</reference>
<reference anchor="G.729">
<front>
<title>ITU-T Recommendation G.729, "Coding of speech at 8 kbit/s
using conjugate-structure algebraic-code-excited linear prediction
(CS-ACELP)".</title>
<author>
<organization/>
</author>
<date month="March" year="1996"/>
</front>
</reference>
<reference anchor="AMR">
<front>
<title>3GPP TS 26.090, "Adaptive Multi-Rate (AMR) speech codec;
Transcoding functions".</title>
<author>
<organization/>
</author>
<date day="23" month="June" year="1999"/>
</front>
</reference>
<reference anchor="AMR-WB">
<front>
<title>3GPP TS 26.190, "Adaptive Multi-Rate - Wideband (AMR-WB)
speech codec; Transcoding functions".</title>
<author>
<organization/>
</author>
<date day="10" month="April" year="2001"/>
</front>
</reference>
<reference anchor="H.264">
<front>
<title>ITU-T Recommendation H.264, "Advanced video coding for
generic audiovisual services".</title>
<author>
<organization/>
</author>
<date month="May" year="2003"/>
</front>
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:24:28 |