One document matched: draft-ietf-mmusic-image-attributes-02.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc3016 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3016.xml">
<!ENTITY rfc3264 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3264.xml">
<!ENTITY rfc3984 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3984.xml">
<!ENTITY rfc4566 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4566.xml">
<!ENTITY rfc4587 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4587.xml">
<!ENTITY rfc4629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4629.xml">
]>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-mmusic-image-attributes-02"
ipr="trust200811">
<front>
<title abbrev="Image Attributes in SDP">Negotiation of Generic Image
Attributes in SDP</title>
<author fullname="Ingemar Johansson" initials="I." surname="Johansson">
<organization>Ericsson AB</organization>
<address>
<postal>
<street>Laboratoriegrand 11</street>
<city>SE-971 28 Lulea</city>
<country>SWEDEN</country>
</postal>
<phone>+46 73 0783289</phone>
<email>ingemar.s.johansson@ericsson.com</email>
</address>
</author>
<author fullname="Kyunghun Jung" initials="K." surname="Jung">
<organization>Samsung Electronics Co., Ltd.</organization>
<address>
<postal>
<street>Dong Suwon P.O. Box 105</street>
<street>416, Maetan-3Dong, Yeongtong-gu</street>
<city>Suwon-city, Gyeonggi-do</city>
<country>Korea 442-600</country>
</postal>
<phone>+82 10 9909 4743</phone>
<email>kyunghun.jung@samsung.com</email>
</address>
</author>
<date day="16" month="Apr" year="2009" />
<abstract>
<t>This document proposes a new generic session setup attribute to make
it possible to negotiate different image attributes such as image size.
A possible use case is to make it possible for a e.g a low-end hand-held
terminal to display video without the need to rescale the image,
something that may consume large amounts of memory and processing power.
The draft also helps to maintain an optimal bitrate for video as only
the image size that is desired by the receiver is transmitted.</t>
</abstract>
<note title="Requirements Language">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref>.</t>
</note>
</front>
<middle>
<section anchor="sec-intro" title="Introduction">
<t>This document proposes a new attribute to make it possible to
negotiate different image attributes such as image size. The term image
size is defined here as it may differ from the physical screen size of
e.g a hand-held terminal. For instance it may be beneficial to display a
video image on a part of the physical screen and leave space on the
screen for e.g menus and other info.</t>
<t>There are a number of benefits with a possibility to negotiate the
image size:<list style="symbols">
<t>Less image distortion: Rescaling of images introduces additional
distortion, something that can be avoided (at least on the receiver
side) if the image size can be negotiated.</t>
<t>Reduced complexity: Image rescaling can be quite computation
intensive. For low end devices this can be a problem.</t>
<t>Optimal quality for the given bitrate: The sender does not need
to encode an entire CIF (352x288) image if only an image size of
288x256 is displayed on the receiver screen. This gives
alternatively a saving in bitrate.</t>
<t>Memory requirement: The receiver device will know the size of the
image and can then allocate memory accordingly.</t>
<t>Optimal aspect ratio: If rescaling of the image is possible on
the receiver side one can imagine that the offer contains three
resolutions 100x200, 200x100 and 100x100 with sar=1.0 (1:1). If the
receiver screen has the resolution 200x200 with sar=1 then the
obvious is to select 100x100 and scale the image by a factor 2. If
on the other hand the screen has the resolution 100x200 with sar=2
(2:1) then the obvious is again to select 100x100 and scale the
image by a factor 2 in the x-axis.</t>
</list></t>
<t>The cautious reader may however object that the rescaling issue has
been moved to the sender and also that codecs such as H.264 are not
mandated to support the rescaling of the video image size. This
potentially reduces the number of valid arguments to only 1 (optimal use
of bandwidth).</t>
<t>However, what must then be considered is that:<list style="symbols">
<t>Rescaling on the sender/encoder side is likely to be easier to do
as the camera related software/hardware already contains the
necessary functionality for zooming/cropping/trimming/sharpening the
video signal. Moreover, rescaling is generally done in RGB or YUV
domain and should not depend on the codecs used.</t>
<t>The encoder may be able to encode in a number of formats but may
not know which format to choose as it, without the image attribute,
does not know the receivers performance or preference.</t>
<t>The quality drop due to digital domain rescaling using
interpolation is likely to be lower if it is done before the video
encoding rather than after the decoding esp. when low bitrate video
coding is used.</t>
<t>If low-complexity rescaling operations such as cropping only must
be performed after all, the benefit with having this functionality
on the sender side is that it is then possible to present a
miniature "what you send" image on the display to help the user to
target the camera correctly.</t>
</list></t>
<t>Several of the existing standards ([H.263], [H.264] and [MPEG-4])
have support for different resolutions at different framerates. The
purpose of this document is to provide for a generic mechanism and is
targeted mainly at the negotiation of the image size but to make it more
general the attribute is named "imageattr". A problem statement and
discussion that gives a motivation to this document can be found in
<xref target="S4-080144"></xref>.</t>
<t>The draft is limited to unicast scenarios in general and more
specific peer to peer situations. The attribute may be used in
centralized conferencing scenarios as well but due to the abundance of
configuration options it may then be difficult to come up with a
configuration that fits all parties.</t>
</section>
<section title="Conventions, Definitions and Acronyms">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.</t>
</section>
<section anchor="sec-definition" title="Defintion of Attribute">
<t>A new image attribute is defined with the name "imageattr". The new
SDP attribute contains a set of image attribute options that the offerer
can provide. The receiver can then select the desired image attribute
(e.g image size in pixels) and may then have the ability to avoid costly
transformations (e.g rescaling) of the images. In this approach only the
image resolution and optionally sample aspect ratio, allowed range in
picture aspect ratio and preference is covered but the framework makes
it possible to extend with other image related attributes that make
sense.</t>
<section title="Requirements">
<t>The new image attribute should meet the following
requirements:<list style="hanging">
<t hangText="REQ-1:">Support the offer of a specific image size on
the receiver display or in other words, reduce or avoid the need
for rescaling images in the receiver to fit a given portion of the
screen.</t>
<t hangText="REQ-2:">Support asymmetric setups i.e the very likely
scenario where Alice prefers an image size of 320x240 for her
display while Bob prefers an image size of 176x144.</t>
<t hangText="REQ-3:">Interoperate with codec specific parameters
such as sprop-parameter-sets in H.264 or config in MPEG4.</t>
<t hangText="REQ-3:">Make the attribute generic with as little
codec specific details/tricks as possible. Ideally the attribute
should not care about the codec specific features.</t>
<t hangText="OPT-1:">Make it possible to use attribute for other
purposes than video. One possible use case may be distributed
white-board presentations which are based on transmission of
compressed bitmap images where rescaling often produce very poor
results.</t>
</list></t>
</section>
<section title="Attribute syntax">
<t>In this section the syntax of the image attribute is described. The
section is split up in two parts, the first gives an overall view of
the syntax while the second describes how the syntax is used.</t>
<section anchor="sec-syntax-overall" title="Overall view of syntax">
<t>The syntax for the image attribute is in ABNF:</t>
<figure>
<artwork> ----
image-attr = "imageattr:" PT 1*2( 1*WSP ( "send" / "recv" )
1*WSP attr_list )
PT = 1*DIGIT / "*"
attr_list = ( set *(1*WSP set) ) / "*"
see below for a definition of set.
----</artwork>
</figure>
<t><list style="symbols">
<t>Maximum one occurrence of the "send" keyword and
corresponding attr_list is allowed per image attribute.</t>
<t>Maximum one occurrence of the "recv" keyword and
corresponding attr_list is allowed per image attribute.</t>
<t>PT is the payload type number, it can be set to * to indicate
that the attribute applies to all payload types in the media
description.</t>
<t>For sendonly or recvonly streams one of the directions MAY be
omitted. See <xref target="sec-sendonly-recvonly"></xref>,
moreover the order of the send and recv directions is not
important.</t>
</list></t>
<t>The syntax for the set is given by:<figure>
<artwork> ----
set= "[" "x=" range "," "y=" range [",sar="range]
[",par=" range] [",q=" value] "]"
x is the horizontal image size range
y is the vertical image size range
sar (sample aspect ratio) is the sample aspect ratio associated
with the set (optional and MAY be ignored)
par (picture aspect ratio) is the allowed ratios between the
displays x and y physical size (optional)
q (optional with range [0.0..1.0], default value 0.5)
is the preference for the given set, a higher value means higher
preference from the sender point of view
range is expressed in a few different formats
1) range= value
a single value
2) range= "[" value1 ":" [ step ":" ] value2 "]"
values between value1 and value2 inclusive,
if step is omitted a stepsize of 1 is implied
3) range= "[" value 1*( "," value ) "]"
any value from the list of values
4) range= "[" value1 "-" value2 "]"
any real value between value1 and value2 inclusive
value is a positive integer or real value
step is a positive integer or real value
If step is left out in the syntax a stepsize of 1 is implied
Real values are only applicable for the
sar, par and q parameters
Note the use of brackets [..] if more that one value
is specified.
----
</artwork>
</figure></t>
<t>Some further guidelines for the use of the attribute is given
below:<list style="symbols">
<t>The image attribute is bound to a specific media by means of
the payload type number. A wild card (*) can be specified for
the payload type number to indicate that it applies to all
payload types in the media description. Several image attributes
can be defined e.g for different video codec alternatives
conditioned that the payload type number differs.</t>
<t>The preference for each set is 0.5 by default, setting the
optional q parameter to another value makes it possible to set
different preferences for the sets. A higher value gives a
higher preference for the given set.</t>
<t>The sar parameter specifies the sample aspect ratio
associated to the given range of x and y values. The sar
parameter is defined as dx/dy where dx and dy is the size of the
pixels. Square pixels gives a sar=1.0. The parameter sar MAY be
expressed as a range. <vspace />If this parameter is not present
a default sar value of 1.0 is assumed. <vspace />The
interpretation of sar differs between the send and the receive
directions. <list style="symbols">
<t>In the send direction it defines a specific sample aspect
ration associated to a given x and y image size (range).</t>
<t>In the recv direction sar expresses that the receiver of
the given media prefers to receive a given x and y
resolution with a given sample aspect ratio.</t>
</list>See <xref target="sec-sar-considerations"></xref> for a
more detailed discussion. <vspace />The sar parameter will
likely not solve all the issues that are related to different
sample aspect ratios but it can help to solve them and reduce
aspect ratio distortion.</t>
<t>The par (width/height = x/y ratio) parameter indicates a
range of allowed ratios between x and y physical size (picture
aspect ratio). This is used to limit the number of x and y image
size combinations, par is given as <figure>
<artwork> ----
par=[ratio_min-ratio_max]
----</artwork>
</figure>Where ratio_min and ratio_max are the min and max
allowed picture aspect ratios.<vspace />If sar and the display
sample aspect ration is the same (or close) the relation between
the x and y pixel resolution and the physical size of the image
is straightforward. If however sar differs from the sample
aspect ratio of the receiver display this must be taken into
consideration when the x and y pixel resolution alternatives are
sorted out.</t>
<t>The offerer MUST be able to support the image attributes that
it offers.</t>
<t>The answerer MAY choose to keep imageattr but is not required
to do so. If the attribute is kept in the SDP answer:<list
style="symbols">
<t>The answerer MUST for its receive direction only include
one or more valid entries taken from the offer. In other
words, the answerer MUST for its receive direction only pick
one or more valid entries from the multidimensional solution
space spanned by the offer.</t>
<t>The answerer MAY for its send direction modify the
attribute in the sense that new entries other than those
presented in the offer are added. It must however be noted
that this may lead to an extra offer/answer exchange of the
added parameters are not supported by the offerer.</t>
</list></t>
</list></t>
</section>
<section anchor="sec-syntax-descr" title="Syntax description">
<t>In the description of the syntax we here assume that Alice wish
to setup a session with Bob and that Alice takes the first
initiative. The syntactical white-space delimiters (1*WSP) and
double-quotes are removed to make reading easier.</t>
<t>In the offer Alice provides with information for both the send
and receive (recv) directions using syntax version 1. For the send
direction Alice provides with a list that the answerer can select
from. For the receive direction Alice may either specify a desired
image size range right away or a * to instruct Bob to fill with a
list of image size that Bob can support to send. Using the overall
high level syntax the image attribute may then look like <figure>
<artwork> ----
a=imageattr:PT send attr_list recv attr_list
----</artwork>
</figure>or<figure>
<artwork> ----
a=imageattr:PT send attr_list recv *
----</artwork>
</figure>In the first alternative the recv direction may be a full
list of desired image size formats. It may however (and most likely)
just be a list with one alternative for the preferred x and y
resolution.</t>
<t>If Bob supports an x and y resolution in the given x and y range
the answer from Bob will look like: <figure>
<artwork> ----
a=imageattr:PT send attr_list recv attr_list
----</artwork>
</figure>And the offer answer negotiation is done. Worth notice
here is that the attr_list will likely be pruned in the answer.
While it may contain many different alternatives in the offer it may
in the end contain just one or two alternatives in the end.</t>
<t>If Bob does not support any x and y resolution in the given x and
y range in attr_list or a * was given for the recv direction then he
MUST either: <list style="symbols">
<t>Provide with another list of options (attr_list). The answer
from Bob may then look like:<figure>
<artwork> ----
a=imageattr:PT recv attr_list send attr_list
----</artwork>
</figure>In this case the offer/answer negotiation is not
quite done. To complete the offer/answer Alice sends another
offer that looks like:<figure>
<artwork> ----
a=imageattr:PT send attr_list recv attr_list
----</artwork>
</figure>Bob MAY send back an answer to complete the 2nd
offer/answer but this is not necessary.</t>
<t>Remove the corresponding part completely in which case the
answer from bob would look like:<figure>
<artwork> ----
a=imageattr:PT recv attr_list
----</artwork>
</figure>Again it is worth notice that the attr_list for each
direction is likely pruned depending on preferred and supported
options.</t>
</list></t>
<t>If the 1st offer (from Alice) already defines a desired image
size for the recv direction the answerer can do one of the
following:<list style="numbers">
<t>Accept the image size and return it in the answer.</t>
<t>Replace with a list of options in the answer.</t>
<t>Remove the corresponding part completely. This may happen if
it is deemed that it is unlikely that the list of options is
supported. The answer will then lack a description for the send
direction and will look like:<figure>
<artwork> ----
a=imageattr:PT recv attr_list
----</artwork>
</figure></t>
</list></t>
</section>
</section>
<section title="Considerations">
<section title="No imageattr in 1st offer">
<t>A high end device (Alice) may not see any need for the image
attribute as it most likely has the processing capacity to rescale
incoming video and may therefore not include the attribute in the
offer as it otherwise does not see any use for it. The answerer
(Bob) MAY include imageattr in the answer. This has two
implications:<list style="symbols">
<t>Longer session setup time due to extra offer/answer
exchanges</t>
<t>There is a risk that Alice does not recognize or support
imageattr and will thus anyway ignore the attribute.</t>
</list></t>
</section>
<section title="Asymmetry">
<t>While the image attribute supports asymmetry there are some
limitations to this. One important limitation is that the codec
being used can only support up to a given maximum resolution for a
given profile level.</t>
<t>As an example H.264 with profile level 1.2 does not support
higher resolution than 352x288 (CIF). The offer/answer rules
essentially gives that the same profile level must be used in both
directions. This means that for an asymmetric scenario where Alice
wants an image size of 580x360 and Bob wants 150x120 profile level
2.2 is needed in both directions even though profile level 1 would
have been enough in one direction.</t>
<t>Currently, the only solution to this problem is to specify two
unidirectional media descriptions. Note however that the asymmetry
issue for the H.264 codec is solved in <xref
target="RFC3984bis"></xref>.</t>
</section>
<section anchor="sec-sendonly-recvonly" title="sendonly and recvonly">
<t>If the directional attributes a=sendonly or a=recvonly are given
for a media, there is of course no need to specify the image
attribute for both directions. Therefore one of directions in the
attribute MAY be omitted. However it may be good to do the image
attribute negotiation in both directions in case the session is
updated for media in both directions at a later stage.</t>
</section>
<section anchor="sec-sar-considerations" title="Sample aspect ratio">
<t>The sar parameter in relation to the x and y pixel resolution
deserves some extra discussion. Consider the offer from Alice to Bob
(we set the recv direction aside for the moment): <figure>
<artwork> ----
a=imageattr:97 send [x=720,y=576,sar=1.1]
----</artwork>
</figure>If the receiver display has square pixels the 720x576
image would need to be rescaled to for example 792x576 or 720x524 to
ensure a correct image aspect ratio. This in practice means that
rescaling would need to be performed on the receiver side, something
that is contrary to the spirit of this draft. <vspace />To avoid
this problem Alice MAY specify a range of values for the sar
parameter like:<figure>
<artwork> ----
a=imageattr:97 send [x=720,y=576,sar=[0.91,1.0,1.09,1.45]]
----</artwork>
</figure>Meaning that Alice can encode with any of the mentioned
sample aspect ratios, leaving to Bob to decide which one he
prefers.</t>
<t>The response MUST NOT include the sar parameter if there is no
acceptable value given.</t>
</section>
<section title="SDPCapNeg support">
<t>The image attribute can be used within the SDP Capability
Negotiation <xref target="SDPCapNeg"></xref> framework and its use
is then specified using the "a=acap" parameter. An example is</t>
<figure>
<artwork> ----
a=acap:1 imageattr:97 send [x=720,y=576,sar=[0.91,1.0,1.09,1.45]]
----</artwork>
</figure>
<t>For use with SDP Media Capability Negotiation extension <xref
target="SDPMedCapNeg"></xref>, where it is no longer possible to
specify payload type numbers, it is possible to use the parameter
substitution rule, an example of this is.</t>
<figure>
<artwork> ----
...
a=mcap:1 video H264/90000
a=acap:1 imageattr:%1% send [x=720,y=576,sar=[0.91,1.0,1.09,1.45]]
...
----</artwork>
</figure>
<t>Where %1% maps to media capability number 1.</t>
</section>
<section title="Interaction with codec parameters">
<t>As most codecs specifies some kind of indication of e.g. the
image size already at session setup some measures must be taken to
avoid that the image attribute conflicts with this already existing
information.</t>
<t>The following subsections describes the most well known codecs
and how they define image-size related information.</t>
<section title="H.263">
<t>The payload format for H.263 is described in <xref
target="RFC4629"></xref>.</t>
<t>H.263 defines (on the fmtp line) a list of image sizes and
their maximum frame rates (profiles) that the offerer can receive.
The answerer is not allowed to modify this list and must reject a
payload type that contains an unsupported profile. The CUSTOM
profile may be used for image size negotiation but support for
asymmetry requires the specification of two unidirectional media
descriptions using the sendonly/recvonly attributes.</t>
</section>
<section title="H.264">
<t>The payload format for H.264 is described in <xref
target="RFC3984"></xref> and updated in <xref
target="RFC3984bis"></xref>.</t>
<t>H.264 defines image size related information in the fmtp line
by means of sprop-parameter-sets. According to the specification
several sprop-parameter-sets may be defined for one payload type.
The sprop-parameter-sets describe the image size (+ more) that the
offerer sends in the stream and need not be complete. This means
that this does not represent any negotiation. Moreover an answer
is not allowed to change the sprop-parameter-sets.</t>
<t>This configuration may be changed later inband if for instance
image sizes need to be changed or added.</t>
</section>
<section title="MPEG-4">
<t>The payload format for MPEG-4 is described in <xref
target="RFC3016"></xref>.</t>
<t>MPEG-4 defines a config parameter on the fmtp line which is a
hexadecimal representation of the MPEG-4 visual configuration
information. This configuration does not represent any negotiation
and the answer is not allowed to change the parameter.</t>
<t>Currently it is not possible to change the configuration using
inband signaling.</t>
</section>
<section title="Possible solutions">
<t>The subsections above clearly indicate that this kind of
information must be aligned well with the image attribute to avoid
conflicts. There are a number of possible solutions:<list
style="symbols">
<t>Ignore payload format parameters: This may not work well
e.g in the presence of bad channel conditions esp. in the
beginning of a session. Moreover this is not a good option for
MPEG-4.</t>
<t>2nd session-wide offer/answer round: In the 2nd
offer/answer the codec payload format specific parameters are
defined based on the outcome of the imageattr negotiation. The
drawback with this is that setup of the entire session
(including audio) may be delayed considerably, especially as
the imageattr negotiation can already itself cost up to two
offer/answer rounds. Also the conflict between the imageattr
negotiation and the payload format specific parameters is
still present after the first offer/anser round and a
fuzzy/buggy implementation may start media before the second
offer/answer is completed with unwanted results.</t>
<t>2nd session-wide offer/answer round only for video: This is
similar to the alternative above with the exception that setup
time for audio is not increased, moreover the port number for
video is set to 0 during the 1st offer answer round to avoid
that media flows. <vspace />This has the effect that video
will blend in some time after the audio is started (up to 2
seconds delay). This alternative is likely the most clean-cut
and failsafe alternative. The drawback is, as the port number
in the first offer is always zero, the media startup will
always be delayed even though it would in fact have been
possible to start media already after the first offer/answer
round.</t>
</list></t>
</section>
</section>
<section title="Change of display in middle of session">
<t>A very likely scenario is that a user switches to another phone
during e.g a video telephony call or plugs the cellphone into an
external monitor. In both cases it is very likely that a
renegotiation is initiated using e.g the SIP-REFER or SIP-UPDATE
methods. It is RECOMMENDED to negotiate the image size during this
renegotiation.</t>
</section>
<section title="Addition of parameters">
<t>The image attribute opens up for the addition of parameters in
the future. To make backwards adaptation possible; an entity that
process the attribute MUST remove parameters that are not recognized
before returning the attribute in the SDP answer. Addition of future
parameters that are not understood by the receiving endpoint may
lead to ambiguities if mutual dependencies between parameters exist,
therefore addition of parameters must be done with great care.</t>
</section>
</section>
</section>
<section title="Examples">
<t>A few examples to highlight the syntax, here is assumed where needed
that Alice initiates a session with Bob</t>
<section title="Example 1">
<figure>
<artwork> ----
a=imageattr:97 send [x=800,y=640,sar=1.1,q=0.6] [x=480,y=320] \
recv [x=330,y=250]
----</artwork>
</figure>
<t>Two image resolution alternatives are offered with 800x640 with
sar=1.1 having the highest preference</t>
<t>The example also indicates that Alice wish to display video with a
resolution of 330x250 on her display</t>
<t>In case Bob accepts the "recv [x=330,y=250]" the answer may look
like</t>
<figure>
<artwork> ----
a=imageattr:97 recv [x=800,y=640,sar=1.1] \
send [x=330,y=250]
----</artwork>
</figure>
<t>Indicating that the receiver (Bob) wish the encoder (on Alice's
side) to compensate for a sample aspect ratio of 1.1 (11:10) and
desires an image size on its screen of 800x640.</t>
<t>There is however a possibility that "recv [x=330,y=250]" is not
supported. If the case, Bob may completely remove this part or replace
it with a list of supported image sizes.</t>
<figure>
<artwork> ----
a=imageattr:97 recv [x=800,y=640,sar=1.1] \
send [x=[320:16:640],y=[240:16:480],par=[1.2-1.3]]
----</artwork>
</figure>
<t>Alice can then select a valid image size which is closest to the
one that was originally desired (336x256) and performs a second
offer/answer</t>
<figure>
<artwork> ----
a=imageattr:97 send [x=800,y=640,sar=1.1] \
recv [x=336,y=256]
----</artwork>
</figure>
<t>Bob replies with (actually not necessary):</t>
<figure>
<artwork> ----
a=imageattr:97 recv [x=800,y=640,sar=1.1] \
send [x=336,y=256]
----</artwork>
</figure>
</section>
<section title="Example 2">
<figure>
<artwork> ----
a=imageattr:97 \
send [x=[480:16:800],y=[320:16:640],par=[1.2-1.3],q=0.6] \
[x=[176:8:208],y=[144:8:176],par=[1.2-1.3]] \
recv *
----</artwork>
</figure>
<t>Two image resolution sets are offered with the first having a
higher preference (q=0.6). The x-axis resolution can take the values
480 to 800 in 16 pixels steps and 176 to 208 in 8 pixels steps. The
par parameter limits the set of possible x and y screen resolution
combinations such that 800x640 (ratio=1.25) is a valid combination
while 720x608 (ratio=1.18) or 800x608 (ratio=1.31) are invalid
combinations.</t>
<t>For the recv direction (Bob->Alice) Bob is requested to provide
with a list of supported image sizes</t>
</section>
<section title="Example 3">
<t>In this example is defined a complete SDP offer for the video media
part</t>
<figure>
<artwork> ----
m=video 49154 RTP/AVP 99
a=rtpmap:99 H264/90000
a=fmtp:99 packetization-mode=0;profile-level-id=42e011; \
sprop-parameter-sets=Z0LgC5ZUCg/I,aM4BrFSAa
a=imageattr:99 \
send [x=176,y=144] [x=224,y=176] [x=272,y=224] [x=320,y=240] \
recv [x=176,y=144] [x=224,y=176] [x=272,y=224,q=0.6] [x=320,y=240]
----</artwork>
</figure>
<t>In the send direction, sprop-parameter-sets is defined for a
resolution of 320x240 which is the largest image size offered in the
send direction. This means that if 320x240 is selected, no additional
offer/answer is necessary. In the receive direction four alternative
image sizes are offered with 272x224 being the preferred choice.</t>
<t>The answer may look like:</t>
<figure>
<artwork> ----
m=video 49154 RTP/AVPF 99
a=rtpmap:99 H264/90000
a=fmtp:99 packetization-mode=0;profile-level-id=42e011; \
sprop-parameter-sets=Z0LgC5ZUCg/I,aM4BrFSAa
a=imageattr:99 send [x=320,y=240] recv [x=320,y=240]
----</artwork>
</figure>
<t>Indicating (in this example) that the image size is 320x240 in both
directions. Although the offerer preferred 272x224 for the receive
direction, the answerer might not be able to offer 272x224 or not
allow encoding and decoding of video of different image sizes
simultaneously. The answerer sets new sprop-parameter-sets,
constructed for both send and receive directions at the restricted
conditions and image size of 320x240.</t>
</section>
<section title="Example 4">
<t>This example illustrates in more detail how compensation for
different sample aspect ratios can be negotiated with the image
attribute.</t>
<t>We setup a session between Alice and Bob, Alice is the offerer of
the session. The offer (from Alice) contains the image attribute
below:<figure>
<artwork> ----
a=imageattr:97 \
send [sar=[1.0-1.3],x=400:16:800],y=[320:16:640],par=[1.2-1.3]] \
recv [sar=1.1,x=800,y=600]
----</artwork>
</figure></t>
<t>First we consider the recv direction: The offerer (Alice)
explicitly states that she wish to receive the screen resolution
800x600, however she also indicates that the screen on her display
does not use square pixels, the sar value=1.1 means that Bob must
(preferably) compensate for this. So.. If Bob's video camera produces
square pixels, and wish to satisfy Alice's sar requirement, the image
processing algorithm must rescale a 880x600 pixel image (880=800*1.1)
to 800x600 pixels (could be done other ways).</t>
<t>... and now the send direction: Alice indicates that she can (in
the image processing algorithms) rescale the image for sample aspect
ratios in the range 1.0 to 1.3. She can also provide with a number of
different image sizes (in pixels) ranging from 400x320 to 800x640. Bob
inspects the offered sar and image sizes and responds with the
modified image attribute <figure>
<artwork> ----
a=imageattr:97 \
recv [sar=1.15,x=464,y=384] \
send [sar=1.1,x=800,y=600]
----</artwork>
</figure></t>
<t>Alice will, in order to satisfy Bob's request, need to rescale the
image from her video camera from 534x384 (534=464*1.15) to
464x384.</t>
<t>Neither part is required to rescale like this (sar MAY be ignored),
the consequence will of course be a distorted image.</t>
</section>
</section>
<section title="IANA Considerations">
<t>Following the guidelines in <xref target="RFC4566"></xref>, the IANA
is requested to register one new SDP attribute:<list style="symbols">
<t>Contact name, email address and telephone number: Authors of
RFCXXXX</t>
<t>Attribute-name: imageattr</t>
<t>Type of attribute: media-level</t>
<t>Subject to charset: no</t>
</list></t>
<t>This attribute defines the ability to negotiate various image
attributes such as image sizes. The attribute contains a number of
parameters which can be modified in and offer/answer exchange.</t>
<t>Note to RFC Editor: please replace "RFC XXXX" above with the RFC
number of this memo, and remove this note.</t>
</section>
<section title="Security Considerations">
<t>This draft does not add any additional security issues other than
those already existing with currently specified offer/answer
procedures.</t>
</section>
<section anchor="Acknowledgements" title="Acknowledgements">
<t>The authors would like to thank the people who has contributed with
objections and suggestions to this draft and provided with valuable
guidance in the amazing video-coding world. Special thanks go to Clinton
Priddle, Roni Even, Randell Jesup, and Dan Wing.</t>
</section>
<section title="Changes">
<t>The main changes are:<list hangIndent="" style="hanging">
<t hangText="From WG -01 to WG -02"><list style="symbols">
<t>Added extra example that highlights the negotiation of
sar</t>
</list></t>
<t hangText="From WG -00 to WG -01"><list style="symbols">
<t>Added info about future addition of parameters and backwards
compatibility</t>
<t>Added IANA considerations</t>
</list></t>
<t hangText="From individual -02 to WG -00"><list style="symbols">
<t>Cleanup of syntax, ABNF form</t>
<t>Additional example</t>
</list></t>
<t hangText="From -01 to -02"><list style="symbols">
<t>Cleanup of the sar and par parameters to make them match the
established conventions</t>
<t>Requirement specification added</t>
<t>New bidirectional syntax</t>
<t>Interoperability considerations with well known video codecs
discussed</t>
</list></t>
</list></t>
</section>
</middle>
<back>
<references title="Informative References">
&rfc3016;
&rfc3264;
&rfc3984;
&rfc4566;
&rfc4587;
&rfc4629;
<reference anchor="H.264">
<front>
<title>ITU-T Recommendation H.264,
http://www.itu.int/rec/T-REC-H.264-200711-I/en</title>
<author fullname="" initials="" surname="">
<organization>ITU-T</organization>
</author>
<date />
</front>
</reference>
<reference anchor="S4-080144">
<front>
<title>Signaling of Image Size: Combining Flexibility and Low Cost,
http://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_48/Docs/S4-080144.zip</title>
<author fullname="" initials="" surname="">
<organization>3GPP</organization>
</author>
<date />
</front>
</reference>
<reference anchor="SDPCapNeg">
<front>
<title>SDP Capability Negotiation,
http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-capability-negotiation</title>
<author fullname="" initials="" surname="">
<organization>IETF</organization>
</author>
<date />
</front>
</reference>
<reference anchor="RFC3984bis">
<front>
<title>RTP Payload Format for H.264 Video,
http://tools.ietf.org/wg/avt/draft-ietf-avt-rtp-rfc3984bis/</title>
<author fullname="" initials="" surname="">
<organization>IETF</organization>
</author>
<date />
</front>
</reference>
<reference anchor="SDPMedCapNeg">
<front>
<title>SDP media capabilities Negotiation,
http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-media-capabilities</title>
<author fullname="" initials="" surname="">
<organization>IETF</organization>
</author>
<date />
</front>
</reference>
</references>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
</references>
</back>
</rfc>| PAFTECH AB 2003-2026 | 2026-04-24 05:26:32 |