One document matched: draft-ietf-payload-rtp-g718-01.xml


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='http://xml.resource.org/authoring/rfc2629.xslt' ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3264 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3264.xml">
<!ENTITY rfc3550 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc3551 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3551.xml">
<!ENTITY rfc3711 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3711.xml">
<!ENTITY rfc4288 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4288.xml">
<!ENTITY rfc4566 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4566.xml">
<!ENTITY rfc4585 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4585.xml">
<!ENTITY rfc4855 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4855.xml">
<!ENTITY rfc4867 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4867.xml">
<!ENTITY rfc2326 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2326.xml">
<!ENTITY rfc2974 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2974.xml">
<!ENTITY rfc3828 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3828.xml">
<!ENTITY rfc4340 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4340.xml">
<!ENTITY rfc5117 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5117.xml">
<!ENTITY rfc5583 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5583.xml">
<!ENTITY rfc6051 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6051.xml">
]>

<?rfc strict="yes"?>
<?rfc comments="no"?>
<?rfc inline="yes"?>
<?rfc editing="no"?>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="yes"?>
<?rfc rfcedstyle="yes"?>
<?rfc rfcprocack="no"?>
<?rfc tocindent="yes"?>

<rfc category="std" docName="draft-ietf-payload-rtp-g718-01.txt" ipr="pre5378Trust200902">
	<front>
		<title abbrev="RTP Payload for G.718 Speech/Audio">RTP Payload Format for G.718 Speech/Audio</title>

		<author role="editor" fullname="Glen Zorn" initials="G" surname="Zorn">
			<organization>Network Zen</organization>
			<address>
				<postal>
					<street>227/358 Thanon Sanphawut</street>
					<city>Bang Na</city>
					<region>Bangkok</region>
					<code>10260</code>
					<country>Thailand</country>
				</postal>
				<phone>+66 (0) 87-040-4617</phone>
				<email>glenzorn@gmail.com</email>
			</address>
		</author>

		<author fullname="Ye-Kui Wang" initials="Y" surname="Wang">
			<organization>Huawei Technologies</organization>
			<address>
				<postal>
					<street>400 Somerset Corp Blvd.</street>
					<street>Suite 402</street>
					<city>Bridgewater</city>
					<region>NJ</region>
					<code>08807</code>
					<country>USA</country>
				</postal>
				<phone>+1 (908) 541-3518</phone>
				<email>yekuiwang@huawei.com</email>
			</address>
		</author>
		
		<author fullname="Ari Lakaniemi" initials="A" surname="Lakaniemi">
			<organization>Nokia</organization>
			<address>
				<postal>
					<street>P.O.Box 407</street>
					<city>FIN-00045 Nokia Group</city>
					<country>Finland</country>
				</postal>
				<phone>+358-71-8008000</phone>
				<email>ari.lakaniemi@nokia.com</email>
			</address>
		</author>

		
		<date year="2011"/>

		<abstract>
			<t>
				This document specifies the Real-Time Transport Protocol (RTP)
				payload format for the Embedded Variable Bit-Rate (EV-VBR)
				speech/audio codec, specified in ITU-T G.718. A media type 
				registration for this RTP payload format is also included.
			</t>
		</abstract>
	</front>

	<middle>
		<section title="Introduction">
			<t>
				The International Telecommunication Union (ITU-T) Recommendation
				G.718 <xref target="ITU.G718.2008"/>
				specifies the Embedded Variable Bit Rate (EV-VBR)
				speech/audio codec. This document specifies the Real-time Transport
				Protocol (RTP)  <xref target="RFC3550"/>
				payload format for this codec.
			</t>
		</section>
		
		<section title="Requirements Language">
			<t>
				The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
				"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
				document are to be interpreted as described in <xref target="RFC2119"></xref>.
			</t>
		</section>

		<section title="Background">
			<section title="The G.718 Codec">
				<t>
					G.718 is an embedded variable rate speech codec having a layered design. 
					The bitstream of the G.718 core codec consists of a core
					layer, denoted as L1, and four enhancement layers, denoted as L2-L5.
					The bit-rates of the G.718 core codec range from 8 kbit/s (core layer
					only) to 32 kbit/s (with all layers up to L5). Furthermore, the G.718
					codec also supports discontinuous transmission (DTX) and comfort
					noise generation (CNG) by sending Silence Descriptor (SID) frames
					during periods of non-active input signal, resulting in a reduced
					bit-rate. The sampling frequency of the core codec is 16 kHz and the
					codec operates on 20 ms frames. The G.718 codec is also capable of
					narrowband operation with audio input and/or output at 8 kHz sampling frequency.
				</t>
				<t>
					While transmitting/receiving the core layer L1 is enough for
					successful decoding of the audio content, each of the enhancement
					layers Ln (n being 2 to 5, inclusive) provides an improvement to
					reconstructed audio quality. Thus, the core layer ensures the basic
					communication while the enhancement layers can be used to improve the
					perceptual quality. Furthermore, enhancement layers are dependent on
					all the lower layers in a sense that successful decoding of layer Ln
					requires also all the layers Lm with m<n to be available.
				</t>
				<t>
					The sizes, sampling rates and possible outputs of the G.718 core
					codec layers L1-L5 are summarized in Table 1
					below, where the "Bytes"
					column indicates the number of bytes per encoded data unit for a
					layer. NB and WB denote narrowband and wideband, respectively. The
					"Bytes" column in other tables has the same meaning. Note that for
					layers L1 and L2, the corresponding output may either be NB or WB,
					depending on the rendering device and the application requirement,
					regardless of the sampling rate of the encoded data.
					
<figure align="center" anchor="table_1" suppress-title="true"><artwork><![CDATA[
                     Table 1: G.718 Layers

  Layer   Bytes   Cumulative bit-rate   Sampling rate   Output
----------------------------------------------------------------
   L1'      32       12.8 kbit/s           16 kHz         WB
   L3'       9       16.4 kbit/s           16 kHz         WB
   L4       20       24.4 kbit/s           16 kHz         WB
   L5       20       32.4 kbit/s           16 kHz         WB
]]></artwork></figure>

					<vspace blankLines="1"/>
					The G.718 codec also includes an operating mode that is compatible
					with the Adaptive Multi-Rate Wideband (AMR-WB) codec 
					<xref target="AMR-WB"/>, for
					which the RTP payload format is specified in <xref target="RFC4867"/>.
					In this AMR-
					WB interoperable mode, layers L1 and L2 are replaced by L1' consisting
					of AMR-WB encoded data and L3'
					is used instead of L3. The usage of layers L4 and L5 is not affected
					by transmitting AMR-WB data in the lower layers. If layer L3' is
					present in the encoded bit-stream, the base layer L1' must use the
					AMR-WB mode 2 with a bit-rate of 12.65 kbits/s. Otherwise (the
					encoded bit-stream contains only the L1' layer), any of the 9 AMR-WB
					coding modes 0, 1, 2, 3, 4, 5, 6, 7, and 8 correspond to the bit-
					rates of 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, and
					23.85 kbit/s, respectively, may be in use. Table 2 summarizes the
					AMR-WB interoperable mode when more than one layer may be present.
				</t>

<figure align="center" anchor="table_2" suppress-title="true"><artwork><![CDATA[
     Table 2: G.718 layers in the AMR-WB interoperable mode

  Layer   Bytes   Cumulative bit-rate   Sampling rate   Output
----------------------------------------------------------------
   L1'      32       12.8 kbit/s           16 kHz          WB
   L3'       9       16.4 kbit/s           16 kHz          WB
   L4       20       24.4 kbit/s           16 kHz          WB
   L5       20       32.4 kbit/s           16 kHz          WB
]]></artwork></figure>	

				<t>
					Note that the bit-rate for the raw bit-stream of AMR-WB mode 2 is
					12.65 kbits/s. However, after counting the padding bits to make each
					encoded data unit byte-aligned, as in the octet-aligned mode
					specified in <xref target="RFC4867"/>,
					the resulting bit-rate is then 12.8 kbits/s.
				</t>
				<t>
					In the AMR-WB interoperable mode, when the base layer L1' is
					transported in its own RTP packet stream, the packetisation specified in <xref target="RFC4867"/>
					MUST be used, to enable legacy RFC4867 receivers to
					receive the base layer L1'.
					<vspace blankLines="1"/>
					ITU-T SG16 is currently working on a set of extension layers in order
					to provide so-called super-wideband (SWB) audio and stereophonic
					encoding extensions on top of the G.718 core codec. Further details
					and the usage of these layers are undtermined at this time.
					<vspace blankLines="1"/>
					The main application of the G.718 codec is telephony. Other expected
					applications include audio/video conferencing and streaming.
				</t>
			</section>
			<section title="Benefits of Layered Design">
				<t>
					Layered design enables simple scalability of the transmitted
					stream simply by conveying a suitable number of layers. The number of
					layers used in a session may be selected for example based on the
					capacity of the transmission channel, current transmission conditions,
					characteristics of the source signal or available processing capacity.
					<vspace blankLines="1"/>
					Another obvious benefit of the layered codec design is the
					possibility to exploit the scalability to support congestion control
					by transmitting/dropping some of the (higher) enhancement layers in
					order to alleviate congestion in the network. See more detailed
					discussion on the congestion control in <xref target="C-C"/>.
					<vspace blankLines="1"/>
					Furthermore, the layered design also implicitly provides possibility
					for unequal error detection/protection by employing different levels
					of protection on core layer and enhancement layers.
				</t>
			</section>
			<section title="Transmitting Layered Data">
				<t>
					In principle there are two basic approaches to carry the data from a layered encoder:
					<vspace blankLines="1"/>
					<list style="numbers">
						<t>
							All the layers are carried within a single RTP session
							<vspace blankLines="1"/>
						</t>
						<t>
							The encoded data is divided over multiple RTP sessions, each  session carrying a subset of layers. 
							This is also referred to as Multi-Session Transmission (MST)
							<vspace blankLines="1"/>
						</t>
					</list>
					The first choice is the most efficient in terms of exploitation of
					transmission bandwidth. Furthermore, using only one packet to carry
					all encoded data layers of a frame requires less resources also from
					the end-systems (and intermediate systems) since the number of
					packets is kept at minimum and only single RTP packet stream needs to
					be handled. However, this option requires any intermediate network
					element performing the scaling operation to be fully media-aware
					since removing encoded layers requires modification of the payload.
					Furthermore, the intermediate network element needs to be within the
					security context to enable the meaningful manipulation of the payload,
					in case secure transport is employed. This might not be feasible in
					all systems/scenarios, but some special-purpose devices such as e.g.
					media gateways in cellular telephone systems may be able to implement
					this kind of media-aware functionality.
					<vspace blankLines="1"/>
					The second alternative, transmitting selected subsets of layers in
					separate RTP sessions, facilitates simple scalability in intermediate
					network elements without the requirement of being fully media-aware.
					One use case of this alternative is layered multicast <xref target="McCanne"/>.
					On the other hand, this approach introduces separate packet header
					overhead for each subset of layers for those low-delay application
					scenarios wherein aggregation of data from multiple frames is not
					ideal. In this case, when the size of the encoded data block per
					single layer is in the range of 10 to 20 bytes, the packetisation may
					result in relatively high amount of protocol overhead, which might be
					an expensive solution on bandwidth-limited links. Another drawback of
					this approach is somewhat more complex session setup and the
					additional complexity associated with handling of several concurrent
					RTP sessions. However, this is a trade-off that enables simple
					scalability also by intermediate network elements that are not aware
					of the details of the transmitted media.
				</t>
			</section>
			<section title="Scaling Scenarios and Rate Control">
				<t>
					In principle there are three different ways to make use of the 
					layered design to control the bandwidth usage:
					<vspace blankLines="1"/>
					<list style="numbers">
						<t>
							A sender decides to change the number of layers it is transmitting
							(for example due to congestion control constraints)
							<vspace blankLines="1"/>
						</t>
						<t>
							A receiver or an intermediate network element instructs a sender
							to change the number of layers it is transmitting
							<vspace blankLines="1"/>
						</t>
						<t>
							An intermediate network element passes through only a subset of layers it receives
							<vspace blankLines="1"/>
						</t>
					</list>
					The most appropriate mechanism depends on the application and the
					employed network topology. For example point-to-point conversational
					audio connection can easily introduce rate control by changing the
					number of transmitted layers, while in centralized audio/video
					conferencing scenario the conference server is a more appropriate
					point to implement the rate control instead of transmitting end-point.
					Please refer to RFC 5117 for extensive discussion on the different
					topologies and their implications to the transmission.
					However, the fundamental difference between these choices is that
					method 1 does not necessarily need any feedback from the receiver(s),
					while methods 2 and 3 require a signaling mechanism to support rate control.
				</t>
			</section>
		</section>
		
		<section title="G.718 RTP Payload Format" anchor="G-R-P-F">
			<t>
				The basic G.718 source data unit is one layer of an encoded frame.
				Since generally the term layer refers to time series of data
				representing certain encoding layer, in this specification we use the
				term Encoded Data Unit (EDU) to refer to a single layer of data from
				single encoded frame. Thus, each EDU has a (conceptual) frame number
				indicating its location in encoding/decoding order and a layer number
				indicating the encoding layer the EDU represents.
			</t>
			<section title="Payload Structure" anchor="P-S">
				<figure align="left" suppress-title="true">
					<preamble>
						The G.718 payload format consists of a payload header, followed by
						one or more transport blocks (TB) forming the actual payload data.
					</preamble>
					<artwork>
<![CDATA[+-----------------+----------+----------+- /// -+----------+
| Payload header  |  TB(1)   |  TB(2)   |          TB(n)   |
+-----------------+----------+----------+- /// -+----------+]]>
					</artwork>
				</figure>
				<section title="Payload Header" anchor="P-H">
					<t>
						The payload header consists of an 8-bit payload CRC checksum:
<figure align="left" suppress-title="true"><artwork><![CDATA[
+-+-+-+-+-+-+-+-+
|     CRC       |
+-+-+-+-+-+-+-+-+
]]></artwork></figure>
						On the transmitting end the payload checksum is computed over the
						primary transport block (specified in <xref target="G-T-B"/>) of the payload
						using the generator polynomial
<figure align="left" suppress-title="true"><artwork><![CDATA[
C(z) = z^8 + z^4 + z^3 + z^2 + 1
]]></artwork></figure>
						Subsequent transport blocks are prepared in such a way that the
						payload checksum is valid for any integer number of contiguous
						transport blocks within one RTP packet starting from the beginning of
						the primary transport block.
						<vspace blankLines="1"/>
						On the receiving end the payload CRC checksum can be used to verify
						the correct reception of any contiguous subset of transport blocks
						within one RTP packet starting from the beginning of the primary
						transport block (see <xref target="CRC-V"/> for a detailed description).
					</t>
				</section>
				<section title="G.718 Transport Blocks" anchor="G-T-B">
					<t>
						The basic building block of the G.718 RTP payload data is an G.718
						transport block (TB). There are two types of transport blocks:
						primary and secondary.
						<vspace blankLines="1"/>
						
						<figure align="left" suppress-title="true">
							<preamble>The structure of the primary transport block is depicted below.</preamble>
							<artwork>
<![CDATA[ 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+----------------------------+
|   L-ID    |NF | Encoded data               |
+-+-+-+-+-+-+-+-+----------------------------+
]]>
							</artwork>
						</figure>
						
						<figure align="left" suppress-title="true">
							<preamble>The structure of the secondary transport block is depicted below.</preamble>
							<artwork>
<![CDATA[ 0 1 2 3 4 5 6 7                              0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+
|   L-ID    |NF | Encoded Data               |     Tail      |
+-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+
]]>
							</artwork>
						</figure>
						The layer ID (L-ID) and the NF fields form the transport block header.
						The L-ID field is used to identify the layer structure of the encoded
						data carried in this G.718 transport block, and the NF field
						indicates the number of encoded frames with this layer structure
						carried in the Encoded data part following the transport block header.
						The Tail field of the secondary transport block carries a modified 8-bit
						CRC checksum computed over the transport block, as specified below.
						<vspace blankLines="1"/>
						A G.718 RTP packet payload SHALL include exactly one primary
						transport block, which MAY be followed by one or more secondary
						transport blocks. The data fields of both transport block types are described below.
						<vspace blankLines="1"/>
						<list style="hanging">
							<t hangText="L-ID (6 bits)">
								<vspace blankLines="0"/>
								Identification of the encoded data carried in this transport block. 
								Table 3 below specifies the mapping between L-ID and the encoded data.
								Note that L-ID is treated as an unsigned integer.
								<figure align="center" suppress-title="true">
									<preamble>Table 3: Layer Identification (L-ID) Values
									</preamble>
									<artwork>
<![CDATA[      L-ID    Encoded data
    --------------------------
        0     Empty frame
        1     L1
        2     L1-L2
        3     L1-L3
        4     L1-L4
        5     L1-L5
        6     L2
        7     L2-L3
        8     L2-L4
        9     L2-L5
       10     L3
       11     L3-L4
       12     L3-L5
       13     L4
       14     L4-L5
       15     L5
       16     L1'
       17     L1', L3'
       18     L1', L3', L4
       19     L1', L3', L4-L5
       20     G.718 SID
       21     AMR-WB SID
       22-63  Reserved]]>
									</artwork>
								</figure>
							</t>
							<t hangText="NF (2 bits)">
								<vspace blankLines="0"/>
								Number of frames in this transport block (2 bits) decreased by
								one. The number of frames is equal to the value of NF
								incremented by one. For example, value NF=0 indicates that the
								transport block carries one frame, and value NF=3 indicate that
								the transport block carries four frames. If the sender wants to
								encapsulate more than four frames per payload, several
								transport blocks need to be used.
								<vspace blankLines="1"/>
							</t>
							<t hangText="Encoded Data (variable length)">
								<vspace blankLines="0"/>
								Encoded data consists of EDUs as specified by the values L-ID
								and NF fields, arranged according to the rules given in <xref target="H-T-E-D"/>.
								When L-ID is equal to 0 (empty frame), the encoded data field is not present.
								<vspace blankLines="1"/>
							</t>
							<t hangText="Tail (8 bits)">
								<vspace blankLines="0"/>
								The Tail field of the secondary transport block carries a
								bit field that is needed to modify the partial CRC checksum
								over the payload data up to the end of this TB to match the
								payload CRC field value carried in the payload header.
								<vspace blankLines="0"/>
								In the transmitter the Tail bits for a secondary TB(n) are
								computed by first computing the CRC checksum CRC(n) over the
								payload data from the beginning of the primary TB up to the end
								of TB(n) using the generator polynomial C(z) given above. The
								bits of the Tail field of TB(n) are set to zero value for the
								CRC computation. The transmitted value of the Tail field in
								TB(n) is obtained by bitwise XOR operation between the payload
								CRC field value carried in the payload header and the CRC(n)
								computed for TB(n).
							</t>
						</list>
					</t>
				</section>
			
			</section>
			<section title="Handling The Encoded Data" anchor="H-T-E-D">
				<t>
					In order to provide unique mapping of EDUs to encoded frames, the
					following rules on sequence of frames and sequence of layers need to
					be followed when creating a payload:
					<vspace blankLines="1"/>
					<list style="symbols">
						<t>
							The frames within a payload MUST form a set of contiguous frames
							in decoding order, i.e. if a payload carries frames n and n+N, all
							frames between n and n+N in decoding order MUST also be present in
							the payload.
							<vspace blankLines="1"/>
						</t>
						<t>
							The layers within a frame MUST form a contiguous set of layers,
							i.e. if layers Lx and Ly of a frame are included in the payload,
							all layers between Lx and Ly layers MUST also be present.
							<vspace blankLines="1"/>
						</t>
					</list>
					The EDUs within a transport block are arranged according to the following rules:
					<vspace blankLines="1"/>
					<list style="symbols">
						<t>
							The EDUs within a transport block MUST be arranged in increasing order of layer number
							<vspace blankLines="1"/>
						</t>
						<t>
							The EDUs with the same layer number within a transport block MUST be arranged in decoding order
							<vspace blankLines="1"/>
						</t>
					</list>
					Explicit timing information for the transport blocks is not needed,
					since the ordering of EDUs in the payload and their mapping to
					transport blocks can be used to implicitly carry this information.
					The following rules apply:
					<vspace blankLines="1"/>
					<list style="symbols">
						<t>
							If the highest layer carried in transport block k is n, and the
							lowest layer carried by transport block k+1 is n+1, then the EDUs
							of transport block k and k+1 belong to the same encoded frame.
							Furthermore, if transport blocks k and k+1 carry EDUs belonging to
							the same encoded frame(s), these transport blocks MUST include the
							same number of EDUs
							<vspace blankLines="1"/>
						</t>
						<t>
							If the highest layer carried in transport block k is n, and the
							lowest layer carried by transport block k+1 is smaller than or
							equal to n, the EDUs of transport block k and k+1 belong to the
							two separate encoded frames, which are contiguous in decoding order
							<vspace blankLines="1"/>
						</t>
						<t>
							Multiple copies of an EDU MUST NOT be included in the payload
							<vspace blankLines="1"/>
						</t>
					</list>
					A set of EDUs can be allocated to transport blocks in several ways.
					For example each EDU can be encapsulated in its own transport block,
					all EDUs can be carried in single transport block, EDUs belonging to
					the same encoded frame can be encapsulated in dedicated transport
					block, or EDUs representing the same layer can be carried in their
					own transport blocks. Three examples on this with two frames with
					layers L1-L3 are given below. The first example illustrates the case
					using a single transport block for the whole payload, while the
					second payload example introduces separate transport blocks for each
					of the EDUs. The third example shows an approach where all layers are
					carried in dedicated transport blocks. The notation Fx-Ly is used to
					denote layer y of frame x.
					<figure align="left" suppress-title="true">
						<preamble>Example 1: All EDUs in a single transport block</preamble>
						<artwork>
<![CDATA[+---------+-----+-------+-------+-------+-------+-------+--------+
| L-ID=3  |NF=1 | F1-L1 | F2-L1 | F1-L2 | F2-L2 | F1-L3 | F2-L3  |
+---------+-----+-------+-------+-------+-------+-------+--------+]]>
						</artwork>
					</figure>
					<figure align="left" suppress-title="true">
						<preamble>Example 2: All EDUs in separate transport blocks</preamble>
						<artwork>
<![CDATA[+---------+-----+-------+---------+-----+-------+
| L-ID=1  |NF=0 | F1-L1 | L-ID=1  |NF=0 | F2-L1 |
+---------+-----+-------+---------+-----+-------+
| L-ID=8  |NF=0 | F1-L2 | L-ID=8  |NF=0 | F2-L2 |
+---------+-----+-------+---------+-----+-------+
| L-ID=14 |NF=0 | F1-L3 | L-ID=14 |NF=0 | F2-L3 |
+---------+-----+-------+---------+-----+-------+]]>
						</artwork>
					</figure>
					<figure align="left" suppress-title="true">
						<preamble>Example 3: Dedicated transport for EDUs of each layer</preamble>
						<artwork>
<![CDATA[+---------+-----+-------+-------+---------+-----+-------+-------+
| L-ID=1  |NF=1 | F1-L1 | F2-L1 | L-ID=6  |NF=1 | F1-L2 | F2-L2 |
+---------+-----+-------+-------+---------+-----+-------+-------+
| L-ID=10 |NF=1 | F1-L3 | F2-L3 |
+---------+-----+-------+-------+]]>
						</artwork>
					</figure>
					While the first example carrying data from all layers in the same
					transport block obviously consumes less bandwidth, the second example
					using separate transport block for each EDU, and the third example
					using dedicated transport blocks for each layer provide simple
					scaling possibility: while in the first case the removal of e.g.
					layer L3 (from each frame in the payload) would require changing the
					value of the L-ID in addition to removing the corresponding EDU(s),
					in the second and third options it is enough to just remove all
					transport blocks carrying L3 data and the remaining part of the
					payload can be left untouched (however the packet size information in
					high-layer protocol headers needs change).
				</t>
			</section>
			
			<section title="G.718 Scaling">
				<t>
					Some Media-Aware Network Elements (MANEs) MAY modify the G.718
					bitstream by dropping some of the layers in case congestion control
					or e.g. access link bandwidth requires such scaling to take place.
					Such MANEs are RTP translators (with the topology Topo-Translator as
					described in <xref target="RFC5117"/>,
					for which the rules for RTP translators specified in <xref target="RFC3550"/> apply.
					<vspace blankLines="1"/>
					A payload can be either completely dropped or some of the transport
					blocks it carries can be discarded. In case full payloads are dropped
					to implement scaling, a packet containing the core layer L1 SHOULD
					NOT be discarded, since the decoding of higher layers of the same
					encoded frame is not possible without the core layer data being
					available. This means that payloads with L-ID values equal to 1 to 5,
					inclusive and 16 to 19, inclusive, SHOULD NOT be completely discarded.
					<vspace blankLines="1"/>
					<list>
						<t>
							Author's note: To be checked whether the case of dropping a subset of the transport blocks in 
							one packet also strictly follows the topology Topo-Translator.
						</t>
					</list>
					<vspace blankLines="1"/>
					In case the payload is forwarded with modified content, at least the
					primary transport block MUST be preserved in the payload, while some
					of the secondary transport blocks at the end of the payload MAY be
					discarded.
				</t>
			</section>
			
			<section title="CRC Verification" anchor="CRC-V">
				<t>
					Both UDP-Lite <xref target="RFC3828"/>
					and DCCP <xref target="RFC4340"/>
					provide partial checksum
					options, in which partially damaged payloads can be delivered to the
					application layer.  In cases wherein such a transport layer operation
					is in use, and the partial checksum service by the transport layer
					protects up to the RTP header and the payload header, the CRC
					checksum provided in the payload header can be used to verify whether
					an RTP packet payload contains corrupt transport blocks.
					<vspace blankLines="1"/>
					On the receiving end the CRC verification is made in such a way that
					the CRC computation is started from the beginning of the primary TB,
					i.e. from the MSB of the first octet of the TB(1), and the
					computation is continued until the end of the payload data or until
					an erroneous TB is encountered. At the end of each TB a check MAY be
					performed: if the CRC value at the end of TB(n) matches the payload
					CRC value received in the payload header, the verification is
					successful and the data up TB(n) is valid. If the CRC value at the
					end of TB(n) does not match the payload CRC value received in the
					payload header, there is an error in the TB(n) and it MUST be
					discarded as corrupted. Furthermore, if the verification indicates
					corrupted TB(n), all subsequent transport blocks TB(m) with m>n MUST
					also be discarded.
				</t>		
			</section>
			
			<section title="G.718 Session">
				<t>
					A G.718 session consists of one or several RTP sessions carrying G.718 data encoded according to the 
					payload format specified in <xref target="P-S"/>.
				</t>
			</section>
			
			<section title="Cross-stream/Cross-layer Timing Synchronization">
				<t>
					In the case where a G.718 session consists of multiple RTP sessions, the RTP
					packets transmitted on separate RTP sessions need to be synchronized
					in order to enable reconstruction of the frames in the receiving end <xref target="RFC6051"/>.
					Since each of the RTP sessions uses its own random initial value for
					the RTP timestamp, there is also a random offset between the RTP
					timestamps values carrying the EDUs belonging to the same encoded
					frame in different RTP sessions.
					<vspace blankLines="1"/>
					The receiver MUST use the traditional RTCP-based mechanism to
					synchronize streams by using the RTP and NTP timestamps of the RTCP
					Sender Reports (SR) it receives <xref target="RFC3550"/>.
				</t>
			</section>
			
			<section title="RTP Header Usage">
				<t>
					   This section specifies the usage of some fields of the RTP header
						(<xref target="RFC3550">specified in Section 5 of </xref>)
						with the G.718 RTP payload format.  The settings for other RTP header fields are as specified in
						<xref target="RFC3550"/>.
						<vspace blankLines="1"/>
						The RTP timestamp corresponds to the sampling instant of the first
						encoded sample of the earliest frame in the payload. The timestamp
						clock frequency is 32 kHz.
						<vspace blankLines="1"/>
						The marker bit (M) of each of the RTP streams of the session SHALL be
						set to value 1 if the payload carries an EDU belonging to the first
						frame after an inactive period, i.e. an EDU from the first frame of a
						talkspurt. For all other packets the marker bit is set to value 0.
				</t>
			</section>
		</section>
		
		<section title="Payload Format Parameters">
			<t>
				This section defines the parameters that may be used to configure
				optional features in the G.718 RTP transmission.
				<vspace blankLines="1"/>
				The parameters are defined here as part of the media subtype
				registration for the G.718 codec.  Mapping of the parameters into the
				Session Description Protocol (SDP) <xref target="RFC4566"/>
				is also provided for those applications that use SDP.  In control protocols that do not
				use MIME or SDP, the media type parameters MUST be mapped to the
				format used with that control protocol.
			</t>
			<section title="Media Type Registration" anchor="M-T-R">
				<t>
					This registration is done using the template defined in RFC 4288 <xref target="RFC4288"/>
					and following RFC 4855 <xref target="RFC4855"/>.
					<vspace blankLines="1"/>
					<list style="hanging">
						<t hangText="Type name:">
							audio
							<vspace blankLines="1"/>
						</t>
						<t hangText="Subtype name:">
							G718
							<vspace blankLines="1"/>
						</t>
						<t hangText="Required parameters:">
							none
							<vspace blankLines="1"/>
						</t>
						<t hangText="Optional parameters:">
							<list style="empty">
								<t>
									<list style="hanging" hangIndent="13">
										<t hangText="mode:">
											This parameter MAY be used to indicate whether the mode with layer L1 being present 
											or the AMR-WB
											compatible mode (with layer L1' being present) is in
											use. If this parameter is not present or the value of
											this parameter is equal to 0, the mode with layer L1
											being present is in use. Otherwise, the AMR-WB
											compatible mode is in use. When this parameter is
											present, the value MUST be either 0 or 1.
											<list hangIndent="6">
												<t hangText="NOTE:">
													When the upcoming stereo and SWB options are present, 
													the semantics of this parameter may change.
												</t>
											</list>
											<vspace blankLines="1"/>
										</t>
										<t hangText="layers:">
											The numbers of the layers (in range from 1 to 5,
											denoting layers from L1 to L5, respectively) transmitted in this session, expressed as comma-
											separated list of layer numbers. If the parameter is 
											present, at least layer L1 or L1' MUST be included in
											the list of layers in one of the RTP sessions included
											in the G.718 session. If the parameter is not present,
											all layers up to layer L5 MAY be used in the session.
											<list hangIndent="6">
												<t hangText="NOTE:">Why not use semantics similarly as L-ID?</t>
											</list>
											<vspace blankLines="1"/>
										</t>
										<t hangText="ptime:">
											The recommended length of time (in milliseconds) represented by the media in a packet.
											<xref target="RFC4566">See Section 6 of</xref>.
											<vspace blankLines="1"/>
										</t>
										<t hangText="maxptime:">
											The maximum length of time (in milliseconds) that can be encapsulated in a packet.  
											<xref target="RFC4566">See Section 6 of</xref>.
											<vspace blankLines="1"/>
										</t>
									</list>
								</t>
							</list>
							<list hangIndent="6">
								<t hangText="NOTE:">
									Some further study is needed to see if separate
									parameters for sending and receiving capabilities/preferences are
									needed -- especially for upcoming stereo and SWB options.
									<vspace blankLines="1"/>
								</t>
								<t hangText="NOTE:">
									Support for upcoming SWB and stereo options
									needs to be taken into account. Basically we can either 1) extend
									the parameter "layers" to cover also this aspect, or 2) define
									separate parameter(s) for these new options when more details on
									the stereo/SWB support are available.
									<vspace blankLines="1"/>
								</t>
							</list>
						</t>
						<t hangText="Encoding considerations:">
							<vspace blankLines="0"/>
							This media type is framed and contains binary data; see Section 4.8 of <xref target="RFC4288"/>.
							<vspace blankLines="1"/>
						</t>
						<t hangText="Security considerations:">
							See <xref target="S-C"/> of RFC XXXX.
							<vspace blankLines="0"/>
							[RFC Editor: Upon publication  as an RFC, please "XXXX" with the number assigned to this document 
							and remove this note.]
							<vspace blankLines="1"/>
						</t>
						<t hangText="Interoperability considerations:">
							None.
							<vspace blankLines="1"/>
						</t>
						<t hangText="Published specification:">
							RFC XXX.
							<vspace blankLines="0"/>
							[RFC Editor: Upon publication  as an RFC, please "XXXX" with the number assigned to this document 
							and remove this note.]
							<vspace blankLines="1"/>
						</t>
						<t hangText="Applications which use this media type:">
							<vspace blankLines="0"/>
							For example: Voice over IP, audio and video conferencing, audio streaming and voice messaging.
							<vspace blankLines="1"/>
						</t>
						<t hangText="Additional information:">
							None.
							<vspace blankLines="1"/>
						</t>
						<t hangText="Person & email address to contact for further information:">
							<vspace blankLines="0"/>
							Ari Lakaniemi, ari.lakaniemi@nokia.com
							<vspace blankLines="1"/>
						</t>
						<t hangText="Intended usage:">
							COMMON
							<vspace blankLines="1"/>
						</t>
						<t hangText="Restrictions on usage:">
							<vspace blankLines="0"/>
							This media type depends on RTP framing, and hence is only defined for transfer via RTP <xref target="RFC3550"/>.
							<vspace blankLines="1"/>
						</t>
						<t hangText="Author:">
							Ari Lakaniemi, ari.lakaniemi@nokia.com
							<vspace blankLines="1"/>
						</t>
						<t hangText="Change controller:">
							<vspace blankLines="0"/>
							IETF Audio/Video Transport Working Group delegated from the IESG.
						</t>
					</list>
				</t>
			</section>
			
			<section title="Mapping to SDP Parameters">
				<t>
					The information carried in the media type specification has a
					specific mapping to fields of the SDP <xref target="RFC4566"/>,
					which is commonly used to describe RTP sessions.  When SDP is used to specify sessions
					employing the G.718 codec, the mapping is as follows:
					<vspace blankLines="1"/>
					<list style="symbols">
						<t>
							The media type ("audio") goes in SDP "m=" as the media name.
							<vspace blankLines="1"/>
						</t>
						<t>
							The media subtype ("G718") goes in SDP "a=rtpmap" as the encoding
							name.  The RTP clock rate in "a=rtpmap" MUST be 32000 for G.718.
							<list style="hanging">
								<t hangText="NOTE:">
									The current choice for the RTP clock rate is a 'placeholder'. 
									The clock rate needs to be set according to SWB
									sampling rate, which is still T.B.D. Since the core codec employs
									16000 Hz sampling rate, an integer multiple of 16000 Hz seems to
									be a preferable choice.
									<vspace blankLines="1"/>
								</t>
							</list>
						</t>
						<t>
							The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, 
							respectively.
							<vspace blankLines="1"/>
						</t>
						<t>
							Any remaining parameters go in the SDP "a=fmtp" attribute by
							copying them directly from the media type string as a semicolon
							separated list of parameter=value pairs.
						</t>
					</list>
				</t>
			</section>
			
			<section title="Offer/Answer Considerations">
				<t>
					The following considerations apply when using the SDP offer/answer <xref target="RFC3264"/>
					mechanism to negotiate the G.718 transport. The parameter
					"layers" MAY be used to indicate the layer configuration for the each
					RTP session belonging to current G.718 session an end-point making
					the offer is ready to transmit and wishes to receive.
					<vspace blankLines="1"/>
					<list style="symbols">
						<t>
							In case the G.718 session consists of a single RTP session, it is
							RECOMMENDED not to impose any layer restrictions for the session
							but to use the rate control functionality to set possible
							restrictions on usage of the higher or highest layers. If the
							offer includes a layer configuration parameter, the answer MAY use
							different configuration, but the highest layer in the answer MUST
							NOT be higher than the highest layer of the offered configuration.
							<list style="hanging">
								<t hangText="NOTE:">
									Support for answer modifying the layer configuration is FFS.
								</t>
							</list>
							<vspace blankLines="1"/>
						</t>
						<t>
							In case the G.718 session consists of multiple RTP sessions, the
							answer MUST use the layer configurations provided in the offer for
							the sessions it accepts.
						</t>
					</list>
				</t>
			</section>
			<section title="Declarative Usage of SDP">
				<t>
					In declarative usage, such as SDP in RTSP <xref target="RFC2326"/>
					or SAP <xref target="RFC2974"/>,
					the parameter "layers" SHALL be interpreted to provide a set of
					layers that the sender MAY use in the session.
				</t>
			</section>
			
			<section title="SDP Examples">
				<t>
					Some example SDP session descriptions utilizing G.718 encodings are provided below.
				</t>
				<section title="Example 1">
					<t>
						The first example illustrates the simple case with the G.718 session employing a single RTP session and 
						the AVPF profile is offered, and the answer accepts the offer without any changes.
						<vspace blankLines="1"/>
						<list style="hanging" hangIndent="2">
							<t hangText="Offer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
								<vspace blankLines="1"/>
							</t>
							<t hangText="Answer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
							</t>
						</list>
					</t>
				</section>
				<section title="Example 2">
					<t>
						This example shows a bit more complex case where the G.718 session using a single RTP session and 	
						the AVPF profile is offered with the restriction to send/receive only with layers L1 and L2. The
						answer indicates that the other end-point is happy to receive (and send) layers up to L5.
						<vspace blankLines="1"/>
						<list style="hanging" hangIndent="2">
							<t hangText="Offer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:97 layers=1,2
								<vspace blankLines="1"/>
							</t>
							<t hangText="Answer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:97 layers=1,2,3,4,5
							</t>
						</list>
					</t>
				</section>
				<section title="Example 3">
					<t>
						The third example shows an G.718 session using multiple RTP sessions with the AVPF profile. The 
						answerer wishes to use only layers up to L3.
						<vspace blankLines="1"/>
						<list style="hanging" hangIndent="2">
							<t hangText="Offer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:97 layers=1,2
								<vspace blankLines="0"/>
								a=mid=1
								<vspace blankLines="1"/>
								m=audio 49122 RTP/AVPF 98
								<vspace blankLines="0"/>
								a=rtpmap:98 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:98 layers=3
								<vspace blankLines="0"/>
								a=mid=2
								<vspace blankLines="0"/>
								a=depend:lay 1
								<vspace blankLines="1"/>
								m=audio 49124 RTP/AVPF 99
								<vspace blankLines="0"/>
								a=rtpmap:99 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:99 layers=4,5
								<vspace blankLines="0"/>
								a=mid=3
								<vspace blankLines="0"/>
								a=depend:lay 1 2
								<vspace blankLines="1"/>
							</t>
							<t hangText="Answer:">
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 97
								<vspace blankLines="0"/>
								a=rtpmap:97 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:97 layers=1,2
								<vspace blankLines="0"/>
								a=mid=1
								<vspace blankLines="1"/>
								m=audio 49120 RTP/AVPF 98
								<vspace blankLines="0"/>
								a=rtpmap:98 G718/32000/1
								<vspace blankLines="0"/>
								a=fmtp:98 layers=3
								<vspace blankLines="0"/>
								a=mid=2
								<vspace blankLines="0"/>
								a=depend:lay 1
								<vspace blankLines="1"/>
							</t>
						</list>
						Note that the dependency signaling described in <xref target="RFC5583"/>
						is used in the third example above to indicate the relationship between the
						layers distributed into separate RTP sessions.
					</t>
				</section>
			</section>
		</section>
		
		<section anchor="C-C" title="Congestion Control">
			<t>
				As a scalable codec, G.718 implicitly provides means for congestion
				control by providing a possibility for 'thinning' the bitstream. The
				RTP payload format according to this specification provides several
				different means for reducing the G.718 session bandwidth. The most
				appropriate mechanism (in terms of impact to the user experience)
				depends on the employed payload structure and also on the employed
				session configuration (single RTP session or multiple RTP sessions).
				The following means (in no particular order) can be used to assist
				congestion control procedures -- either by the sender or by the
				intermediate node.
				<vspace blankLines="1"/>
				<list style="symbols">
					<t>
						The payloads carrying the EDUs representing the highest layers in an G.718 session can be dropped, 
						along with all associated transport blocks
						<vspace blankLines="1"/>
					</t>
					<t>
						The transport blocks carrying the EDUs representing the highest layers within the payload can be dropped
						<vspace blankLines="1"/>
					</t>
					<t>
						Transport blocks or payloads carrying EDUs belonging to
						redundant frames included in the payload can be dropped
						<vspace blankLines="1"/>
					</t>
				</list>
			</t>
		</section>

		<section anchor="S-C" title="Security Considerations">
			<t>
				RTP packets using the payload format defined in this specification
				are subject to the security considerations discussed in the RTP
				specification <xref target="RFC3550"/>, 
				and in any appropriate RTP profile (for
				example <xref target="RFC3551"/>
				or <xref target="RFC4585"/>.
				This implies that confidentiality
				of the media streams is achieved by encryption; for example, through
				the application of SRTP <xref target="RFC3711"/>.
				Because the data compression used
				with this payload format is applied end-to-end, any encryption needs
				to be performed after compression.
				<vspace blankLines="1"/>
				A potential denial-of-service threat exists for data encodings using
				compression techniques that have non-uniform receiver-end
				computational load.  The attacker can inject pathological datagrams
				into the stream that will increase the processing load of the decoder
				and may cause the receiver to be overloaded. For example inserting
				additional EDUs representing the higher enhancement layers on top of
				the ones actually transmitted may increase the decoder load. However,
				the G.718 codec is not particularly vulnerable to such an attack,
				since the majority of the computational load in an G.718 session is
				associated to the encoder.  Another form of possible attach might be
				forging of codec bit-rate control messages, which may result in
				encoder operating employing higher number of enhancement layers than
				originally intended and thereby requiring larger amount of
				computation resources. Therefore, the usage of data origin
				authentication and data integrity protection of at least the RTP
				packet is RECOMMENDED; for example, with SRTP <xref target="RFC3711"/>.
				<vspace blankLines="1"/>
				Note that the appropriate mechanism to ensure confidentiality and
				integrity of RTP packets and their payloads is very dependent on the
				application and on the transport and signaling protocols employed.
				Thus, although SRTP is given as an example above, other possible
				choices exist.
				<vspace blankLines="1"/>
				Note that end-to-end security with either authentication, integrity
				or confidentiality protection will prevent a network element not
				within the security context from performing media-aware operations
				other than discarding complete packets.  To allow any (media-aware)
				intermediate network element to perform its operations, it is
				required to be a trusted entity which is included in the security
				context establishment.
			</t>
		</section>

		<section title="IANA Considerations">
			<t>
				IANA is kindly requested to register a media type for the G.718 codec for RTP transport, as specified in 
				<xref target="M-T-R"/>
				of this document.
			</t>
		</section>
		
		<section title="Acknowledgements">
			<t>
				Thanks to Qin Wu for useful review and commentary.
			</t>
		</section>
	</middle>
	
	<back>
		<references title="Normative References">
			<reference anchor="AMR-WB">
				<front>
					<title>
						Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; General description
					</title> 
					<author>
						<organization>3GPP</organization> 
					</author>
					<date day="10" month="April" year="2001" /> 
				</front>
				<seriesInfo name="3GPP TS" value="26.171 5.0.0" /> 
				<format type="HTML" target="http://www.3gpp.org/ftp/Specs/html-info/26171.htm" /> 
			</reference>
			
			<reference anchor="ITU.G718.2008">
				<front>
					<title>Frame Error Robust Narrowband and Wideband Embedded Variable Bit-Rate Coding of Speech and Audio from 8-32 Kbit/s</title>
					<author>
						<organization>International Telecommunications Union</organization>
					</author>
					<date month="May" year="2008" />
				</front>
				<seriesInfo name="ITU-T" value="Recommendation G.718" />
			</reference>

			&rfc2119;
			&rfc3264;
			&rfc3550;
			&rfc3551;
			&rfc3711;
			&rfc4288;
			&rfc4566;
			&rfc4585;
			&rfc4855;
			&rfc4867;
			&rfc5583;
		</references>

		<references title="Informative References">
			
			<reference anchor="McCanne">
				<front>
					<title>Receiver-driven layered multicast</title> 
					<author fullname="Steven McCanne" initials="S.M." surname="McCanne">
						<organization>
							University of California, Berkeley and Lawrence Berkeley National Laboratory
						</organization> 
					</author>
					<author fullname="Van Jacobson" initials="V.J." surname="Jacobson">
						<organization>Network Research Group, Lawrence Berkeley National Laboratory</organization>
					</author>
					<author fullname="Martin Vetterli" initials="M.V." surname="Vetterli">
						<organization>University of California, Berkeley</organization>
					</author>
					<date month="October" year="1996"/> 
				</front>
				<seriesInfo name="ACM SIGCOMM Computer Communication Review" value="Volume 26 Issue 4"/> 
			</reference>

			&rfc2326;
			&rfc2974;
			&rfc3828;
			&rfc4340;
			&rfc5117;
			&rfc6051;
		</references>
		<section title="Payload Examples">
			<t>
				The G.718 payload structure enables flexible transport either by
				carrying all layers in the same payload or separating the layers into
				separate payloads. The following subsections illustrate different
				possibilities for transport by simple examples. Note that examples do
				not show the full payload structure to keep the illustration simple.
			</t>
			<section title="Simple Payload Examples">
				<section title="All The Layers in The Same Payload">
				<figure align="left" suppress-title="true">
					<preamble>
						The illustration below shows layers L1-L3 from two encoded frames
						encapsulated into separate payloads using single transport block.
					</preamble>
					<artwork>
<![CDATA[ +-------+--------+-----+------+------+------+
 | RTP1  | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 |
 +-------+--------+-----+------+------+------+

 +-------+--------+-----+------+------+------+
 | RTP2  | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 |
 +-------+--------+-----+------+------+------+]]>					
					</artwork>
				</figure>
				<figure align="left" suppress-title="true">
					<preamble>
						In the case where the same layers from two input frames are encapsulated into
						one payload using single transport block, the structure is as shown below.
					</preamble>
					<artwork>
<![CDATA[ +-------+--------+-----+------+------+------+------+------+------+
 | RTP1  | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 |
 +-------+--------+-----+------+------+------+------+------+------+]]>					
					</artwork>
				</figure>
				<figure align="left" suppress-title="true">
					<preamble>
						The third example illustrates the case where the layers L1-L3 from
						two input frames are encapsulated into one payload using two separate
						transport blocks, the first one carrying L1 and the other one
						containing L2 and L3.
					</preamble>
					<artwork>
<![CDATA[ +-------+--------+-----+------+------+
 | RTP1  | L-ID=1 |NF=1 |F1-L1 |F2-L1 |
 +-------+--------+-----+------+------+------+------+
         | L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 |
         +--------+-----+------+------+------+------+]]>					
					</artwork>
				</figure>
				</section>
				<section title="Layers in Seperate RTP Streams">
					<t>
						In this case the data for each layer is transmitted in its own payload.
						<figure align="left" suppress-title="true">
							<preamble>
								In the first example each transport block including a single EDU is carried in its own RTP payload.
							</preamble>
							<artwork>
<![CDATA[ +-------+--------+-----+-----+    +-------+--------+-----+-----+
 | RTP1a | L-ID=1 |NF=0 |F1-L1|    | RTP1b | L-ID=6 |NF=0 |F1-L2|
 +-------+--------+-----+-----+    +-------+--------+-----+-----+

 +-------+--------+-----+-----+    +-------+--------+-----+-----+
 | RTP1c |L-ID=10 |NF=0 |F1-L3|    | RTP2a | L-ID=1 |NF=0 |F2-L1|
 +-------+--------+-----+-----+    +-------+--------+-----+-----+

 +-------+--------+-----+-----+    +-------+--------+-----+-----+
 | RTP2b | L-ID=6 |NF=0 |F2-L2|    | RTP2c |L-ID=10 |NF=0 |F2-L3|
 +-------+--------+-----+-----+    +-------+--------+-----+-----+]]>							
							</artwork>
						</figure>
						<figure align="left" suppress-title="true">
							<preamble>
								If the payloads carry data from two consecutive input frames, the
								same encoded data as in the previous example is arranged as follows.
							</preamble>
							<artwork>
<![CDATA[+-------+--------+-----+-----+-----+
 | RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1|
 +-------+--------+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2|
 +-------+--------+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3|
 +-------+--------+-----+-----+-----+]]>							
							</artwork>
						</figure>
					</t>
				</section>
			</section>
			<section title="Advanced Examples">
				<section title="Different Update Rate for Subset of Layers">
					<figure align="left" suppress-title="true">
						<preamble>
							An example employing different update rates (i.e. different number of
							frames per packet) for selected subsets of layers. In these examples
							all core codec layers L1-L5 are shown.
						</preamble>
						<artwork>
<![CDATA[ +-------+--------+-----+-----+-----+-----+-----+
 | RTP1  | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1|
 +-------+--------+-----+-----+-----+-----+-----+

 +-------+--------+-----+-----+-----+-----+-----+
 | RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3|
 +-------+--------+-----+-----+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP3a |L-ID=14 |NF=0 |F1-L4|F1-L5|
 +-------+--------+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP3b |L-ID=14 |NF=0 |F2-L4|F2-L5|
 +-------+--------+-----+-----+-----+

 +-------+--------+-----+-----+-----+-----+-----+
 | RTP2b | L-ID=7 |NF=1 |F3-L2|F4-L2|F3-L3|F4-L3|
 +-------+--------+-----+-----+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5|
 +-------+--------+-----+-----+-----+

 +-------+--------+-----+-----+-----+
 | RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5|
 +-------+--------+-----+-----+-----+]]>
						</artwork>
					</figure>
				</section>
				<section title="Redundant Frames With Limited Set of Layers">
					<figure align="left" suppress-title="true">
						<preamble>
							An example transmitting layers L1-L3 as primary data and L1 (of the
							previous frame) as redundant data is shown below. Each payload
							carries one primary (i.e. new) frame in one transport block and one
							redundant frame, which in this example is the frame preceding the
							primary frame, in another transport block.
						</preamble>
						<artwork>
<![CDATA[ +-------+--------+-----+-----+--------+-----+-----+-----+-----+
 | RTP1  | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3|
 +-------+--------+-----+-----+--------+-----+-----+-----+-----+

 +-------+--------+-----+-----+--------+-----+-----+-----+-----+
 | RTP2  | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3|
 +-------+--------+-----+-----+--------+-----+-----+-----+-----+

 +-------+--------+-----+-----+--------+-----+-----+-----+-----+
 | RTP3  | L-ID=1 |NF=0 |F2-L1| L-ID=3 |NF=0 |F3-L1|F3-L2|F3-L3|
 +-------+--------+-----+-----+--------+-----+-----+-----+-----+]]>					
						</artwork>
					</figure>
					<figure align="left" suppress-title="true">
						<preamble>
							Alternatively, the payload carrying also redundant data for a subset
							of layers can be arranged differently, as shown in the example below.
						</preamble>
						<artwork>
<![CDATA[ +-------+--------+-----+-----+-----+-----+--------+-----+-----+
 | RTP1  | L-ID=3 |NF=0 |F0-L1|F0-L2|F0-L3| L-ID=1 |NF=0 |F1-L1|
 +-------+--------+-----+-----+-----+-----+--------+-----+-----+

 +-------+--------+-----+-----+-----+-----+--------+-----+-----+
 | RTP2  | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1|
 +-------+--------+-----+-----+-----+-----+--------+-----+-----+

 +-------+--------+-----+-----+-----+-----+--------+-----+-----+
 | RTP3  | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1|
 +-------+--------+-----+-----+-----+-----+--------+-----+-----+]]>						
						</artwork>
						<postamble>
							Now the first transport block carries the primary data and the second
							transport block carries the redundant data, which in this case covers
							the frame following the primary frame. The benefit of this approach
							is that the redundant data is included in the last (secondary)
							transport block of the payload, which might be beneficial for
							possible payload scaling operation within the network.
						</postamble>
					</figure>
				</section>
			</section>
		</section>
	</back>
</rfc>

PAFTECH AB 2003-20262026-04-24 01:19:33