One document matched: draft-ietf-avtext-multiple-clock-rates-05.xml


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='http://xml.resource.org/authoring/rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3550 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc2629 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY rfc3551 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3551.xml">
<!ENTITY rfc3556 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3556.xml">
<!ENTITY rfc3611 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3611.xml">
<!ENTITY rfc5450 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5450.xml">
<!ENTITY rfc5484 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5484.xml">
<!ENTITY rfc5760 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5760.xml">
<!ENTITY I-D.ietf-avt-variable-rate-audio PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-avt-variable-rate-audio.xml">
<!ENTITY rfc6051 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6051.xml">
]>

<rfc category="std" docName="draft-ietf-avtext-multiple-clock-rates-05" ipr="trust200902" updates="3550">
	<front>
		<title abbrev="Multiple Clock Rates">Support for Multiple Clock Rates in an RTP Session</title>

		<author fullname="Marc Petit-Huguenin" initials="M.P.H" surname="Petit-Huguenin">
			<organization>Unaffiliated</organization>

			<address>
				<email>petithug@acm.org</email>
			</address>
		</author>
		
		<author fullname="Glen Zorn" initials="G." role="editor" surname="Zorn">
			<organization abbrev="Network Zen">Network Zen</organization>

			<address>
				<postal>
					<street>227/358 Thanon Sanphawut</street>

					<city>Bang Na</city>

					<region>Bangkok</region>

					<code>10260</code>

					<country>Thailand</country>
				</postal>

				<phone>+66 (0) 87-0404617</phone>

				<email>glenzorn@gmail.com</email>
			</address>
		</author>

		<date year="2012"/>
		
		<abstract>
			<t>
				This document clarifies the RTP specification when different clock rates are used in an RTP session.
				It also provides guidance on how to interoperate with legacy RTP implementations that use multiple clock rates.
			</t>
		</abstract>
	</front>

	<middle>
		<section title="Introduction">
			<t>
				The clock rate is a parameter of the payload format.
				It is often defined as been the same as the sampling rate but it is not always the case
				(see e.g. the G722 and MPA audio codecs <xref target="RFC3551"/>).
			</t>

			<t>
				An RTP sender can switch between different payloads during the lifetime of an RTP session and because clock rates are defined by payload types,
				it is possible that the clock rate also varies during an RTP session.
				Schulzrinne, et al. <xref target="RFC3550"></xref> lists using multiple clock rates as one of the reasons to not use different payloads on the same
				SSRC but unfortunately this advice was not always followed and some RTP implementations
				change the payload in the same SSRC even if the different payloads use different clock rates.
			</t>

			<t>
				This creates three problems:
				<list style="symbols">
					<t>
						The method used to calculate the RTP timestamp field in an RTP packet is underspecified.
					</t>
					<t>
						When the same SSRC is used for different clock rates,
						it is difficult to know what clock rate was used for
						the RTP timestamp field in an RTCP SR packet.
					</t>
					<t>
						When the same SSRC is used for different clock rates,
						it is difficult to know what clock rate was used for the interarrival
						jitter field in an RTCP RR packet.
					</t>
				</list>
			</t>

			<t>
				<xref target="fields"/> contains a non-exhaustive list of fields in RTCP packets that uses a clock rate as unit:
			</t>

			<texttable anchor="fields">
				<ttcol>Field name</ttcol>
				<ttcol>RTCP packet type</ttcol>
				<ttcol>Reference</ttcol>
				<c>RTP timestamp</c> <c>SR</c> <c>
<xref target="RFC3550"/>
</c>
				<c>Interarrival jitter</c> <c>RR</c> <c>
<xref target="RFC3550"/>
</c>
				<c>min_jitter</c> <c>XR Summary Block</c> <c>
<xref target="RFC3611"/>
</c>
				<c>max_jitter</c> <c>XR Summary Block</c> <c>
<xref target="RFC3611"/>
</c>
				<c>mean_jitter</c> <c>XR Summary Block</c> <c>
<xref target="RFC3611"/>
</c>
				<c>dev_jitter</c> <c>XR Summary Block</c> <c>
<xref target="RFC3611"/>
</c>
				<c>Interarrival jitter</c> <c>IJ</c> <c>
<xref target="RFC5450"/>
</c>
				<c>RTP timestamp</c> <c>SMPTETC</c> <c>
<xref target="RFC5484"/>
</c>
				<c>Jitter</c> <c>RSI Jitter Block</c> <c>
<xref target="RFC5760"/>
</c>
				<c>Median jitter</c> <c>RSI Stats Block</c> <c>
<xref target="RFC5760"/>
</c>
			</texttable>

			<t>
				This document first tries to list in <xref target="legacy"/> and subsections all
				of the algorithms known to be used in existing RTP implementations at the time of writing.
				These sections are not normative.
			</t>

			<t>
				<xref target="recommendations"/> and subsections then recommend a unique algorithm that modifies RFC 3550.
				These sections are normative.
			</t>

		</section>

		<section title="Terminology">
			<t>
				The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
				"RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 <xref target="RFC2119"/>.
				In addition, this document uses the following terms:
			</t>

			<t>
			<list hangIndent="15" style="hanging" >
				<t hangText="Clock rate">
						The multiplier used to convert from a wallclock value in seconds to an equivalent RTP timestamp
						value (without the fixed random offset).
						Note that RFC 3550 uses various terms like "clock frequency",
						"media clock rate", "timestamp unit", "timestamp frequency", and "RTP timestamp clock rate"
						as synonymous to clock rate.						
				</t>

				<t hangText="RTP Sender">
						A logical network element that sends RTP packets, sends RTCP SR packets, and receives RTCP RR packets.
				</t>

				<t hangText="RTP Receiver">
						A logical network element that receives RTP packets, receives RTCP SR packets, and sends RTCP RR packets.
					</t>
				</list>
			</t>
		</section>


		<section anchor="legacy" title="Legacy RTP">
			<t>
				The following sections describe the various ways legacy RTP implementations behave when multiple clock rates are used.
				Legacy RTP refers to RFC 3550 without the modifications introduced by this document.
			</t>

			<section anchor="method1" title="Different SSRC">
				<t>
					One way of managing multiple clock rates is to use a different SSRC for each different clock rate, as in this case there is no ambiguity on the clock rate used by fields in the RTCP packets.
					This method also seems to be the original intent of RTP as can be deduced from points 2 and 3 of section 5.2 of RFC 3550.
				</t>

				<t>
					On the other hand changing the SSRC can be a problem for some implementations designed to work only with unicast IP addresses, where having multiple SSRCs is considered a corner case.
					Lip synchronization can also be a problem in the interval between the beginning of the new stream and the first RTCP SR packet.
					This is not different than what happen at the beginning of the RTP session but it can be more annoying for the end-user.
				</t>
			</section>

			<section title="Same SSRC">
				<t>
					The simplest way of managing multiple clock rates is to use the same SSRC for all the payload types regardless of the clock rates.
				</t>

				<t>
					Unfortunately there is no clear definition on how the RTP timestamp should be calculated in this case.
					The following subsections present the algorithms used in the field.
				</t>

				<section anchor="method2" title="Monotonic timestamps">
					<t>
						This method of calculating the RTP timestamp ensures that the value increases monotonically.
						The formula used by this method is as follows:

<figure align="center"> <artwork> <![CDATA[
timestamp = previous_timestamp 
            + (current_capture_time - previous_capture_time)
            * current_clock_rate
]]></artwork></figure>
					</t>
					<t>
						The problem with this method is that the jitter calculation on the receiving
						side gives an invalid result during the transition between two clock rates,
						as shown in <xref target="monotonic"/>.
						The capture and arrival time are in seconds, starting at the beginning of the capture of the first packet;
						clock rate is in Hz; the RTP timestamp does not include the random offset;
						the transit, jitter, and average jitter use the clock rate as unit.
					</t>

					<texttable anchor="monotonic">
						<ttcol>Capt. time</ttcol>
						<ttcol>Clock rate</ttcol>
						<ttcol>RTP timestamp</ttcol>
						<ttcol>Arrival time</ttcol>
						<ttcol>Transit</ttcol>
						<ttcol>Jitter</ttcol>
						<ttcol>Average jitter</ttcol>
						<c>0</c>
<c>8000</c>
<c>0</c>
<c>0.1</c>
<c>800</c>
<c/>
<c/>
						<c>0.02</c>
<c>8000</c>
<c>160</c>
<c>0.12</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.04</c>
<c>8000</c>
<c>320</c>
<c>0.14</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.06</c>
<c>8000</c>
<c>480</c>
<c>0.16</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.08</c>
<c>16000</c>
<c>800</c>
<c>0.18</c>
<c>2080</c>
<c>480</c>
<c>30</c>
						<c>0.1</c>
<c>16000</c>
<c>1120</c>
<c>0.2</c>
<c>2080</c>
<c>0</c>
<c>28</c>
						<c>0.12</c>
<c>16000</c>
<c>1440</c>
<c>0.22</c>
<c>2080</c>
<c>0</c>
<c>26</c>
						<c>0.14</c>
<c>8000</c>
<c>1600</c>
<c>0.24</c>
<c>320</c>
<c>720</c>
<c>70</c>
						<c>0.16</c>
<c>8000</c>
<c>1760</c>
<c>0.26</c>
<c>320</c>
<c>0</c>
<c>65</c>
					</texttable>

					<t>
						Calculating the correct transit time on the receiving side can be done by using the following formulas:
					</t>

					<t>
						<list style="numbers">
								<t>
										current_time_capture = current_timestamp - previous_timestamp) / current_clock_rate + previous_time_capture
								</t>
								<t>
										transit = current_clock_rate * (time_arrival - current_time_capture)
								</t>
								<t>
										previous_time_capture = current_time_capture
								</t>
						</list>
					</t>

					<t>The main problem with this method, in addition to the fact that the jitter calculation described in RFC 3550 cannot be used, is that is it dependent on the previous RTP packets, packets that can be reordered or lost in the network.</t>
				</section>

				<section anchor="method3" title="Non-monotonic timestamps">
					<t>An alternate way of generating the RTP timestamps is to use the following formula:</t>

<figure><artwork align="center"><![CDATA[
timestamp = capture_time * clock_rate
]]></artwork></figure>
						
					
					<t>
						With this formula,
						the jitter calculation is correct but the RTP timestamp values are no longer increasing monotonically as shown in
						<xref target="non-monotonic"/>.
						RFC 3550 states that "[t]he sampling instant MUST be derived from a clock that increments monotonically[...]"
						but nowhere says that the RTP timestamp must increment monotonically.
					</t>

					<texttable anchor="non-monotonic">
						<ttcol>Capt. time</ttcol>
						<ttcol>Clock rate</ttcol>
						<ttcol>RTP timestamp</ttcol>
						<ttcol>Arrival time</ttcol>
						<ttcol>Transit</ttcol>
						<ttcol>Jitter</ttcol>
						<ttcol>Average jitter</ttcol>
						<c>0</c>
<c>8000</c>
<c>0</c>
<c>0.1</c>
<c>800</c>
<c/>
<c/>
						<c>0.02</c>
<c>8000</c>
<c>160</c>
<c>0.12</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.04</c>
<c>8000</c>
<c>320</c>
<c>0.14</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.06</c>
<c>8000</c>
<c>480</c>
<c>0.16</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.08</c>
<c>16000</c>
<c>1280</c>
<c>0.18</c>
<c>1600</c>
<c>0</c>
<c>0</c>
						<c>0.1</c>
<c>16000</c>
<c>1600</c>
<c>0.2</c>
<c>1600</c>
<c>0</c>
<c>0</c>
						<c>0.12</c>
<c>16000</c>
<c>1920</c>
<c>0.22</c>
<c>1600</c>
<c>0</c>
<c>0</c>
						<c>0.14</c>
<c>8000</c>
<c>1120</c>
<c>0.24</c>
<c>800</c>
<c>0</c>
<c>0</c>
						<c>0.16</c>
<c>8000</c>
<c>1280</c>
<c>0.26</c>
<c>800</c>
<c>0</c>
<c>0</c>
					</texttable>

					<t>
						The advantage with this method is that it works with the jitter calculation described in RFC 3550, as long as the correct clock rates are used.
						It seems that this is what most implementations are using.
					</t>
				</section>
			</section>
		</section>

<section anchor="recommendations" title="Recommendations">
	<t>
		The following subsections describe behavioral recommendations for
		RTP senders (with and without RTCP) and RTP recievers
	</t>
	<section anchor="SendRTCP" title="RTP Sender (with RTCP)">
			<t>
				An RTP Sender with RTCP turned on MUST use a different SSRC for each different clock rate.
				An RTCP BYE MUST be sent and a new SSRC MUST be used if the clock rate switches back to a value already seen in the RTP stream.
			</t>
		
			<t>
				To accelerate lip synchronization, the next compound RTCP packet sent by the RTP sender MUST contain multiple SR packets,
				the first one containing the mapping for the current clock rate and the next SR packets
				containing the mapping for the other clock rates seen during the last period.
			</t>
		
			<t>
				The RTP extension defined in Perkins & Schierl <xref target="RFC6051"/> MAY be used to accelerate the synchronization.
			</t>
	</section>
	<section anchor="SendNoRTCP" title="RTP Sender (without RTCP)">
		<t>
			An RTP Sender with RTCP turned off (i.e. by setting the RS and RR bandwidth modifiers
			<xref target="RFC3556"/> to 0) SHOULD use a different SSRC for each different clock rate
			but MAY use different clock rates on the same SSRC as long as the RTP timestamp without the random offset is calculated as explained below:
			<vspace blankLines="1"/>
			Each time the clock rate changes, the start_offset and capture_start values are calculated with the following formulas:
			
			<figure align="left">
				<artwork align="center"><![CDATA[
start_offset = (capture_time - capture_start)
               * previous_clock_rate
capture_start = capture_time
				]]></artwork>
			</figure>
		
			For the first RTP packet, the values are initialized with the following values:
			
			<figure align="left">
				<artwork align="center"><![CDATA[
start_offset = 0
capture_start = capture_time
				]]></artwork>
			</figure>
		
			After eventually updating these values, the RTP timestamp is calculated with the following formula:
			<figure align="left">
				<artwork align="center"><![CDATA[
timestamp = (capture_time - capture_start) * clock_rate
            + start_offset
				]]></artwork>
			</figure>
		</t>
		
		<t>
			Note that in all the formulas, capture_time is the first instant the new timestamp rate is used.
		</t>
	</section>
	

			<section title="RTP Receiver">
				<t>
					An RTP Receiver MUST calculate the jitter using the following formula:
				</t>

				<t>
				<figure align="left">
					<artwork align="center"><![CDATA[
D(i,j) = (arrival_time_j * clock_rate_i - timestamp_j)
         - (arrival_time_i * clock_rate_i - timestamp_i)
				    ]]></artwork>
				</figure>
				</t>

				<t>
					An RTP Receiver MUST be able to handle a compound RTCP packet with multiple SR packets.
				</t>

				<t>
					For interoperability with legacy RTP implementations,
					an RTP receiver MAY use the information in two consecutive SR packets to calculate the clock rate used,
					i.e. if Ni is the NTP timestamp for the SR packet i,
					Ri the RTP timestamp for the SR packet i and Nj and Rj the NTP timestamp and RTP timestamp for the previous SR packet j,
					then the clock rate can be guessed as the closest to (Ri - Rj) / (Ni - Nj).
				</t>
			</section>
		</section>

		<section title="Security Considerations">
			<t>
				This document is not believed to effect
				the security of the RTP sessions described here in any way.
			</t>
		</section>

		<section title="IANA Considerations">
			<t>This document requires no IANA actions.</t>
		</section>

		<section title="Acknowledgements">
			<t>Thanks to Colin Perkins, Ali C. Begen,
			Harald Alvestrand and Magnus Westerlund for their comments, suggestions and questions that helped to improve this document.</t>
			<t>Thanks to Robert Sparks and the attendees of SIPit 26 for the survey on multiple clock rates interoperability.</t>
			<t>This document was written with the xml2rfc tool described in Rose <xref target="RFC2629"/>.</t>
		</section>
	</middle>

	<back>
		<references title="Normative References">
				&rfc2119;
				&rfc3550;
		</references>

		<references title="Informative References">
			&rfc2629;
			&rfc3551;
			&rfc3556;
			&rfc3611;
			&rfc5450;
			&rfc5484;
			&rfc5760;
			&rfc6051;
			&I-D.ietf-avt-variable-rate-audio;
		</references>

		<section title="Using a Fixed Clock Rate">
			<t>
				An alternate way of fixing the multiple clock rates issue was proposed in <xref target="I-D.ietf-avt-variable-rate-audio"/>.
				This document proposed to define a unified clock rate, but the proposal was rejected at IETF 61.
			</t>
		</section>

		<section title="Behavior of Legacy Implementations">
			<section title="libccrtp 2.0.2">
				<t>This library uses the formula described in <xref target="method3"/>.</t>

				<t>Note that this library uses gettimeofday(2) which is not guaranteed to increment monotonically, like when the clock is adjusted by NTP.</t>
			</section>

			<section title="libmediastreamer0 2.6.0">
				<t>This library (which uses the oRTP library) uses the formula described in <xref target="method3"/>.</t>

				<t>Note that in some environments this library uses gettimeofday(2) which is not guaranteed to increment monotonically.</t>
			</section>

			<section title="libpjmedia 1.0">
				<t>This library uses the formula described in <xref target="method3"/>.</t>
			</section>

			<section title="Android RTP stack 4.0.3">
				<t>This library changes the SSRC each time the format changes, as described in <xref target="method1"/>.</t>
			</section>
		</section>
	</back>
</rfc>

PAFTECH AB 2003-20262026-04-23 14:21:11