One document matched: draft-marjou-rtcweb-audio-codecs-for-interop-01.xml


<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [      
  <!ENTITY RFC2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
  <!ENTITY RFC2616 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2616.xml'>
  <!ENTITY audio-codecs PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-rtcweb-audio.xml'>
<!ENTITY overview PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-rtcweb-overview.xml'>
<!ENTITY use-cases-and-requirements PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-rtcweb-use-cases-and-requirements.xml'>
]>


<?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?> 
<rfc ipr='trust200811' category='info'>
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc sortrefs='yes'?>

<front>
	<title abbrev='WebRTC audio codecs for interop'>
     WebRTC audio codecs for interoperability with legacy networks. 
	</title>
	
	<author initials='X.' surname='Marjou' fullname='Xavier Marjou'>
	<organization>France Telecom Orange</organization>
	<address>
		<postal>
			<street>2, avenue Pierre Marzin</street>
			<city>Lannion</city> 
			<code>22307</code>
			<country>France</country>
		</postal> 
      <email>xavier.marjou@orange.com</email>
	</address>
	</author>
	
	<author initials='S.' surname='Proust' fullname='Stephane Proust'>
	<organization>France Telecom Orange</organization>
	<address>
		<postal>
			<street>2, avenue Pierre Marzin</street>
			<city>Lannion</city> 
			<code>22307</code>
			<country>France</country>
		</postal> 
      <email>stephane.proust@orange.com</email>
	</address>
	</author>

	<author initials='K.' surname='Bogineni' fullname='Kalyani Bogineni'>
	<organization>Verizon Wireless</organization>
	<address>
      <email>Kalyani.Bogineni@VerizonWireless.com</email>
	</address>
	</author>

	<author initials='R.' surname='Jesske' fullname='Roland Jesske'>
	<organization>Deutsche Telekom AG</organization>
	<address>
      <email>R.Jesske@telekom.de</email>
	</address>
	</author>

	<author initials='B.' surname='Feiten' fullname='Bernhard Feiten'>
	<organization>Deutsche Telekom AG</organization>
	<address>
      <email>R.Jesske@telekom.de</email>
	</address>
	</author>




	<author initials='L.' surname='Miao' fullname='Lei Miao'>
	<organization>Huawei</organization>
	<address>
      <email>lei.miao@huawei.com</email>
	</address>
	</author>
	
	<author initials='E.' surname='Enrico' fullname='Marocco'>
	<organization>Telecom Italia</organization>
	<address>
      <email>enrico.marocco@telecomitalia.it</email>
	</address>
	</author>
	

	<author initials='E.' surname='Berger' fullname='Espen Berger'>
	<organization>Cisco</organization>
	<address>
      <email>espeberg@cisco.com</email>
	</address>
	</author>
	


	
	<date month='February' year='2013' />
	
	<area>Real-time Applications and Infrastructure Area</area>
	<keyword>WebRTC</keyword>
	<keyword>audio</keyword>
	<keyword>codec</keyword>
	<keyword>G.722</keyword>
	<keyword>AMR</keyword>
	<keyword>AMR-WB</keyword>

	<abstract>
	<t>This document presents use-cases underlining why WebRTC needs AMR-WB, AMR and G.722 as additional relevant voice codecs to satisfactorily ensure interoperability with existing systems. It also presents a way forward that takes into consideration the concerns expressed against the addition of codecs besides Opus and G.711.</t>

<t>It is especially recognized that unjustified additional costs on browsers must be avoided. Therefore, the proposed solution intends to fully rely on the codecs already supported on the devices implementing the browsers and for which license and implementation costs have been already paid.</t>

<t>It is expected that this way forward will significantly limit the costs and technical impacts on browsers while greatly improving interoperability with legacy systems and overall quality. It intents to be considered as a good compromise beneficial to all parties and to the whole industry: the user quality experience will be optimized as a whole at limited additional costs without incurring high costs for both networks to support transcoding and browsers to support additional codecs. </t>

	</abstract>
</front>

<middle>

  <section title="Introduction">

<t>As indicated in [I-D.ietf-rtcweb-overview], it has been anticipated that WebRTC will not remain an isolated island and that some WebRTC endpoints will need to communicate with devices used in other existing networks with the help of a gateway.</t>

<t>In order to reach the agreement to select OPUS and G.711 as mandatory to implement codecs it was also agreed to consider possible additional codecs in order to take into account the concerns expressed on interoperability issues. The discussion is consequently currently taking place regarding the additional voice/audio codecs that need to be supported. It is mainly questioned whether Opus and G.711 are sufficient to properly address the interoperability issues with legacy systems or if additional codecs need to be supported.</t>

<t>This document presents some use cases highlighting that Opus and G.711 are not sufficient to properly cover all interoperability requirements. In Section 4, important interoperability use-cases are presented describing the interoperability issues that would be encountered if only Opus and G.711 were supported. It therefore advocates for the addition of other codecs while addressing concerns raised against such support.</t>
 
<t>In section 5, a way forward is proposed that intends to be a real compromise taking into consideration the concerns expressed against additional codecs.  It is especially recognized that unjustified additional costs on browsers must be avoided. Therefore the proposed solution intends to strongly limit the cost and technical impact on browsers for the support of additional codecs (including license costs) while improving interoperability with legacy systems.</t>

<t>Regarding audio codecs, it is a common misconception that other existing voice networks only support G.711. Actually existing networks use circuit switched networks as well as voice-over-IP networks like H.323 and SIP-based networks, which means that audio codecs are not limited to G.711. For instance, from use cases described in [I-D.ietf-rtcweb-se-ucases-and-requirements], it can be foreseen that interoperability with mobile telephony systems will often happen. In such mobile systems, the UE must support the Adaptive Multi-Rate (AMR) speech codec, and if wideband speech communication is offered, the UE must support AMR wideband (AMR-WB) codec. An increasing number of customers are now experiencing high quality voice with HD Voice services over mobile, fixed networks or over the internet. For those customers, any fall back to the G.711 narrow band quality for interoperability purpose could be perceived as a strong and unacceptable quality degradation. Support of G.711 as the only codec for legacy interoperability purposes is currently not sufficient.</t>


</section>
 
	<section title="Terminology">
      
	<t> In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 <xref target="RFC2119" />.</t>
  
	</section>


        <section title="Definitions">
      
  
	<t>Legacy networks: In this draft, legacy networks encompass the conversational networks that are already deployed like the PSTN, the PLMN, the IMS, H.323 networks.</t>

	</section>


	<section title="Use cases">


	  <section title="AMR-WB">

		<section title="Use case">
			<t>The market of voice personal communication is driven by mobile terminals and WebRTC technology is expected to be increasingly used on smartphones. Furthermore "HD voice" is gaining momentum and more and more personal communication devices will support wideband communications. Customers are now getting used to the high quality offered by HD Voice mobile devices, CAT-iq fixed HD devices and eventually, HD Voice via WebRTC and OPUS over the internet.</t>
<t>Hence, many communications are expected to be held between a user of a WebRTC endpoint on a device integrating an AMR-WB module who wants to communicate with another user that can only be reached on a mobile device that only supports AMR-WB (and AMR); both endpoints support HD voice quality.</t>
		</section>

		<section title="Problem">
			<t>For this use case, the best situation will be to have the AMR-WB supported by both sides of the connection. Indeed, as mentioned in the introduction, AMR-WB is specified by 3GPP as the mandatory codec to be supported by wideband mobile terminals for a wide range of communication services as described in <xref target="AMR-WB" />. This includes the massively deployed circuit switched mobile telephony services and new multimedia telephony services over IP/IMS and 4G/VoLTE as specified by GSMA as voice IMS profile for VoLTE in IR92. Hence, AMR-WB is strongly increasing with deployment in more than 60 networks from 45 countries and more than 130 types of terminals (c.f. <xref target="Information-Papers" />).</t>
<t>In that use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints, a gateway must then transcode these codecs into AMR-WB, and vice-versa, in order to implement the use-case.  As a consequence, a high number of calls are likely to be affected by transcoding operations producing a degradation of the user quality experience for many customers. This will have a very significant business impact for all service providers on both sides, not only with respect to the transcoding costs but mainly with respect to user experience degradation.
</t>

<t>The drawbacks of transcoding are recalled below: 
<vspace blankLines="1" />
<list style="symbols">

<t>Cost issues: transcoding places important additional costs on network gateways for example codec implementation and license costs, deployments costs, testing/validation costs etc... However these costs can be seen as just transferred from the terminal side to the network side. The real issue is rather the degradation of the quality of service affecting the end user perceived quality which will be harmful to all concerned service providers.<vspace blankLines="1" /></t>

<t>Intrinsic quality degradation: Subjective test results show that intrinsic voice quality is significantly degraded by transcoding. The degradation is around 0.2 to 0.3 MOS for most of transcoding use cases with AMR-WB at 12.65 kbit/s. It should be stressed that if transcoding is performed between AMR-WB and G.711, wideband voice quality will be lost. Such bandwidth reduction effect clearly degrades the user perceived quality of service leading to shorter and less frequent calls (see ref_gsma). Such a switch to G.711 will not be accepted anymore by customers. If transcoding is performed between AMR-WB and OPUS, wideband communication could be maintained. However, as the WB codecs complexity is higher than NB codecs complexity, such WB transcoding is also more costly and degrades the quality: MOS scores of transcoding between AMR-WB 12.65kbit/s and OPUS at 16 kbit/s in both directions are significantly lower than those of AMR-WB at 12.65kbit/s or OPUS at 16 kbit/s. Furthermore, in degraded conditions, the addition of defects, like audio artifacts due to packet losses, and the audio effects resulting from the cascading of different packet loss recovery algorithms may result in a quality below the acceptable limit for the customers.<vspace blankLines="1" /></t>

<t>Degraded interactivity due to increased latency: Transcoding means full de-packetization for decoding of the media stream (including mechanisms of de-jitter buffering and packet loss recovery) then re-encoding, re-packetization and re-sending. The delays produced by all these operations are additive and may increase the end to end delay beyond acceptable limits like with more than 1s end to end latency.<vspace blankLines="1" /></t>
</list>
</t>

<t>In addition to these drawbacks related to transcoding, the following issue must be considered:<vspace blankLines="1" />
<list style="symbols" >
<t>Efficiency over mobile radio access: AMR-WB has been designed and extensively tested for optimized and robust usage over mobile radio access which results in enhanced capacity and efficiency. The mobile radio bearer is optimized for such a codec with channel coding protecting its most sensitive bits. Furthermore, AMR-WB is more efficient than OPUS at regular bit rates used for mobile communication of 12.65 kbit/s with fall back modes down to 6.6 kbit/s. Finally, hardware optimized implementation may allow for less battery consumption</t>
</list>
</t>
<t>As a consequence, re-using AMR-WB would be beneficial for the specific usage of WebRTC technology over mobile networks. With the strong increase of the smartphone market the capability to use such a mobile codec could strongly enforce and extend the market penetration of the Web RTC technology.</t>
		</section>	

	<section title="Concerns from the browser manufacturers">
			<t>The browser manufacturers are concerned by the additional costs that the implementation of AMR-WB would put on browsers which include integration and test costs and codec license costs. The proposed way forward in Section 5 takes carefully into account this concern.</t>
		</section>
	
	   </section>

	
<section title="AMR">


		<section title="Use case">
			<t>A user of a WebRTC endpoint on a device integrating an AMR module wants to communicate with another user that can only be reached on a mobile device that only supports AMR. Although more and more terminal devices are now "HD voice" and support AMR-WB a high number of legacy terminals supporting only AMR (terminals with no wideband / HD Voice capabilities) are still used.</t>
		</section>

		<section title="Problem">
			<t>For this use case, the best solution will be to have the AMR supported by both sides of the connection. Indeed, AMR is specified by 3GPP as the mandatory codec to be supported by any mobile terminal for a wide range of communication services. This includes the massively deployed circuit switched mobile telephony services and new multimedia telephony services over IP/IMS and 4G/VoLTE as specified by GSMA as voice IMS profile for VoLTE in IR92. Hundreds of millions of terminals are consequently currently supporting AMR and are not supporting OPUS nor G.711.</t>
<t>In that use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints the same problem as described in 4.1.1 will be experienced because of transcoding impacts (costs, quality degradation and increased latency) and lower efficiency over mobile radio access.</t> 
<t>As a consequence, re-using AMR would be beneficial for the specific usage of WebRTC technology over mobile networks. With the strong increase of the smartphone market the capability to use such a mobile codec could strongly enforce and extend the market penetration of the Web RTC technology.</t>
		
	   </section>


	<section title="Concerns from the browser manufacturers">

	<t>Same as in Section 4.1.3.</t>

	</section>
</section>


	  <section title="G.722">


		<section title="Use case">
			<t>As mentioned in Section 4.1.1, HD Voice is gaining momentum and more and more personal communication devices support wideband communications. Customers get used to high quality voice and WebRTC aims at providing high voice quality over internet.  In this use case, a user of a WebRTC endpoint on a device integrating G.722 module wants to communicate with another user that can only be reached on a device that only supports G.722 as a wideband codec, G.722 being specified by ETSI as the mandatory wideband codec for New Generation DECT (e.g. CAT-iq compliant).</t>
		</section>

		<section title="Problem">
			<t>For this use case, the best solution will be to have the G.722 supported by both sides of the connection.</t>
<t>Indeed, G.722 has been chosen by ETSI DECT to greatly increase the voice quality by extending the bandwidth from narrow band to wideband. Besides providing high wideband quality, it has low complexity and very low delay. It is widely used in HD fixed services in both hard and soft endpoints.</t>
<t>In this use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints, a gateway must then transcode them from and into G.722 in order to implement the use-case. As in Section 4.1.2, it should be stressed that if transcoding is performed between G.722 and G.711, wideband quality will be lost with fall back to narrow band. This will be perceived as a strong and unacceptable quality degradation by customers experiencing more and more wideband voice calls. It is also important to recall that wideband audio can help persons with hearing impairments to use voice communication over distance and drafts regulations dealing with this requires wide band audio wherever there is voice communication pointing at G.722 as the common codec at least to assure interoperability with wide-band audio between providers.</t>
<t>On the other hand, transcoding with OPUS will greatly increase the complexity, now, as mentioned above, G.722 low complexity was a key factor in many applications mandating G.722.</t>
		</section>		

<section title="Concerns from the browser manufacturers">

	<t>Unlike AMR and AMR-WB, G.722, as G.711, is royalty free. Some concerns about the availability of G.722 PLC were raised.  Indeed G.722 and G.711 were initially designed without Packet Loss Concealment (PLC); nevertheless, ITU-T did standardize such functionality as appendices to these recommendations to extend the capabilities of current systems to support new applications and to follow the market demand (for instance when these standards have been widely used in VoIP applications). It has been argued that, unlike the main recommendations, there are non-RF IPR declarations for these PLC appendices (G.711 Appendix I, G.722 Appendices III and IV). It should be recalled that these appendices are only examples and implementers are free to use any PLC solution. For instance in the G.722 case, there are publicly available PLC in the ITU-T Software Tool Library.</t>

	</section>

	   </section>

	</section>

	
 
	<section title="The proposed way-forward">
  	
	<t>It is proposed that the browser manufacturers re-use AMR, AMR-WB and G.722 codecs whenever they are already supported on the device on which the browsers are implemented.</t> 
<t>AMR and AMR-WB are already supported by millions of devices with license costs and technical costs (implementation, tests...) already paid for these codecs optimized for mobile usage.</t>
<t>Android now provides the APIs needed to give access to all the voice and audio features implemented on the hardware that are required to develop a voice service. This especially includes AMR and AMR-WB encoding and decoding, adaptive gain control, echo cancellation and noise reduction.</t>
<t>It is therefore technically feasible that browsers offer AMR and AMR-WB as additional RTC Web codecs without re-implementing these codecs and so with very limited additional costs:<vspace blankLines="1" />
<list style="symbols">
<t>For implementation, no codec software code has to be developed, tested or integrated directly in the browser (neither for the encoding/decoding nor for related audio functions).<vspace blankLines="1" /></t>
<t>For licensing, fees are already paid for the hardware implementation. Since no additional codec is implemented on the device with only one active voice call using the codec at a time it cannot be argued that additional license fees have to be paid in that case. However, in order to fully guarantee this, it is made explicit in the proposed text that the support of AMR and AMR-WB is conditional on the fact that no additional license fee is required.</t>
</list>
</t>
	
<t>This proposed way forward is expected to be a good compromise beneficial to all parties. It will optimize the user experience with limited additional costs: excluding high costs for networks to support  transcoding and high additional costs on browsers to support additional codecs.</t>

<t>It is consequently proposed the following wording to be added to Section 3 (codec requirements)  of  [I-D.ietf-rtcweb-audio-codecs]:
<vspace blankLines="1" />
<list style="empty">
<t>3.  Audio Codecs<vspace blankLines="1" /></t>

<t>3.1.  Required Codecs<vspace blankLines="1" /></t>

<t>To ensure a baseline level of interoperability between WebRTC
clients, a minimum set of required codecs are specified below.<vspace blankLines="1" /></t>

<t>WebRTC clients are REQUIRED to implement the following audio codecs.
<list style="symbols">
<t>Opus [RFC6716], with any ptime value up to 120 ms</t>

<t>G.711 PCMA and PCMU with one channel, a rate of 8000 Hz and a
ptime of 20 - see section 4.5.14 of [RFC3551]</t>
<t>Telephone Event - [RFC4733]<vspace blankLines="1" /></t>
</list></t>

<t>3.2.  Additional Codecs<vspace blankLines="1" /></t>

<t>To ensure an enhanced level of interoperability between WebRTC
clients, AMR-WB, AMR and G.722 codecs SHOULD
be implemented by WebRTC end-points to avoid transcoding costs and
quality degradations towards legacy fixed and mobile devices and allow interworking with enhanced voice quality (rather than fall back to G.711 narrow band voice).<vspace blankLines="1" /></t>

<t>WebRTC browsers on devices for which the implementation of AMR is mandatory for voice services MUST allow AMR to be negotiated and used at WebRTC level provided it is ensured that no additional license fees are required.<vspace blankLines="1" /></t>

<t>WebRTC browsers on wide-band devices for which the implementation of AMR-WB is mandatory for voice services MUST allow AMR-WB to be negotiated and used at WebRTC level provided it is ensured that no additional license fees are required.<vspace blankLines="1" /></t>

<t>WebRTC browsers devices for which the implementation of G.722 is mandatory for voice services MUST allow G.722 to be negotiated
and used at WebRTC level.<vspace blankLines="1" /></t>
</list>
</t>	

<t>Note: the wording of section "3.2.  Additional Codecs" is a first proposal and example to try to capture the general principle explained in this document to improve interoperability while limiting the cost impact on browsers. It is subject to further modifications to reach the best possible compromise.
</t>

	</section>

    
    
	<section title="Security Considerations">
  
		<t>
			
	  </t>

	</section>


	<section title="IANA Considerations">

	
		<t>None.</t>


	</section> 


	<section title="Acknowledgements">

		<t>Thanks to Milan Patel for his review.</t>

	</section> 


  
</middle>

<back>

<references title="Normative references">

	&RFC2119;
  	&RFC2616;  	
 
</references>

<references title="Informative references">

	&audio-codecs;
	&overview;
	&use-cases-and-requirements;

<reference anchor="AMR-WB">
	<front>
		<title>AMR-WB</title>
		<author>
			<organization>GSMA</organization>
		<address>
<uri>http://www.gsma.com/newsroom/wp-content/uploads/2012/03/ir3610.pdf</uri>
		</address>
		</author>
		<date year="2011"></date>
		
	</front>
</reference>	

<reference anchor="Information-Papers">
	<front>
		<title>Information Papers</title>
		<author>
			<organization>GSMA</organization>
		<address>
		<uri>http://www.gsacom.com/gsm_3g/info_papers</uri>
		</address>
		</author>
		<date year="2013"></date>
		
	</front>
</reference>	

</references>

</back>

</rfc>


 


PAFTECH AB 2003-20262026-04-24 02:37:52