One document matched: draft-isomaki-rtcweb-mobile-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">

]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="no" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std" docName="draft-isomaki-rtcweb-mobile-00" ipr="trust200902">
 <!-- category values: std, bcp, info, exp, and historic
    ipr values: trust200902, noModificationTrust200902, noDerivativesTrust200902,
       or pre5378Trust200902
    you can add the attributes updates="NNNN" and obsoletes="NNNN" 
    they will automatically be output with "(if approved)" -->

 <!-- ***** FRONT MATTER ***** -->

 <front>
   <!-- The abbreviated title is used in the page header - it is only necessary if the 
        full title is longer than 39 characters -->

  <title abbrev="RTCWeb for Mobile">RTCweb Considerations for Mobile Devices</title>

   <author initials='M.I.' surname="Isomaki" fullname='Markus Isomaki'>
    <organization abbrev="Nokia">Nokia</organization>
    <address>
      <postal>
	<street>Keilalahdentie 2-4</street>
      <code>FI-02150 Espoo</code>
	<country>Finland</country>
      </postal>
      <email>markus.isomaki@nokia.com</email>
    </address>
  </author>


   <date year="2012" />

   <!-- Meta-data Declarations -->

   <area>RAI</area>

   <workgroup>RTCWeb</workgroup>

   <keyword>RTCWeb</keyword>
   <keyword>mobile</keyword>


   <!-- Keywords will be incorporated into HTML output
        files in a meta tag but they have no effect on text or nroff
        output. If you submit your draft to the RFC Editor, the
        keywords will be used for the search engine. -->

   <abstract>
     <t>
       Web Real-time Communications (WebRTC) aims to provide web-based applications 
       real-time and peer-to-peer communication capabilities. In many cases those
       applications are run in mobile devices connected to different types of mobile
       networks. This document gives an overview of the issues and challenges in
       implementing and deploying WebRTC in mobile environments. It also gives
       guidance on how to overcome those challenges. 
     </t>
   </abstract>
 </front>

 <middle>
   <section title="Introduction">
     <t>
       Web Real-time Communications (WebRTC) provides web-based applications real-time
       and peer-to-peer communication capabilities. The applications can setup communication
       sessions that can carry audio, video, or any application specific data. To be
       reachable for incoming sessions setups or other messages, the applications must
       keep persistent connectivity with their "calling site". 
    </t>

     <t>
       In the last few years, mobile devices, such as smartphones or tablets, have become
       relatively powerful in terms of processing and memory. Their browsers are becoming
       close to their desktop counterparts. So, from that perspective, it is feasible
       to run WebRTC applications in them. However, power consumption and highly diverse
       nature of the connectivity still remain as specific challenges. A lot of work is done
       to address these challenges in e.g. radio technologies and hardware components, but
       still by far the most important factor is how the applications and protocols and
       application programming interfaces are designed.       
    </t>

     <t>
       Section 2 of this document gives an overview of the characteristics of different 
       mobile networks as background for further discussion. Section 3 introduces the specific
       issues that WebRTC protocols and applications should take into consideration to be
       mobile-friendly.     
    </t>

   <t>
     The current version of the document misses all references and lot of details. It may have
     some errors. Its purpose is to get attention to the topics it raises and start discussion about them.
   </t>



   
   </section>

<section title="Common mobile networks and their properties">
  <t>
    The most relevant mobile networks for WebRTC at the moment are Wi-Fi and the different
    variants of cellular technologies. 
  </t>
   <t>
     Many characteristics of the cellular networks are covered in Section 3 in the context of the 
     particular issue under discussion. The following is a very brief description of the power
     consumption related properties of WCDMA/HSPA networks. The details vary, but similar principles
     apply to other cellular networks, at least GPRS/EDGE and LTE. 
   </t>

   <t>
     In simplified terms, the WCDMA/HSPA radio can be in three different types of states: The power-save
     state (IDLE, Cell_PCH, URA_PCH), a shared channel state (Cell_FACH) or a dedicated channel state
     (Cell_DCH). The power-save states consumes about two decades less power than the dedicated channel
     state, while the shared-channel state is somewhere in the middle. The state machine works so that if
     a device has only small packets (upto ~200-500 bytes) to send or receive, it will allocate a 
     shared channel, that operates on low data rate. If there is more traffic (even a single full size
     IP packet), a dedicated channel is allocated. Starting from the power-save state, the channel allocation
     typically takes somewhere between 0.5 and 2 seconds, depending on the network and the exact power-save state. 
     Only after that, the first packet is really sent. If two cellular devices were to exchange packets with each 
     other starting from the power-save state, the initial IP-level RTT could be easily 3-4 seconds. 
     </t>

     <t>
     The channel
     is kept for some time after the last packet has been sent or received. The dedicated channel drops to
     power-save via the shared channel. The timers from dedicated to shared and shared to power-save are 
     network dependent, but typically somewhere between 5 and 30 seconds. So, in some networks sending a single
     ping every 30 secods is enough to keep the power consumption constantly at the maximum level, while in
     others the power-save state is entered much faster. The total radio power consumption does not actually
     depend so much on overall volume of traffic, but on how long a dedicated or shared channel is active. So,
     for instance a 1 kB keep-alive sent every 30 seconds for an hour (total ~100 kB of traffic) consumes much 
     more (even an order or magnitude more!) than a single 10 MB download, assuming that will finish in a minute or two.
   </t>

   <t>
     The applications have no control over the radio states, but the Operating System and the Radio Modem
     software can do something about them. In the newer specifications (and devices and networks) it is possible
     for the device to explicily ask the radio channel to be abandoned even immediately after the last packet.
     For instance, if the device were somehow to know that no new packets are to be sent for some time, it
     could do such signaling and save power. 
   </t>

   <t>
     The bottom line is that applications and protocols should keep as long intervals between traffic as possible,
     giving the radio as much low-power time as possible. The intervals that are more than a few seconds may help,
     but at least intervals that are longer than 30 seconds will definitely help. On the other hand, the initial
     RTT after an interval will be long. This issue is covered in Sections 3.1 and 3.2. 
   </t>

   <t>
     The other key characteristic of cellular networks is that they have long buffers and run link-layer in
     "acknowledged" mode, meaning all lost packets are retransmitted. This means TCP will easily create long delays
     and ruins real-time traffic. This is covered in Section 3.4. 
   </t>

  <t>
     The third characteristic is that mobile devices often change networks on the fly, typically between cellular and
     Wi-Fi. Most devices only run a single interface at a time. From networking perspective this means that the 
     device's IP address changes, and e.g. all its TCP connections are lost. This is covered in Section 3.3.
  </t>
</section>

 <section title="Specific issues and how to deal with them">
   <t>
    
   </t>

   <section title="Persistent connectivity to the Calling Site">
   <t>
     Many WebRTC apps want to be reachable for incoming sessions (JSEP Offers) or other types of
     asynchronous messages. For this purpose they need some kind of a persistent communication
     channel with their "Calling Site". Two standard approaches for this are WebSockets and HTTP
     long-polling. In both of these cases a TCP connection is used as the underlying transport. 
   </t>
   <t>
     Most cellular networks have a firewall preventing incoming TCP connections, even when they
     allocate public IPv4 or IPv6 addresses. Also NATs are becoming more popular with the exhaustion
     of IPv4 address space. The firewall and NAT timers for TCP can range between 1 and 60 minutes,
     depending on the network. To keep the TCP connection alive, the application needs to send
     some kind of a keep-alive packets with high enough frequency to avoid the timeout.  
   </t>
   <t>
     If the WebRTC app intends to run for a long periods of time (even when the user is not actively
     interacting with it), it is of utmost importance to keep this keep-alive traffic as infrequent
     as possible. Every wake-up of the radio consumes a significant amount of power, even if it is
     needed just for sending and receiving a couple of IP packets. It makes a huge difference, if there
     are for instance 6 vs. 60 of these wake-ups every hour. A naiive application may want to make
     it sure it sends frequently enough for all possible networks. That leads to unacceptable
     power consumption. A smarter application will try to figure out a suitable
     timeout for a given network it is using, and can save a lot of power in networks with longer
     timers.  
   </t>
   <t>
     There are further strategies to manage the keep-alives so that they consume least amount of power.
     It is best to send as small keep-alive messages as possible. HSPA/WCDMA networks have a special
     shared radio channel (FACH) that can carry small amounts of traffic. Its power consumption is typically
     less than half of the dedicated channel. Depending on the network, a packet of a couple of 
     hundred bytes will usually only require FACH, while a thousand byte packet will require the dedicated channel
     to be activated. So, a WebSocket PING-PONG is better than an HTTP POST or GET with all the Cookies
     and other headers attached. If there are multiple applications or connections to be kept alive, the Browser
     or the underlying platform should offer some kind of a synchronization for them, so that the radio is
     woken only once per cycle. 
   </t>

   <t>
     The most efficient approach would be to multiplex the initial incoming messages for all applications
     over the same TCP connection. This would require the use of some kind of a gateway service in the network.
     Such "notification" services are available on many platforms, but at the moment they are not typically
     available for browsers or web applications. It would be useful to standardize or develop Javascript APIs for this 
     purpose. There is W3C work on Server-sent events. Also, the Open Mobile Alliance (OMA) has started work on
     standardized "notification" services. Be the services standards based or proprietary, the most relevant
     part to get done would be to give WebRTC and other Web applications access to them. Such services are 
     always subject to privacy concerns, so at minimum the messages passed over them should be end-to-end
     encrypted. (Traffic analysis threats would still remain.)
   </t>


   </section>     

   <section title="Media and Data channels">   

   <t>
     Real-time media (audio, video) is typically sent and/or received constantly, while the media channel
     is established. This means radio needs to be on constantly, and there is little for the application
     to do to preserve power. (Choosing a hardware accelerated video codec over a non-HW-supported one
     is one thing the application may be able to influence.) At least in LTE there are techniques called
     Discontinuous Transmission/Reception (DTX, DRX), that operate even in the timeframe of tens of 
     milliseconds and can affect power consumption e.g. for VoIP. It is an open issue if WebRTC stacks can 
     be somehow optimized for them. 
   </t>

   <t>
     The Data Channel may however be often low-volume or even idle for long periods of time. For instance
     an IM connection may be idle for minutes or even hours. There can be many apps that want to keep
     such a connection available just in case there is some traffic to be sent or received infrequently.
     The WebRTC Data Channel is based on SCTP over DTLS over UDP. This means it needs keepalives in the
     order of 30 seconds in cellular networks, meaning the radio will be active most of the time even if
     no user traffic is sent. It is not possible to keep such a channel on for a long
     time due to power consumption. 
   </t>

   <t>
     Applications can choose different strategies to deal with this problem. One approach is to avoid Data
     Channels completely for low-volume or infrequent traffic and send it via the Web servers over HTTP or
     WebSockets. This is probably the best approach. The other approach is to tear down the Data Channel after some timeout and re-establish it
     only when new traffic needs to be sent. This may create some lag in sending the first message after
     the interval. The third option is to transport the Data Channel over TCP, e.g. using a yet undefined
     "HTTP tunneling fallback" mechanism. This would be almost identical to the first approach, except
     that logically the application would still be using a WebRTC Data Channel. It is not yet clear if this
     will be feasible due to ICE concent refreshes that may need to occur frequetly as well (every 30 seconds?). 
     They are sent end-to-end so one side of the Data Channel can not by itself even affect their rate. 
   </t>


   </section>     

   <section title="Recovery from interface switching">
   <t>
     Most mobile platforms only support Internet connectivity over only one interface at a time. In
     practice this is either a cellular or a Wi-Fi interface. From radio hardware perspective there
     would be no need for such a limitation, but it is driven by simplicity and power preservation.
     The devices typically have a hard-coded or configurable priority order for different networks.
     The most common policy is that any known Wi-Fi network is always preferred over any cellular
     network, but even more complex policies are possible. 
   </t>

   <t>
     When the device detects a higher priority network than the one currently in use, it will by default
     attach to that network automatically. After a successful attachment to the new network, the device 
     turns the old network (and interface) off. In most platforms applications have no control over this. 
     In a typical situation the switch-over leads to a change of IP address, and for instance all
     TCP connections becoming disconnected, and any state tied to them needs to be recreated.
   </t>

   <t>
     It is important that WebRTC applications are made robust enough to survive this behavior. Many native
     applications deal with it by listening to "disconnect" and "reconnect" events through the APIs they
     are using. For WebRTC apps the first priority is to re-establish its "signaling" connectivity to the 
     "Calling Site". If that connectivity is based on a 
     WebSocket, the application needs to react to the "onerror" event through the WebSocket API and establish
     a new connection and setup all state related to it. (Say, if the application was using SIP over
     WebSockets, it might have to re-REGISTER on the SIP level.) If the disconnect was caused by interface
     switching and the switch-over succeeded cleanly, it would be possible to setup the new connection
     immediately. In some cases the disconnect could last longer, and the application would have to retry
     the connection until connectivity is regained. 
   </t>

    <t>
     It would be advisable to make the
     reconnect step as lightweight as possible in terms of RTTs required. For the browser and the web
     application platform, it is important that the "disconnect" event gets propagated to the applications
     as fast as possible. 
    </t>

   <t>
     For HTTP long-polling, it would similarly be important to notice that the underlying
     TCP connection has become stale, and a new poll needs to be sent as quickly as possible. 
   </t>

   <t>
     The application may also attempt to update any peer-to-peer sessions it is having at the time of the
     switch-over. At this point of RTCWeb standardization it is not yet clear how much control over this
     the protocols and APIs will exhibit. There are many layers on which the recovery can be done. It is 
     possible to try to deal with it using ICE. This would require knowing when the currently used ICE
     candidate becomes unusable, as it is bound to a removed interface. The failure of ICE connectivity
     checks provide that information, but possibly after some delay. (Frequent connectivity checks are not
     an issue as long as media is actively sent or received, but would be costly over an idle or low-volume media 
     channel, such as a Data Channel. If media traffic is infrequent, the speed of detection may not be that
     critical for user experience anyway.) If an interface really became unusable, it would be better to have
     an explicit event to signal that all ICE candidates bound to it are likely unusable as well, so the
     application could act immediately. If a new interface became available, the application could restart
     ICE and start using the new candidates gathered. 
    </t>

    <t>
     The PeerConnection API offers a few events for these 
     purposes, at least "icechange" and "renegotiationneeded". With these the application can learn about
     problems with the currently used candidates. There is also a method "updateIce" by which the application
     can restart the ICE candidate gathering process. It is however not yet entirely clear how these event
     handlers and methods should be best used to deal with an interface change, and whether they even are a 
     feasible tool for dealing with it. It is also important to note that no new offers or answers could be
     sent or received until the "signaling channel" (e.g. the Websocket connection) was first re-established.
   </t>

   <t>
     If the lower-level instruments fail, the application could create a new PeerConnection, and recreate the
     media channels. This would be a heavier operation, but in some cases it might still be better than
     leaving the recovery entirely to the user, i.e. explicitly making a new call from the UI. 
   </t>

   <t>
     There are certain things that the underlyind platform (Operating System, Connection Manager etc.) can 
     also implement to make interface switching smoother for the applications. One possibility would be to
     keep the old interface available for a short duration even after a new higher priority interface becomes
     available. This would allow applications to deal with the change in a more proactive fashion. There are
     also protocols such as Multipath TCP that could be used to switch e.g. WebSocket connections to a new
     interface without always resorting to the application support. 
   </t>

   </section>     

   <section title="Congestion avoidance">
   <t>
     Cellular mobile networks have notoriously large buffers. Their link layers also typically operate in an
     "acknowledged" mode, meaning that the lost frames (or packets) are retransmitted. Retransmission creates
     head of line blocking on the queue. This means packets are seldom lost, but delays grow large. The individual
     users or endpoints are often isolated from each other so that the network capacity is divided among them
     more or less evenly. However, all traffic to and from the same endpoint ends up in the same queue. In WebRTC
     context this means that plain TCP traffic will easily ruin real-time traffic due to the buffering. 
   </t>
   <t>
     WebRTC protocols should be desinged to avoid this. If Data Channels transfer a lot of data in parallel to the
     real-time streams, they should not use the loss-driven (TCP) congestion control algorithms but something that
     reacts to queue growth much faster. IETF LEDBAT WG may have something to offer for this case. If the browser
     wants to protect its real-time strams in general against all TCP (HTTP, WebSocket) traffic, it might be best for it to also
     restrict the number of simultanous TCP connections in use, for instace to retrive a website. The HTTP 2.0 work
     done in IETF HTTPBIS WG should prove helpful in this case. 
   </t>
   <t>
     Cellular networks also do have their in-built Quality of Service mechanisms that can be used to differentiate
     service for different packet flows. These are not widely used in HSPA/WCDMA, but LTE may change the situation
     to some extent. The QoS policy is enforced by the network, and requires a contract with the operator. It is thus
     likely only available for services with some relation to the access operator. How the WebRTC application or the 
     browser deal with that is TBD. Technically DiffServ marking is probably the only dynamic approach to indicate
     the priority of a particular flow. 
   </t>


   </section>     
    

</section>     
   
   <section anchor="Security" title="Security Considerations">
     <t>
       Not explicitly covered in this version. 
     </t>
     
     
   </section>

<!-- <section title="Additional contributors">  -->

<!-- </section> -->

<section anchor="Acknowledgements" title="Acknowledgements">

  <t>
   Bernard Aboba and Göran Eriksson provided useful comments to the document. Dan Druta has worked on Web notifications in
   the context of WebRTC. 
   </t>

  
</section> 


   <!-- Possibly a 'Contributors' section ... -->

   
 </middle>

 <!--  *****BACK MATTER ***** -->

 <back>
   <!-- References split into informative and normative -->

   <!-- There are 2 ways to insert reference entries from the citation libraries:
    1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
    2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
       (for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")

    Both are cited textually in the same manner: by using xref elements.
    If you use the PI option, xml2rfc will, by default, try to find included files in the same
    directory as the including file. You can also define the XML_LIBRARY environment variable
    with a value containing a set of directories to search.  These can be either in the local
    filing system or remote ones accessed by http (http://domain/dir/... ).-->

   <references title="References">
     
   </references>


   <!-- Change Log

     -->
 </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 19:37:51