One document matched: draft-jennings-rtcweb-signaling-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc iprnotified="yes" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<?rfc sortrefs="no" ?>
<?rfc colonspace="yes" ?>
<?rfc rfcedstyle="no" ?>
<?rfc tocdepth="4"?>
<rfc category="std" docName="draft-jennings-rtcweb-signaling-00"
     ipr="pre5378Trust200902">
  <front>
    <title abbrev="ROAP">RTCWeb Offer/Answer Protocol (ROAP)</title>

    <author fullname="Cullen Jennings" initials="C." surname="Jennings">
      <organization>Cisco</organization>

      <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <phone>+1 408 421-9990</phone>

        <email>fluffy@cisco.com</email>
      </address>
    </author>

    <author fullname="Jonathan Rosenberg" initials="J.R." surname="Rosenberg">
      <organization>jdrosen.net</organization>

      <address>
        <email>jdrosen@jdrosen.net</email>

        <uri>http://www.jdrosen.net</uri>
      </address>
    </author>

    <date day="14" month="October" year="2011" />

    <area>RAI</area>

    <abstract>
      <t>This document describes an protocol used to negotiate media between
      browsers or other compatible devices. This protocol provides the state
      machinery needed to implement the offer/answer model (RFC 3264), and
      defines the semantics and necessary attributes of messages that must be
      exchanged. The protocol uses an abstract transport in that it does not
      actually define how these messages are exchanged. Rather, such exchanges
      are handled through web-based transports like HTTP or WebSockets. The
      protocol focuses solely on media negotiation and does not handle call
      control, call processing, or other functions.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>This specification defines a protocol that allows an RTCWeb browser
      to exchange information to control the set up of media to another
      browser or device. The scope of this protocol is limited to
      functionality required for the setup and negotiation of media and the
      associated transports, referred to as media control. The protocol
      defines the minimum set of messages and state machinery necessary to
      implement the offer/answer model as defined in <xref
      target="RFC3264"></xref>. The offer answer model specifies rules for the
      bilateral exchange of Session Description Protocol (SDP) messages <xref
      target="RFC4566"></xref> for creation of media streams.</t>

      <t>The protocol specified here defines the state machines, semantic
      behaviors, and messages that are exchanged between instances of the
      state machines. However, it does not specify the actual on-the-wire
      transport of these messages. Rather, it assumes that the implementation
      of this protocol would occur within the browser itself, and then browser
      APIs would allow the application's JavaScript to request creation of
      messages and insert messages into the state machine. The actual transfer
      of these messages would be the responsibility of the web application,
      and would utilize protocols such as HTTP and WebSockets. To facilitate
      implementation within a browser, JSON notation is used to describe the
      messages <xref target="RFC4627"></xref>.</t>

      <t>The protocol defined here covers media control, but does not provide
      any call control. Concepts like ringing of phones, user search, call
      forwarding, redirection, transfer, hold, and so on, are all the domain
      of call processing and are out of scope for this specification. It is
      assumed that the application running within the browser provides any
      call control based on the needs of the application, the scope of which
      is not a matter of standardization.</t>

      <t>Despite that fact that it has an abstract transport, ROAP is still a
      protocol. This means it has state machines, and it has rules governing
      the behavior of those state machines which guarantee that system
      operates properly based on any set of inputs. It is assumed that this
      state machinery is implemented in the browser and thus immutable by the
      application, which can then guarantee proper behavior regardless of the
      operation of the resident JavaScript. This provides an important layer
      of protection.</t>

      <t>The protocol is designed to operate between two entities (browsers
      for example), which exchange messages "directly" - meaning that a
      message output by one entity is meant to be directly processed by the
      other entity without further modification. In practice, this means that
      a web server can treat ROAP messages as opaque and just shuffle them
      between browser instances. This allows for simple implementations.
      However, more powerful applications can be built in which the web server
      or JavaScript can modify the messages in order to provide more complex
      features. As long as those modifications produce messages compliant to
      this specification, <xref target="RFC3264">SDP Offer/Answer</xref>,
      <xref target="RFC4566">SDP</xref>, <xref target="RFC5245">ICE</xref> and
      any other dependencies, the modifications are permissible.</t>

      <t>This protocol is designed for two major use cases:</t>

      <t><list style="symbols">
          <t>Browser to browser</t>

          <t>Browser to SIP device via a SIP gateway</t>
        </list></t>

      <t>In the browser to SIP use case, the gateway obviously needs to be
      somewhat more sophisticated. However, because this design is a small
      subset of the design space covered by SIP <xref
      target="RFC3261"></xref>, it is intended to be simple to translate to
      and from/SIP via a signalling gateway. Moreover, many of the elements in
      messages have clear mappings to elements in SIP messages, thus allowing
      simple, stateless translation.</t>
    </section>

    <section title="Requirements and Design Goals">
      <t>There has been extensive debate about the best architecture for
      RTCWeb signaling. To a great extent this decision is dictated by the
      requirements that the signaling mechanism is intended to fit. The
      protocol in this document was designed to minimize the amount of
      implementation effort required outside the browser and RTC-Web signaling
      gateways. This implies the following requirements:</t>

      <t><list style="symbols">
          <t>It should be possible to develop a simple browser to browser
          voice and video service in a small amount of code. In particular, it
          MUST be possible to implement a functional service such that:</t>

          <t><list style="symbols">
              <t>The web service maintains only transaction state, not call
              state;</t>

              <t>In the browser to browser case, the web server can simply
              pass protocol messages between the browser agents without
              examining or modifying them;</t>

              <t>The service operates without needing to examine the details
              of the browser capabilities (e.g., new codecs should be
              automatically accommodated without modifying either the service
              or the associated JS.</t>
            </list></t>

          <t></t>

          <t>It should be possible to implement a simple RTC-Web gateway
          that:</t>

          <t><list style="symbols">
              <t>Connects to legacy SIP devices ranging from multiscreen video
              phones to PSTN gateways;</t>

              <t>Has a deterministic mapping between RTC-Web messages and SIP
              messages;</t>

              <t>Permits the mechanical translation of messages without
              knowledge of the details of all the browser capabilities;</t>

              <t>Maintains only transaction state, not call state; and</t>

              <t>Does not need to send or receive the media (unless also
              acting as a relay or a translator for codecs which are not
              jointly supported).</t>
            </list></t>
        </list></t>

      <t>Finally it seems clear that SDP is too complicated to reinvent, so
      despite its manifest deficiencies we opt to take it as-is rather than
      trying to reinvent it.</t>
    </section>

    <section title="Terminology">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
      "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
      interpreted as described in <xref target="RFC2119"></xref>.</t>

      <t>This draft uses the API and terminology described in <xref
      target="webrtc-api"></xref>.</t>
    </section>

    <section title="Protocol Overview">
      <t>We start with a simple example. Consider the case where browser A
      wishes to setup up a media session with browser B. At the high level, A
      needs to communicate the following information:</t>

      <t><list style="symbols">
          <t>This is a new media session and not an update to a different
          session.</t>

          <t>Here is my SDP offer, including media parameters and ICE
          candidates.</t>
        </list></t>

      <t>The OFFER message is used to carry this information. For example, A
      might send B:</t>

      <figure>
        <artwork><![CDATA[
{
  "messageType":"OFFER",
  "offererSessionId":"13456789ABCDEF",
  "seq": 1
  "sdp":"
v=0\n
o=- 2890844526 2890842807 IN IP4 192.0.2.1\n
s= \n
c=IN IP4 192.0.2.1\n
t=2873397496 2873404696\n
m=audio 49170 RTP/AVP 0"
}]]></artwork>
      </figure>

      <t>The messageType field indicates that this is an OFFER and the
      offererSessionId indicates the media session that this OFFER is
      associated with. B can tell that this is for a new media session because
      it contains a offererSessionId that he has not seen before. The sdp
      field contains the offer itself, which is just an ordinary SDP offer
      rendered as a string.</t>

      <t>If B elects to start a media session, B responds with an ANSWER
      message containing SDP, as shown below.</t>

      <figure>
        <artwork><![CDATA[
{
  "messageType":"ANSWER",
  "offererSessionId":"13456789ABCDEF",
  "answererSessionId":"abc1234356",
  "seq": 1,
  "sdp":"
v=0\n
o=- 2890844526 2890842807 IN IP4 192.0.2.3\n
s= \n
c=IN IP4 192.0.2.3\n
t=2873397496 2873404696\n
m=audio 49175 RTP/AVP 0"
}
]]></artwork>
      </figure>

      <t>The contents of this message are more or less the same as those in
      the OFFER, except that B also includes a answererSessionId to uniquely
      identify the session from B's perspective. The combination of
      offererSessionId and answererSessionId uniquely identifies this
      session.</t>

      <t>Finally, in order to confirm that A has seen B's ANSWER, A responds
      with an OK message.</t>

      <figure>
        <artwork><![CDATA[
{
  "messageType":"OK",
  "offererSessionId":"13456789ABCDEF",
  "answererSessionId":"abc1234356",
  "seq": 1
}
]]></artwork>
      </figure>

      <t>Note that all of these messages contain a seq field which contains a
      transaction sequence number. The seq field makes it possible to
      correlate messages which belong to the same transaction, as well as to
      detect duplicates, which is described later in section <xref
      target="sec-reliability"></xref>.</t>

      <t>The messageType value of "OFFER" will always contain an SDP offer,
      and an object with a messageType value of "ANSWER" will always contain
      an SDP answer. The complete list of message types is defined in <xref
      target="sec-syntax"></xref>. Only a small number of messages are
      permitted and much of the message set is devoted to error handling.</t>

      <!--  FLUFFY
     
     <t>In building web systems it is often useful for a request to contain
      some state that is passed back in future messages. This system includes
      two types of state: session state and request state. If a browser
      receives a message that contains state in a setSessionState attribute,
      any future messages it sends that have the same offererSessionId MUST
      include this state in a sessionState attribute. Similarly if a request
      contains an setResponseState attribute, that state MUST be included in
      any response to that request in a responseState attribute. </t>-->

      <t>Once a session has been set up, additional rounds of offer/answer can
      be sent using the OFFER/ANSWER/OK sequence. Note that the seq attribute
      makes it easy to differentiate these additional rounds from the initial
      exchange and from each other.</t>
    </section>

    <section anchor="sec-syntax" title="Semantics & Syntax">
      <section anchor="sec-reliability" title="Reliability Model">
        <t>ROAP messages are typically carried over a reliable transport
        (likely HTTP via XMLHttpRequest or WebSockets), so the chance of
        message loss is low (though non-zero), provided that the signaling
        service is up. However, the common web reliability and scaleability
        model is based on the principle that transactions are idempotent and
        that requests can just be discarded and will be retried. A retry of a
        transaction might happened if a given host was down and the DNS round
        robin approach wanted to move to the next server, or if a server was
        overloaded, or if there was a hiccup in the network. Web applications
        that want to work well need to deal with theses issues to get the
        advantages of the general web design pattern for scaleability and
        reliability.</t>

        <t>To support this web model in this protocol, OFFER and ANSWER
        messages are retried by the client until they are acknowledged end to
        end with an ANSWER or OK. The combination of the sessionID and seq
        allow the browser to detect and discard duplicate requests.</t>
      </section>

      <section title="Common Fields">
        <section anchor="sec.session-id" title="Session IDs">
          <t>Each call is identified by a pair of session identifiers:</t>

          <t><list style="hanging">
              <t hangText="offererSessionId">The offerer's half of the session
              ID (supplied in the OFFER)</t>

              <t></t>

              <t hangText="answererSessionId">The answerer's half of the
              session ID (supplied in the response to an OFFER)</t>
            </list></t>

          <t>The session ID values MUST be generated so that they are globally
          unique. Thus, the combination of both sessionIds is itself globally
          unique. Session IDs never change for during an media session.</t>

          <t>All messages MUST contain the "offererSessionId", and all
          messages other than OFFER or an error in response to an OFFER MUST
          contain both "offererSessionId" and "answererSessionId".</t>
        </section>

        <section title="Seq">
          <t>This is a sequence counter for the key requests that helps
          correlate responses to the correct request.</t>

          <t>This is a 32-bit unsigned integer. On each new OFFER (from either
          browser) it is incremented by one. The Seq of an OK or ANSWER is set
          to the same Seq that was used in the OFFER which caused it. When a
          PeerConnection objects originates a new session by sending an OFFER
          type message, it starts the Seq at 1. Note: If browser A starts an
          OFFER/ANSWER/OK transaction with a seq of 1 to browser B, then later
          B initiates a second OFFER/ANSWER?/OK transaction, it will have a
          seq of 2.</t>
        </section>

        <section title="More-coming">
          <t>This is a boolean flag that can only appear in an ANSWER and, if
          set to true, indicates \that this answer is not the final answer
          that will be sent for the associated OFFER. If this flag is not
          present, it is assumed to be false.</t>

          <t>A common situation where the flag may be set to true could be in
          a case where an Agent had received an OFFER and wished to
          immediately respond with an ANSWER that allowed ICE checking to
          start from both sides; but the Agent could not respond with a final
          ANSWER because the agent was still waiting for user authorization to
          determine which media should be sent. In this case, the Agent could
          send an ANSWER that had "more answer's coming" but that allowed ICE
          to start. Then later, when the user had authorized the media, the
          Agent could send an ANSWER with the more-coming flag set to false
          that indicated this was the final media selection.</t>

          <t>This is a bit different that sending a final ANSWER with just the
          ICE right away then later sending an OFFER to update the media.
          Consider the where browser A requests video with B. When the A side
          that sent the initial OFFER gets an ANSWER that rejects the video,
          it may very well present an users interface that indicates that the
          there is no media. Five seconds later when browser B sends an OFFER
          requesting video, browser A may present an interface that ask if it
          is OK to do the video that was just rejected. This results in a
          crappy user an experience and in the extreme can result in both
          sides always rejecting the other sides OFFER of video, then waiting
          for the user to authorize video that results in a new OFFER that is
          always rejected.</t>

          <t>It easier to be able to indicate that OFFER resulted in one valid
          ANSWER, but that the OFFER needs to be held open as other valid
          ANSWERS may replace the current one. This stops the other side from
          generating new a new OFFER while this is taking place. This is also
          needed to support a SIP gateway doing early media.</t>
        </section>

        <!-- FLUFFY -->

        <!-- 
        <section title="Session Tokens">
          <t>While session IDs serve to uniquely identify a session, it may be
          useful to allow one or another sides to offload state onto the other
          side (for instance to enable a stateless gateway). The
          "setSessionToken" and "sessionToken" fields are used for this
          purpose. When an implementation receives a message with a
          "setSessionToken" field, it MUST associate the field value with the
          session. For all future messages in the session MUST send the
          associated value in the "sessionToken" field (unless the session
          token is reset by another "setSessionToken" value). If no session
          token has yet been received, the "sessionToken" field MUST be
          omitted.</t>
        </section>

        <section title="Response Tokens">
          <t>In addition to tokens which persist for the life of a session, it
          is also possible to have tokens which are only valid for the
          lifetime of a given request/response pair. The "setResponseToken"
          and "responseToken" fields are used for this purpose.</t>

          <t>When an implementation responds to a message from the other side
          (e.g., supplies an answer to an offer, or replies to an answer with
          an OK), it MUST copy into the "responseToken" field any value found
          in a "setResponseToken" field in the message being responded to. If
          no "setResponseToken" field is present, then the "responseToken"
          field MUST be omitted.</t>
        </section>
-->
      </section>

      <section title="Media Setup">
        <t>In order to initiate sending media between the browsers, the
        offerer sends an OFFER message. In order to accept the media, the
        answerer responds with an ANSWER message. A sample message flow for
        this is shown below:</t>

        <figure>
          <artwork alt="Go Read the TXT version of this draft"
                   src="rtcweb-signaling/simple-b2b.png"><![CDATA[
participant OffererUA
participant OffererJS
participant AnswererJS
participant AnswererUA
OffererJS->OffererUA: peer=new PeerConnection();

OffererJS->OffererUA: peer->addStream();
OffererUA->OffererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"OFFER", "sdp":"..."}
AnswererJS->AnswererUA: peer=new PeerConnection();
AnswererJS->AnswererUA: peer->processSignalingMessage();
AnswererUA->AnswererJS: onconnecting();

AnswererUA->OffererUA: ICE starts checking 

note right of AnswererUA: User decides it is OK to send video 
AnswererJS->AnswererUA: peer->addStream();
AnswererUA->OffererUA: Media

AnswererUA->AnswererJS: sendSignalingChannel();
AnswererJS->OffererJS: {"type":"ANSWER","sdp":"..."}
OffererJS->OffererUA: peer->processSignalingMessage();
OffererUA->OffererJS: onaddstream();
OffererUA->AnswererUA: Media

AnswererUA->OffererUA: ICE Completes
AnswererUA->AnswererJS: onopen();
OffererUA->OffererJS: onopen();

OffererUA->OffererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"OK" }
AnswererJS->AnswererUA: peer->processSignalingMessage();
AnswererUA->AnswererJS: onaddstream();
            ]]></artwork>
        </figure>

        <t>The above figure shows a simple message flow for negotiating media:</t>

        <t><list style="symbols">
            <t>The offerer sends an OFFER to initiate the call;</t>

            <t>At this point, ICE negotiation starts;</t>

            <t>Once the browser authorizes sending media to the far side, the
            answerer sends an ANSWER containing the media parameters; and
            finally,</t>

            <t>Once ICE is completed and an OK to the ANSWER is received, both
            sides know that media can flow.</t>
          </list></t>

        <t>The contents of each of these messages is detailed below.</t>

        <section title="OFFER Message">
          <t>The first OFFER message with a given offererSessionId is used to
          indicate the desire to start a media session.</t>

          <section title="Offerer Behavior">
            <t>In order to start a new media session, a offerer constructs a
            new OFFER message with a fresh offererSessionId. The
            answererSessionId field MUST be empty. Like all SDP offers, the
            message MUST contain an "sdp" field with the offerer's offer.</t>
          </section>

          <section title="Answerer Behavior">
            <t>A answerer can receive an OFFER in three cases:</t>

            <t><list style="symbols">
                <t>A new session (this is detected by seeing a new
                offererSessionId value);</t>

                <t>A retransmit of a new OFFER (known offererSessionId, empty
                answererSessionId); or</t>

                <t>A request to change media parameters (known
                offererSessionId, known answererSessionId, new seq value).</t>
              </list></t>

            <t>The first two situations are described in this section. The
            third case is described in <xref
            target="sec.change-media"></xref>. Any other condition represents
            an alien packet and SHOULD be rejected with Error: NOMATCH</t>

            <t>If no media session exists with the given "offererSessionId"
            value, then this is a new media session. The answerer has three
            primary options:</t>

            <t><list style="symbols">
                <t>Reject the request, either silently with no response or
                with an Error: REFUSED message;</t>

                <t>Reply to the OFFER message with a final ANSWER message; or
                <xref target="sec.ANSWER"></xref></t>

                <t>Send back a non final ANSWER message and then later respond
                with an final ANSWER.</t>
              </list></t>

            <t>In either of the latter two cases, the answerer performs the
            following steps:</t>

            <t><list style="numbers">
                <t>Generate a "answererSessionId" value;</t>

                <t>Create some local call state (i.e., a PeerConnection
                object) and bind it to the
                "offererSessionId"/"answererSessionId" pair. All future
                messages on this session MUST then be delivered to that
                PeerConnection object;</t>

                <t>Start ICE handshaking with the offerer; and finally,</t>

                <t>Respond with a message containing an SDP answer in the
                "sdp" field. This will contain the answerer's (potentially
                provisional) media information and the ICE parameters.</t>
              </list></t>

            <t>If an OFFER is received that has already been received and
            responded to and the media session still exists, then the answerer
            MUST respond with the same message as before. If the session has
            been terminated in the meantime, then an ERROR:NOMATCH message
            SHOULD be sent.</t>
          </section>
        </section>

        <section anchor="sec.ANSWER" title="ANSWER">
          <t>The ANSWER message is used by the receiver of an OFFER message to
          indicate that the offer has been accepted. The ANSWER message MUST
          contain the answererSessionId for this media session and an sdp
          parameter containing ICE candidates and the final media parameters
          for the session (although of course these can be adjusted by a new
          OFFER/ANSWER exchange. See <xref
          target="sec.change-media"></xref>)</t>
        </section>

        <section anchor="sec.OK" title="OK">
          <t>The OK message is used by the receiver of an ANSWER message to
          indicate that it has received the ANSWER message. It has no contents
          itself and is merely used to stop the retransmissions of the
          ANSWER.</t>
        </section>

        <section anchor="sec.ERROR" title="ERROR">
          <t>The ERROR message is used to indicate that there has been an
          error. The contents and semantics of this message are defined in
          <xref target="sec.err"></xref>.</t>
        </section>
      </section>

      <section anchor="sec.change-media" title="Changing Media Parameters">
        <t>Once a call has been set up, it is common to want to adjust the
        media parameters, e.g., to add video to an audio-only call. This is
        also done with the OFFER/ANSWER/OK sequence of messages, though the
        details are slightly different.</t>

        <t>Either side may initiate a new OFFER/ANSWER exchange by sending an
        OFFER message. However, implementations MUST NOT attempt this for
        sessions which are still in active negotiation. Specifically, the
        offerer MUST NOT send a new OFFER until it has received the ANSWER,
        and the answerer MUST NOT send a new OFFER until it has received the
        OK indicating receipt of the ANSWER.</t>

        <t>A new OFFER MUST contain a complete set of media parameters
        describing the proposed new media configuration as well as a full set
        of ICE parameters. The recipient of a new OFFER on a valid connection
        MUST respond with an appropriate ANSWER message. However that message
        MAY refuse to accept the proposed new configuration. If the session
        has been terminated in the meantime, then an ERROR:NOMATCH message
        SHOULD be sent.</t>

        <section anchor="sec.offer-glare" title="Conflicting OFFERS (glare)">
          <t><list style="hanging">
              <t hangText="Note:">The algorithm described here models what is
              used in SIP today. There is a backwards compatible proposal that
              may turn out to work better. If that evolves, it will probably
              be used to replace the algorithm described here.</t>
            </list></t>

          <t>Because a change of media parameters may be initiated by either
          side, there is a potential for the change requests to occur
          simultaneously (i.e., "glare"). When an agent which has sent an
          OFFER and not yet received an ANSWER receives an OFFER from the
          other side, it MUST respond with an ERROR: CONFLICT message.</t>

          <t>An offerer which receives an Error: conflict message MUST either
          abandon the attempted capability change or generate a timer of T
          seconds, with T chosen as follows:</t>

          <t><list style="numbers">
              <t>If the offerer is the offerer, T has a randomly chosen value
              between 2.1 and 4 seconds in units of 10 ms.</t>

              <t>If the offerer is the answerer, T has a randomly chosen value
              of between 0 and 2 seconds in units of 10 ms.</t>
            </list></t>

          <t>When the timer fires, the offerer SHOULD increment the Seq and
          attempt a new OFFER once more, if it still desires that session
          modification to take place. The new OFFER might be the same as the
          original offer (other than the seq) or it might be different.</t>

          <t>[FIGURE: Glare]</t>

          <t>The following figure assumes the previous message flow has
          happened and media is flowing.</t>

          <figure>
            <artwork alt="Go Read the TXT version of this draft"
                     src="rtcweb-signaling/glare.png"><![CDATA[
participant OffererUA
participant OffererJS
participant AnswererJS
participant AnswererUA

note left of OffererJS: "Hi, Let's do video"
note right of AnswererJS: "Sounds great"
OffererJS->OffererUA: peer->addStream( new MediaStream() );
OffererUA->OffererJS: sendSignalingChannel();
AnswererJS->AnswererUA: peer->addStream( new MediaStream() );
AnswererUA->AnswererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"OFFER", "sdp":"..."}
AnswererJS->OffererJS: {"type":"OFFER", "sdp":"..."}
AnswererJS->AnswererUA: peer->processSignalingMessage();
OffererJS->OffererUA: peer->processSignalingMessage();

OffererUA->OffererJS: sendSignalingChannel();
AnswererUA->AnswererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"ERROR", error = "conflict", "sdp":"..."}
AnswererJS->OffererJS: {"type":"ERROR", error = "conflict", "sdp":"..."}
AnswererJS->AnswererUA: peer->processSignalingMessage();
OffererJS->OffererUA: peer->processSignalingMessage();

OffererUA->OffererUA: wait 1.1 seconds
OffererUA->OffererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"OFFER", "sdp":"..."}
AnswererJS->AnswererUA: peer->processSignalingMessage();
AnswererUA->AnswererJS: sendSignalingChannel();
AnswererJS->OffererJS: {"type":"ANSWER", "sdp":"..."}
OffererJS->OffererUA: peer->processSignalingMessage();
OffererUA->AnswererUA: One way Video
OffererUA->OffererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"OK"}
AnswererJS->AnswererUA: peer->processSignalingMessage();
AnswererUA->AnswererJS: onaddstream();

AnswererUA->AnswererUA: wait 2.7 seconds
AnswererUA->AnswererJS: sendSignalingChannel();
AnswererJS->OffererJS: {"type":"OFFER", "sdp":"..."}
OffererJS->OffererUA: peer->processSignalingMessage();
OffererUA->OffererJS: sendSignalingChannel();
OffererJS->AnswererJS: {"type":"ANSWER", "sdp":"..."}
AnswererJS->AnswererUA: peer->processSignalingMessage();
AnswererUA->OffererUA: Both way Video    
AnswererUA->AnswererJS: sendSignalingChannel();
AnswererJS->OffererJS: {"type":"OK"}
OffererJS->OffererUA: peer->processSignalingMessage();
OffererUA->OffererJS: onaddstream();

            ]]></artwork>
          </figure>
        </section>

        <section anchor="sec.premature-offer" title="Premature OFFER">
          <t>It is an error, though technically possible, for an agent to
          generate a second OFFER while it already has an unanswered OFFER
          pending. An agent which receives such an offer MUST respond with an
          ERROR: FAILED message containing a "RetryAfter" attribute generated
          as a random value from 0 to 10 seconds.</t>
        </section>
      </section>

      <section anchor="sec.err" title="Errors">
        <t>Errors are indicated by the messageType "ERROR". All errors MUST
        contain an "errorType" field indicating the type of error which
        occurred and echo the "seq" value (if any) and the session id values
        of the message which generated the error. The following sections
        describe each error type.</t>

        <section anchor="err.nomatch" title="NOMATCH">
          <t>An implementation which receives a message with either an unknown
          offererSessionId (for an OFFER) or an unknown
          offererSessionId/answererSessionId pair SHOULD respond with a
          NOMATCH error.</t>
        </section>

        <section anchor="err.timeout" title="TIMEOUT">
          <t>The TIMEOUT error is used to indicate that the corresponding
          message required some processing which timed out. For instance, an
          agent which is a SIP gateway translates ROAP signaling messages into
          SIP messages. If those SIP messages time out, the gateway would
          generate a TIMEOUT error.</t>
        </section>

        <section anchor="err.refused" title="REFUSED">
          <t>An agent which has received an initial OFFER MAY indicate its
          refusal of the media session by sending a REFUSED error. Note that
          this error is not required; an agent MAY simply drop the OFFER with
          no acknowledgement at all. However, agents which do not wish to
          accept subsequent OFFERS SHOULD [OPEN ISSUE: MUST?] send a REFUSED
          in order to avoid timeouts and confusion on the offerer side.</t>
        </section>

        <section anchor="err.conflict" title="CONFLICT">
          <t>The CONFLICT error is used to indicate that an agent has received
          an OFFER while it has its own OFFER outstanding. The offerer's
          behavior in response to this error is defined in <xref
          target="sec.offer-glare"></xref>.</t>
        </section>

        <section anchor="err.failed" title="FAILED">
          <t>FAILED is a catch-all error indicating that something went wrong
          while processing a message. A FAILED error MAY contain a
          "retryAfter" field, which indicates the time (in seconds) after
          which the message MAY be retried (though retries are OPTIONAL).</t>
        </section>
      </section>
    </section>

    <section anchor="sec-sec" title="Security Considerations">
      <t>TBD</t>
    </section>

    <section title="Companion APIs">
      <t><list style="hanging">
          <t hangText="Note:">This section may need to move to the
          requirements draft<xref
          target="I-D.ietf-rtcweb-use-cases-and-requirements"></xref> but for
          now it is convenient to put it here just to help see how all the
          pieces fit together.</t>
        </list></t>

      <t>The offer / answer concepts in this draft are not enough to meet all
      the use cases of RTCWeb. They need to be combined with some additional
      functionality that the browser exposes to the JavaScript applications.
      This additional functionality loosely falls into three categories:
      capabilities, hints, and stats. The capabilities allow the JS
      application to find out what video codecs and capabilities a given
      browser supports before initiating a media session. The hints provide a
      way for the JS application to provide useful information to the browser
      about how the media will be used so that the browser can negotiate
      appropriate codecs and modes. Stats provides statistics about what the
      current media sessions. The capabilities, hints, and stats do not need
      to be communicated between the two browsers, so they are not specified
      in this draft. However, this drafts assumes the existence of API so that
      these three can be used to build complete systems. Some of the
      assumptions about these APIs are described in the following
      sections.</t>

      <section title="Capabilities">
        <t>The APIs need to provide a way to find out the capabilities as
        defined in section 9 of RFC 3264. This allows the JS to find out the
        codecs that the browser supports.</t>
      </section>

      <section title="Hints">
        <t>When creating a new PeerConenction in a browser, the application
        needs to be able to provide optional hints to the browser about
        preferences for the media to be negotiated. These include:</t>

        <t><list style="numbers">
            <t>Whether the session has audio, video, or both;</t>

            <t>Whether the audio is spoken voice or music;</t>

            <t>Preferred video resolution and frame rate (perhaps these just
            come from the MediaTrack objects);</t>

            <t>Whether the video should prefer temporal or spatial
            fidelity;</t>

            <t><add more here></t>
          </list>The JS applications should also be able to update and change
        these hints mid-session. Some types of hint changes may simply impact
        the parameter on various codecs and require no signalling to the other
        end of the media stream. Other types of hint changes may cause a new
        offer answer exchange.</t>
      </section>

      <section title="Stats">
        <t>Several parts of the media session create statistics that are
        important to some applications. APIs should provide the JS
        applications with information on the following statistics:</t>

        <t><list style="numbers">
            <t>Total IP data rate for the session;</t>

            <t>ICE statistics including current candidates, active pairs,
            RTT;</t>

            <t>RTP statistics including codecs selected, parameters, and bit
            rates;</t>

            <t>RTCP statistics including packet loss rate; and</t>

            <t>SRTP statistics.</t>
          </list></t>
      </section>
    </section>

    <section title="Relationship with SIP & Jingle">
      <t>The SIP <xref target="RFC3261"></xref> specifies an application
      protocol that provides a complete solution for setting up and managing
      communications on the Internet. It combines both "call processing"
      functions - identity and name spaces, call routing, user search, call
      features, authentication, and so on - as well as media processing
      through its transport of SDP and support for the offer/answer model.</t>

      <t>In a web context, application processing can be done through
      proprietary logic implemented in Javascript/HTML, along with proprietary
      logic implemented in the web server, and proprietary messaging
      transported through HTTP and WebSockets. One of the advantages of the
      web is to allow a rich set of applications to be built without changing
      the browser. Although application processing and be done in JavaScript
      and the web servers, we do require raw media control in the browser.
      ROAP basically extracts the offer/answer media control processing used
      in SIP, and puts it into an protocol that can operate independently of
      SIP itself.</t>

      <t>The information contained in ROAP messages corresponds closely to the
      offer/answer information carried by complete solutions such as SIP and
      Jingle, so it is straightforward to build gateways to and from ROAP.
      These gateways need only translate the signaling, while allowing
      end-to-end media without the need for media relays (except, of course,
      for NAT traversal.) In the case of SIP, which uses SDP directly, such
      gateways would translate between SIP and ROAP, while transporting SDP
      end-to-end. In the case of Jingle <xref target="XEP-0166"></xref>, it
      would also be necessary to translate between SDP and the Jingle
      offer/answer format; <xref target="XEP-0167"></xref> describes such a
      mapping.</t>
    </section>

    <section title="IANA Considerations">
      <t>This document requires no actions from IANA.</t>
    </section>

    <section title="Acknowledgments">
      <t>Many thanks for comment, ideas, and text from Eric Rescorla, Harald
      Alvestrand, Magnus Westerlund, Ted Hardie, and Stefan Hakansson.</t>
    </section>

    <section title="Open Issues">
      <t>How to negotiate support for enhancements to this JSON message.
      (consider supported / required )</t>

      <t>Common way to indicate destination in offer going to a signalling
      gateway.</t>

      <t>Need to generate proper ASCII art version of message flows.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <reference anchor="RFC4627">
        <front>
          <title>The application/json Media Type for JavaScript Object
          Notation (JSON)</title>

          <author fullname="D. Crockford" initials="D." surname="Crockford">
            <organization></organization>
          </author>

          <date month="July" year="2006" />
        </front>

        <seriesInfo name="RFC" value="4627" />

        <format octets="16319"
                target="http://www.rfc-editor.org/rfc/rfc4627.txt" type="TXT" />
      </reference>

      <reference anchor="RFC3264">
        <front>
          <title>An Offer/Answer Model with Session Description Protocol
          (SDP)</title>

          <author fullname="J. Rosenberg" initials="J." surname="Rosenberg">
            <organization></organization>
          </author>

          <author fullname="H. Schulzrinne" initials="H."
                  surname="Schulzrinne">
            <organization></organization>
          </author>

          <date month="June" year="2002" />
        </front>

        <seriesInfo name="RFC" value="3264" />

        <format octets="60854"
                target="http://www.rfc-editor.org/rfc/rfc3264.txt" type="TXT" />
      </reference>

      <reference anchor="RFC2119">
        <front>
          <title abbrev="RFC Key Words">Key words for use in RFCs to Indicate
          Requirement Levels</title>

          <author fullname="Scott Bradner" initials="S." surname="Bradner">
            <organization>Harvard University</organization>

            <address>
              <postal>
                <street>1350 Mass. Ave.</street>

                <street>Cambridge</street>

                <street>MA 02138</street>
              </postal>

              <phone>- +1 617 495 3864</phone>

              <email>sob@harvard.edu</email>
            </address>
          </author>

          <date month="March" year="1997" />

          <area>General</area>

          <keyword>keyword</keyword>
        </front>

        <seriesInfo name="BCP" value="14" />

        <seriesInfo name="RFC" value="2119" />

        <format octets="4723"
                target="http://www.rfc-editor.org/rfc/rfc2119.txt" type="TXT" />

        <format octets="17491"
                target="http://xml.resource.org/public/rfc/html/rfc2119.html"
                type="HTML" />

        <format octets="5777"
                target="http://xml.resource.org/public/rfc/xml/rfc2119.xml"
                type="XML" />
      </reference>

      <reference anchor="RFC4566">
        <front>
          <title>SDP: Session Description Protocol</title>

          <author fullname="M. Handley" initials="M." surname="Handley">
            <organization></organization>
          </author>

          <author fullname="V. Jacobson" initials="V." surname="Jacobson">
            <organization></organization>
          </author>

          <author fullname="C. Perkins" initials="C." surname="Perkins">
            <organization></organization>
          </author>

          <date month="July" year="2006" />
        </front>

        <seriesInfo name="RFC" value="4566" />

        <format octets="108820"
                target="http://www.rfc-editor.org/rfc/rfc4566.txt" type="TXT" />
      </reference>
    </references>

    <references title="Informative References">
      <reference anchor="RFC3261">
        <front>
          <title>SIP: Session Initiation Protocol</title>

          <author fullname="J. Rosenberg" initials="J." surname="Rosenberg">
            <organization></organization>
          </author>

          <author fullname="H. Schulzrinne" initials="H."
                  surname="Schulzrinne">
            <organization></organization>
          </author>

          <author fullname="G. Camarillo" initials="G." surname="Camarillo">
            <organization></organization>
          </author>

          <author fullname="A. Johnston" initials="A." surname="Johnston">
            <organization></organization>
          </author>

          <author fullname="J. Peterson" initials="J." surname="Peterson">
            <organization></organization>
          </author>

          <author fullname="R. Sparks" initials="R." surname="Sparks">
            <organization></organization>
          </author>

          <author fullname="M. Handley" initials="M." surname="Handley">
            <organization></organization>
          </author>

          <author fullname="E. Schooler" initials="E." surname="Schooler">
            <organization></organization>
          </author>

          <date month="June" year="2002" />
        </front>

        <seriesInfo name="RFC" value="3261" />

        <format octets="647976"
                target="http://www.rfc-editor.org/rfc/rfc3261.txt" type="TXT" />
      </reference>

      <reference anchor="XEP-0166">
        <front>
          <title>Jingle</title>

          <author fullname="Scott Ludwig" initials="S." surname="Ludwig">
            <organization></organization>

            <address>
              <email>scottlu@google.com</email>
            </address>
          </author>

          <author fullname="Joe Beda" initials="J." surname="Beda">
            <organization></organization>

            <address>
              <email>jbeda@google.com</email>
            </address>
          </author>

          <author fullname="Peter Saint-Andre" initials="P."
                  surname="Saint-Andre">
            <organization></organization>

            <address>
              <email>stpeter@jabber.org</email>
            </address>
          </author>

          <author fullname="Robert McQueen" initials="R." surname="McQueen">
            <organization></organization>

            <address>
              <email>robert.mcqueen@collabora.co.uk</email>
            </address>
          </author>

          <author fullname="Sean Egan" initials="S." surname="Egan">
            <organization></organization>

            <address>
              <email>seanegan@google.com</email>
            </address>
          </author>

          <author fullname="Joe Hildebrand" initials="J." surname="Hildebrand">
            <organization></organization>

            <address>
              <email>jhildebr@cisco.com</email>
            </address>
          </author>

          <date day="23" month="December" year="2009" />
        </front>

        <seriesInfo name="XSF XEP" value="0166" />

        <format target="http://xmpp.org/extensions/xep-0166.html" type="HTML" />
      </reference>

      <reference anchor="XEP-0167">
        <front>
          <title>Jingle RTP Sessions</title>

          <author fullname="Scott Ludwig" initials="S." surname="Ludwig">
            <organization></organization>

            <address>
              <email>scottlu@google.com</email>
            </address>
          </author>

          <author fullname="Peter Saint-Andre" initials="P."
                  surname="Saint-Andre">
            <organization></organization>

            <address>
              <email></email>
            </address>
          </author>

          <author fullname="Sean Egan" initials="S." surname="Egan">
            <organization></organization>

            <address>
              <email>seanegan@google.com</email>
            </address>
          </author>

          <author fullname="Robert McQueen" initials="R." surname="McQueen">
            <organization></organization>

            <address>
              <email>robert.mcqueen@collabora.co.uk</email>
            </address>
          </author>

          <author fullname="Diana Cionoiu" initials="D." surname="Cionoiu">
            <organization></organization>

            <address>
              <email>diana@null.ro</email>
            </address>
          </author>

          <date day="19" month="December" year="2008" />
        </front>

        <seriesInfo name="XSF XEP" value="0167" />

        <format target="http://www.xmpp.org/extensions/xep-0167.html"
                type="HTML" />
      </reference>

      <reference anchor="RFC5245">
        <front>
          <title>Interactive Connectivity Establishment (ICE): A Protocol for
          Network Address Translator (NAT) Traversal for Offer/Answer
          Protocols</title>

          <author fullname="J. Rosenberg" initials="J." surname="Rosenberg">
            <organization></organization>
          </author>

          <date month="April" year="2010" />

          <abstract>
            <t>This document describes a protocol for Network Address
            Translator (NAT) traversal for UDP-based multimedia sessions
            established with the offer/answer model. This protocol is called
            Interactive Connectivity Establishment (ICE). ICE makes use of the
            Session Traversal Utilities for NAT (STUN) protocol and its
            extension, Traversal Using Relay NAT (TURN). ICE can be used by
            any protocol utilizing the offer/answer model, such as the Session
            Initiation Protocol (SIP). [STANDARDS-TRACK]</t>
          </abstract>
        </front>

        <seriesInfo name="RFC" value="5245" />

        <format octets="285120"
                target="http://www.rfc-editor.org/rfc/rfc5245.txt" type="TXT" />
      </reference>

      <reference anchor="webrtc-api">
        <front>
          <title>WebRTC 1.0: Real-time Communication Between Browsers</title>

          <author fullname="W3C editors"
                  surname="Bergkvist, Burnett, Jennings, Narayanan">
            <organization>W3C</organization>
          </author>

          <date day="4" month="October" year="2011" />
        </front>

        <annotation>Available at
        http://dev.w3.org/2011/webrtc/editor/webrtc.html</annotation>
      </reference>

      <reference anchor="I-D.ietf-rtcweb-use-cases-and-requirements">
        <front>
          <title>Web Real-Time Communication Use-cases and
          Requirements</title>

          <author fullname="Christer Holmberg" initials="C" surname="Holmberg">
            <organization></organization>
          </author>

          <author fullname="Stefan Hakansson" initials="S" surname="Hakansson">
            <organization></organization>
          </author>

          <author fullname="Goran Eriksson" initials="G" surname="Eriksson">
            <organization></organization>
          </author>

          <date day="4" month="October" year="2011" />

          <abstract>
            <t>This document describes web based real-time communication
            use-cases. Based on the use-cases, the document also derives
            requirements related to the browser, and the API used by web
            applications to request and control media stream services provided
            by the browser.</t>
          </abstract>
        </front>

        <seriesInfo name="Internet-Draft"
                    value="draft-ietf-rtcweb-use-cases-and-requirements-06" />

        <format target="http://www.ietf.org/internet-drafts/draft-ietf-rtcweb-use-cases-and-requirements-06.txt"
                type="TXT" />
      </reference>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 16:33:25