http://stupid.domain.name/ietf/

One document matched: draft-nandakumar-rtcweb-glare-handling-00.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- comment -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd"[]>
<?rfc toc="no" ?>
<?rfc compact="yes" ?>
<?rfc sortrefs="no" ?>
<?rfc symrefs="yes" ?> 

<rfc ipr="trust200902" category="info" docName="draft-nandakumar-rtcweb-glare-handling-00.txt" submissionType="IETF" xml:lang="en">
<front>
    <title abbrev="Glareless Media">
		Glareless addition of media to existing RTCWeb Sessions
	</title>
  <author fullname="Suhas Nandakumar" initials="S." surname="Nandakumar">
    <organization>Cisco</organization>
    
    <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <email>snandaku@cisco.com</email>
      </address>
  </author>

  <author fullname="Cullen Jennings" initials="C." surname="Jennings">
    <organization>Cisco</organization>    
    <address>
        <postal>
          <street>400 3rd Avenue SW, Suite 350</street>

          <city>Calgary</city>

          <region>AB</region>

          <code>T2P 4H2</code>

          <country>Canada</country>
        </postal>

        <email>fluffy@iii.ca</email>
      </address>
  </author>
  
  <date day="18" month="February" year="2013" />
  
  <area>RAI</area>
  

    <abstract>
		<t>
		  The RFC3264 Offer/Answer model specifies rule for the bilateral exchange of 
		  Session Description Protocol (SDP) [RFC4566] messages for setting up, updating and
		  tearing down of multimedia streams. Rarely, there might be situations wherein either 
		  of the communicating parties, might end up being the offerer for updating an on-going session.
		  This scenario is commonly known as "glare" condition and it needs to be handled nevertheless.
		  This specification describes procedures for parties involved in an ongoing RTCWeb session to
		  add new media in a glareless fashion.
		</t> 
     <t>
      There are various ways this problem might be solved - this
      draft sketches out one possible solution to the problem. 
     </t>
    </abstract>
</front>

<middle>
    <section title="Introduction">
      <t>  
        In the most basic form,the RFC3264 <xref target="RFC3264"></xref> protocol operation 
        begins by one of the participants sending an initial SDP offer describing its intent to 
        start a multimedia communication session. The participant receiving the offer MAY generate
        an SDP answer accepting the offer or it MAY reject the offer. Once the session is setup,
        at any time, either agent MAY generate a new offer that updates the session. However, it
        MUST NOT generate a new offer if it has received an offer which it has not yet been answered or
        rejected. If an agent receives an offer after having sent one, but before receiving an 
        answer to it,the situation is considered as "glare" condition.
       </t> 	
		   <t>
		    This specification defines set of procedures for RTCWeb end-points to add new media to an
		    ongoing session in a glareless fashion. The rest of this document is organized as follow.
		    Section 2 provides motivation for dealing with glare condition. Section 3 explains 
		    the detailed call-flows with examples describing the solution proposed in this specification.
		    Finally Section 4 concludes with a note on applicability.
		   </t>   
    </section>
    <section title="Terminology">

      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
      "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
      interpreted as described in <xref target="RFC2119"></xref>.</t>


      <t>m-line: An <xref target="RFC4566">RFC4566</xref> media description
        identifier that starts with "m=" field and conveys following values:media
        type,transport port,transport protocol and media format descriptions.</t>
      
      <t>m-block: An <xref target="RFC4566">RFC4566</xref> media description
        that starts with an m-line and is terminated by either the next m-line or
        by the end of the session description.</t>
      
      <t>Offer: An <xref target="RFC3264"></xref> SDP message generated by the
        participant who wishes to initiate a multimedia communication session.
        An Offer describes participants capabilities for engaging in a multimedia
        session.</t>
      
      <t>Answer: An <xref target="RFC3264"></xref> SDP message generated by the
        participant in response to an Offer. An Answer describes participants
        capabilities in continuing with the multimedia session with in the
        constraints of the Offer.</t>
 
    </section>
  
    <section  anchor="sec-motivation" title="Motivation">
     <t>
      The following example serves as base case for "glare" condition that will be discussed throughout
      this document in the context of the proposed solution.
      <list>
        <t>
           Alice and Bob are in a two way audio-only RTCWeb Session.
        </t>
        <t>
          They decide to escalate to a video session.
        </t>
        <t>
         Each initiate the addition of their respective camera streams to their current 
         session almost the same time. This is done by each side sending the Offer
         with appropriate media descriptions.
        </t>
        <t>
         Alice receives Offer from the Bob before receiving Answer for her Offer.
        </t>
        <t>
         Bob receives Alice's Offer when he is expecting an Answer for his Offer.
        </t>
        <t>
         Both Alice and Bob have outstanding Offers and they end up stuck in a "glare"
         situation.
        </t>
      </list>
     </t>
     <t>
      End points stuck in the "glare" condition stay there forever unless the condition
      is resolved either by some higher layer protocol mechanisms or by some application logic.
      An example for the former is SIP <xref target="RFC3261"></xref> that provides means for ordering
      of messages in each direction to resolve "glare" condition. This documents 
      proposes a case for the latter.
     </t>
    </section>

    <section title="Solution">
       <t>
         Here is the basic outline of the solution described herein.     
         <list style="numbers">
           <t>
            Participants setup an initial RTCWeb session using normal Offer/Answer procedures.
            This can by any combination of audio, video or data-channel sessions.
           </t>
           <t>
            Whenever participants decide to add new media, a m-block describing the
            capabilities of media stream is generated and exchanged with the peer.
           </t>
           <t>
             Each sender "Opportunistically" acknowledges the m-block with an
             Answer that it is expecting for the m-blocks sent. This triggers
             the completion of Offer/Answer state machine at the senders thus
             not resulting in any outstanding Offers.
             In this specification, the Answer used for this purpose is called
             "ExpectedAnswer" since this is not the real answer from the Answerer.
           </t>
           <t>
             When the m-blocks reach the Answerer's Javascript application, the 
             application needs to re-order the m-blocks to be in the right order 
             for the local SDP
           </t>
           <t>
            Once re-ordered to match the order of m-blocks in the local SDP, the Offer is
            installed with the Peer Connection at the Answerer,as remote description, to 
            generate an Answer
           </t>
           <t>
             On obtaining the Answer, the Answerer's Javascript application extracts the m-block,
             sends it to the Offerer as response.
           </t>
           <t>
             On receiving the response m-blocks from the Answerer, the Offerer needs to re-order
             the m-blocks to be in the right order for the Offerer's local SDP.
           </t>
           <t>
            The Offerer's Javascript application then will generate an updated Offer with the
            m-blocks received and re-ordered, to be applied as remote description to generate
            an Answer. This is done to apply the real Answer from the Answerer and overwrite 
            any number of "ExpectedAnswers" applied in the interim. The term "Overlapping Zone"
            is used to indicate time between initial m-block Offer and the final m-block Answer.
            During this time, the end-points at the either end MAY be executing one or more
            iterations of m-block exchanges". 
           </t>
           <t style="hangText">
             There could be few limitations that might result in the failure while 
             converting the m-block Answer to an Offer at the Offerer as described in the section
             <xref target="sec-applicability"></xref>
           </t>
           <t>
            The Offerer's Javascript application will extract m-blocks from the generated Answer
            to match it with its local SDP to ensure that a successful m-block Offer/Answer exchange
            did happen. If not, the on-going session is terminated.
           </t>
           <t>
           </t>
         </list>
       </t>
      
      <section title="Requirements">
      <t>
       Following represent the basic ideas and the requirements for the proposed solution to perform
       glareless addition of new media.
       <list style="symbols">
         <t>
          There is an ongoing RTCWeb session between the parties, say Alice and Bob. This
          implies an RTP transport association has been successfully setup between the peers.
         </t>
         <t>
          The Javascript applications have their own copy of the most recent successfully negotiated
          SDP with the local ordering of m-blocks preserved.
         </t>
         <t>
           The Javascript application is responsible for maintaining the appropriate ordering of the
           m-blocks for that User Agent.
         </t>
         <t>
           Subsequent Offer/Answer exchanges after the initial exchange might not use complete
           SDP messages to indicate updates to the session, say  adding a new video stream. 
           One approach proposed in this document involves exchange of "m-blocks" that 
           describe the capabilities of the newly added media.
         </t>
         <t>
           The Javascript application MUST be capable of generating complete SDP Offer or Answer
           from the session updates exchanged so as to supply the same to the User Agent or to
           negotiate the same with the peer, if needed.
         </t>
         <t>
           The end-points MUST support at least one way to identify the m-blocks. In this specification
           mid <xref target="RFC5888"></xref> is used for this purpose and it is open to other 
           identification approaches such as msid <xref target="I-D.alvestrand-mmusic-msid"></xref>
         </t>
       </list>
      </t>
      </section>
      <section anchor="sec-callflow" title="Call Flow - Glareless One-Way Video Addition">
      <t>
       Below is the high-level call flow that captures the procedure for adding one-way video
       to an ongoing RTCWeb session in a glareless fashion. Also to note, the call-flow
       captures the proposed solution from the Alice's perspective alone due to the 
       implied symmetry.
      </t>
      <figure>
        <artwork alt="Glareless One-Way Video Add"
          src="Glareless.png"  width="100%" ><![CDATA[
title Glareless One-Way Video Add
participant Alice_PC
participant Alice_JS
participant BOB_JS
participant BOB_PC

note over Alice_PC, BOB_PC
On-Going RTP Session with any combination of
audio and/or video and/or data-channel.
end note

note over Alice_JS, BOB_JS
Alice and Bob's JS application have local copies of the most recent 
successfully negotiated SDP
end note

Alice_JS->Alice_PC: AddStream(Video)
Alice_JS->+Alice_PC: CreateOffer()
Alice_PC-->-Alice_JS:offer with video sendonly.
Alice_JS->Alice_PC: ExpectedAnswer w/video inactive
Alice_JS->BOB_JS: m-block offer for added video source
BOB_JS->BOB_JS: Merge m-block to local set
BOB_JS->BOB_JS: Generate Offer from the local set 
BOB_JS->BOB_PC: SetRemoteDesc(Offer)
BOB_JS->+BOB_PC: CreateAnswer()
BOB_PC-->-BOB_JS: answer with video recvonly
BOB_JS->BOB_JS: ReMap answer to local set
BOB_JS->Alice_JS:m-block answer accepting the video stream
Alice_JS->Alice_JS: Merge Bob's m-block to generate updated offer
Alice_JS->Alice_PC: SetRemoteDesc() w/updated offer
Alice_JS->+Alice_PC: CreateAnswer()
Alice_PC-->-Alice_JS: answer with video recv-only
Alice_JS->Alice_JS: Check for Success/Failure
note left of Alice_JS
 For success, the m-block values from the generated answer MUST
 match with the corresponding values in the local set matched
 by their MIDs
end note   
]]></artwork>
      </figure>
    </section>
    <section title="Call Flow Details">  
      <t>
      The table below provides step-by-step analysis of various states as reflected 
      at Alice's and Bob's end point when applied to example mentioned in the Motivation
      section <xref target="sec-motivation"></xref>
     </t>    
      <t hangText="Note">
        Only the relevant aspects of SDP media descriptions are captures for the sake of
        clarity.
      </t>
        
      <texttable title="Timeline for Glareless Video Addition">
        <ttcol align='left'>Timeline </ttcol>
        <ttcol align='left'>Alice's JS View </ttcol>
        <ttcol align='left'>Bob's JS View </ttcol>
        
        <c>t+0</c><c>Local Set: m=audio (sendrecv)</c><c>Local Set: m=audio (sendrecv) </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+1</c><c>AddStream(Video)-sendonly </c><c>AddStream(Video)-sendonly </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+2</c><c>CreateOffer with m=audio,m=video[mid:1,sendonly] </c>
        <c>CreateOffer with m=audio,m=video[mid:2,sendonly] </c>
        <c> </c><c> </c><c> </c>
       
        <c>t+3</c><c>Generate ExpectedAnswer with m=audio,m=video[mid:1,inactive] </c> 
        <c>Generate ExpectedAnswer with m=audio,m=video[mid:2,inactive] </c>
        <c> </c><c> </c><c> </c>
         
        <c>t+4</c><c>Install ExpectedAnswer </c> <c>Install ExpectedAnswer </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+5</c><c>Send video m-block [mid:1,sendonly] </c> 
        <c>Send video m-block [mid:2,sendonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+6</c><c>Video m-block [mid:2,sendonly] Arrives </c>
        <c>Video m-block [mid:1,sendonly] Arrives </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+7</c><c>Merge/Update Local Set </c><c>Merge/Update Local Set </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+8</c><c>LocalSet:m=audio,m=video[mid:1,sendonly],m=video[mid:2,sendonly] </c>
        <c>Local Set:m=audio,m=video[mid:2,sendonly],m=video[mid:1,sendonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+9</c><c>Install RemoteDescription  </c>
        <c>Install RemoteDescription </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+10</c><c>CreateAnswer m=audio,m=video[mid:1,sendonly] m=video[mid:2,recvonly]  </c>
        <c>CreateAnswer for m=audio,m=video[mid:1,recvonly] m=video[mid:2,sendonly]  </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+11</c><c>Merge/Update Local Set </c> <c>Merge/Update Local Set </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+12</c><c>Local Set:m=audio,m=video[mid:1,sendonly],m=video[mid:2 recvonly] </c>
        <c>Local Set:m=audio,m=video[mid:2,sendonly],m=video[mid:1,recvonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+13</c><c>Send video m-block [mid:2,recvonly] </c> 
        <c>Send video m-block [mid:1,recvonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+14</c><c>Video m-block [mid:1,recvonly] Arrives </c>
        <c>Video m-block [mid:2 recvonly] Arrives </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+15</c><c>Install updated Offer m=audio,m=video[mid:1 recvonly],m=video[mid:2,recvonly]  </c>
        <c>Generate Update Offer m=audio,m=video[mid:1 recvonly],m=video[mid:2,recvonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+16</c><c>Create Answer with m=audio,m=video[mid:1 sendonly],m=video[mid:2 recvonly]  </c>
        <c>Create Answer with m=audio,m=video[mid:1 recvonly],m=video[mid:2,sendonly] </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+17</c><c>Compare Local Set and generated Answer </c>
        <c>Compare Local Set and generated Answer </c>
        <c> </c><c> </c><c> </c>
        
        <c>t+18</c><c>Local Set at t+12 matches with Answer m-blocks at t+16 </c>
        <c>Local Local Set at t+12 matches with Answer m-blocks at t+16 </c>
        <c> </c><c> </c><c> </c>

        
      </texttable>   
    
      </section>
  </section> 
     
  <section anchor="sec-applicability" title="Applicability Statement">
		<t>
		 As mentioned earlier, generating an Offer from the Answer created in response to m-block
		 received from the Answerer might fail due to possible mismatches in the configurations 
		 between Offered m-block, applied Expected Answer and the actual Answer received 
		 from the Answerer.
		</t>
    <t>
     On the other hand, this solution works best in all the cases where the first side can
     easily predict what the far side's answer will be. Given that the Offerer offering
     a capability that is supposed to fail on purpose is a rarity and also given the 
     higher chances of Offer(s) being accepted in practice, we believe this solution should
     enable successful glareless media addition with high frequency.		
    </t>    
  </section>

   <section anchor="sec-sec" title="Security Considerations">
      <t>TBD</t>
    </section>

    <section title="IANA Considerations">
      <t>This document requires no actions from IANA.</t>
    </section>
	
</middle>

<back>
    <references title="Normative References">
		<?rfc include="reference.RFC.2119"?>		
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.3264"?>
      <?rfc include="reference.RFC.4566"?>
    <?rfc include="reference.RFC.3261"?>
      <reference anchor='RFC5888'>
        
        <front>
          <title>The Session Description Protocol (SDP) Grouping Framework</title>
          <author initials='G.' surname='Camarillo' fullname='G. Camarillo'>
            <organization /></author>
          <author initials='H.' surname='Schulzrinne' fullname='H. Schulzrinne'>
            <organization /></author>
          <date year='2010' month='June' />
          <abstract>
            <t>In this specification, we define a framework to group "m" lines in the Session Description Protocol (SDP) for different purposes.  This framework uses the "group" and "mid" SDP attributes, both of which are defined in this specification.  Additionally, we specify how to use the framework for two different purposes: for lip synchronization and for receiving a media flow consisting of several media streams on different transport addresses.  This document obsoletes RFC 3388. [STANDARDS-TRACK]</t></abstract></front>
        
        <seriesInfo name='RFC' value='5888' />
        <format type='TXT' octets='43924' target='http://www.rfc-editor.org/rfc/rfc5888.txt' />
      </reference>
      
      <reference anchor='I-D.alvestrand-mmusic-msid'>
        <front>
          <title>Cross Session Stream Identification in the Session Description Protocol</title>
          
          <author initials='H' surname='Alvestrand' fullname='Harald Alvestrand'>
            <organization />
          </author>
          
          <date month='December' day='13' year='2012' />
          
          <abstract><t>This document specifies a grouping mechanism for RTP media streams that can be used to specify relations between media streams within different RTP sessions as well as within a single RTP session.  This mechanism is used to signal the association between the RTP concept of SSRC and the WebRTC concept of "media stream" / "media stream track" using SDP signaling.  This document is an input document for discussion.  It should be discussed in the MMUSIC WG list, mmusic@ietf.org.</t></abstract>
          
        </front>
        
        <seriesInfo name='Internet-Draft' value='draft-alvestrand-mmusic-msid-02' />
        <format type='TXT'
          target='http://www.ietf.org/internet-drafts/draft-alvestrand-mmusic-msid-02.txt' />
      </reference>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 05:26:52