One document matched: draft-nandakumar-rtcweb-glare-handling-00.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- comment -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd"[]>
<?rfc toc="no" ?>
<?rfc compact="yes" ?>
<?rfc sortrefs="no" ?>
<?rfc symrefs="yes" ?>
<rfc ipr="trust200902" category="info" docName="draft-nandakumar-rtcweb-glare-handling-00.txt" submissionType="IETF" xml:lang="en">
<front>
<title abbrev="Glareless Media">
Glareless addition of media to existing RTCWeb Sessions
</title>
<author fullname="Suhas Nandakumar" initials="S." surname="Nandakumar">
<organization>Cisco</organization>
<address>
<postal>
<street>170 West Tasman Drive</street>
<city>San Jose</city>
<region>CA</region>
<code>95134</code>
<country>USA</country>
</postal>
<email>snandaku@cisco.com</email>
</address>
</author>
<author fullname="Cullen Jennings" initials="C." surname="Jennings">
<organization>Cisco</organization>
<address>
<postal>
<street>400 3rd Avenue SW, Suite 350</street>
<city>Calgary</city>
<region>AB</region>
<code>T2P 4H2</code>
<country>Canada</country>
</postal>
<email>fluffy@iii.ca</email>
</address>
</author>
<date day="18" month="February" year="2013" />
<area>RAI</area>
<abstract>
<t>
The RFC3264 Offer/Answer model specifies rule for the bilateral exchange of
Session Description Protocol (SDP) [RFC4566] messages for setting up, updating and
tearing down of multimedia streams. Rarely, there might be situations wherein either
of the communicating parties, might end up being the offerer for updating an on-going session.
This scenario is commonly known as "glare" condition and it needs to be handled nevertheless.
This specification describes procedures for parties involved in an ongoing RTCWeb session to
add new media in a glareless fashion.
</t>
<t>
There are various ways this problem might be solved - this
draft sketches out one possible solution to the problem.
</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>
In the most basic form,the RFC3264 <xref target="RFC3264"></xref> protocol operation
begins by one of the participants sending an initial SDP offer describing its intent to
start a multimedia communication session. The participant receiving the offer MAY generate
an SDP answer accepting the offer or it MAY reject the offer. Once the session is setup,
at any time, either agent MAY generate a new offer that updates the session. However, it
MUST NOT generate a new offer if it has received an offer which it has not yet been answered or
rejected. If an agent receives an offer after having sent one, but before receiving an
answer to it,the situation is considered as "glare" condition.
</t>
<t>
This specification defines set of procedures for RTCWeb end-points to add new media to an
ongoing session in a glareless fashion. The rest of this document is organized as follow.
Section 2 provides motivation for dealing with glare condition. Section 3 explains
the detailed call-flows with examples describing the solution proposed in this specification.
Finally Section 4 concludes with a note on applicability.
</t>
</section>
<section title="Terminology">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in <xref target="RFC2119"></xref>.</t>
<t>m-line: An <xref target="RFC4566">RFC4566</xref> media description
identifier that starts with "m=" field and conveys following values:media
type,transport port,transport protocol and media format descriptions.</t>
<t>m-block: An <xref target="RFC4566">RFC4566</xref> media description
that starts with an m-line and is terminated by either the next m-line or
by the end of the session description.</t>
<t>Offer: An <xref target="RFC3264"></xref> SDP message generated by the
participant who wishes to initiate a multimedia communication session.
An Offer describes participants capabilities for engaging in a multimedia
session.</t>
<t>Answer: An <xref target="RFC3264"></xref> SDP message generated by the
participant in response to an Offer. An Answer describes participants
capabilities in continuing with the multimedia session with in the
constraints of the Offer.</t>
</section>
<section anchor="sec-motivation" title="Motivation">
<t>
The following example serves as base case for "glare" condition that will be discussed throughout
this document in the context of the proposed solution.
<list>
<t>
Alice and Bob are in a two way audio-only RTCWeb Session.
</t>
<t>
They decide to escalate to a video session.
</t>
<t>
Each initiate the addition of their respective camera streams to their current
session almost the same time. This is done by each side sending the Offer
with appropriate media descriptions.
</t>
<t>
Alice receives Offer from the Bob before receiving Answer for her Offer.
</t>
<t>
Bob receives Alice's Offer when he is expecting an Answer for his Offer.
</t>
<t>
Both Alice and Bob have outstanding Offers and they end up stuck in a "glare"
situation.
</t>
</list>
</t>
<t>
End points stuck in the "glare" condition stay there forever unless the condition
is resolved either by some higher layer protocol mechanisms or by some application logic.
An example for the former is SIP <xref target="RFC3261"></xref> that provides means for ordering
of messages in each direction to resolve "glare" condition. This documents
proposes a case for the latter.
</t>
</section>
<section title="Solution">
<t>
Here is the basic outline of the solution described herein.
<list style="numbers">
<t>
Participants setup an initial RTCWeb session using normal Offer/Answer procedures.
This can by any combination of audio, video or data-channel sessions.
</t>
<t>
Whenever participants decide to add new media, a m-block describing the
capabilities of media stream is generated and exchanged with the peer.
</t>
<t>
Each sender "Opportunistically" acknowledges the m-block with an
Answer that it is expecting for the m-blocks sent. This triggers
the completion of Offer/Answer state machine at the senders thus
not resulting in any outstanding Offers.
In this specification, the Answer used for this purpose is called
"ExpectedAnswer" since this is not the real answer from the Answerer.
</t>
<t>
When the m-blocks reach the Answerer's Javascript application, the
application needs to re-order the m-blocks to be in the right order
for the local SDP
</t>
<t>
Once re-ordered to match the order of m-blocks in the local SDP, the Offer is
installed with the Peer Connection at the Answerer,as remote description, to
generate an Answer
</t>
<t>
On obtaining the Answer, the Answerer's Javascript application extracts the m-block,
sends it to the Offerer as response.
</t>
<t>
On receiving the response m-blocks from the Answerer, the Offerer needs to re-order
the m-blocks to be in the right order for the Offerer's local SDP.
</t>
<t>
The Offerer's Javascript application then will generate an updated Offer with the
m-blocks received and re-ordered, to be applied as remote description to generate
an Answer. This is done to apply the real Answer from the Answerer and overwrite
any number of "ExpectedAnswers" applied in the interim. The term "Overlapping Zone"
is used to indicate time between initial m-block Offer and the final m-block Answer.
During this time, the end-points at the either end MAY be executing one or more
iterations of m-block exchanges".
</t>
<t style="hangText">
There could be few limitations that might result in the failure while
converting the m-block Answer to an Offer at the Offerer as described in the section
<xref target="sec-applicability"></xref>
</t>
<t>
The Offerer's Javascript application will extract m-blocks from the generated Answer
to match it with its local SDP to ensure that a successful m-block Offer/Answer exchange
did happen. If not, the on-going session is terminated.
</t>
<t>
</t>
</list>
</t>
<section title="Requirements">
<t>
Following represent the basic ideas and the requirements for the proposed solution to perform
glareless addition of new media.
<list style="symbols">
<t>
There is an ongoing RTCWeb session between the parties, say Alice and Bob. This
implies an RTP transport association has been successfully setup between the peers.
</t>
<t>
The Javascript applications have their own copy of the most recent successfully negotiated
SDP with the local ordering of m-blocks preserved.
</t>
<t>
The Javascript application is responsible for maintaining the appropriate ordering of the
m-blocks for that User Agent.
</t>
<t>
Subsequent Offer/Answer exchanges after the initial exchange might not use complete
SDP messages to indicate updates to the session, say adding a new video stream.
One approach proposed in this document involves exchange of "m-blocks" that
describe the capabilities of the newly added media.
</t>
<t>
The Javascript application MUST be capable of generating complete SDP Offer or Answer
from the session updates exchanged so as to supply the same to the User Agent or to
negotiate the same with the peer, if needed.
</t>
<t>
The end-points MUST support at least one way to identify the m-blocks. In this specification
mid <xref target="RFC5888"></xref> is used for this purpose and it is open to other
identification approaches such as msid <xref target="I-D.alvestrand-mmusic-msid"></xref>
</t>
</list>
</t>
</section>
<section anchor="sec-callflow" title="Call Flow - Glareless One-Way Video Addition">
<t>
Below is the high-level call flow that captures the procedure for adding one-way video
to an ongoing RTCWeb session in a glareless fashion. Also to note, the call-flow
captures the proposed solution from the Alice's perspective alone due to the
implied symmetry.
</t>
<figure>
<artwork alt="Glareless One-Way Video Add"
src="Glareless.png" width="100%" ><![CDATA[
title Glareless One-Way Video Add
participant Alice_PC
participant Alice_JS
participant BOB_JS
participant BOB_PC
note over Alice_PC, BOB_PC
On-Going RTP Session with any combination of
audio and/or video and/or data-channel.
end note
note over Alice_JS, BOB_JS
Alice and Bob's JS application have local copies of the most recent
successfully negotiated SDP
end note
Alice_JS->Alice_PC: AddStream(Video)
Alice_JS->+Alice_PC: CreateOffer()
Alice_PC-->-Alice_JS:offer with video sendonly.
Alice_JS->Alice_PC: ExpectedAnswer w/video inactive
Alice_JS->BOB_JS: m-block offer for added video source
BOB_JS->BOB_JS: Merge m-block to local set
BOB_JS->BOB_JS: Generate Offer from the local set
BOB_JS->BOB_PC: SetRemoteDesc(Offer)
BOB_JS->+BOB_PC: CreateAnswer()
BOB_PC-->-BOB_JS: answer with video recvonly
BOB_JS->BOB_JS: ReMap answer to local set
BOB_JS->Alice_JS:m-block answer accepting the video stream
Alice_JS->Alice_JS: Merge Bob's m-block to generate updated offer
Alice_JS->Alice_PC: SetRemoteDesc() w/updated offer
Alice_JS->+Alice_PC: CreateAnswer()
Alice_PC-->-Alice_JS: answer with video recv-only
Alice_JS->Alice_JS: Check for Success/Failure
note left of Alice_JS
For success, the m-block values from the generated answer MUST
match with the corresponding values in the local set matched
by their MIDs
end note
]]></artwork>
</figure>
</section>
<section title="Call Flow Details">
<t>
The table below provides step-by-step analysis of various states as reflected
at Alice's and Bob's end point when applied to example mentioned in the Motivation
section <xref target="sec-motivation"></xref>
</t>
<t hangText="Note">
Only the relevant aspects of SDP media descriptions are captures for the sake of
clarity.
</t>
<texttable title="Timeline for Glareless Video Addition">
<ttcol align='left'>Timeline </ttcol>
<ttcol align='left'>Alice's JS View </ttcol>
<ttcol align='left'>Bob's JS View </ttcol>
<c>t+0</c><c>Local Set: m=audio (sendrecv)</c><c>Local Set: m=audio (sendrecv) </c>
<c> </c><c> </c><c> </c>
<c>t+1</c><c>AddStream(Video)-sendonly </c><c>AddStream(Video)-sendonly </c>
<c> </c><c> </c><c> </c>
<c>t+2</c><c>CreateOffer with m=audio,m=video[mid:1,sendonly] </c>
<c>CreateOffer with m=audio,m=video[mid:2,sendonly] </c>
<c> </c><c> </c><c> </c>
<c>t+3</c><c>Generate ExpectedAnswer with m=audio,m=video[mid:1,inactive] </c>
<c>Generate ExpectedAnswer with m=audio,m=video[mid:2,inactive] </c>
<c> </c><c> </c><c> </c>
<c>t+4</c><c>Install ExpectedAnswer </c> <c>Install ExpectedAnswer </c>
<c> </c><c> </c><c> </c>
<c>t+5</c><c>Send video m-block [mid:1,sendonly] </c>
<c>Send video m-block [mid:2,sendonly] </c>
<c> </c><c> </c><c> </c>
<c>t+6</c><c>Video m-block [mid:2,sendonly] Arrives </c>
<c>Video m-block [mid:1,sendonly] Arrives </c>
<c> </c><c> </c><c> </c>
<c>t+7</c><c>Merge/Update Local Set </c><c>Merge/Update Local Set </c>
<c> </c><c> </c><c> </c>
<c>t+8</c><c>LocalSet:m=audio,m=video[mid:1,sendonly],m=video[mid:2,sendonly] </c>
<c>Local Set:m=audio,m=video[mid:2,sendonly],m=video[mid:1,sendonly] </c>
<c> </c><c> </c><c> </c>
<c>t+9</c><c>Install RemoteDescription </c>
<c>Install RemoteDescription </c>
<c> </c><c> </c><c> </c>
<c>t+10</c><c>CreateAnswer m=audio,m=video[mid:1,sendonly] m=video[mid:2,recvonly] </c>
<c>CreateAnswer for m=audio,m=video[mid:1,recvonly] m=video[mid:2,sendonly] </c>
<c> </c><c> </c><c> </c>
<c>t+11</c><c>Merge/Update Local Set </c> <c>Merge/Update Local Set </c>
<c> </c><c> </c><c> </c>
<c>t+12</c><c>Local Set:m=audio,m=video[mid:1,sendonly],m=video[mid:2 recvonly] </c>
<c>Local Set:m=audio,m=video[mid:2,sendonly],m=video[mid:1,recvonly] </c>
<c> </c><c> </c><c> </c>
<c>t+13</c><c>Send video m-block [mid:2,recvonly] </c>
<c>Send video m-block [mid:1,recvonly] </c>
<c> </c><c> </c><c> </c>
<c>t+14</c><c>Video m-block [mid:1,recvonly] Arrives </c>
<c>Video m-block [mid:2 recvonly] Arrives </c>
<c> </c><c> </c><c> </c>
<c>t+15</c><c>Install updated Offer m=audio,m=video[mid:1 recvonly],m=video[mid:2,recvonly] </c>
<c>Generate Update Offer m=audio,m=video[mid:1 recvonly],m=video[mid:2,recvonly] </c>
<c> </c><c> </c><c> </c>
<c>t+16</c><c>Create Answer with m=audio,m=video[mid:1 sendonly],m=video[mid:2 recvonly] </c>
<c>Create Answer with m=audio,m=video[mid:1 recvonly],m=video[mid:2,sendonly] </c>
<c> </c><c> </c><c> </c>
<c>t+17</c><c>Compare Local Set and generated Answer </c>
<c>Compare Local Set and generated Answer </c>
<c> </c><c> </c><c> </c>
<c>t+18</c><c>Local Set at t+12 matches with Answer m-blocks at t+16 </c>
<c>Local Local Set at t+12 matches with Answer m-blocks at t+16 </c>
<c> </c><c> </c><c> </c>
</texttable>
</section>
</section>
<section anchor="sec-applicability" title="Applicability Statement">
<t>
As mentioned earlier, generating an Offer from the Answer created in response to m-block
received from the Answerer might fail due to possible mismatches in the configurations
between Offered m-block, applied Expected Answer and the actual Answer received
from the Answerer.
</t>
<t>
On the other hand, this solution works best in all the cases where the first side can
easily predict what the far side's answer will be. Given that the Offerer offering
a capability that is supposed to fail on purpose is a rarity and also given the
higher chances of Offer(s) being accepted in practice, we believe this solution should
enable successful glareless media addition with high frequency.
</t>
</section>
<section anchor="sec-sec" title="Security Considerations">
<t>TBD</t>
</section>
<section title="IANA Considerations">
<t>This document requires no actions from IANA.</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.2119"?>
</references>
<references title="Informative References">
<?rfc include="reference.RFC.3264"?>
<?rfc include="reference.RFC.4566"?>
<?rfc include="reference.RFC.3261"?>
<reference anchor='RFC5888'>
<front>
<title>The Session Description Protocol (SDP) Grouping Framework</title>
<author initials='G.' surname='Camarillo' fullname='G. Camarillo'>
<organization /></author>
<author initials='H.' surname='Schulzrinne' fullname='H. Schulzrinne'>
<organization /></author>
<date year='2010' month='June' />
<abstract>
<t>In this specification, we define a framework to group "m" lines in the Session Description Protocol (SDP) for different purposes. This framework uses the "group" and "mid" SDP attributes, both of which are defined in this specification. Additionally, we specify how to use the framework for two different purposes: for lip synchronization and for receiving a media flow consisting of several media streams on different transport addresses. This document obsoletes RFC 3388. [STANDARDS-TRACK]</t></abstract></front>
<seriesInfo name='RFC' value='5888' />
<format type='TXT' octets='43924' target='http://www.rfc-editor.org/rfc/rfc5888.txt' />
</reference>
<reference anchor='I-D.alvestrand-mmusic-msid'>
<front>
<title>Cross Session Stream Identification in the Session Description Protocol</title>
<author initials='H' surname='Alvestrand' fullname='Harald Alvestrand'>
<organization />
</author>
<date month='December' day='13' year='2012' />
<abstract><t>This document specifies a grouping mechanism for RTP media streams that can be used to specify relations between media streams within different RTP sessions as well as within a single RTP session. This mechanism is used to signal the association between the RTP concept of SSRC and the WebRTC concept of "media stream" / "media stream track" using SDP signaling. This document is an input document for discussion. It should be discussed in the MMUSIC WG list, mmusic@ietf.org.</t></abstract>
</front>
<seriesInfo name='Internet-Draft' value='draft-alvestrand-mmusic-msid-02' />
<format type='TXT'
target='http://www.ietf.org/internet-drafts/draft-alvestrand-mmusic-msid-02.txt' />
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:26:52 |