http://stupid.domain.name/ietf/

One document matched: draft-wenger-clue-definitions-02.xml
<?xml version="1.0" encoding="US-ASCII"?>

<!DOCTYPE rfc SYSTEM "http://xml.resource.org/authoring/rfc2629.dtd"  [

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC3261 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3261.xml">
<!ENTITY RFC3550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY RFC4353 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4353.xml">
<!ENTITY RFC5117 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5117.xml">

]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc iprnotified="no" ?>
<?rfc strict="no" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no"?>
<?rfc sortrefs="no" ?>

<rfc category="info" docName="draft-wenger-clue-definitions-02.txt" ipr="trust200902">

  <front>
     <title abbrev="CLUE Definitions"> CLUE Definitions</title>

    <author fullname="Stephan Wenger" initials="S." surname="Wenger">
      <organization>Vidyo, Inc.</organization>

      <address>
        <postal>
          <street>433 Hackensack Ave., 7th Floor</street>

          <city>Hackensack</city>

          <region>NJ</region>

          <code>07601</code>

          <country>USA</country>
        </postal>

        <email>stewe@stewe.org</email>
      </address>
    </author> 
    <date month="October" year="2011" />


    <workgroup>CLUE WG</workgroup>


    <abstract>
      <t>This document collects terminology and definitions to be used consistently 
      among documents produced in the CLUE working group.  
        </t>
    </abstract>
  </front>

  <middle>
    <section anchor="Introduction" title="Introduction">
<t>
This document collects terminology and definitions to be used consistently
among documents produced in the CLUE working group.  The 
    draft is intended to be a
living document, to be updated as the working group discussions progress.  It is 
not intended for publication 
    as an RFC; instead, it is expected that definitions
and terminology defined herein are going to be replicated in other CLUE documents.
</t>

<t>
All defined terms herein are captitalized.  It is recommended that authors of 
other CLUE document follow this conventions when using 
    definitions from this
document.
</t>

<t>
It should be understood that the definitions herein are not to be interpreted as
antedating design decisions to be made later.  It's entierly 
    reasonable that
design choices conflicting with definitions in this draft may become necessary.
In such cases, CLUE document authors can 
    either request the definition be updated,
or can simply avoid the use of a definition of this document.  However, an
author should never 
    use a capitzalied definition as set forth herein with a different
meaning in a different CLUE document.
</t>


</section>

<section anchor="terms" title="Terminology">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref target="RFC2119">RFC 2119</xref>.</t>
</section>


<section anchor="Definitions" title="Definitions">
<list style='hanging' hangIndent='5'>


<t>  Audio Capture: Media Capture for audio.  Denoted as ACn. </t>
<t> 
Audio Mixing: refers to the accumulation of scaled audio signals to produce a single
audio stream. See RTP Topologies, <xref target="RFC5117"/>.  
</t>


<t>  Camera-Left and Right: For media captures, camera-left and camera-right are from the point of view of a person observing the rendered
     media.  They are the opposite of stage-left and stage-right. </t>
<t> 
Capture Device: A device that converts audio and video input
into an electrical signal, in most cases to be fed into a media
encoder.  
     Cameras, microphones, and keyboards are examples for capture devices.    
</t>


<t>  Capture Scene: the scene that is captured by a collection of Capture Devices.  A Capture Scene may be represented by more than one type of
     Media.  A Capture Scene may include more than one Media Capture of the same type.  An example of a Capture Scene is the video image of a
     group of people seated next to each other, along with the sound of their voices, which could be represented by some number of VCs and
     ACs.  A middle box may also express Capture Scenes that it constructs from Media streams it receives. </t>
<t>  Capture Set: includes Media Captures that all represent some aspect of the same Capture Scene.  The items (rows) in a Capture Set
     represent different alternatives for representing the same Capture Scene. </t>
<t> 
Conference: used as defined in <xref target="RFC4353"/>, A Framework
for Conferencing within the Session Initiation Protocol (SIP).
</t>


<t>  Encoding Group: Encoding group: A set of encoding parameters representing a device's complete encoding capabilities or a
     subdivision of them.  Media stream providers formed of multiple physical units, in each of which resides some encoding capability,
     would typically advertise themselves to the remote media stream consumer as being formed multiple encoding groups.  Within each
     encoding group, multiple potential actual encodings are possible, with the sum of those encodings' characteristics constrained to being
     less than or equal to the group-wide constraints.</t>
<t>  Endpoint: The logical point of final termination through
receiving, decoding and rendering, and/or initiation through
capturing, encoding, 
     and sending of media streams.  An endpoint
consists of one or more physical devices which source and sink
media streams, and exactly 
     one <xref target="RFC4353"/> Participant
(which, in turn, includes exactly one SIP User Agent).
In contrast to an endpoint, an MCU may 
     also send and receive media streams, but it is
not the initiator nor the final terminator in the sense that Media
is Captured or Rendered.  
     Endpoints can be
anything from multiscreen/multicamera rooms to handheld devices.
</t>


<t>
 Endpoint Characteristics: include placement of Capture and Rendering 
Devices, capture/render angle, resolution of cameras and screens, 
     spatial 
location and mixing parameters of microphones.  Endpoint characteristics 
are not specific to individual media streams sent by the endpoint.
</t>


<t>  Front: the portion of the room closest to the cameras.  In going towards back you move away from the cameras. </t>
<t>  Individual Encode: A variable with a set of attributes that describes the maximum values of a single audio or video capture
     encoding.  The attributes include: maximum bandwidth- and for video maximum macroblocks, maximum width, maximum height, maximum frame 
     rate.  [Edt. These are based on H.264.] </t>
<t>  Layout: How rendered media streams are spatially arranged with respect
to each other on a single screen/mono audio telepresence
endpoint, and how rendered 
     media streams are arranged with respect to
each other on a multiple screen/loudspeaker telepresence endpoint.
Note that audio as well as video is encompassed 
     by the term
layout--in other words, included is the placement of audio
streams on loudspeakers as well as video streams on video screens.
Edit. note: 
     this is on the RENDERER side.
</t>


<t>
 Left: to be interpreted in the context of the description where the word occurs."
Edt. note: there has been a lot of mailing list discussions about "left" 
     and "right", 
and we seem not to be able to arrive at a common understanding.  
</t>


<t>
 Local: Sender and/or receiver physically co-located ("local", "in the same room") in the context of the discussion.
</t>

<t>
 MCU: Multipoint Control Unit (MCU) - a device that
connects two or more endpoints together into one single
multimedia conference <xref target="RFC5117"/>.  
     An MCU includes
an <xref target="RFC4353"/> Mixer.
Edt. Note: RFC4353 is tardy in requireing that media from the mixer
be sent to EACH participant.  
     I think we have practical use cases
where this is not the case.  But the bug (if it is one) is in 4353
and not herein.
</t>


<t> 
Media: Any data that, after suitable encoding, can be conveyed
over RTP, including audio, video or timed text.
Edt. note: does Media include far end 
     camera control (which can be
conveyed over RTP)?
</t>


<t>  Media Capture: a source of Media, such as from one or more Capture Devices.  A Media Capture (MC) may be the source of one or more Media
     streams.  A Media Capture may also be constructed from other Media streams.  A middle box can express Media Captures that it constructs
     from Media streams it receives. </t>
<t>  Media Consumer: an Endpoint or middle box that receives Media streams </t>
<t>  Media Provider: an Endpoint or middle box that sends Media streams</t>
<t>
Model: a set of assumptions a telepresence system of a given vendor 
adheres to and expects 
     the remote telepresence system(s) also to adhere to. 
</t>


<t>
Participant: to be interpreted as defined in RFC 4353
</t>


<t> 
Remote: Sender and/or receiver on the other side of the communication
channel (depending on context); not Local. A remote can be an Endpoint or an MCU.          
</t>


<t> 
Render: the process of generating a representation from a media, such 
as displayed motion video or sound emitted from loudspeakers.
</t>


<t>
 Rendering Device: A device that converts an electrical signal - in most cases stemming 
from a media decoder - to audible and visual signals.  
     Screens and loudspeakers are 
examples for rendering devices.
</t>


<t>
 Right: to be interpreted in the context of the description where the word occurs."/>
</t>


<t>  Simultaneous Transmission Set: a set of media captures that can be transmitted simultaneously from a Media Provider. </t>
<t> 
Source selection policies: rules for determining which media
source(s) to play or show.
</t>


<t> 
Spatial Relation: The arrangement in space of two objects, in
contrast to relation in time or other relationships.  See also Left and
Right.  
</t>


<t>  RTP stream as in [RFC3550]. </t>
<t>
 Stream Characteristics: include media stream attributes commonly used 
in non-CLUE SIP/SDP environments (such as: media codec, bit rate, resolution, 

     profile/level etc.) as well as CLUE specific attributes (which could 
include for example and depending on the solution found: the I-D or spatial 

     location of a capture device a stream originates from).
</t>

<t>
Telepresence: an environment that gives non co-located users or 
user groups a 
     feeling of (co-located) presence - the feeling that a 
Local user is in the same room with other Local users and the Remote 
parties.  The 
     inclusion of Remote parties is achieved through 
multimedia communication including at least audio and video signals of 
high fidelity.
</t>

<t>
 "Telepresence Extensions": The protocol extensions beyond
SIP adn its extensions defined as of July 2011, specified by RFC XXXX. (XXX to be 
     replaced by the
RFC editor with any documents developed by the CLUE WG that contain
normative content, such as architecture, protocol specification, 
     and
whatnot.)
</t>

<t>  Video Capture: Media Capture for video.  Denoted as VCn. </t>

<t> 
Video composite: A single image that is formed from combining
visual elements from separate sources.  
</t> 


</list>


</section>


<section anchor="Acknowledgements" title="Acknowledgements">
      
<t> Most of this stuff is copied from the CLUE requirements and framework draft.  Go there to look at dignitaries.
</t>
    </section>


    <section anchor="IANA" title="IANA Considerations">
      <t>None</t>

    </section>

    <section anchor="Security" title="Security Considerations">
      <t> None </t>
    </section>
    
  </middle>



  <back>
    <references title="Informative References">
     &RFC2119;
     &RFC5117;
     &RFC3550;
     &RFC4353;
   
     
     
     
    <reference anchor="StageDirection(Wikipadia)" target="http://en.wikipedia.org/wiki/Stage_direction#Stage_directions">
    <front>
    <title>Blocking (stage), available from http://en.wikipedia.org/wiki/Stage_direction#Stage_directions</title>
    <author> <organization>Wikipedia</organization>
    </author>
<date year='2011' month='May' />
</front>
</reference>   
    </references>    
    
<section anchor="Appendix_A" title="Draft History">
<t>
01: removed "stage direction" in "left" "right".  Replaced with "to be interpreted in context".
</t>
<t>
clarified "local" to mean "in the same room" per J. Polk email 6/14/11
</t>
<t>
Consistently refer to "loudspeaker" per Christer's email 6/13/11
</t>
<t>
Added "keyboard" to listed caputure devices per Christer's email 6/13/11
</t>
<t>
Rendering device: changed "optical" to "visual".  Hope Christer finds this OK.  Per Christer's email 6/13/11
</t>
<t>
Note: the other changes proposed by Christer in his 6/13 email appear to change
more than just words but semantics.  Not included here as such semantic changes 
are best handled in the requirement and other docs.
</t>
<t>
Edt. Note: there have been lengthily discussions about the term "Participant".
One camp wanted the word to refer to human users of telepresence equipment 
so to use intuitive language, the other (as this document) suggests to
use the word in the RFC 
</t>
<t>
Participant defined as per RFC 4353.  I think this is what we arrived at on the mailing list, right?
</t>
<t>
Added "Telepresence Extensions" per long "solutions" thread.
</t>
<t>
00: Initial version; mostly copy-past from requirements-02.
</t>
<t>Removed "session".  Editors are advised to be specific about SIP, RTP, or whatever else sessions.</t>
<t>Made clear that Layout refers to rendering, not capturing.</t>
<t>added Left, Right, in stage direction, w/ Wikipedia reference </t>
</section>
  </back>
  
 
</rfc>
PAFTECH AB 2003-2026
2026-04-24 06:47:40