One document matched: draft-lennox-avtext-lrr-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc3551 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3551.xml">
<!ENTITY rfc3711 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3711.xml">
<!ENTITY rfc3984 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3984.xml">
<!ENTITY rfc4855 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4855.xml">
<!ENTITY rfc4566 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4566.xml">
<!ENTITY rfc4585 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4585.xml">
<!ENTITY rfc5104 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5104.xml">
<!ENTITY rfc6190 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6190.xml">
<!ENTITY rfc6838 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6838.xml">
<!ENTITY vp8rtp SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-payload-vp8.xml">
<!ENTITY vp9rtp SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.uberti-payload-vp9.xml">
<!ENTITY h265rtp SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-payload-rtp-h265.xml">
<!ENTITY taxonomy SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-avtext-rtp-grouping-taxonomy.xml">

]>
<rfc category="std" docName="draft-lennox-avtext-lrr-00" ipr="trust200902">
  <?rfc symrefs="yes" ?>

  <?rfc sortrefs="yes" ?>

  <!-- alphabetize the references -->

  <?rfc comments="no"?>

  <!-- show comments -->

  <?rfc inline="yes" ?>

  <!-- comments are inline -->

  <?rfc toc="yes" ?>

  <!-- generate table of contents -->

  <front>
    <title abbrev="Layer Refresh Request RTCP Feedback">The Layer Refresh Request (LRR) RTCP Feedback Message</title>

    <author fullname="Jonathan Lennox" initials="J." surname="Lennox">
      <organization abbrev="Vidyo">Vidyo, Inc.</organization>

      <address>
        <postal>
          <street>433 Hackensack Avenue</street>

          <street>Seventh Floor</street>

          <city>Hackensack</city>

          <region>NJ</region>

          <code>07601</code>

          <country>US</country>
        </postal>

        <email>jonathan@vidyo.com</email>
      </address>
    </author>

     <author fullname="Danny Hong" initials="D." surname="Hong">
      <organization abbrev="Vidyo">Vidyo, Inc.</organization>

      <address>
        <postal>
          <street>433 Hackensack Avenue</street>

          <street>Seventh Floor</street>

          <city>Hackensack</city>

          <region>NJ</region>

          <code>07601</code>

          <country>US</country>
        </postal>

        <email>danny@vidyo.com</email>
      </address>
    </author>

    <author fullname="Justin Uberti" initials="J." surname="Uberti">
      <organization abbrev="Google">Google, Inc.</organization>

      <address>
        <postal>
          <street>747 6th Street South</street>

          <city>Kirkland</city>

          <region>WA</region>

          <code>98033</code>

          <country>USA</country>
        </postal>
        <email>justin@uberti.name</email>
      </address>
    </author>

    <author fullname="Stefan Holmer" initials="S." surname="Holmer">
      <organization abbrev="Google">Google, Inc.</organization>
       <address>
        <postal>
          <street>Kungsbron 2</street>

          <code>111 22</code>

          <city>Stockholm</city>

          <country>Sweden</country>
        </postal>
      </address>
    </author>

    <author fullname="Magnus Flodman" initials="M." surname="Flodman">
      <organization abbrev="Google">Google, Inc.</organization>
      <address>
        <postal>
          <street>Kungsbron 2</street>

          <code>111 22</code>

          <city>Stockholm</city>

          <country>Sweden</country>
        </postal>
      </address>
    </author>

    <date/>

    <area>RAI</area>

    <workgroup>Payload Working Group</workgroup>

    <keyword>RFC</keyword>

    <keyword>Request for Comments</keyword>

    <keyword>RTP</keyword>

    <abstract>
      <t>This memo describes the RTP Payload-Specific Feedback Message
		"Layer Refresh Request" (LRR), which can be used to request a
		state refresh of one or more substreams of a layered media
		stream.  It also defines its use with several scalable media
		formats.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <t>This memo describes an RTP <xref target="RFC4585">Payload-Specific Feedback Message</xref>
		"Layer Refresh Request" (LRR).  It is designed to allow a
		receiver of a layered media stream to request that one or more
		of its substreams be refreshed, such that it can then be
		decoded by an endpoint which previously was not receiving those
		layers, without requiring that the
		entire stream be refreshed (as it would be if the receiver
		sent a <xref target='RFC5104'>Full Intra Request (FIR)</xref>.</t>

	  <t>The message is designed to be applicable both to temporally
	  and spatially scaled streams, and to both single-stream and
	  multi-stream scalability modes.</t>

    </section>

    <section anchor="conventions"
             title="Conventions, Definitions and Acronyms">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>


	<section anchor="terminology"
			 title="Terminology">
	  <t>A "Layer Refresh Point" is a point in a scalable stream after
	  which a decoder, which previously had been able to decode only some
	  (possibly none) of the available layers of stream, is able to
	  decode a greater number of the layers.</t>

	  <t>For spatial (or quality) layers, layer refresh typically
	  requires that a spatial layer be encoded in a way that
	  references only lower-layer subpictures of the current picture,
	  not any earlier pictures of that spatial layer.  Additionally,
	  the encoder must promise that no earlier pictures of that
	  spatial layer will be used as reference in the future.</t>

      <figure anchor="figureSpatialRefresh">
        <preamble>An illustration of spatial layer refresh is shown
        below.</preamble>
          <artwork><![CDATA[
     ... <--  S1  <--  S1       S1  <--  S1  <-- ...
               |        |        |        |
              \/       \/       \/       \/
     ... <--  S0  <--  S0  <--  S0  <--  S0  <-- ...

               1        2        3        4
          ]]></artwork>

          <postamble>In this illustration, frame 3 is a layer refresh
			point for spatial layer S1; a decoder which had previously
			only been decoding spatial layer S0 would be able to
			decode layer S1 starting at frame 3.</postamble>
      </figure>

	  <t>For temporal layers, layer refresh requires that the layer be
	  "temporally nested", i.e. use as reference only
	  earlier frames of a lower temporal layer, not any earlier frames of this
	  temporal layer, and also promise that no future frames
	  of this temporal layer will reference frames of this temporal
	  layer before the refresh point.  In many cases, the temporal
	  structure of the stream will mean that all frames are
	  temporally nested, in which case decoders will have no need to
	  send LRR messages for the stream.</t>

      <figure anchor="figureTemporalRefresh">
        <preamble>An illustration of temporal layer refresh is shown
        below.</preamble>
          <artwork><![CDATA[
        ...  <----- T1  <------ T1          T1  <------ ...
                   /           /           /
                 |_          |_          |_
     ... <--  T0  <------ T0  <------ T0  <------ T0  <--- ...

               1     2     3     4     5     6     7
          ]]></artwork>

          <postamble>In this illustration, frame 6 is a layer refresh
			point for temporal layer T1; a decoder which had previously
			only been decoding temporal layer T0 would be able to
			decode layer T1 starting at frame 6.</postamble>
      </figure>

      <figure anchor="figureTemporalNesting">
        <preamble>An illustration of an inherently temporally nested
        stream is shown below.</preamble>
          <artwork><![CDATA[
                    T1          T1          T1
                   /           /           /
                 |_          |_          |_
     ... <--  T0  <------ T0  <------ T0  <------ T0  <--- ...

               1     2     3     4     5     6     7
          ]]></artwork>

          <postamble>In this illustration, the stream is temporally
          nested in its ordinary structure; a decoder receiving layer
          T0 can begin decoding layer T1 at any point.</postamble>
      </figure>

	</section>

    </section>

	<section anchor="layerRefreshRequest" title="Layer Refresh Request">
	  <t>A layer refresh frame can be requested by sending a Layer Refresh Request (LRR),
	  which is an <xref target="RFC4585">RTCP payload-specific feedback message</xref> asking the encoder to encode a frame
	  which makes it possible to upgrade to a higher layer. The LRR
	  contains one or two tuples, indicating the layer the decoder
	  wants to upgrade to, and (optionally) the currently highest
	  layer the decoder can decode.</t>

	  <t>The specific format of the tuples, and the mechanism by which
	  a receiver recognizes a refresh frame, is
	  codec-dependent.  Usage for several codecs is discussed in
	  <xref target="codec-details"/>.</t>

	  <t>LRR follows the model of the <xref target="RFC5104">Full
	  Intra Request (FIR)</xref>(Section 3.5.1) for its
	  retransmission, reliability, and use in multipoint conferences.
	  TODO: expand these here.</t>

	  <t>The LRR message is identified by RTCP packet type value
	  PT=PSFB and FMT=TBD.  The FCI field MUST contain one or more FIR entries.  Each entry
   applies to a different media sender, identified by its SSRC.</t>

	  <section anchor="MessageFormat" title="Message Format">

		<t>The Feedback Control Information (FCI) for the Layer Refresh Request
		  consists of one or more FCI entries, the content of which is
   depicted in <xref target="figureFciFormat"/>.  The length of
		  the LRR feedback message MUST be set to
   2+3*N, where N is the number of FCI entries.</t>

      <figure anchor="figureFciFormat">
          <artwork><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                              SSRC                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Seq nr.       |C| Payload Type| Reserved                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Target Layer Index            | Current Layer Index (opt)     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          ]]></artwork>
	  </figure>

	  <t>
		<list style="hanging">
		  <t hangText="SSRC (32 bits)"> The SSRC value of the media sender that is
              requested to send a layer refresh point.</t>

		  <t hangText="Seq nr. (8 bits)"> Command sequence number.  The sequence number
              space is unique for each pairing of the SSRC of command
              source and the SSRC of the command target.  The sequence
              number SHALL be increased by 1 modulo 256 for each new
              command.  A repetition SHALL NOT increase the sequence
              number.  The initial value is arbitrary.</t>

		  <t hangText="C (1 bit)">A flag bit indicating whether the
			"Current Layer Index" field is present in the FCI.  If
			this bit is false, the sender of the LRR message is
			requesting refresh of all layers up to and including the
			target layer.</t>

		  <t hangText="Payload Type (7 bits)">The RTP payload type for
		  which the LRR is being requested.  This gives the context in
		  which the target layer index is to be interpreted.</t>

		  <t hangText="Reserved (16 bits)"> All bits SHALL be set to 0
			by the sender and SHALL be ignored on reception.</t>

		  <t hangText="Target Layer Index (16 bits)">The target layer
			for which the receiver wishes a refresh point.  Its format
			is dependent on the payload type field.</t>

		  <t hangText="Current Layer Index (16 bits)">If C is 1, the
		  current layer being decoded by the receiver.  This message
		  is not requesting refresh of layers at or below this layer.
		  If C is 0, this field SHALL be set to 0 by the sender and
		  SHALL be ignored on reception.</t>
		</list>
	  </t>

	  </section>
	</section>

	<section anchor="codec-details" title="Usage with specific codecs">

	  <section title="H264 SVC">

		<t><xref target="RFC6190">H.264 SVC</xref> defines temporal,
		dependency (spatial), and quality scalability modes.</t>

		<figure anchor="figureH264SvcIndexFormat">
          <artwork><![CDATA[
            +---------------+---------------+
            |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |R| DID |  QID  | TID |RES      |
            +---------------+---------------+
          ]]></artwork>
		</figure>

		<t><xref target="figureH264SvcIndexFormat"/> shows the format
		of the layer index field for H.264 SVC streams.  This is
		designed to follow the same layout as the third and fourth
		bytes of the H.264 SVC NAL unit extension, which carry the
		stream's layer information.  The "R" and "RES"
		fields MUST be set to 0 on transmission and ignored on
		reception.  See <xref target='RFC6190'/> Section 1.1.3 for
		details on the DID, QID, and TID fields.</t>

		<t>TODO: identifying layer refresh frames in an H.264 bitstream.</t>
	  </section>

	  <section title="VP8">

		<t><xref target="I-D.ietf-payload-vp8">The VP8 RTP payload
		format</xref> defines temporal scalability modes.  It does not
		support spatial scalability.</t>

		<figure anchor="figureVP8IndexFormat">
          <artwork><![CDATA[
            +---------------+---------------+
            |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |TID| RES                       |
            +---------------+---------------+
          ]]></artwork>
		</figure>

		<t><xref target="figureVP8IndexFormat"/> shows the format
		of the layer index field for VP8 streams.  The "RES"
		fields MUST be set to 0 on transmission and ingnored on
		reception.  See <xref target='I-D.ietf-payload-vp8'/> Section 4.2 for
		details on the TID field.</t>

		<t>TODO: identifying layer refresh frames in an VP8 bitstream.</t>
	  </section>

	  <section title="H265">

		<t>The initial version
		of <xref target="I-D.ietf-payload-rtp-h265">the H.265 payload
		format</xref> defines temporal scalability, with protocol
		elements reserved for spatial or other scalability modes
		(which are expected to be defined in a future version of the
		specification.</t>

		<figure anchor="figureH265IndexFormat">
          <artwork><![CDATA[
            +---------------+---------------+
            |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            | RES         |  LayerId  | TID |
            +-------------+-----------------+
          ]]></artwork>
		</figure>

		<t><xref target="figureH265IndexFormat"/> shows the format
		of the layer index field for H.265 streams.  This is
		designed to follow the same layout as the first and second
		bytes of the H.265 NAL unit header, which carry the
		stream's layer information.  The "RES"
		field MUST be set to 0 on transmission and ingnored on
		reception.  See <xref target='I-D.ietf-payload-rtp-h265'/> Section 1.1.3 for
		details on the LayerId and TID fields.</t>

		<t>TODO: identifying layer refresh frames in an H.265 bitstream.</t>
	  </section>

	  <section title="VP9">
		<t><xref target="I-D.uberti-payload-vp9">The RTP payload format
		for VP9</xref> defines how it can be used for spatial and
		temporal scalability.</t>

		<figure anchor="figureVP9IndexFormat">
          <artwork><![CDATA[
            +---------------+---------------+
            |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
            +-------------+-----------------+
            |  T  |R|  S  | RES             |
            +-------------+-----------------+
          ]]></artwork>
		</figure>

		<t><xref target="figureVP9IndexFormat"/> shows the format
		of the layer index field for VP9 streams.  This is
		designed to follow the same layout as the "L" byte
		of the VP9 payload header, which carries the
		stream's layer information.  The "R" and "RES"
		fields MUST be set to 0 on transmission and ingnored on
		reception.  See <xref target='I-D.uberti-payload-vp9'/> for
		details on the T and S fields.</t>

	  <t>Identification of a layer refresh frame can be derived from the
	  reference IDs of each frame by backtracking the dependency chain
	  until reaching a point where only decodable frames are being
	  referenced. Therefore it's recommended for both the
	  flexible and the non-flexible mode that, when upgrade frames are
	  being encoded in response to a LRR, those packets should contain
	  layer indices and the reference fields so that the decoder or an
	  MCU can make this derivation.</t>

	  <t>Example:</t>
	  <t>LRR {1,0}, {2,1} is sent by an MCU when it is currently
	  relaying {1,0} to a receiver and which wants to upgrade to
	  {2,1}. In response the encoder should encode the next frames
	  in layers {1,1} and {2,1} by only referring to frames in
	  {1,0}, or {0,0}.</t>

	  <t>In the non-flexible mode, periodic upgrade frames can be
	  defined by the layer structure of the SS, thus periodic upgrade
	  frames can be automatically identified by the picture ID.</t>
	</section>

	</section>

	<section title="Usage with different scalability transmission mechanisms">

	  <t>Several different mechanisms are defined for how scalable
	  streams can be transmitted in RTP.
	  The <xref target="I-D.ietf-avtext-rtp-grouping-taxonomy">RTP
	  Taxonomy</xref> Section 3.7 defines three mechanisms: Single RTP
	  Stream on a Single Media Transport (SRST), Multiple RTP Streams
	  on a Single Media Transport (MRST), and Multiple RTP Streams on
	  Multiple Media Transports (MRMT).</t>

	  <t>The LRR message is applicable to all these mechanisms.  For
	  MRST and MRMT mechanisms, the "media source" field of the LRR
	  FCI is set to the SSRC of the RTP stream containing the layer
	  indicated by the Current Layer Index (if "C" is 1), or the
	  stream containing the base encoded stream (if "C" is 0).  For
	  MRMT, it is sent on the RTP session on which this stream is
	  sent.  On receipt, the sender MUST refresh all the layers
	  requested in the stream, simultaneously in decode order.</t>

	  <t>Note: arguably, for the MRST and MRMT mechanisms, FIR
		feedback messages could instead be used to refresh specific individual
		layers.  However, the usage of FIR for MRSR/MRMT is not
		explicitly specified anywhere, and if FIR is interpreted as refreshing
		layers, there is no way to request an actual full, synchronized refresh of
		all the layers of an MRST/MRMT layered source.  Thus, the authors feel that
		interpreting FIR as refreshing the entire source, and using
		LRR for the individual layers, would be more useful.</t>

	</section>

    <section anchor="securityConsiderations" title="Security Considerations">
	  <t>All the security considerations of <xref target="RFC5104">FIR
	  feedback packets</xref> apply to LRR feedback packets as well.
	  Additionally, media senders receiving LRR feedback packets MUST
	  validate that the payload types and layer indices they are
	  receiving are valid for the stream they are currently sending,
	  and discard the requests if not.</t>
    </section>

    <section anchor="IANAConsiderations" title="IANA Considerations">
      <t>The IANA is requested to register the following values:<vspace
      blankLines="0"/> - TODO: PSFB value for LRR</t>
    </section>
  </middle>

  <back>
    <references>
	  &vp8rtp;

      &vp9rtp;

	  &h265rtp;

	  &taxonomy;

      &rfc2119;

      &rfc4585;

	  &rfc5104;

	  &rfc6190;
    </references>
  </back>
</rfc>
<!--  LocalWords:  PictureID DCT Hadamard WHT SSRC CSRC pyld hdr FI VER RPSI
 -->
<!--  LocalWords:  stPartitionSize SLI SDP AVPF SRTP IANA PID PICIDX TID
 -->

PAFTECH AB 2003-20262026-04-23 19:50:46