One document matched: draft-yang-avt-rtp-synced-playback-03.txt

Differences from draft-yang-avt-rtp-synced-playback-02.txt




Audio/Video Transport                                            P. Yang
Internet-Draft                                                Ye-K. Wang
Intended status: Standards Track           Huawei Technologies Co., Ltd.
Expires: August 31, 2010                               February 27, 2010


    Synchronized Playback in Rapid Acquisition of Multicast Sessions
               draft-yang-avt-rtp-synced-playback-03.txt

Abstract

   When watching the same IPTV channel, different TV sets may not render
   the same picture and the associated audio at the same moment.  The
   variation in end-to-end delay that resulted in such asynchronous
   playback between users is referred to as inter-user playback delay.
   Unicast based rapid acquisition of multicast RTP sessions (RAMS) as
   specified in [I-D.ietf-avt-rapid-acquisition- for-rtp] is an
   important technique in achieving fast channel switching in IPTV as
   well as other multicast applications.  In addition, RAMS also
   significantly relaxes the requirement of relatively short random
   access point period in encoding of video streams in multicast
   applications, thus allowing significantly improved compression
   efficiency.  However, on the other hand, the use of RAMS increases
   inter-user playback delay, which makes users receiving the same
   multicast session playback the same content asynchronously.  This
   document specifies a mechanism to help reduce inter-user playback
   delay caused by the use of RAMS.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.



Yang & Wang              Expires August 31, 2010                [Page 1]

Internet-Draft        Synchronized Playback in RAMS        February 2010


   This Internet-Draft will expire on August 31, 2010.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.



































Yang & Wang              Expires August 31, 2010                [Page 2]

Internet-Draft        Synchronized Playback in RAMS        February 2010


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Conventions  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   4.  Reducing Inter-User Playback Delay due to RAMS . . . . . . . .  5
   5.  Message Extensions . . . . . . . . . . . . . . . . . . . . . .  7
     5.1.  Extension to RAMS-R  . . . . . . . . . . . . . . . . . . .  7
     5.2.  Extension to RAMS-I  . . . . . . . . . . . . . . . . . . .  8
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
   9.  Normative References . . . . . . . . . . . . . . . . . . . . . 10
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10





































Yang & Wang              Expires August 31, 2010                [Page 3]

Internet-Draft        Synchronized Playback in RAMS        February 2010


1.  Introduction

   The Internet-Draft "Unicast-Based Rapid Acquisition of Multicast RTP
   Sessions" in [I-D.ietf-avt-rapid-acquisition-for-rtp] presents a
   method based on unicast burst stream for rapid acquisition of
   multicast RTP sessions (RAMS), thus to reduce the waiting time for
   the so-called Reference Information (RI).  This method is effective
   in reducing channel switching and tune-in delay in multicast
   applications, such as IPTV.  The RI typically starts at an access
   unit that is a random access point.  In RAMS, RTP receivers start
   playback from the random access point from which the RI starts when
   they switch from one multicast session to another.  As the unicast
   burst stream starts from the RI and is transmitted as fast as
   possible and faster than the media rate, on average the receiver can
   start processing the unicast burst stream faster, because it does not
   need to wait for the next random access point, as it does if it
   directly joined the multicast group.

   Another important benefit brought by RAMS is that significantly
   improved coding efficiency for video streams is possible.  In
   conventional multicast applications, video streams must be encoded
   with frequent random access points, e.g. 0.5 to 1 second, to allow
   new receivers to tune-in or existing users to switching from another
   multicast session.  Random access points typically contain intra-
   coded pictures, for which the compression efficiency is
   significantly, e.g., several to ten times, lower than inter-coded
   pictures.  Therefore, the less the random access points, the higher
   the coding efficiency.  When RAMS is in use, random access period
   length is not the decision factor for the tune-in or channel
   switching delay.  This means that the video random access point
   frequency can be significantly reduced, leading to significantly
   improved compression efficiency.

   In a video communication system, an end-to-end delay is unavoidable.
   Typical end-to-end delay components are transmission delay, receiver
   buffering delay (which also handles transmission jitter), decoding
   buffering delay, output buffering delay, and other processing delays.
   Use of RAMS causes additional much longer end-to-end delay, the
   amount of which is equal to the time difference between the latest
   RTP timestamp of primary multicast packets buffered in the RS when
   the unicast burst starts and the RTP timestamp of the starting point
   of the unicast burst.

   In multicast applications, receivers receiving the same multicast
   session can have different end-to-end delays and thus the playback of
   the same content among the receivers is not synchronized.  In this
   document, a delay in playback time of the same content between any
   two users is referred to as inter-user playback delay (IUPD).  This



Yang & Wang              Expires August 31, 2010                [Page 4]

Internet-Draft        Synchronized Playback in RAMS        February 2010


   document further refers to the typical end-to-end delay (when RAMS is
   not in use) as Common End-to-end Delay (CED).  Among the delay
   components of CED, jitter buffer delay is the main part.  Typical
   set-top box de-jitter buffers can store 100-500 ms (of SDTV) video,
   so network jitter must be within these limits and delay variation
   beyond these limits will manifest itself as loss [TR-126].  Compared
   to jitter buffer delay under the level of several hundreds of
   milliseconds, the acquisition of Reference Information by RAMS may
   result in longer and more serious delay which depends on the period
   of random access points, which sometimes are, vaguely, referred to as
   Group of Pictures (GOP) length (distance between key or I-frames).
   This document refers to this additional longer and more serious end-
   to-end delay caused by the use of RAMS as Rapid Acquisition Playback
   Delay (RAPD), and is specific to eliminate the RAPD to reduce the
   IUPD.  The methods employed to solve CED is out of scope for this
   document.

   Due to IUPD, different users may watch different pictures from
   different TV sets when watching the same IPTV content.  In some
   application scenarios, e.g., remote education or a discussion room
   for an ongoing TV program in a social network, different users may be
   discussing the same content received through multicast.  In this
   case, an obvious playback synchronization loss due to excessive
   inter-user playback delay can generate bad user experience.

   As mentioned above, a disadvantage of RAMS is that the use of the
   technique causes RAPD for each user.  Receivers are not synchronized
   in sending RAMS requests.  Regardless of when the receiver starts the
   RAMS request (i.e. joins the program), the playback will start from a
   previous random access point.  Given the starting point, the closer
   to the next random access point the receiver joins the program, the
   longer the RAP will be.  Thus, the use of RAMS increases IUPD.

   When obvious IUPD affects user experiences in above application
   scenarios, some actions must be taken.  One way is to constrain the
   use of long random access period length, which significantly hurts
   video compression efficiency.  In this document, we describe some
   mechanisms to eliminate the RAPD to reduce IUPD due to the use of
   RAMS and to allow the use of long random access period length for
   improved compression efficiency at the same time.


2.  Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].




Yang & Wang              Expires August 31, 2010                [Page 5]

Internet-Draft        Synchronized Playback in RAMS        February 2010


3.  Definitions

   This document uses the following acronyms and definitions frequently:

   Inter-User Playback Delay (IUPD): The playback delay between
   different users, for the same content transmitted on the same
   multicast session, due to variations in end-to-end delays.

   Common End-to-end Delay (CED): The end-to-end delay when RAMS is not
   in use, which consists of transmission delay, receiver buffering
   delay (which also handles transmission jitter), decoding buffering
   delay, output buffering delay, and other processing delays.

   Rapid Acquisition Playback Delay (RAPD): The additional end-to-end
   delay introduced by the use of RAMS.  The value is equal to the time
   difference between the latest RTP timestamp of primary multicast
   packets buffered in the RS when the unicast burst starts and the RTP
   timestamp of the starting point of the unicast burst.


4.  Reducing Inter-User Playback Delay due to RAMS

   Inter-User Playback Delay (IUPD) is causes by the variation in end-
   to-end delay, which includes Common End-to-End Delay (CED) and Rapid
   Acquisition Playback Delay (RAPD).  CED is primarily determined by
   jitter buffering delay which is in the range from 100 ms to 500 ms
   [TR-126].  In practice, IUPD caused by CED, and primarily by jitter
   buffer delay variations, should be under the level of tens of
   milliseconds.  In contrast to this small range of IUPD values caused
   by CED, the IUPD caused by RAPD can be much higher, up to a few
   seconds or even to ten seconds, as long period of random access
   points can be used for significantly improved video compression
   efficiency when RAMS is in us.

   To lower IUPD the receiver can speed up video playback until it
   catches up with the primary multicast stream.  The procedures are
   outlined bellow.

   The playback synchronization of receivers uses the Primary Multicast
   Stream as a reference point to address RAPD due to the use of RAMS.
   Each receiver keeps the synchronization with the primary multicast
   stream so that all receivers can keep an approximate synchronization
   in playback.  The advantage is that it does not need every receiver
   to get playback synchronization information of other receivers, and
   thus scalability is not an issue as the procedure does not depend on
   the number of receivers.

   The mechanism for speeding up video frames uses a method of skipping



Yang & Wang              Expires August 31, 2010                [Page 6]

Internet-Draft        Synchronized Playback in RAMS        February 2010


   video frame.  In other words, one frame per an interval of some video
   frames is skipped until the number of extra video frames has been
   skipped.  This way, the additional end-to-end delay, RAPD, can be
   compensated, and at the same time a smooth playback is ensured.

   The mechanism involves the following changes to the RAMS method:

     1) When the RTP Receiver (RR) sends a rapid acquisition request for
        the new multicast RTP session, the request MAY contain
        additional information indicating whether RR supports inter-user
        playback delay reduction.

     2) When the Retransmission Sever (RS) receives the RAMS-R message
        and decides to accept it, RS MAY include the following
        additional information in the RAMS-I message to RR:

        a) N, the playback delay reduction target in number of frame
           durations; and

        b) V, recommended interval, in frames, between two continuous
           events for skipping of one frame.

        When the RAMS-R message indicates that RR supports inter-user
        playback delay reduction, RS SHOULD include the above
        information in the RAMS-I message.

        Retransmission server (RS) can precisely determine the value of
        N by detecting the time of RAMS-R, as the amount of additional
        end-to-end delay introduced by the use of RAMS, i.e. RAPD, is
        known to RS.  The value is equal to the time difference between
        the latest RTP timestamp of primary multicast packets buffered
        in the RS when the unicast burst starts and the RTP timestamp of
        the starting point of the unicast burst.  The value V is a
        recommended skip frame interval and the value must be chosen
        such that there is no noticeable audio distortion.  For a video
        frame rate of 30 frames per second, typically when V is greater
        than 15 there is no noticeable audio distortion.

     3) When RR receives an RAMS-I message containing the above
        information, it SHOULD speed up media rendering during playback
        taking into account the information as follows.  During the
        speedup playback, after each V frames, one frame is skipped as
        if it was not present, and the presentation time of each
        remaining frame is shifted earlier by one frame duration, until
        totally N frames have been skipped. Receivers will playback the
        media content with its original speed after totally N frames
        have been skipped.  Note that decoding remains the same as if
        speedup playback was not in use.



Yang & Wang              Expires August 31, 2010                [Page 7]

Internet-Draft        Synchronized Playback in RAMS        February 2010


   Besides the above mechanism, RS can use selective transmission of
   packets in the beginning of the unicast burst, by taking advantage of
   the temporal scalability of video bitstreams.

   Video bitstreams are typically temporally scalable, in the sense that
   a portion of the bitstream can be extracted and decoded with a lower
   frame rate.  Conventionally video coding standards like MPEG- 2 can
   realize temporal scalability using different types of frames, i.e.  I
   frame, P frame, and B frame, wherein I frames only consist of the
   lowest temporal layer (corresponding to the lowest frame rate), P
   frames only consist of the middle temporal layer, and B frames only
   consist of the highest temporal layer.  Decoding of one temporal
   layer requires the presence of the temporal layer itself and all the
   lower temporal layers, but not the presence of any higher layer.
   H.264/AVC and its scalable extension SVC (Scalable Video Coding)
   support advanced temporal scalability thanks to the flexible inter
   prediction scheme and flexible reference frame management scheme.  It
   should be noted that in H.264/AVC and its extensions, B frames
   themselves may be reference frames, which may be used by other frames
   for inter prediction, therefore the discarding of which may lead to
   problems in decoding of the rest of the bitstream.

   The RS MAY transmit the unicast burst as follows.  In the beginning
   of the unicast burst, the RS discards one or more of the highest
   temporal layers and transmits only the remaining lower temporal
   layers.  After a certain point, all temporal layers are transmitted.
   This would speed up the acquisition of the multicast session under
   the same unicsat burst rate, at the cost of lower initial frame rate.
   At the same time, if the playback of the initial stream with lower
   frame rate is sped up, inter-user playback delay can be reduced.


5.  Message Extensions

   This section defines the extensions to RAMS-R and RAMS-I messages for
   inter-user playback delay reduction.

5.1.  Extension to RAMS-R

   The RAMS Request message is identified by SFMT=1.  This message is
   used by RTP_Rx to request rapid acquisition for a primary multicast
   RTP session, or one or more primary multicast streams belonging to
   the same primary multicast RTP session.  The FCI field MUST contain
   only one RAMS Request.  The FCI field has the structure depicted in
   Figure 1.






Yang & Wang              Expires August 31, 2010                [Page 8]

Internet-Draft        Synchronized Playback in RAMS        February 2010


      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    SFMT=1     |                    Reserved                   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     :      Optional TLV-encoded Fields (and Padding, if needed)     :
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 1: FCI field syntax for the RAMS Request message

o  Request for Inter-user Playback Delay Reduction (0 bits): TLV element
   that indicates that RTP_Rx is only requesting the inter-user playback
   delay reduction for the desired primary multicast stream(s). If this
   TLV element exists in the RAMS-R message, RS SHALL send the value N -
   the playback delay reduction target in number of frame durations and
   the value V - the recommended interval, in frames, between two
   continuous events for skipping of one frame.  Note that this TLV
   element does not carry a Value field. Thus,the Length field MUST be
   set to zero.

   Type:  6

5.2.  Extension to RAMS-I

   The RAMS Information message is identified by SFMT=2.  This message
   is used to describe the unicast burst that will be sent for rapid
   acquisition.  It also includes other useful information for RTP_Rx as
   described below.  The FCI field MUST contain only one RAMS
   Information.  The FCI field has the structure depicted in Figure 2.






















Yang & Wang              Expires August 31, 2010                [Page 9]

Internet-Draft        Synchronized Playback in RAMS        February 2010


      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    SFMT=2     |      MSN      |          Response             |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     :      Optional TLV-encoded Fields (and Padding, if needed)     :
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Figure 2: FCI field syntax for the RAMS Information message

o  The value N(16 bits) - the playback delay reduction target in number
   of frame durations:  TLV element that denotes the amount of frame
   delay which is equal to the frame difference between the latest RTP
   timestamp of primary multicast packets buffered in the RS when the
   unicast burst starts and the RTP timestamp of the starting point of
   the unicast burst.

   If the request for inter-user playback delay reduction has been
   accepted, RS MUST send this field at least once, so that RTP_Rx
   knows to speed up media rendering during playback. Otherwise, RS
   ignore the request for inter-user playback delay reduction.

   Type:  36

o  The value V(8 bits) - recommended interval, in frames, between two
   continuous events for skipping of one frame:  Optional TLV element
   that denotes the recommended interval for skipping of one frame.
   During the speedup playback of RTP_Rx, after each V frames, one frame
   is skipped as if it was not present, and the presentation time of
   each remaining frame is shifted earlier by one frame duration, until
   totally N frames have been skipped. Receivers will playback the media
   content with its original speed after totally N frames have been
   skipped.

   Type:  37


6.  Security Considerations

   TBD.


7.  IANA Considerations

   TBD.






Yang & Wang              Expires August 31, 2010               [Page 10]

Internet-Draft        Synchronized Playback in RAMS        February 2010


8.  Acknowledgements

   The authors thank the following individuals for their contributions,
   comments and suggestions to this document: Roni Even, Colin Perkins,
   Ali C. Begen (abegen) and Bill Ver Steeg (versteb).


9.  Normative References

   [I-D.ietf-avt-rapid-acquisition-for-rtp]
              Steeg, B., Begen, A., Caenegem, T., and Z. Vax, "Unicast-
              Based Rapid Acquisition of Multicast RTP Sessions",
              draft-ietf-avt-rapid-acquisition-for-rtp-07 (work in
              progress), February 2010.

   [I-D.ietf-avt-rtcp-guidelines]
              Ott, J. and C. Perkins, "Guidelines for Extending the RTP
              Control Protocol (RTCP)",
              draft-ietf-avt-rtcp-guidelines-03 (work in progress),
              February 2010.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
              "Extended RTP Profile for Real-time Transport Control
              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
              July 2006.

   [RFC4588]  Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
              Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
              July 2006.

   [TR-126]   DSL Forum TR-126, "Triple-play Services Quality of
              Experience (QoE) Requirements", December 2006.












Yang & Wang              Expires August 31, 2010               [Page 11]

Internet-Draft        Synchronized Playback in RAMS        February 2010


Authors' Addresses

   Peilin Yang
   Huawei Technologies Co., Ltd.
   No.91, Baixia Road, Nanjing 210001
   P.R.China

   Phone: +86-25-84565881
   Email: yangpeilin@huawei.com


   Ye-Kui Wang
   Huawei Technologies Co., Ltd.
   400 Somerset Corporate Blvd, Suite 602
   Bridgewater, NJ 08807
   USA

   Phone: +1-908-541-3518
   Email: yekuiwang@huawei.com
































Yang & Wang              Expires August 31, 2010               [Page 12]



PAFTECH AB 2003-20262026-04-24 01:14:07