One document matched: draft-zong-httpstreaming-gap-analysis-01.txt

Differences from draft-zong-httpstreaming-gap-analysis-00.txt




Network Working Group                                            N. Zong
Internet-Draft                                       Huawei Technologies
Intended status: Informational                          October 24, 2010
Expires: April 27, 2011


Survey and Gap Analysis for HTTP Streaming Standards and Implementations
                draft-zong-httpstreaming-gap-analysis-01

Abstract

   With the explosive growth of the Internet usage and increasing demand
   for multimedia information on the web, media delivery over Internet
   attract substantial attention from media industry.  To meet above
   requirements, HTTP Streaming technology is designed and gradually
   plays an important role in recent years.  Several leading Standard
   Development Organizations (SDOs) have been producing a series of
   technical specifications to define streaming over HTTP.  Moreover,
   several companies have devoted to developing private HTTP-based media
   delivery platform to provide high quality, adaptive viewing
   experience to customers.  Following a brief survey of existing HTTP
   streaming standards and implementations, this document gives a brief
   summary on these related work, analyzes the potential challenges
   especially from the network point of view, and lists the gap between
   existing work and possible working scope on the topic of HTTP
   streaming in IETF.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 27, 2011.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.



Zong                     Expires April 27, 2011                 [Page 1]

Internet-Draft           Survey and Gap Analysis            October 2010


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.






























Zong                     Expires April 27, 2011                 [Page 2]

Internet-Draft           Survey and Gap Analysis            October 2010


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  HTTP Streaming Standards . . . . . . . . . . . . . . . . . . .  6
     3.1.  3GPP . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
       3.1.1.  Media Presentation Components  . . . . . . . . . . . .  6
       3.1.2.  Media Presentation Description . . . . . . . . . . . .  8
       3.1.3.  Streaming Procedure  . . . . . . . . . . . . . . . . .  9
         3.1.3.1.  Overview . . . . . . . . . . . . . . . . . . . . .  9
         3.1.3.2.  Segment list generation  . . . . . . . . . . . . . 10
         3.1.3.3.  Seeking, trick mode and adaptation support . . . . 10
     3.2.  OIPF . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
       3.2.1.  MPD  . . . . . . . . . . . . . . . . . . . . . . . . . 10
       3.2.2.  Segmentation . . . . . . . . . . . . . . . . . . . . . 11
       3.2.3.  Media formats for MPEG2-TS . . . . . . . . . . . . . . 11
       3.2.4.  Use cases  . . . . . . . . . . . . . . . . . . . . . . 12
         3.2.4.1.  Live streaming . . . . . . . . . . . . . . . . . . 12
         3.2.4.2.  Trick mode and seeking . . . . . . . . . . . . . . 12
     3.3.  MPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
       3.3.1.  Objectives . . . . . . . . . . . . . . . . . . . . . . 13
       3.3.2.  Requirements for proposal  . . . . . . . . . . . . . . 13
   4.  HTTP Streaming Implementations . . . . . . . . . . . . . . . . 14
     4.1.  Microsoft Smooth Streaming . . . . . . . . . . . . . . . . 14
       4.1.1.  On-disk MP4 file format  . . . . . . . . . . . . . . . 15
       4.1.2.  On-wire segments transmission  . . . . . . . . . . . . 15
       4.1.3.  Adaptative support . . . . . . . . . . . . . . . . . . 16
     4.2.  Adobe  . . . . . . . . . . . . . . . . . . . . . . . . . . 16
       4.2.1.  Components . . . . . . . . . . . . . . . . . . . . . . 16
       4.2.2.  Workflow . . . . . . . . . . . . . . . . . . . . . . . 16
       4.2.3.  Top features . . . . . . . . . . . . . . . . . . . . . 17
     4.3.  Apple  . . . . . . . . . . . . . . . . . . . . . . . . . . 17
       4.3.1.  Basic process  . . . . . . . . . . . . . . . . . . . . 18
   5.  Gap Analysys . . . . . . . . . . . . . . . . . . . . . . . . . 18
     5.1.  Brief Summary of Exitsting Work  . . . . . . . . . . . . . 19
     5.2.  Challenges . . . . . . . . . . . . . . . . . . . . . . . . 20
     5.3.  Gap List and Potential Working Scope in IETF . . . . . . . 21
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 22
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 22
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 22
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 22
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23







Zong                     Expires April 27, 2011                 [Page 3]

Internet-Draft           Survey and Gap Analysis            October 2010


1.  Introduction

   Media streaming have played increasingly important role in Internet
   content deliveries, and are becoming indispensable in many
   applications (e.g., distance learning, digital libraries, home
   shopping, and video-on-demand).  Currently, several streaming
   protocols are commonly used to deliver media content on Internet,
   such as HTTP, RTSP/RTP, RTMP, MMS, etc.

   HTTP streaming, one of above listed protocols, is rapidly becoming
   one of the most commonly used approach for media content distribution
   on the Internet.  HTTP streaming is a mechanism for sending media
   data/file, which is divided into several chunks/fragments and supply
   them in order to user through port 80/8080.  HTTP streaming includes
   various streaming media formats/codec including MP4, MPEG2-TS, H.264/
   AAC, etc., and streaming services over HTTP, such as Windows Media/
   Silver Light Streaming, Flash Video, QuickTime Streaming Server, Real
   Media Streaming and others.

   HTTP streaming offers two advantages as below:

   1) Media protocols often have difficulty getting around firewalls and
   routers because they are commonly based on UDP sockets over unusual
   port numbers.  HTTP-based media delivery has no such problems because
   firewalls and routers know to pass HTTP downloads through port 80.

   2) HTTP media delivery has the ability to use standard HTTP servers
   and standard HTTP caches (or cheap servers in general) to deliver the
   content, so that it doesn't require special proxies or caches.
   Additionally, most Content Delivery Network (CDN) make use of HTTP to
   redirect request, retrieve cached multimedia object, and communicate
   policy servers.

   Several leading Standard Development Organizations (SDOs) have been
   producing a series of technical specifications to define streaming
   over HTTP. 3GPP introduces adaptive HTTP streaming in Technical
   Specification (TS) 26.234 [3GPP], where HTTP streaming is introduced
   in detail including Media Presentation Description (MPD), Media
   Segmentation Format, HTTP server and client behavior, etc., as an
   alternative approach to the RTSP/RTP based media delivery.  Open IPTV
   Forum (OIPF) introduces HTTP adaptive streaming in its technical
   Specification [OIPF], which defines the usage of and extensions to
   3GPP HTTP streaming to enable HTTP based Adaptive Streaming for OIPF
   compliant services and devices.  Recently, ISO/IEC JTC1/SC29/WG11
   (MPEG) launched a new standard on HTTP streaming.  A bunch of
   documents [MPEG-1][MPEG-2][MPEG-3][MPEG-4] have been proposed to
   address the backgroud, objectives, use cases and requriements of the
   transport of MPEG media over HTTP.



Zong                     Expires April 27, 2011                 [Page 4]

Internet-Draft           Survey and Gap Analysis            October 2010


   Several companies have devoted to developing private HTTP-based media
   delivery platform to provide high quality, adaptive viewing
   experience to customers.  Microsoft has implemented its Smooth
   Streaming technology, which is a web-base, adaptive media content
   delivery approach that uses standard HTTP [MS-IIS].  Instead of
   delivering media as full-file download, in Smooth Streaming, the
   content is delivered to client as a series of small file chunks that
   can be easily cached at edge servers, closer to client.  Adobe HTTP
   Dynamic Streaming is a new Adobe-defined delivery method for enabling
   on-demand and live adaptive bitrate video streaming over regular HTTP
   connections [Adobe].  Adobe HTTP Dynamic Streaming packages media
   files into fragments that Flash Player clients can access instantly
   without downloading the entire file.  Apple HTTP Live Streaming
   [Apple] allows to send live or prerecorded audio and video to iPhone
   or other devices, such as desktop computers, using an ordinary Web
   server, with support of adaptive bitrate.

   Following a brief survey of the above mentioned existing HTTP
   streaming standards and implementations, this document gives a brief
   summary on these related work, analyzes the potential challenges
   especially from the network point of view, and lists the gap between
   existing work and possible working scope on the topic of HTTP
   streaming in IETF.


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119] and
   indicate requirement levels for compliant implementations.

   Live Streaming: Live events can be streamed over the Internet with
   the help of broadcast software which encodes the live source - from a
   microphone, video camera, or other recording device and delivers the
   resulting stream to the server.  The server then transfers the
   stream.  So the user experiences the event as it happens.

   On-Demand Streaming: To provide "anytime" access to media content,
   client is allowed to select and playback on demand.

   Progressive Download: A mode that allow client playback the media
   file while the file is downloading, after only a few seconds wait for
   buffering, the process of collecting the first part of a media file
   before playing.

   Adaptive Streaming: Adaptive streaming is a process that adjusts the
   quality of a video delivered to a client based on the changing



Zong                     Expires April 27, 2011                 [Page 5]

Internet-Draft           Survey and Gap Analysis            October 2010


   network conditions to ensure the best possible viewer experience.


3.  HTTP Streaming Standards

3.1.  3GPP

   3GPP introduces adaptive HTTP streaming in Technical Specification
   (TS) 26.234 [3GPP].  TS 26.234 specifies the protocols and codecs for
   the Packet-Switched Steaming Service (PSS) within the 3GPP system.
   Protocols for control signalling, capability exchange, media
   transport, rate adaptation and protection are specified.  Codecs for
   speech, natural and synthetic audio, video, still images, bitmap
   graphics, vector graphics, timed text and text are specified.

   The delivery of media over HTTP provides an alternative delivery
   mechanism to the RTSP/RTP based media delivery.  It is assumed that
   the HTTP-Streaming Client has access to a Media Presentation
   Description (MPD).  An MPD provides sufficient information for the
   HTTP-Streaming Client to provide a streaming service to the user by
   sequentially downloading media data from an HTTP server and rendering
   the included media appropriately.

3.1.1.  Media Presentation Components

   A media presentation is a structured collection of data that is
   accessible to the HTTP-Streaming Client, which is described in a MPD.
   The media presentation structure is shown in the following figure.























Zong                     Expires April 27, 2011                 [Page 6]

Internet-Draft           Survey and Gap Analysis            October 2010


   ^ resultion / bit-rate / language / etc
   |
   |                    representation
   |   +------------------------------------------------+
   |   |             segment              segment       |
   |   | +----------------------------+  +-----+        |
   |   | | +----------+  +----------+ |  |     |        |
   |   | | |   meta   |  |   media  | |  |     |        |
   |   | | |   data   |  |   data   | |  |     | ... ...|
   |   | | +----------+  +----------+ |  |     |        |
   |   | +----------------------------+  +-----+        |
   |   +------------------------------------------------+
   |
   |   +------------------------------------------------+
   |   |                representation                  |
   |   +------------------------------------------------+
   |                     ... ...
   |   +------------------------------------------------+
   |   |                representation                  |
   |   +------------------------------------------------+
   |                       period 1                          period2 ...
   +------------------------------------------------------------------->
                                                                   time

   A media presentation consists of:

      1) A sequence of Periods.

      2) Each Period contains one or more Representations from the same
      media content.  Different Representations usually have different
      attributes on media resolution, bit-rate, language, etc.

      3) Each Representation consists of one or more segments.

      4) Segments contain media data and/or metadata to decode and
      present the included media data and is defined as a unit that can
      be uniquely referenced by an http-URL element in the MPD.  The
      Initialisation Segment contains initialisation information (no
      media data) for accessing the Representation.  A Media Segment
      contains media data that are described either within this Media
      Segment or by the Initialisation Segment.  The segment has a start
      time relative to the start time of the representation (period)
      such that the client can download a specific segment.  The segment
      provides random access information, namely if and how you can
      randomly access the media within this segment.  There is no
      requirement that a segment starts with a random access point
      (RAP).  But it is possible that all segments start with a RAP.




Zong                     Expires April 27, 2011                 [Page 7]

Internet-Draft           Survey and Gap Analysis            October 2010


3.1.2.  Media Presentation Description

   The logic structure of media presentation is actually described as
   the data structure (e.g. xml schema) in MPD file.  That is, the MDP
   contains metadata required by the client to construct appropriate
   URIs to access segments and to provide the streaming service to the
   user.  Several important attributes and elements contained in a MPD
   are listed as below:

      1) "type" attribute: type of the media presentation, i.e.  VOD or
      live.

      2) "availabilityStartTime" attribute: media presentation start
      time if "type"=live.  If "type"=VoD, media presentation start time
      is 0.

      3) "duration" attribute: duration/length of the media
      presentation.  For live presentation, the sum of "duration" and
      "availabilityStart" specifies the end time of the media
      presentation.  If "duration" is not provided, then the MPD does
      not describe an entire media presentation and the MPD may be
      updated during live presentation.

      4) "minimumUpdatePeriodMPD" attribute: minimum MPD update period.

      5) "timeShiftBufferDepth" attribute: duration of time shifting
      buffer maintained at the server for live presentation.  This
      attribute will be used in the case of trick mode.

      6) "minBufferTime" attribute: minimum buffer time for the stream.

      7) Multiple "Period" element: describe a period.  A "Period"
      element contains the following important attributes and elements:

         7.1) "start" attribute: start time of this period.

         7.2) Multiple "Representation" element: describe a
         representation with different bit-rate, resolution, language,
         etc.  A "Representation" element contains the following
         important attributes and elements:

            7.2.1) "bandwidth" attribute: maximum bit-rate of the
            representation averaged over any interval of "minBufferTime"
            duration.

            7.2.2) "startWithRAP" attribute: When True, indicates that
            all segments in the representation start with a random
            access point (RAP).



Zong                     Expires April 27, 2011                 [Page 8]

Internet-Draft           Survey and Gap Analysis            October 2010


            7.2.3) "qualityRanking" attribute: quality ranking of the
            representation.

            7.2.4) "TrickMode" element: provides the information for
            trick mode.  In this element, "AlternatePlayoutRate"
            attribute denotes the playout speed as a multiple of the
            regular playout speed.

            7.2.5) "SegmentInfo" element: describe all segments in a
            representation.  Each "SegmentInfo" element permits
            generating a list of Media Segment URLs (possibly with a
            byte range) and Media Segment start times relative to the
            start time of the Representation.  A "SegmentInfo" element
            contains the following important attributes and elements:

               7.2.5.1) "duration" attribute: gives the constant
               approximate segment duration.

               7.2.5.2) at most one "InitialisationSegmentURL" element.
               If not present, then each media segment within this
               representation shall be self-initialising.

               7.2.5.3) either a URLtemplate" element that specifies a
               default segment URL template for all segments, or one or
               more "Url" elements that provides a set of explicit
               URL(s) for segments.

   Note that a client derives the request-for-MPD-update time as the sum
   of the time of its last requested update of the MPD and the
   "minimumUpdatePeriodMPD" attribute.

3.1.3.  Streaming Procedure

3.1.3.1.  Overview

   Initially, the client parses the MPD and creates an segment list for
   each representation.  Then the client selects one representation
   based on the information in the representation attributes and other
   information, e.g. available bandwidth, client capabilities.  Client
   acquires initialisation segments and the media segments of the
   selected representation by using the generated segment list.  Client
   continues consuming the media content by continuously requesting
   media segments taking into account the MPD update.  Client may change
   representations taking into account updated MPD information and/or
   updated information from its environment, e.g. access bit-rate
   changes.





Zong                     Expires April 27, 2011                 [Page 9]

Internet-Draft           Survey and Gap Analysis            October 2010


3.1.3.2.  Segment list generation

   A list contains: 1) URL to initialization segment; 2) URLs to media
   segments; 3) start times to media segments in the period.  There are
   two approaches for generating segment list.  One is template based
   generation, that is to utilize the "URLtemplate" and "duration"
   attributes in "SegmentInfo" element in MPD.  Another is play-list
   based generation, that is to utilize the "URLs" and "duration" in
   "SegmentInfo" element in MPD.

3.1.3.3.  Seeking, trick mode and adaptation support

   Suppose that the client wants to seek to time "tp", the corresponding
   segment can be searched by the server through: Target_segment_index =
   max { i | MediaSegment[i].StartTime <= tp- Period.start }.  For
   accurate seeking to time "tp", client needs to access a RAP.  Client
   may use the information in the 'sidx' to locate the RAP and the
   corresponding presentation time in the media presentation.  For fast
   start-up, client may initially request the 'sidx' box from the
   beginning of the media segment using byte range requests.

   Trick mode can be implemented by utilizing the "AlternatePlayoutRate"
   attribute in "TrickMode" element in MPD.

   Switching to a new representation is equivalent to seeking to the new
   representation.  Client should seek to a RAP in the new
   representation at a desired presentation time "tp" later than current
   presentation time.

3.2.  OIPF

   Open IPTV Forum (OIPF) introduces HTTP adaptive streaming in its
   technical Specification [OIPF].  This specification defines the usage
   of and, where necessary, extensions to the technologies defined in
   3GPP TS 26.234 to enable HTTP based Adaptive Streaming for Release 2
   OIPF compliant services and devices.  Most details on HTTP adaptive
   streaming in this specification is based on 3GPP TS 26.234.  The
   extensions and designs speficic to OIPF are introduced in this
   document.

3.2.1.  MPD

   A Representation may be made up of multiple components, for example
   audio, video.  A partial Representation may only contain some of
   these components and a terminal may need to download (and play)
   multiple partial Representations to build up a complete
   Representation, with the appropriate components according to the
   preferences and wishes of the user.  Accordingly, in MPD, the



Zong                     Expires April 27, 2011                [Page 10]

Internet-Draft           Survey and Gap Analysis            October 2010


   "Representation" element may consist of one or more Components which
   may be downloaded and provided to the terminal in addition to content
   being downloaded from other "Representation" elements.  In this case
   the "Representation" element in MPD SHALL contain one or more
   "Component" elements.

   The "Representation" element in MPD may carry a "group" attribute.
   The value of the "group" attribute SHALL be the same for
   Representations that contain at least one same Component.  Two
   Representations with completely different Components (e.g. audio at
   two different languages) SHALL have different values for the "group"
   attribute.

   To provide nPVR functionality, when the Segments of the live Content
   are stored on the nPVR server, the URLs indicating the Segments on
   the nPVR server SHOULD be provided to the OIPF to enable it to access
   these Segments by the MPD update mechanism defined in 3GPP TS 26.234.

3.2.2.  Segmentation

   Each Segment SHALL start with a random access point (RAP).  Moreover,
   to enable seamless switching:

      1) Different Component Streams of the same Component SHALL be
      encoded in the same media format but MAY be different in the
      profile of that format. (e.g., if a Representation contains a
      Component Stream of a certain video Component that is encoded
      using H.264/AVC using the HD profile, then all other
      Representations that have a Component Stream of that Component
      must use H.264/AVC but may use different configurations within the
      HD profile.)

      2) Segments of Representations with the same value for the "group"
      attribute SHALL be time aligned.

3.2.3.  Media formats for MPEG2-TS

   Component Streams of the same Component (e.g. "video angle 1 in H.264
   at 720x576" and "video angle 1 in H.264 at 320x288") SHALL be carried
   in transport stream packets that have the same PID.  When the
   Segments of a Representation contain MPEG-2 TS packets, the value of
   the "id" attribute in each Component element, if present, SHALL be
   the PID of the Transport Stream packets which carry the Component.

   For all Representations, the PAT and PMT are either contained in the
   initialisation Segments or in the media Segments.  The
   Representations with zero "group" attribute will have the same PAT/
   PMT as Representations with non-zero "group" attribute.



Zong                     Expires April 27, 2011                [Page 11]

Internet-Draft           Survey and Gap Analysis            October 2010


   A media Segment SHALL contain the concatenation of one or several
   contiguous (and complete) PES packets which are split and
   encapsulated into TS packets.  When packetizing video elementary
   streams, up to one frame SHALL be included into one PES packet.  The
   PES packet where a frame starts SHALL always contain a PTS/DTS header
   fields in the PES header.

3.2.4.  Use cases

3.2.4.1.  Live streaming

   If the "timeShiftBufferDepth" attribute is present in the MPD, it may
   be used by the terminal to know at any moment which Segments are
   effectively available for downloading with the current MPD.  If this
   timeshift information is not present in the MPD, the terminal may
   assume that all Segments described in the MPD which are already in
   the past are available for downloading.  Periods may be used in the
   live streaming scenario to appropriately describe successive live
   events with different encoding or adaptive streaming properties.

3.2.4.2.  Trick mode and seeking

   Basic implementation of trick modes is based on the processing of
   Segments by the terminal software: downloaded Segments may be
   provided to the decoder at a speed lower or higher than normal.  The
   playback of Segments in fast forward and fast rewind has an immediate
   effect on the bitrate, because the Segments also need to be
   downloaded at a faster rate than normal.  Dedicated streams may be
   used to implement efficient trick modes: it is recommended to produce
   the streams with a lower frame rate, longer Segments or a lower
   resolution to ensure that the bitrate is kept at a reasonable level
   even when the Segment is downloaded at a faster rate.  The dedicated
   stream is described as Representation with a "TrickMode" element in
   the MPD.  It is also recommended that if there are dedicated fast
   forward Representations, the normal Representations do not contain
   the "TrickMode" element in the MPD.

   To determine the random access point in a media Segment, the client
   should download and search RAP one by one till the required RAP is
   found.

3.3.  MPEG

   Recently, ISO/IEC JTC1/SC29/WG11 (MPEG) launched a new standard on
   HTTP streaming.  A series of proposals
   [MPEG-1][MPEG-2][MPEG-3][MPEG-4] have been proposed to address the
   backgroud, objectives, use cases and requriements of the transport of
   MPEG media over HTTP, as well as call-for-propsal on this topic.



Zong                     Expires April 27, 2011                [Page 12]

Internet-Draft           Survey and Gap Analysis            October 2010


3.3.1.  Objectives

   The main objectives of this new standard are:

      1) Efficient delivery of MPEG media over HTTP in an adaptive,
      progressive, download/streaming fashion.

      2) Support of live streaming of multimedia content.

      3) Efficient and ease of use of existing content distribution
      infrastructure components such as CDNs, proxies, caches, NATs and
      firewalls.

      4) Support of integrated services with multiple components.

      5) Support for signaling, delivery, utilization of multiple
      content protection and rights management schemes, and support for
      efficient content forwarding and relay.

3.3.2.  Requirements for proposal

   A list of requirements on HTTP streaming are ecouraged by MPEG.  Only
   those related to media delivery are introduced as follows.

      1) This standard shall support streaming of content and content
      components over HTTP 1.1.

      2) The media files prepared for this standard should be
      deliverable using progressive download with minimal changes.

      3) This standard shall support streaming of live content of
      possibly indefinite length, including PVR functionalities such as
      pause and time-shifted play.

      4) The standard shall support random access (seeking).

      5) The standard shall support trick modes at least to the extent
      that the underlying formats support them in local playback.

      6) The standard shall not require any extension to HTTP 1.1.  It
      shall support the efficient use of HTTP optimized infrastructures
      such as Content Delivery Networks (CDNs), caches and proxies.

      7) The standard shall allow segmentation of the content.  The
      standard shall not require fixed size or fixed duration segments
      during delivery of content.





Zong                     Expires April 27, 2011                [Page 13]

Internet-Draft           Survey and Gap Analysis            October 2010


      8) The standard should introduce minimal transport overhead and
      should incur minimal presentation startup delay.

      9) The standard shall support description of media components for
      delivery and presentation.

      10) The standard shall support interactive selection of media
      components for delivery and presentation, for example view
      selection in multi-view content.

      11) This standard shall support prioritization of content and
      content components.

      12) This standard shall support signaling the relationship among
      content components.

      13) The standard should support network transition during delivery
      of the content.

      14) The standard shall enable adaptation of content along axes
      such as bitrate, temporal resolution, spatial resolution, quality/
      fidelity or view perspective.

      15) The standard shall support initial selection, and dynamic
      adaptation of the content without presentation interruption during
      delivery.


4.  HTTP Streaming Implementations

4.1.  Microsoft Smooth Streaming

   Smooth Streaming is Microsoft implementation of adaptive streaming
   technology, which is a web-base media content delivery that uses
   standard HTTP [MS-IIS].  Instead of delivering media as full-file
   download, or as progressive download, the content is delivered to
   client as a series of small file chunks that can be easily and
   cheaply cached at edge servers, closer to client.  Smooth Streaming
   defines each chunk/GOP as an MPEG-4 Movie Fragment and stores it
   within a contiguous MP4 file for easy random access.  One MP4 file is
   expected for each bit rate.  Because the media is "virtually" split
   into fragment files, the server must translate sequential URL
   requests into exact byte range offsets within the MP4 file.  Server
   extracts the fragment box and sends it over the wire to the client as
   a standalone file.






Zong                     Expires April 27, 2011                [Page 14]

Internet-Draft           Survey and Gap Analysis            October 2010


4.1.1.  On-disk MP4 file format

   +-------------------------------------------------------------------+
   | +----+ +---------------------+ +--------------+ +------+ +------+ |
   | |    | | Movie Metadata(moov)| |Movie Fragment| |Media | |Movie | |
   | |file| |+-----++-----++-----+| |    (moof)    | |Data  | |Frag  | |
   | |type| ||Movie||Track||Movie|| |+----+ +-----+| |(mdat)| |Random| |
   | |    | ||hdr  ||     ||Ext. || ||Frag| |Track|| |      | |Access| |
   | |    | ||     ||     ||     || ||hdr | |Frag || |      | |(mfra)| |
   | |    | ||     ||     ||     || ||    | |     || |      | |      | |
   | |    | |+-----++-----++-----+| |+----+ +-----+| |      | |      | |
   | +----+ +---------------------+ +--------------+ +------+ +------+ |
   +-------------------------------------------------------------------+

   In a nutshell, the MP4 file starts with file-level metadata ('moov')
   that generically describes the file, but the bulk of the payload is
   actually contained in the fragment boxes that also carry more
   accurate fragment-level metadata ('moof') and media data ('mdat').
   Closing the file is an 'mfra' index box that allows easy and accurate
   seeking within the file.

   In Smooth Streaming, the MP4 files are classified into two kinds.
   One is *.ismv file containing video and audio.  Another is *.isma
   containing audio only.  Beside media files, there are manifest files.
   Server manifest file (*.ism) describes the relationships between the
   media tracks, bit rates and files on disk.  Client manifest file
   (*.ismc) describes the available streams to the client: the codecs
   used, bit rates encoded, video resolutions, markers, captions, etc.

4.1.2.  On-wire segments transmission

   Initially, the client requests the *.ismc client manifest from the
   server.  Client then requests fragments in the form of a URL, e.g., h
   ttp://video.foo.com/NBA.ism/QualityLevels(400000)/
   Fragments(video=610275114).  Server then looks up the quality level
   (bit rate) in the corresponding *.ism server manifest and maps it to
   a physical *.ismv or *.isma file on disk.  Server reads the
   appropriate MP4 file, and based on its 'tfra' index box, figures out
   which fragment box ('moof' + 'mdat') corresponds to the requested
   start time offset.  Server extracts the fragment box and sends it
   over the wire to the client as a standalone file.  The sent fragment/
   file can now be automatically cached further down the network,
   potentially saving the origin server from sending the same fragment/
   file again to another client that requests the same URL.







Zong                     Expires April 27, 2011                [Page 15]

Internet-Draft           Survey and Gap Analysis            October 2010


4.1.3.  Adaptative support

   Smooth Streaming provides multiple encoded bit rates of the same
   media source and thus allow client to seamlessly switch between bit
   rates.  As client plays chunks, network condition may change or media
   processing may be impacted by other applications.  Client can
   immediately request the next chunk come from stream that is encoded
   at a different bit rate to accommodate changing conditions.  This
   enables client to play media without any stuttering, buffering and
   freezing, thereby providing fittest-quality playback to client.

4.2.  Adobe

   Adobe HTTP Dynamic Streaming is a new Adobe-defined delivery method
   for enabling on-demand and live adaptive bitrate video streaming over
   regular HTTP connections [Adobe].  HTTP Dynamic Streaming packages
   media files into fragments that Flash Player clients can access
   instantly without downloading the entire file.  Adobe HTTP Dynamic
   Streaming contains several components that work together to package
   media and stream it over HTTP to Flash Player.

4.2.1.  Components

   File Packagers include Live Packager and VoD Packager.  VoD Packager
   translates on-demand media files into fragments and writes the
   fragments to F4F files.  Live Packager translates ingested live
   streams over Real Time Messaging Protocol (RTMP) into F4F files in
   real-time.

   HTTP Origin Module is an Apache HTTP Server module that serves the
   F4F files created by the File Packagers.

   The F4F file format describes how to divide media content into
   segments and fragments.  Each fragment has its own bootstrap
   information that provides cache management and fast seeking.  The F4M
   Manifest file format contains information about a package of files
   that the HTTP Origin Module can serve.  Manifest information includes
   codecs, resolutions, and the availability of files encoded at
   multiple bit rates.

4.2.2.  Workflow

   HTTP Dynamic Streaming workflow includes content preparation which
   write media fragments into files, distribution of files over HTTP,
   media consumption and protection, etc.






Zong                     Expires April 27, 2011                [Page 16]

Internet-Draft           Survey and Gap Analysis            October 2010


            +--------+        +-------+        +-------+        +------+
            |        |        |       |        |       |        |      |
     Live   |        |F4F/F4M |       |        |       |        |      |
   streaming|File    |Files   | HTTP  |HTTP    | HTTP  |HTTP    |Client|
   -------->|Packager|------->| Origin|Delivery| Cache/|Delivery|Appl. |
            |        |        | Module|------->| CDN   |------->|      |
     VoD    |        |        |       |        |       |        |      |
   content  |        |        |       |        |       |        |      |
   -------->|        |        |       |        |       |        |      |
            +--------+        +-------+        +-------+        +------+

4.2.3.  Top features

   HTTP Dynamic Streaming supports features like adaptive bitrate, DVR
   functionality, etc.

      1) Adaptive bitrate.  To stream multi-bitrate content, the server
      encodes a piece of media at multiple bitrates, creating multiple
      files.  The media files share a manifest file that lists
      information about each media file.  With this information, the
      client detects the client's bandwidth, computer resources, etc and
      requests content fragments encoded at the most appropriate bitrate
      for the best viewing experience.

      2) DVR functionality.  Add interactivity to live streams by
      enabling DVR functionality, allowing viewers to pause, rewind, and
      skip forward to real time.

      3) Support for standard HTTP caching systems.  Leverage existing
      standard server hardware and caching infrastructures to maximize
      capacity and reach.

4.3.  Apple

   Apple HTTP Live Streaming [Apple] allows to send live or prerecorded
   audio and video to iPhone or other devices, such as desktop
   computers, using an ordinary Web server.  Playback requires iPhone OS
   3.0 or later on iPhone or iPod touch; QuickTime X or later is
   required on the desktop.












Zong                     Expires April 27, 2011                [Page 17]

Internet-Draft           Survey and Gap Analysis            October 2010


4.3.1.  Basic process

            +-------+      +---------+         +------+        +------+
            |       |      |         |         |      |        |      |
     Live   |       |MPEG2 |         |Index/   |      |        |      |
   streaming|Media  |TS    |Stream   |.ts Files|HTTP  |HTTP    |Client|
   -------->|Encoder|----->|Segmenter|-------->|Server|Delivery|Appl. |
            |       |      |         |         |      |------->|      |
     VoD    |       |      |         |         |      |        |      |
   content  |       |      |         |         |      |        |      |
   -------->|       |      |         |         |      |        |      |
            +-------+      +---------+         +------+        +------+

   Media Encoder takes audio-video input and turns it into an MPEG-2
   Transport Stream.  Currently, the supported format is MPEG-2
   Transport Streams (with H.264 video and AAC audio) for audio-video,
   or MPEG elementary streams for audio.

   Stream segmenter reads the Transport Stream from the local network
   and divides it into a series of small media files (.ts files) of
   equal duration, and creates an index file containing a playlist of
   the media files, as well as meta-data information.  The index file is
   in .M3U8 format.  In the case of a live stream, each time the
   segmenter completes a new media file, the index file is updated.  The
   index is used to track the availability and location of the media
   files.  Both .ts and .M3U8 files are placed on a HTTP server.

   A HTTP server or a web caching system that delivers the media files
   and index files to the client over HTTP.

   A client begins by fetching the index file, based on a URL
   identifying the stream.  The index file in turn specifies the
   location of the available media files, decryption keys, and any
   alternate streams available.  For the selected stream, the client
   downloads each available media file in sequence.  Each file contains
   a consecutive segment of the stream.  Once it has a sufficient amount
   of data downloaded, the client begins presenting the reassembled
   stream to the user.

   In addition, HTTP Live Streaming technology supports adaptive bitrate
   and automatically switches to the optimal bitrate based on the
   network conditions for a smooth quality playback experience.


5.  Gap Analysys






Zong                     Expires April 27, 2011                [Page 18]

Internet-Draft           Survey and Gap Analysis            October 2010


5.1.  Brief Summary of Exitsting Work

   It can be observed that 3GPP, OIPF, MS Smooth Streaming, Adobe
   Dynamic Streaming and Apple HTTP Live Streaming all follow a similar
   design scope, that is:

      1) Streaming server utilizes a stream encoder/segmenter to write
      the media content into a series of small files, as well as produce
      a manifest file to describe these media files.  See below summary
      of existing defined media and menifest files for HTTP streaming,
      regardless the codec and media container type.

                      |  Media File      |  Menifest File  |
   =========================================================
   3GPP/OIPF          | .3GP file        |    .3GP file    |
   ---------------------------------------------------------
   MS Smooth HTTP     | .ismv/.isma file | .ism/.ismc file |
   ---------------------------------------------------------
   Adobe Dynamic HTTP | .F4F file        |    .F4M file    |
   ---------------------------------------------------------
   Apple Live HTTP    | .ts file         |    .M3U8 file   |

      2) HTTP client firstly obtains the menifest file, then construct a
      series of URIs pointing to the media files.  Based on the
      condition of client (e.g. network, device type, etc), or in the
      situation when the user operates trick mode, the client choose to
      request certain media file using HTTP request with the
      corresponding URI.

      3) Upon receiving the HTTP request, the HTTP server send the media
      file corresponding to the URI in the request to the client.

   Apparently, the above design leave the network transport out of
   scope, that is, the media (both live streaming and VoD content) is
   encrypted into files and further transmitted by standard HTTP as
   payload.  From the network transport point of view, there is no
   difference between transmission of such media data and normal text
   file.  All the main features of media streaming, such as meta-
   information of media, PVR funtion, seeking, trick mode, adaptation
   between different viewing quality, etc, are implemented (or can be
   implemented) by the negotiation between server and client by flexible
   MPD, or menifest file.  Another word, all the intelligence in current
   HTTP streaming design resides on the server and client software,
   rather than the network transport.







Zong                     Expires April 27, 2011                [Page 19]

Internet-Draft           Survey and Gap Analysis            October 2010


5.2.  Challenges

   However streaming long duration and high quality media over the best-
   effort Internet to satisfy the real-time streaming requirements faces
   several challenges when there are no network capabilities support for
   HTTP Streaming.

   The first challenge is that the current HTTP streaming is based on
   pull mode where the HTTP client relies on the updated menifest file
   from the server to pull the chunk one after another through issuing a
   sequence of HTTP requests to the HTTP server.  In the case of live
   streaming, the server will need to update the manifest file
   frequently once a new chunk of live media becomes available.  Hence,
   a potential problem is that there will be additional round trips
   between the client and the server for manifest file update before the
   client can request each new chunk, which could risk the real-time
   feature of live streaming.  HTTP server push model, on the other
   hand, enables the server to actively and continuously push chunks to
   the client once a new chunk is available on the server, without the
   round trips between the client and the server for manifest file
   update.  In this sense, push model could be more efficient and a
   better candidate for time-sensitive scenario.

   The second challenge is the lack of QoE improvement and monitoring
   mechanisms in current HTTP streaming systems.  Compared to the
   dedicated IPTV system, the HTTP streaming based on the best-effort
   Internet may suffer more from network transition.  For example, when
   a user switches live channel, the current group of pictures (GoP) and
   initialization information for decoders (a.k.a.  Reference
   Information (RI)) of the media content need to be acquired by the
   client ASAP to start playback.  Unfortunately, there is no mechanism
   so far to improve the transmission of the important HTTP packets,
   hence may introduce a long delay to start the playback in the
   scenario of HTTP streaming.  Additionally, some QoE metrics at
   session level, such as startup delay are important to the HTTP
   streaming system for monitoring or diagnostic purpose.
   Unfortunately, there is no such quality monitoring mechanisms (e.g.
   like RTCP report) in current HTTP streaming system.  To provide a
   high-quality service for the user, monitoring and analyzing the
   system's overall performance is extremely important, since offering
   the performance monitoring capability can help diagnose the potential
   network impairment.

   With these above challenges, the typical user experience in the
   existing HTTP streaming schemes can be limited by delayed startups,
   poor quality, buffering delays, etc.  Especially, in the case of
   "Multi-Screen" applications, the service provider intends to provide
   a common user experience when the user enjoys the media content



Zong                     Expires April 27, 2011                [Page 20]

Internet-Draft           Survey and Gap Analysis            October 2010


   across PCs, TVs, and smart-phones.  Therefore, HTTP streaming over
   the Internet without some optimization on network transport for QoE
   improvement may lead difficulty for the service provider to comply
   the service level agreements (SLAs) between service provider and
   users.

5.3.  Gap List and Potential Working Scope in IETF

   The following table list the gaps in exisiting works on HTTP
   streaming including 3GPP, MS, Adobe, and Apple.

                                     |    If satisfied by   |
             Characteristic          |    existing work     |
   ==========================================================
   Adaptation bir-rate               |    Yes    |          |
   ----------------------------------------------------------
   Playback control                  |    Yes    |          |
   ----------------------------------------------------------
   Use existing cache, CDN           |    Yes    |          |
   ----------------------------------------------------------
   client pull model                 |    Yes    |          |
   ----------------------------------------------------------
   server push model                 |           |    No    |
   ----------------------------------------------------------
   Reliable transmission in network  |    Yes    |          |
   ----------------------------------------------------------
   Real-time support in network      |           |    No    |
   ----------------------------------------------------------
   QoE improvement (e.g. startup)    |           |    No    |
   ----------------------------------------------------------
   QoE monitoring                    |           |    No    |
   ----------------------------------------------------------
   Multicast support for scalability |           |    No    |

   As the leading SDO on making the Internet work better, IETF is a
   suitable place to address the above mentioned gaps by studying and
   enhancing the network to meet the real-time requirement of HTTP
   streaming system.  A potential working scope can be: 1) investigate
   the usage of server push model in HTTP streaming to find the better
   model for the more time-sensitive applications, such as live
   streaming; 2) study some QoE monitoring and feedback mechanisms (e.g.
   like RTCP report ) in HTTP streaming system, including monitoring
   architecture, feedback message coding, QoE metrics for HTTP
   streaming, etc; 3) define some mechanisms for QoE improvement for
   HTTP streaming, such as reducing startup delay in playback when user
   swithes live channel or starts VoD; 4) further improve the real-time
   streaming performance from the aspect of network transport functions.
   Please refer to [HTTPStreamingPS], for more details on the problem



Zong                     Expires April 27, 2011                [Page 21]

Internet-Draft           Survey and Gap Analysis            October 2010


   statement and scope of work.


6.  IANA Considerations

   This document presently raises no IANA considerations.


7.  Security Considerations

   This document presently raises no security considerations.


8.  Acknowledgements

   The authors would like to thank many people who give valuable
   comments on this draft.


9.  References

9.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

9.2.  Informative References

   [3GPP]     3GPP, "Transparent end-to-end Packet-switched Streaming
              Service (PSS) - Protocols and codecs (Release 9)",
              March 2010.

   [OIPF]     OIPF, "HTTP Adaptive Streaming (Release 2)",
              September 2010.

   [MPEG-1]   ISO/IEC JTC1/SC29/WG11, "HTTP Streaming of MPEG Media
              Context and Objectives (N11337)", April 2010.

   [MPEG-2]   ISO/IEC JTC1/SC29/WG11, "Call for Proposals on HTTP
              Streaming of MPEG Media (N11338)", April 2010.

   [MPEG-3]   ISO/IEC JTC1/SC29/WG11, "Use Cases for HTTP Streaming of
              MPEG Media (N11339)", April 2010.

   [MPEG-4]   ISO/IEC JTC1/SC29/WG11, "Requirements on HTTP Streaming of
              MPEG Media (N11340)", April 2010.

   [MS-IIS]   Microsoft Corporation, "IIS Smooth Streaming Technical



Zong                     Expires April 27, 2011                [Page 22]

Internet-Draft           Survey and Gap Analysis            October 2010


              Overview", March 2009.

   [Adobe]    Adobe, "Using ADOBE HTTP DYNAMIC STREAMING", 2010.

   [Apple]    Apple, "HTTP Live Streaming Overview", November 2009.

   [HTTPStreamingPS]
              Wu, Q., "Problem Statement for HTTP Streaming",
              draft-wu-http-streaming-optimization-ps-02.txt (work in
              progress), September 2010.


Author's Address

   Ning Zong
   Huawei Technologies

   Phone: +86 25 56624760
   Email: zongning@huawei.com
































Zong                     Expires April 27, 2011                [Page 23]



PAFTECH AB 2003-20262026-04-24 07:09:21