http://stupid.domain.name/ietf/

One document matched: draft-westerlund-avtcore-transport-multiplexing-04.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc3711 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3711.xml">
<!ENTITY rfc5234 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5234.xml">
<!ENTITY rfc3830 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3830.xml">
<!ENTITY rfc3264 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3264.xml">
<!ENTITY rfc4567 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4567.xml">
<!ENTITY rfc4568 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4568.xml">
<!ENTITY rfc5285 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5285.xml">
<!ENTITY rfc5506 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5506.xml">
<!ENTITY rfc5761 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5761.xml">
<!ENTITY rfc5764 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5764.xml">
<!ENTITY draft-ietf-mmusic-sdp-bundle-negotiation SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-mmusic-sdp-bundle-negotiation.xml">
<!ENTITY draft-rosenberg-rtcweb-rtpmux SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.rosenberg-rtcweb-rtpmux.xml">
]>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std"
     docName="draft-westerlund-avtcore-transport-multiplexing-04"
     ipr="trust200902">
  <front>
    <title abbrev="Multiple RTP Session on Single Transport">Multiple RTP
    Sessions on a Single Lower-Layer Transport</title>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 6</street>

          <city>SE-164 80 Kista</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <author fullname="Colin Perkins" initials="C. S." surname="Perkins">
      <organization>University of Glasgow</organization>

      <address>
        <postal>
          <street>School of Computing Science</street>

          <city>Glasgow</city>

          <code>G12 8QQ</code>

          <country>United Kingdom</country>
        </postal>

        <email>csp@csperkins.org</email>
      </address>
    </author>

    <date day="22" month="October" year="2012"/>

    <abstract>
      <t>This document specifies how multiple RTP sessions are to be
      multiplexed on the same lower-layer transport, e.g. a UDP flow. It
      discusses various requirements that have been raised and their
      feasibility, which results in a solution with a certain applicability. A
      solution is recommended and that solution is provided in more detail,
      including signalling and examples.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>There has been renewed interest for having a solution that allows
      multiple <xref target="RFC3550">RTP sessions</xref> to use a single
      lower layer transport, such as a bi-directional UDP flow. The main
      reason is the cost of doing NAT/FW traversal for each individual flow.
      ICE and other NAT/FW traversal solutions are clearly capable of
      attempting to open multiple flows. However, there is both increased risk
      for failure and an increased cost in the creation of multiple flows. The
      increased cost comes as slightly higher delay in establishing the
      traversal, and the amount of consumed NAT/FW resources. The latter might
      be an increasing problem in the IPv4 to IPv6 transition period.</t>

      <t>There is ongoing work on specifying how and when one RTP session may
      contain multiple <xref
      target="I-D.ietf-avtcore-multi-media-rtp-session">media types</xref>.
      That addresses certain use cases, while this proposal addresses a
      different set of use cases and motivations. This is further discussed in
      the section on <xref target="sec-motivations">Motivations</xref>. The
      classical method of having one RTP session over a specific transport
      flow is still motivated for a number of use cases, especially when flow
      based QoS is to be used for some media streams.</t>

      <t>This document draws up some requirements for consideration on how to
      transport multiple RTP sessions over a single lower-layer transport.
      These requirements had to be weighted as the combined set of
      requirements result in that no known solution exist that can fulfill
      them completely.</t>

      <t>A number of possible solutions where considered and discussed with
      respect to their properties. Based on that, the authors recommended a
      shim layer variant as single solution, which specified in detail
      including signalling solution and examples. The other considered
      proposals and the comparison is available as appendices.</t>
    </section>

    <section title="Conventions">
      <t/>

      <section title="Terminology">
        <t>Some terminology used in this document.</t>

        <t><list style="hanging">
            <t hangText="Multiplexing:">Unless specifically noted, all
            mentioning of multiplexing in this document refer to the
            multiplexing of multiple RTP Sessions on the same lower layer
            transport. It is important to make this distinction as RTP does
            contain a number of multiplexing points for various purposes, such
            as media formats (Payload Type), media sources (SSRC), and RTP
            sessions.</t>
          </list></t>
      </section>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
      </section>
    </section>

    <section anchor="sec-motivations" title="Motivations">
      <t>This section looks at the motivations why an additional solution is
      needed assuming that you can do both the classical method of having one
      RTP session per transport flow as defined by the <xref
      target="RFC3550">RTP specification</xref> and when you have <xref
      target="I-D.ietf-avtcore-multi-media-rtp-session">multiple media types
      within one RTP session</xref>.</t>

      <section title="NAT and Firewalls">
        <t>The existence of NATs and Firewalls at almost all Internet access
        has had implications on protocols like RTP that were designed to use
        multiple transport flows. First of all, the NAT/FW traversal solution
        one uses needs to ensure that all these transport flows are
        established. This has three different impacts:<list style="numbers">
            <t>Increased delay to perform the transport flow establishment</t>

            <t>The more transport flows, the more state and the more resource
            consumption in the NAT and Firewalls. When the resource
            consumption in NAT/FWs reaches their limits, unexpected behaviors
            usually occur. Commonly resulting in service disruptions.</t>

            <t>More transport flows means a higher risk that some transport
            flow fails to be established, thus preventing the application to
            communicate.</t>
          </list></t>

        <t>Using fewer transport flows reduces the risk of communication
        failure, improved establishment behavior and less load on NAT and
        Firewalls.</t>
      </section>

      <section title="No Transport Level QoS">
        <t>Many RTP-using applications don't utilize any network level Quality
        of Service functions. Nor do they expect or desire any separation in
        network treatment of its media packets, independent of whether they
        are audio, video or text. When an application has no such desire, it
        doesn't need to provide a transport flow structure that simplifies
        flow based QoS.</t>
      </section>

      <section anchor="sec-multiple-sessions" title="Multiple RTP sessions">
        <t>The usage of multiple RTP sessions allow separation of media
        streams that have different usages or purposes in an RTP based
        application, for example to separate the video of a presenter or most
        important current talker, from those of the listeners that not all
        end-points receive. Also separation for different processing based on
        media types such as audio and video in end-points and central nodes.
        Thus providing the node with the knowledge that any SSRC within the
        session is supposed to be processed in a similar or same way.</t>

        <t>For simpler cases, where the streams within each media type need
        the same processing, it is clearly possible to find other multiplex
        solutions, for example based on the Payload Type and the differences
        in encoding that the payload type allows to describe. This may anyhow
        be insufficient when you get into more advanced usages where you have
        multiple sources of the same media type, but for different usages or
        as alternatives. For example when you have one set of video sources
        that shows session participants and another set of video sources that
        shares an application or presentation slides, you likely want to
        separate those streams for various reasons such as control,
        prioritization, QoS, methods for robustification, etc. In those cases,
        using the RTP session for separation of properties is a powerful tool.
        A tool with properties that need to be preserved when providing a
        solution for how to use only a single lower-layer transport.</t>

        <t>For more discussion of the usage of RTP sessions verses other
        multiplexing we recommend <xref
        target="I-D.westerlund-avtcore-multiplex-architecture">RTP
        Multiplexing Architecture</xref>.</t>
      </section>

      <section title="Usage of RTP Extensions">
        <t>Applications uses different sets of RTP extensions. The solution
        for <xref target="I-D.ietf-avtcore-multi-media-rtp-session">multiple
        media types in one RTP session</xref> is known to have limitations
        that prevent the usage of the following RTP mechanisms and
        extensions:</t>

        <t><list style="symbols">
            <t>XOR FEC (RFC5109)</t>

            <t>RTP Retransmission in session mode (RFC4588)</t>

            <t>Certain Layered Coding</t>
          </list>A developed solution should minimize the number of RTP/RTCP
        extension and mechanism that can't be used.</t>
      </section>

      <section title="Incremental Deployment">
        <t>In various multi-party communication scenarios deployment can
        become an issue if all session participants are required to have the
        functionality before enabling its usage. This is especially difficult
        in communication scenarios where not all possible participants and
        their capabilities are know ahead of establishing the communication
        session with some sub-set of the participants. At least for
        centralized communication sessions it is desirable to have a solution
        that enables allows the solution to be used on a single leg without
        affecting any other leg, nor require advanced translation
        functionality in any central node.</t>
      </section>

      <section title="Summary">
        <t>The center of the motivation is to ensure that the RTP session is a
        available and usable tool also for applications that has no need for
        network level separation of its media streams and wants to reduce its
        exposure to any NAT or Firewall inconsistencies and minimize the
        resource consumption. As a benefit a well designed solution will
        enable incremental deployment and minimal limitations in what existing
        RTP mechanisms or extensions that can be used by the RTP using
        application.</t>
      </section>
    </section>

    <section title="Requirements">
      <t>This section lists and discusses a number of potential requirements.
      However, it is not difficult to realize that it is in fact possible to
      put requirements that makes the set of feasible solutions an empty set.
      It is thus necessary to consider which requirements that are essential
      to fulfill and which can be compromised on to arrive at a solution.</t>

      <section anchor="sec-req-multiple-session"
               title="Support Use of Multiple RTP Sessions">
        <t><xref target="sec-multiple-sessions"/> discusses a number of
        reasons why an application may like to have multiple RTP sessions.
        Considering the motivations for this work this must be an absolute
        requirement. We also are of the opinion that the session provided by
        the solution must fulfill the definition in the <xref
        target="RFC3550">RTP</xref> specification:<list style="hanging">
            <t>"The distinguishing feature of an RTP session is that each
            maintains a full, separate space of SSRC identifiers (defined
            next). The set of participants included in one RTP session
            consists of those that can receive an SSRC identifier transmitted
            by any one of the participants either in RTP as the SSRC or a CSRC
            (also defined below) or in RTCP."</t>
          </list></t>

        <t/>
      </section>

      <section anchor="seqreq-multi-ssrc"
               title="Same SSRC Value in Multiple RTP Sessions">
        <t>Two different RTP sessions being multiplexed on the same lower
        layer transport need to be able to use the same SSRC value. This is a
        absolute requirement, for two reasons:</t>

        <t><list style="numbers">
            <t>To avoid mandating SSRC assignment rules that are coordinated
            between the sessions. If the RTP sessions multiplexed together
            must have unique SSRC values, then additional code that works
            between RTP Sessions is needed in the implementations. Thus
            raising the bar for implementing this solution. In addition, if
            one gateways between parts of a system using this multiplexing and
            parts that aren't multiplexing, the part that isn't multiplexing
            must also fulfill the requirements on how SSRC is assigned or
            force the gateway to translate SSRCs. Translating SSRC is actually
            hard as it requires one to understand the semantics of all current
            and future RTP and RTCP extensions. Otherwise a barrier for
            deploying new extensions is created.</t>

            <t>There are some few RTP extensions that currently rely on being
            able to use the same SSRC in different RTP sessions: <list
                style="symbols">
                <t>XOR FEC (RFC5109)</t>

                <t>RTP Retransmission in session mode (RFC4588)</t>

                <t>Certain Layered Coding</t>
              </list></t>
          </list></t>
      </section>

      <section title="SRTP">
        <t><xref target="RFC3711">SRTP</xref> is one of the most commonly used
        security solutions for RTP. In addition, it is the only one defined by
        IETF that is integrated into RTP. This integration has several aspects
        that needs to be considered when designing a solution for multiplexing
        RTP sessions on the same lower layer transport.</t>

        <t><list style="hanging">
            <t hangText="Determining Crypto Context:">SRTP first of all needs
            to know which session context a received or to-be-sent packet
            relates to. It also normally relies on the lower layer transport
            to identify the session. It uses the Master Key Indicatior (MKI),
            if present, to determine which key set is to be used. Then the
            SSRC and sequence number are used by most crypto suites, including
            the most common use of AES Counter Mode, to actually generate the
            correct cipher stream.</t>

            <t hangText="Unencrypted Headers:">SRTP has chosen to leave the
            RTP headers and the first two 32-bit words of the first RTCP
            header unencrypted, to allow for both header compression and
            monitoring to work also in the presence of encryption. As these
            fields are in clear text they are used in most crypto suites for
            SRTP to determine how to protect or recover the plain text.</t>
          </list>It is here important to contrast SRTP against a set of other
        possible protection mechanisms. DTLS, TLS, and IPsec are all
        protecting and encapsulating the entire RTP and RTCP packets. They
        don't perform any partial operations on the RTP and RTCP packets. Any
        change that is considered to be part of the RTP and RTCP packet is
        transparent to them, but possibly not to SRTP. Thus the impact on SRTP
        operations must be considered when defining a mechanism.</t>
      </section>

      <section title="Don't Redefine Used Bits">
        <t>As the core of RTP is in use in many systems and has a really large
        deployment story and numerous implementations, changing any of the
        field definitions is highly problematic. First of all, the
        implementations need to change to support this new semantics.
        Secondly, you get a large transition issue when you have some session
        participants that support the new semantics and some that don't.
        Combing the two behaviors in the same session can force the deployment
        of costly and less than perfect translation devices.</t>
      </section>

      <section title="Firewall Friendly">
        <t>It is desirable that current Firewalls will accept the solutions as
        normal RTP packets. However, in the authors' opinion we can't let the
        firewall stifle invention and evolution of the protocol. It is also
        necessary to be aware that a change that will make most deep
        inspecting firewall consider the packet as not valid RTP/RTCP will
        have a more difficult deployment story.</t>
      </section>

      <section title="Monitoring and Reporting">
        <t>It is desirable that a third party monitor can still operate on the
        multiplexed RTP Sessions. It is however likely that they will require
        an update to correctly monitor and report on multiplexed RTP
        Sessions.</t>

        <t>Another type of function to consider is packet sniffers and their
        selector filters. These may be impacted by a change of the fields. An
        observation is that many such systems are usually quite rapidly
        updated to consider new types of standardized or simply common packet
        formats.</t>
      </section>

      <section title="Usable Also Over Multicast">
        <t>It is desirable that a solution should be possible to use also when
        RTP and RTCP packets are sent over multicast, both Any Source
        Multicast (ASM) and Single Source Multicast (SSM). The reason for this
        requirement is to allow a system using RTP to use the same
        configuration regardless of the transport being done over unicast or
        multicast. In addition, multicast can't be claimed to have an issue
        with using multiple ports, as each multicast group has a complete port
        space scoped by address.</t>
      </section>

      <section title="Incremental Deployment">
        <t>A good solution has the property that in topologies that contains
        RTP mixers or Translators, a single session participant can enable
        multiplexing without having any impact on any other session
        participants. Thus a node should be able to take a multiplexed packet
        and then easily send it out with minimal or no modification on another
        leg of the session, where each RTP session is transported over its own
        lower-layer transport. It should also be as easy to do the reverse
        forwarding operation.</t>
      </section>
    </section>

    <section title="Design Considerations">
      <t>When defining a SHIM solution for identifying RTP sessions over a
      single transport layer there has been some special considerations that
      is discussed in this section.</t>

      <section title="Location of SHIM">
        <t>A major question affecting the SHIM is the location of the SHIM
        header providing the Identifier of the session the packet relate to.
        This section will discuss in detail about the impact of making the
        different choices.</t>

        <t>Identified aspects to consider are:<list style="hanging">
            <t hangText="Possibility to Process:">A prefixed shim header, i.e.
            between the transport protocol and the RTP/RTCP packet header has
            the advantage that any node on the network that likes to include
            the header in any per-packet processing can reach it. Reasons for
            per-packet processing are:<list style="letters">
                <t>Quality of Service classification</t>

                <t>SHIM ingress or egress</t>

                <t>Monitoring</t>
              </list></t>

            <t hangText="">Many routers or similar devices can only read and
            process the first N bytes of the whole packet, where N is commonly
            on the order of 64-128 bytes. Any other type of processing means
            putting the packet on the slow path. Thus a prefixed solution
            enables this processing while a post fixed solution will most
            likely forever prevent this type of devices to process it.</t>

            <t hangText="Legacy Processing:">Packets or at least flows of the
            type IP/UDP/RTP can in many cases be identified in Deep Packet
            Inspection, Firewalls or other network entities that concern
            themselves with trying determine what traffic that flows in a
            particular packet. These nodes can clearly be updated but until
            they have they may create a hinder against deployment. Thus a post
            fix gives likely the least resistance for initial deployment.
            However, also for postfix location the deployment can be hindered
            in cases multiple RTP sessions using the same SSRC values due to
            irregular behavior of the fields for what the third party believes
            is one media stream rather than multiple ones. The prefixed will
            however maintain the long-term capabilities of such devices
            assuming they can be updated to include the SHIM header as part of
            the classification.</t>

            <t hangText="Header Compression:">The different header compression
            techniques that has been developed compresses IP/UDP/RTP as
            complete combination. If one instead have a IP/UDP/SHIM/RTP then
            the compression for the full set may not work or poorly. Instead
            only IP/UDP header compression is likely to be applied. Thus a
            prefix will loose some compression efficiency until compression
            profiles for IP/UDP/SHIM/RTP has been developed, implemented and
            deployed. Postfix don't have that issue, but nor can it ever gain
            anything from header compression which an prefixed solution could
            once an updated profile is deployed. Postfix also will have
            reduced efficiency compressing sessions when the same SSRC is used
            in two different RTP sessions as the RTP header fields like
            sequence number etc will not behave as expected and need frequent
            explicit updates.</t>
          </list></t>

        <t>The question of a prefixed or a postfixed header comes down to a
        trade-off between long term usability and deployment issues:<list
            style="hanging">
            <t hangText="Prefixed:">Long term good possibility to adapt any
            network function that needs to take the SHIM header into account.
            At the same time any function that tries to analyze packets and
            because of that may block the packets will be a hinder to
            deployment.</t>

            <t hangText="Postfixed:">This solution will likely short term have
            the best possibilities to deploy successfully. However, long term
            this choice will likely prevent many network nodes that like to be
            capable of separating the RTP sessions being multiplexed together
            from successfully doing that.</t>
          </list></t>

        <t>After discussion in the WG it has been determined that prefixed is
        the prefered solution.</t>
      </section>

      <section title="ICE and DTLS-SRTP Integration">
        <t>When using <xref target="RFC5245">ICE</xref> or <xref
        target="RFC5764">DTLS-SRTP</xref> or both with RTP there exist the
        issue that RTP, <xref target="RFC5389">STUN</xref> and DTLS-SRTP are
        simultanously in use over the same lower layer transport flow, like
        UDP. This multiplexing is based on the value of the first byte of the
        lower layer transport payload as discussed in Section 5.1.2 of <xref
        target="RFC5764">DTLS-SRTP</xref>.</t>

        <t>The replacement of a single RTP session with the multiple RTP
        sessions idenfied by a SHIM must not be missidentified to be either
        STUN or DTLS-SRTP or any other protocol intending to take the
        available free code-points in the range 193-255 (Decimal). Thus a
        prefixed SHIM must have its first byte have the two first bits set to
        10 (Binary). Having the SHIM share the identity of RTP is not an issue
        as one must have mutual agreement that the SHIM is used instead of
        RTP. <list style="empty">
            <t>Note: This limits a single byte SHIM to only allow a maximum of
            64 RTP sessions over a single transport flow.</t>
          </list></t>

        <t/>
      </section>

      <section title="Signalling Fallback">
        <t>There exist an important aspect in how the SDP signalling
        functions, especially <xref target="RFC3264">Offer/Answer</xref>. The
        initial idea for the signalling was to build on top of <xref
        target="I-D.ietf-mmusic-sdp-bundle-negotiation">bundle</xref> which in
        its default function negotiate <xref
        target="I-D.ietf-avtcore-multi-media-rtp-session">multiple media types
        over one RTP session</xref>. If the signalling for the solution that
        main purpose is to enable multiple RTP sessions results in those cases
        the peer doesn't support this specification the communicating peer can
        end up in single RTP session if the peer supports that.</t>

        <t>We consider it important that in the signalling design that the
        application developer can decide what type of fallback that will
        occur. It is also important to consider that one have to signal SHIM
        based multiplexing of RTP sessions that are in fact of the type with
        multiple media types. Thus the signalling for SHIM must be able to
        describe multiple different scenarios:<list style="numbers">
            <t>Multiple RTP sessions multiplexed together using SHIM over one
            transport</t>

            <t>Like 1 but where at least one RTP session is containing
            multiple media types</t>

            <t>Like 1, but where the peer doesn't support SHIM and the
            initiator wants to fallback to independent transports</t>

            <t>Like 2, but where the peer doesn't support SHIM and wants to
            fallback to multiple BUNDLED sessions over independent
            transports.</t>
          </list></t>

        <t>In addition it must be possible to have multiple different
        transports where each is a SHIM multiplex. This is to support
        decomposed end-points or cases where certain media traffic is required
        to go to a central processing node while others goes directly to a
        peer.</t>

        <t>To enable all of these scenarios we propose a solution where each
        indicates SHIM multiplex is indicated as its own grouping attribute
        across all media blocks that are included in some form in the
        multiplex. This resulting in that these media blocks fall under a form
        of BUNDLE super set. This super set will also have some of bundles
        restrictions on the transport layer, but not on higher layer. Which
        Session ID pair a particular media block is associated is signalled
        using a SDP attribute (a=session-mux-id) in each media block. When
        multiple media block are assigned the same session ID pair, they form
        a RTP session with multiple media types and have the full restriction
        of bundle between them.</t>

        <t>The method of fallback is indicated by providing explicit BUNDLE
        grouping in addition to the SHIM when the fallback from SHIM is to
        BUNDLE.</t>

        <t>Note: Signalling solution is awaiting resolution of design path for
        bundle and will then consider that solution and issues raised.</t>
      </section>
    </section>

    <section title="Specification">
      <t>This section contains the specification of the RTP session
      multiplexing SHIM, using an explicit session identifier of the
      encapsulated payload.</t>

      <section title="Shim Layer">
        <t>This solution is based on a shim layer that is inserted in the
        stack between the regular RTP and RTCP packets and the transport layer
        being used by the RTP sessions. Thus the layering looks like the
        following:</t>

        <figure title="Stack View with Session ID SHIM">
          <artwork><![CDATA[
+---------------------+
|  RTP / RTCP Packet  |
+---------------------+
|  Session ID Layer   |
+---------------------+
|  Transport layer    |
+---------------------+]]></artwork>
        </figure>

        <t>The above stack is in fact a layered one as it does allow multiple
        RTP Sessions to be multiplexed on top of the Session ID shim layer.
        This enables the example presented in <xref
        target="fig-shim-example"/> where four sessions, S1-S4 is sent over
        the same Transport layer and where the Session ID layer will combine
        and encapsulate them with the session ID on transmission and separate
        and decapsulate them on reception.</t>

        <figure anchor="fig-shim-example"
                title="Multiple RTP Session On Top of Session ID Layer">
          <artwork><![CDATA[
+-------------------+
| S1 | S2 | S3 | S4 |
+-------------------+
|  Session ID Layer |
+-------------------+
|  Transport layer  |
+-------------------+]]></artwork>
        </figure>

        <t>The Session ID layer encapsulates one RTP or RTCP packet from a
        given RTP session and prefixes the 2-byte Session ID layer to the
        packet. The Session ID layer is depicted <xref
        target="fig-session-id-layer">below</xref> and consists of first 2
        fixed bit values (10b) followed by a 14 bits unsigned integer field
        with the Session ID (SID) value.</t>

        <figure anchor="fig-session-id-layer" title="Session ID layer">
          <artwork><![CDATA[ 0                   1           
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 0|  Session ID (SID)         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+]]></artwork>
        </figure>

        <t>Each RTP session being multiplexed on top of a given transport
        layer is assigned either a single or a pair of unique SID in the range
        0-16383. The reason for assigning a pair of SIDs to a given RTP
        session are for RTP Sessions that doesn't support <xref
        target="RFC5761">"Multiplexing RTP Data and Control Packets on a
        Single Port"</xref> to still be able to use a single 5-tuple. The
        reasons for supporting this extra functionality is that RTP and RTCP
        multiplexing based on the payload type/packet type fields enforces
        certain restrictions on the RTP sessions. These restrictions may not
        be acceptable. As this solution does not have these restrictions,
        performing RTP and RTCP multiplexing in this way has benefits.</t>

        <t>Each Session ID value space is scoped by the underlying transport
        protocol. Common transport protocols like <xref
        target="RFC0768">UDP</xref>, <xref target="RFC4340">DCCP</xref>, <xref
        target="RFC0793">TCP</xref>, and <xref target="RFC4960">SCTP</xref>
        can all be scoped by one or more 5-tuple (Transport protocol, source
        address and port, destination address and port). The case of multiple
        5-tuples occur in the case of multi-unicast topologies, also called
        meshed multiparty RTP sessions or in case any application would need
        more than 8192 RTP sessions.</t>

        <figure align="center" anchor="fig-srtp-packet"
                title="SRTP Packet encapsulated by Session ID Layer">
          <artwork><![CDATA[
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                  +-------------------------------+
                                  |1 0|       Session ID          | 
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
  |V=2|P|X|  CC   |M|     PT      |       sequence number         | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |                           timestamp                           | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |           synchronization source (SSRC) identifier            | |
  +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
  |            contributing source (CSRC) identifiers             | |
  |                               ....                            | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |                   RTP extension (OPTIONAL)                    | |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| |                          payload  ...                         | |
| |                               +-------------------------------+ |
| |                               | RTP padding   | RTP pad count | |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
| ~                     SRTP MKI (OPTIONAL)                       ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| :                 authentication tag (RECOMMENDED)              : |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+- Encrypted Portion*                      Authenticated Portion ---+
]]></artwork>
        </figure>

        <t/>

        <figure align="center" anchor="fig-srtcp-packet"
                title="SRTCP packet encapsulated by Session ID layer">
          <artwork><![CDATA[
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                  +-------------------------------+
                                  |1 0|       Session ID          | 
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
  |V=2|P|    RC   |   PT=SR or RR   |             length          | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |                         SSRC of sender                        | |
+>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
| ~                          sender info                          ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| ~                         report block 1                        ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| ~                         report block 2                        ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| ~                              ...                              ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| |V=2|P|    SC   |  PT=SDES=202  |             length            | |
| +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
| |                          SSRC/CSRC_1                          | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| ~                           SDES items                          ~ |
| +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
| ~                              ...                              ~ |
+>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
| |E|                         SRTCP index                         | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
| ~                     SRTCP MKI (OPTIONAL)                      ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| :                     authentication tag                        : |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+-- Encrypted Portion                    Authenticated Portion -----+

]]></artwork>
        </figure>

        <t>The processing in a receiver when the Session ID layer is present
        will be to<list style="numbers">
            <t>Pick up the packet from the lower layer transport</t>

            <t>Inspect the SID field value</t>

            <t>Strip the SID field from the packet</t>

            <t>Forward it to the (S)RTP Session context identified by the SID
            value</t>
          </list></t>

        <t/>
      </section>

      <section anchor="sec-signalling" title="Signalling">
        <t>Note: This section may need updating as the direction of the
        solution for Bundle has settled and the impact of the raised issues
        has been analyzed.</t>

        <t>The use of the Session ID layer needs to be explicitly agreed on
        between the communicating parties. Each RTP Session the application
        uses must in addition to the regular configuration such as payload
        types, RTCP extension etc, have both the underlying 5-tuple (source
        address and port, destination address and port, and transport
        protocol) and the Session ID used for the particular RTP session. The
        signalling requirement is to assign unique Session ID values to all
        RTP Sessions being sent over the same 5-tuple. The same Session ID
        shall be used for an RTP session independently of the traffic
        direction. Note that nothing prevents a multi-media application from
        using multiple 5-tuples if desired for some reason, in which case each
        5-tuple has its own session ID value space.</t>

        <t>This section defines how to negotiate the use of the Session ID
        layer, using the Session Description Protocol (SDP) <xref
        target="RFC3264">Offer/Answer mechanism</xref>. A new SDP grouping
        semantics is defined "SHIM" and a new media-level SDP attribute,
        ‘session-mux-id. The attribute allows each media description
        (“m=” line) associated with a ‘SHIM’ group to
        be identified in which RTP session it belongs.</t>

        <t>The ‘session-mux-id’ attribute is included for a media
        description, in order to indicate the Session ID for that particular
        media description. Every media description that shares a common
        attribute value is assumed to be part of a single RTP session. An SDP
        Offerer MUST include the ‘session-mux-id’ attribute for
        every media description associated with a ‘SHIM’ group. If
        the SDP Answer does not contain the SHIM group, the SDP Offerer MUST
        NOT use SHIM based layering. However, if that is separate RTP sessions
        or BUNDLE is determined on what was present in the offer and answer.
        This will depend on what the offering party likes to happen. If they
        want a failure to negotiate a SHIM, instead may be one or more bundle
        groups then also the BUNDLE grouping is included in the offer. If the
        SDP Answer still describes a ‘BUNDLE’ group, the
        procedures in <xref target="I-D.ietf-mmusic-sdp-bundle-negotiation"/>
        apply. If not independent transports and sessions are used.</t>

        <t>An SDP Answerer MUST NOT include the 'SHIM' group and
        ‘session-mux-id’ attribute in an SDP Answer, unless they
        where included in the SDP Offer.</t>

        <t>The attribute has the following <xref target="RFC5234">ABNF</xref>
        definition.</t>

        <figure>
          <artwork><![CDATA[
Session-mux-id-attr = "a=session-mux-id:" SID *SID-prop
SID                 = SID-value / SID-pairs
SID-value           = 1*3DIGIT / "NoN"
SID-pairs           = SID-value "/" SID-value ; RTP/RTCP SIDs
SID-prop            = SP assignment-policy / prop-ext
prop-ext            = token "=" value
assignment-policy   = "policy=" ("tentative" / "fixed") 
 ]]></artwork>
        </figure>

        <t>The SHIM group SHALL contain all media descriptions that are
        intended to be sent over the same transport flow, independent of
        Session ID. For all media descriptions part of the same SHIM group the
        transport parameters, i.e. ports, ICE-candidates etc MUST be the same
        and handled as described by BUNDLE. Note, the parameters related to
        the RTP session does not need to be same.</t>

        <t>For media descriptions that have the same value of the Session ID
        SHALL be treated the same way as if they where part of a BUNDLE group,
        independently if that is indicated or not in the SDP.</t>

        <t>The SID property "policy" is used in negotiation by an end-point to
        indicate if the session ID values are merely a tentative suggestion or
        if they must have these values. This is used when negotiating SID for
        multi-party RTP sessions to support shared transports such as
        multicast or RTP translators that are unable to produce renumbered
        SIDs on a per end-point basis. The normal behavior is that the offer
        suggest a tentative set of values, indicated by "policy=tentative".
        These SHOULD be accepted by the peer unless that peer negotiate
        session IDs on behalf of a centralized policy, in which case it MAY
        change the value(s) in the answer. If the offer represents a policy
        that does not allow changing the session ID values, it can indicate
        that to the answerer by setting the policy to "fixed". This enables
        the answering peer to either accept the value or indicate that there
        is a conflict in who is performing the assignment by setting the SID
        value to NoN (Not a Number). Offerer and answerer SHOULD always
        include the policy they are operating under. Thus, in case of no
        centralized behaviors, both offerer and answerer will indicate the
        tentative policy.</t>
      </section>

      <section title="SRTP Key Management">
        <t>Key management for SRTP do needs discussion as we do cause multiple
        SRTP sessions to exist on the same underlying transport flow. Thus we
        need to ensure that the key management mechanism still are properly
        associated with the SRTP session context it intends to key. To ensure
        that we do look at the three SRTP key management mechanism that IETF
        has specified, one after another.</t>

        <section title="Security Description">
          <t><xref target="RFC4568">Session Description Protocol (SDP)
          Security Descriptions for Media Streams</xref> as being based on SDP
          has no issue with the RTP session multiplexing on lower layer
          specified here. The reason is that the actual keying is done using a
          media level SDP attribute. Thus the attribute is already associated
          with a particular media description. A media description that also
          will have an instance of the "a=session-mux-id" attribute carrying
          the SID value/pair used with this particular crypto parameters.</t>
        </section>

        <section title="DTLS-SRTP">
          <t><xref target="RFC5764">Datagram Transport Layer Security (DTLS)
          Extension to Establish Keys for the Secure Real-time Transport
          Protocol (SRTP)</xref> is a keying mechanism that works on the media
          plane on the same lower layer transport that SRTP/SRTCP will be
          transported over.</t>

          <t>The most direct solution would be to use the SHIM and the SID
          context identifier to be applied also on DTLS packets. Thus using
          the same SID that is used with RTP and/or RTCP also for the DTLS
          message intended to key that particular SRTP and/or SRTCP flow(s).
          This of course requires independent usage of DTLS-SRTP for each RTP
          session. In addition it requires changing the layering for DTLS-SRTP
          as well as RTP. Thus this behavior doesn't gain you anything in
          regards to key-management when using SHIM and have some costs.</t>

          <t>Instead we propose that an DTLS-SRTP key-derivation change is
          introduced. By including the Session ID value in the derivation of
          the keying material a single DTLS-SRTP key-management operation
          could apply keys and parameters for all the RTP sessions in the same
          transport flow. Thus the keying cost is significantly reduced,
          especially in regards to network communication and delay impact and
          vunerability to packet loss. </t>

          <t>Details to be written up. </t>
        </section>

        <section title="MIKEY">
          <t><xref target="RFC3830">MIKEY: Multimedia Internet KEYing</xref>
          is a key management protocol that has several transports. In some
          cases it is used directly on a transport protocol such as UDP, but
          there is also a specification for how MIKEY is used with SDP "Key
          Management Extensions for Session Description Protocol (SDP) and
          Real Time Streaming Protocol (RTSP)" <xref target="RFC4567"/>.</t>

          <t>Lets start with the later, i.e. the SDP transport, which shares
          the properties with Security Description in that is can be
          associated with a particular media description in a SDP. As long as
          one avoids using the session level attribute one can be certain to
          correctly associate the key exchange with a given SRTP/SRTCP
          context.</t>

          <t>It does appear that MIKEY directly over a lower layer transport
          protocol will have similar issues as DTLS.</t>
        </section>
      </section>

      <section title="Examples">
        <t/>

        <section title="RTP Packet with Transport Header">
          <t>The below figure contains an RTP packet with SID field
          encapsulated by a UDP packet (added UDP header).</t>

          <figure title="SRTP Packet Encapsulated by Session ID Layer">
            <artwork><![CDATA[
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | Source Port                   | Destination Port              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | Length                        | Checksum                      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |1 0|     Session ID            | 
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
  |V=2|P|X|  CC   |M|     PT      |       sequence number         | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |                           timestamp                           | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |           synchronization source (SSRC) identifier            | |
  +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ |
  |            contributing source (CSRC) identifiers             | |
  |                               ....                            | |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
  |                   RTP extension (OPTIONAL)                    | |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| |                          payload  ...                         | |
| |                               +-------------------------------+ |
| |                               | RTP padding   | RTP pad count | |
+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+
| ~                     SRTP MKI (OPTIONAL)                       ~ |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| :                 authentication tag (RECOMMENDED)              : |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+- Encrypted Portion*                      Authenticated Portion ---+
]]></artwork>
          </figure>

          <t/>
        </section>

        <section title="SDP Offer/Answer example">
          <t/>

          <section title="Basic Example">
            <t>This section contains SDP offer/answer examples. First one
            example of successful SHIMing, and then two where fallback occurs.
            The fallback option here is to fallback to individual transports,
            thus no BUNDLE group.</t>

            <t>In the below SDP offer, one audio and one video is being
            offered. The audio is using SID 0, and the video is using SID 1 to
            indicate that they are different RTP sessions despite being
            offered over the same 5-tuple.</t>

            <figure>
              <artwork><![CDATA[v=0
o=alice 2890844526 2890844526 IN IP4 atlanta.example.com
s=
c=IN IP4 atlanta.example.com
t=0 0
a=group:SHIM foo bar
m=audio 10000 RTP/AVP 0 8 97    
b=AS:200
a=mid:foo 
a=session-mux-id:0 policy=tentative     
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 10000 RTP/AVP 31 32
b=AS:1000
a=mid:bar
a=session-mux-id:1 policy=tentative
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000]]></artwork>
            </figure>

            <t/>

            <t>The SDP answer from an end-point that supports this
            BUNDLEing:</t>

            <figure>
              <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 biloxi.example.com
s=
c=IN IP4 biloxi.example.com
t=0 0
a=group:SHIM foo bar
m=audio 20000 RTP/AVP 0
b=AS:200
a=mid:foo
a=session-mux-id:0 policy=tentative
a=rtpmap:0 PCMU/8000
m=video 20000 RTP/AVP 32
b=AS:1000
a=mid:bar
a=session-mux-id:1 policy=tentative
a=rtpmap:32 MPV/90000]]></artwork>
            </figure>

            <t/>

            <t>The SDP answer from an end-point that does not support this
            SHIMing.</t>

            <figure>
              <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 biloxi.example.com
s=
c=IN IP4 biloxi.example.com
t=0 0
m=audio 20000 RTP/AVP 0
b=AS:200
a=rtpmap:0 PCMU/8000
m=video 30000 RTP/AVP 32
b=AS:1000
a=rtpmap:32 MPV/90000]]></artwork>
            </figure>
          </section>

          <section title="Advanced Example">
            <t>In this example we have two BUNDLED sessions, one with audio
            and video and one with <xref target="RFC5109">XOR based FEC</xref>
            for the audio and the video. These two RTP session are then SHIMed
            into a single transport flow.</t>

            <t/>

            <figure>
              <artwork><![CDATA[v=0
o=alice 2890844526 2890844526 IN IP4 atlanta.example.com
s=
c=IN IP4 atlanta.example.com
t=0 0
a=group:SHIM foo bar 1 2
a=group:BUNDLE 1 2
a=group:BUNDLE foo bar
a=group:FEC foo 1
a=group:FEC bar 2
m=audio 10000 RTP/AVP 0 8 97    
b=AS:200
a=mid:foo 
a=session-mux-id:0 policy=tentative     
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 10000 RTP/AVP 31 32
b=AS:1000
a=mid:bar
a=session-mux-id:0 policy=tentative
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
m=audio 10000 RTP/AVP 100    
b=AS:100
a=rtpmap:100 ulpfec/8000
a=mid:1
a=session-mux-id:1 policy=tentative     
m=video 10000 RTP/AVP 101
b=AS:500
a=mid:2
a=session-mux-id:1 policy=tentative
a=rtpmap:101 ulpfec/90000
]]></artwork>
            </figure>

            <t>The SDP answer of a client supporting <xref
            target="I-D.ietf-mmusic-sdp-bundle-negotiation"/> but not this
            SHIMing would look like this:</t>

            <figure>
              <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 biloxi.example.com
s=
c=IN IP4 biloxi.example.com
t=0 0
a=group:BUNDLE 1 2
a=group:BUNDLE foo bar
a=group:FEC foo 1
a=group:FEC bar 2
m=audio 20000 RTP/AVP 0 8 97    
b=AS:200
a=mid:foo 
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 20000 RTP/AVP 31 32
b=AS:1000
a=mid:bar
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
m=audio 20002 RTP/AVP 100    
b=AS:100
a=rtpmap:100 ulpfec/8000
a=mid:1
m=video 20002 RTP/AVP 101
b=AS:500
a=mid:2
a=rtpmap:101 ulpfec/90000]]></artwork>
            </figure>

            <t>In the above case two different RTP sessions, both being of a
            BUNDLE type with multiple media types in each. The two established
            flows will be Alice:10000<->Bob:20000, and
            Alice:10000<->Bob:20002.</t>

            <t>If the peer did support neither of the SHIM or BUNDLE extension
            the answer would look like this:</t>

            <figure>
              <artwork><![CDATA[v=0
o=bob 2808844564 2808844564 IN IP4 biloxi.example.com
s=
c=IN IP4 biloxi.example.com
t=0 0
a=group:FEC foo 1
a=group:FEC bar 2
m=audio 20000 RTP/AVP 0 8 97    
b=AS:200
a=mid:foo 
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 20002 RTP/AVP 31 32
b=AS:1000
a=mid:bar
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000
m=audio 20004 RTP/AVP 100    
b=AS:100
a=rtpmap:100 ulpfec/8000
a=mid:1
m=video 20006 RTP/AVP 101
b=AS:500
a=mid:2
a=rtpmap:101 ulpfec/90000]]></artwork>
            </figure>

            <t>In this case four different transport flows would be
            established for RTP, each with a different RTP session over them.
            The answer also knows the binding between the sessions with FEC
            and their source data thanks to the FEC specification.</t>
          </section>
        </section>
      </section>
    </section>

    <section title="Open Issues">
      <t>This work is still in the early phase of specification. This section
      contains a list of open issues where the author desires some input.<list
          style="numbers">
          <t>In <xref target="sec-signalling"/> there is a discussion of which
          parameters that must be configured. The scope of these rules and if
          they do make sense needs additional discussion.</t>

          <t>Can we provide better control so that applications that doesn't
          desire fallback to single RTP session when Multiplexing shim fails
          to be supported but Bundle is supported ends up with a better
          alternative?</t>

          <t>The details for how to do key-derivation, preferably in such a
          way that it can be reused by multiple key-management solutions like
          MIKEY and DTLS-SRTP</t>

          <t>The signalling solution will be revisited when the BUNDLE
          solution discussion has yeild some result. </t>
        </list></t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document request the registration of one SDP attribute. Details
      of the registration to be filled in.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The security properties of the Session ID layer is depending on what
      mechanism is used to protect the RTP and RTCP packets of a given RTP
      session. If IPsec or transport layer security solutions such as DTLS or
      TLS are being used then both the encapsulated RTP/RTCP packets and the
      session ID layer will be protected by that security mechanism. Thus
      potentially providing both confidentiality, integrity and source
      authentication. If SRTP is used, the session ID layer will not be
      directly protected by SRTP. However, it will be implicitly integrity
      protected (assuming the RTP/RTCP packet is integrity protected) as the
      only function of the field is to identify the session context. Thus any
      modification of the SID field will attempt to retrieve the wrong SRTP
      crypto context. If that retrieval fails, the packet will be anyway be
      discarded. If it is successful, the context will not lead to successful
      verification of the packet.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>This document is based on the input from various people, especially
      in the context of the RTCWEB discussion of how to use only a single
      lower layer transport. The RTP and RTCP packet figures are borrowed from
      RFC3711. The SDP example is extended from the one present in <xref
      target="I-D.ietf-mmusic-sdp-bundle-negotiation"/>. Eric Rescorla
      contributed the basic idea of optimizing the DTLS-SRTP key-management by
      modifying the key derivation process.</t>

      <t>The proposal in <xref target="sec-sol-padding"/> is original
      suggested by Colin Perkins. The idea in <xref target="sec-sol-ssrc"/> is
      from an <xref target="I-D.rosenberg-rtcweb-rtpmux">Internet Draft</xref>
      written by Jonathan Rosenberg et. al. The proposal in <xref
      target="sec-sol-single"/> is a result of discussion by a group of people
      at IETF meeting #81 in Quebec.</t>

      <t/>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;

      &rfc3550;

      &rfc3711;

      &rfc5234;

      &draft-ietf-mmusic-sdp-bundle-negotiation;
    </references>

    <references title="Informational References">
      &rfc3264;

      &rfc3830;

      &rfc4567;

      &rfc4568;

      &rfc5285;

      &rfc5506;

      &rfc5761;

      &rfc5764;

      &draft-rosenberg-rtcweb-rtpmux;

      <?rfc include='reference.RFC.5109'?>

      <?rfc include='reference.RFC.5245'?>

      <?rfc include='reference.RFC.5389'?>

      <?rfc include='reference.RFC.0768'?>

      <?rfc include='reference.RFC.0793'?>

      <?rfc include='reference.RFC.4340'?>

      <?rfc include='reference.RFC.4960'?>

      <?rfc include='reference.I-D.westerlund-avtcore-multiplex-architecture'?>

      <?rfc include='reference.I-D.ietf-avtcore-multi-media-rtp-session'?>

      <?rfc include='reference.I-D.lennox-rtcweb-rtp-media-type-mux'?>
    </references>

    <section title="Possible Solutions">
      <t>This section documents the solutions explored when selecting a SHIM
      based one and discusses their feasibility.</t>

      <section title="Header Extension">
        <t>One proposal is to define an <xref target="RFC5285">RTP header
        extension</xref> that explicitly enumerates the session identifier in
        each packet. This proposal has some merits regarding RTP, since it
        uses an existing extension mechanism; it explicitly enumerates the
        session allowing for third parties to associate the packet to a given
        RTP session; and it works with SRTP as currently defined since a
        header extension is by default not encrypted, and is thus readable by
        the receiving stack without needing to guess which session it belongs
        to and attempt to decrypt it. This approach does, however, conflict
        with the requirement from <xref target="RFC5285"/> that "header
        extensions using this specification MUST only be used for data that
        can be safely ignored by the recipient", since correct processing of
        the received packet depends on using the header extension to
        demultiplex it to the correct RTP session.</t>

        <t>Using a header extension also result in the session ID is in the
        integrity protected part of the packet. Thus a translator between
        multiplexed and non-multiplexed has the options:</t>

        <t><list style="numbers">
            <t>to be part of the security context to verify the field</t>

            <t>to be part of the security context to verify the field and
            remove it before forwarding the packet</t>

            <t>to be outside of the security context and leave the header
            extension in the packet. However, that requires successful
            negotiation of the header extension, but not of the functionality,
            with the receiving end-points.</t>
          </list></t>

        <t>The biggest existing hurdle for this solution is that there exist
        no header extension field in the RTCP packets. This requires defining
        a solution for RTCP that allows carrying the explicit indicator,
        preferably in a position that isn't encrypted by SRTCP. However, the
        current SRTCP definition does not offer such a position in the
        packet.</t>

        <t>Modifying the RR or SR packets is possible using profile specific
        extensions. However, that has issues when it comes to deployability
        and in addition any information placed there would end up in the
        encrypted part.</t>

        <t>Another alternative could be to define another RTCP packet type
        that only contains the common header, using the 5 bits in the first
        byte of the common header to carry a session id. That would allow
        SRTCP to work correctly as long it accepts this new packet type being
        the first in the packet. Allowing a non-SR/RR packet as the first
        packet in a compound RTCP packet is also needed if an implementation
        is to support <xref target="RFC5506">Reduced Size RTCP packets</xref>.
        The remaining downside with this is that all stack implementations
        supporting multiplexing would need to modify its RTCP compound packet
        rules to include this packet type first. Thus a translator box between
        supporting nodes and non-supporting nodes needs to be in the crypto
        context.</t>

        <t>This solution's per packet overhead is expected to be 64-bits for
        RTCP. For RTP it is 64-bits if no header extension was otherwise used,
        and an additional 16 bits (short header), or 24 bits plus (if needed)
        padding to next 32-bits boundary if other header extensions are
        used.</t>
      </section>

      <section anchor="sec-mux-shim" title="Multiplexing Shim">
        <t>This proposal is to prefix or postfix all RTP and RTCP packets with
        a session ID field. This field would be outside of the normal RTP and
        RTCP packets, thus having no impact on the RTP and RTCP packets and
        their processing. An additional step of demultiplexing processing
        would be added prior to RTP stack processing to determine in which RTP
        session context the packet shall be included. This has also no impact
        on SRTP/SRTCP as the shim layer would be outside of its protection
        context. The shim layer's session ID is however implicitly integrity
        protected as any error in the field will result in the packet being
        placed in the wrong or non-existing context, thus resulting in a
        integrity failure if processed by SRTP/SRTCP.</t>

        <t>This proposal is quite simple to implement in any gateway or
        translating device that goes from a multiplexed to a non-multiplexed
        domain or vice versa, as only an additional field needs to be added to
        or removed from the packet.</t>

        <t>The main downside of this proposal is that it is very likely to
        trigger a firewall response from any deep packet inspection device. If
        the field is prefixed, the RTP fields are not matching the heuristics
        field (unless the shim is designed to look like an RTP header, in
        which case the payload length is unlikely to match the expected value)
        and thus are likely preventing classification of the packet as an RTP
        packet. If it is postfixed, it is likely classified as an RTP packet
        but may not correctly validate if the content validation is such that
        the payload length is expected to match certain values. It is expected
        that a postfixed shim will be less problematic than a prefixed shim in
        this regard, but we are lacking hard data on this.</t>

        <t>This solution's per packet overhead is 1 byte.</t>
      </section>

      <section anchor="sec-sol-single" title="Single Session">
        <t>Given the difficulty of multiplexing several RTP sessions onto a
        single lower-layer transport, it's tempting to send multiple media
        streams in a single RTP session. Doing this avoids the need to
        de-multiplex several sessions on a single transport, but at the cost
        of losing the RTP session as a separator for different type of
        streams. Lacking different RTP sessions to demultiplex incoming
        packets, a receiver will have to dig deeper into the packet before
        determining what to do with it. Care must be taken in that inspection.
        For example, you must be careful to ensure that each real media source
        uses its own SSRC in the session and that this SSRC doesn't change
        media type.</t>

        <t>The loss of the RTP session as a separator for different usages or
        purpose would be an minor issue if the only difference between the RTP
        sessions is the media type. In this case, the application could use
        the Payload Type field to identify the media type. The loss of the RTP
        Session functionality is however severe, if the application uses the
        RTP Session for separating different treatments, contexts etc. Then
        you would need additional signalling to bind the different sources to
        groups which can help make the necessary distinctions.</t>

        <t>However, the loss of the RTP session as separator is not the only
        issue with this approach. The <xref
        target="I-D.westerlund-avtcore-multiplex-architecture">RTP
        Multiplexing Architecture</xref> discusses a number of issues in
        Section 6.7. These include RTCP bandwidth differences, limitations in
        the number of payload types, media aware RTP mixers and interactions
        with Legacy end-points.</t>

        <t>Additional attention should be place on this important aspect. In
        multi-party situations using central nodes there exist some
        difficulties in having a legacy implementation using multiple RTP
        sessions interworking with an end-point having only a single RTP
        session across the central node. The main reason is the fact that the
        one using single session with multiple media types has only one SSRC
        space, while the other end-points have multiple spaces. Thus
        translation may have to occur because there is several RTP sessions
        using the same SSRC value. This has both limitations, processing
        overhead and the possibility of becoming an deployment obstacle for
        new RTP/RTCP extensions.</t>

        <t>This approach has been proposed in the RTCWeb context in <xref
        target="I-D.lennox-rtcweb-rtp-media-type-mux"/> and <xref
        target="I-D.ietf-mmusic-sdp-bundle-negotiation"/>. These drafts
        describe how to signal multiple media streams multiplexed into a
        single RTP session, and address some of the issues raised here and in
        Section 6.7 of the <xref
        target="I-D.westerlund-avtcore-multiplex-architecture"> RTP
        Multiplexing Architecture</xref> draft.</t>

        <t>This method has several limitations that limits its usage as
        solution in providing multiple RTP sessions on the same lower layer
        transport. However, we acknowledge that there are some uses for which
        this method may be sufficient and which can accept the methods
        limitations and downsides. The RTCWEB WG has a working assumption to
        support this method. For more details of this method, see the relevant
        drafts under development. We do include this method in the comparison
        to provide a more complete picture of the pro and cons of this
        method.</t>

        <t>This solution has no per packet overhead. The signalling overhead
        will be a different question.</t>
      </section>

      <section title="Use the SRTP MKI field">
        <t>This proposal is to overload the MKI SRTP/SRTCP identifier to not
        only identify a particular crypto context, but also identify the
        actual RTP Session. This clearly is a miss use of the MKI field,
        however it appears to be with little negative implications. SRTP
        already supports handling of multiple crypto contexts.</t>

        <t>The two major downsides with this proposal is first the fact that
        it requires using SRTP/SRTCP to multiplex multiple sessions on a
        single lower layer transport. The second issue is that the session ID
        parameter needs to be put into the various key-management schemes and
        to make them understand that the reason to establish multiple crypto
        contexts is because they are connected to various RTP Sessions.
        Considering that SRTP have at least 3 used keying mechanisms, <xref
        target="RFC5764">DTLS-SRTP</xref>, <xref target="RFC4568">Security
        Descriptions</xref>, and <xref target="RFC3830">MIKEY</xref>, this is
        not an insignificant amount of work.</t>

        <t>This solution has 32-bit per packet overhead, but only if the MKI
        was not already used.</t>
      </section>

      <section anchor="sec-sol-padding" title="Use an Octet in the Padding">
        <t>The basics of this proposal is to have the RTP packet and the last
        (required by RFC3550) RTCP packet in a compound to include padding, at
        least 2 bytes. One byte for the padding count (last byte) and one byte
        just before the padding count containing the session ID.</t>

        <t>This proposal uses bytes to carry the session ID that have no
        defined value and is intended to be ignored by the receiver. From that
        perspective it only causes packet expansion that is supported and
        handled by all existing equipment. If an implementation fails to
        understand that it is required to interpret this padding byte to learn
        the session ID, it will see a mostly coherent RTP session except where
        SSRCs overlap or where the payload types overlap. However, reporting
        on the individual sources or forwarding the RTCP RR are not completely
        without merit.</t>

        <t>There is one downside of this proposal and that has to do with
        SRTP. To be able to determine the crypto context, it is necessary to
        access to the encrypted payload of the packet. Thus, the only
        mechanism available for a receiver to solve this issue is to try the
        existing crypto contexts for any session on the same lower layer
        transport and then use the one where the packet decrypts and verifies
        correctly. Thus for transport flows with many crypto contexts, an
        attacker could simply generate packets that don't validate to force
        the receiver to try all crypto contexts they have rather than
        immediately discard it as not matching a context. A receiver can
        mitigate this somewhat by using heuristics based on the RTP header
        fields to determine which context applies for a received packet, but
        this is not a complete solution.</t>

        <t>This solution has a 16-bit per packet overhead.</t>
      </section>

      <section anchor="sec-sol-ssrc" title="Redefine the SSRC field">
        <t>The Rosenberg et. al. Internet draft <xref
        target="I-D.rosenberg-rtcweb-rtpmux">"Multiplexing of Real-Time
        Transport Protocol (RTP) Traffic for Browser based Real-Time
        Communications (RTC)"</xref> proposed to redefine the SSRC field. This
        has the advantage of no packet expansion. It also looks like regular
        RTP. However, it has a number of implications. First of all it
        prevents any RTP functionality that require the same SSRC in multiple
        RTP sessions.</t>

        <t>Secondly its interoperability with end-point using multiple RTP
        sessions are problematic. Such interoperability will requires an SSRC
        translator function in the gatewaying node to ensure that the SSRCs
        fulfill the semantic rules of the different domains. That translator
        is actually far from easy as it needs to understand the semantics of
        all RTP and RTCP extensions that include SSRC/CSRC. This as it is
        necessary to know when a particular matching 32-bit pattern is an SSRC
        field and when the field is just a combination of other fields that
        create the same matching 32-bit pattern. Thus there is a possibility
        that such a translator becomes a obstacle in deploying future RTP/RTCP
        extensions. In addition the translator actually have significant
        overhead when SRTP are in use. This as a verification that the packet
        is authentic, decryption, SSRC translation, encryption and finally
        generation of authentication tags are required. In addition the
        translator must be part of the security context.</t>

        <t>This solution has no per packet overhead.</t>
      </section>
    </section>

    <section title="Comparison">
      <t>This section compares the above potential solutions with the
      requirements. Motivations are provided in addition to a high level
      metric of successfully, partially and failing to meet requirement. In
      the end a <xref target="fig-comp-table">summary table</xref> of the high
      level value are provided.</t>

      <section title="Support of Multiple RTP Sessions Over Single Transport">
        <t>This one is easy to determine. Only the single session proposal
        fails this requirement as it is not at all designed to meet it. The
        rest fully support this requirement. The main question around this
        requirement is how important it is to have as discussed in <xref
        target="sec-req-multiple-session"/>.</t>
      </section>

      <section title="Enable Same SSRC Value in Multiple RTP Sessions">
        <t>Based on the discussion in <xref target="seqreq-multi-ssrc"/> two
        sub-requirements have been derived.</t>

        <section title="Avoid SSRC Translation in Gateways/Translation">
          <t>This sub-requirement is derived based on the desire to avoid
          having gateways or translators perform full SSRC translation to
          minimize complexity, avoid the requirement to have gateways in
          security context, and as a hinder to long-term evolution. Two of the
          proposals have issues with this, due to their lack of support for
          multiple 32-bit SSRC spaces and lacking possibility to have the same
          SSRC value in multiple RTP sessions. The proposals that have these
          properties and thus are marked as failing are the Single Session and
          Redefine the SSRC field. The other proposals are all successful in
          meeting this requirement.</t>
        </section>

        <section title="Support Existing Extensions">
          <t>The second sub-requirement is how well the proposals support
          using the existing RTP mechanisms. Here both Single Session and
          Redefine the SSRC field will have clear issues as they cannot
          support the same full 32-bit SSRC value in two different RTP
          sessions. This is clearly an issue for the XOR based FEC. RTP
          retransmission and scalable encoding are minor issues as there exist
          alternatives to those mechanisms that works with the structure of
          these two proposals. Thus we give them a fail. The Header Extension
          gets a partial due to unclear interaction between putting in an
          header extension and these mechanisms.</t>
        </section>
      </section>

      <section title="Ensure SRTP Functions">
        <t>This requirement is about ensuring both secure and efficient usage
        of SRTP. The Octet in Padding field proposal gets a fail as the
        receiving end-point cannot determine the intended RTP session prior to
        de-encryption of the padding field. Thus a catch-22 arises which can
        only be resolved by trying all session contexts and see what decrypts.
        This causes a security vulnerability as an attacker can inject a
        packet which does not meet any of the session contexts. The receiver
        will then attempt decryption and authentication of it using all its
        session contexts, increasing the amount of wasted resources by a
        factor equal to the number of multiplexed sessions. Thus this proposal
        gets a fail.</t>

        <t>The proposal of Overloading the SRTP MKI field as session
        identifier gets a partial due to the fact that it cannot use SRTP's
        key-management mechanism out of the box. It forces the key-management
        mechanism and the SRTP implementations to maintain the MKI-to-RTP
        session bindings to maintain secure and correct function.</t>

        <t>The Redefine the SSRC field gets a partial due to its need to
        modify the key-management mechanisms to correctly identify the partial
        SSRC space the parameters applies to. Similarly, the SRTP
        implementation also needs to be updated to correctly support this
        security context differentiation.</t>

        <t>The header extension based solution gets a less severe partial than
        Redefine the SSRC and the MKI. It will however have an issue when
        being gatewayed to a domain that does not multiplex multiple RTP
        sessions over the same transport. Then the gateway will require to be
        in the security context to be able to add or remove the header
        extension as it is in the part of the packet that is integrity
        protected by SRTP.</t>

        <t>The remaining two proposals do not affect SRTP mechanisms and thus
        successfully meet this requirement.</t>
      </section>

      <section title="Don't Redefine Used Bits">
        <t>This requirement is all about RTP and RTCP header fields having a
        given definition should not be changed as it can cause
        interoperability problems between modified and non-modified
        implementations. This becomes especially problematic in RTP sessions
        used for multi-party sessions.</t>

        <t>Redefine the SSRC field gets a big fail on this as it redefines the
        SSRC field, a core field in RTP. It has been identified that such a
        change will have issues since if it gets connected to a non-modified
        end-point that randomly assigns the SSRC, as supposed by RFC 3550,
        those SSRCs will be distributed over different RTP sessions at the
        modified end-point. Also other functions using the SSRC field, not
        understanding the additional semantics of the SSRC field, is likely to
        have issues.</t>

        <t>Using the SRTP MKI field to identify a session is overloading that
        field with double semantics. This likely has minimal negative impact
        in RTP since it should be possible to have the SRTP stack use the MKI
        field to both look up the security context and which output RTP
        session the processed packet belongs to. However, this redefinition
        clearly creates issues with the key-management scheme. That will have
        to be modified to handle both this change and deal with the
        interoperability issues when negotiating its usage. This gets a full
        fail due to that it makes the problem someone else's, namely the RTP
        implementors.</t>

        <t>Defining an Octet in the Padding field redefines a field, whose
        definition is to have zero value and is expected to be ignored by the
        receiver according to the original semantics. Thus this is one of the
        more benign modifications one can do, however this can still cause
        issues in implementations that unnecessarily check the field values,
        or in Firewalls. This is judged to be partially meeting the
        requirement.</t>

        <t>The Header Extension proposal does in fact not redefine any
        currently used bits in RTP. The header extension would be a correctly
        identified extension with its own definition. However, it does
        redefine a rule on what header extensions are for. The RTCP solution
        however would have more severe impact as it would need to redefine the
        standard meaning of an RTCP packet header in addition to the default
        compound packet rules. Due to these issues the proposal fails to meet
        this requirement.</t>

        <t>The multiplexing shim and the single session both successfully meet
        this requirement.</t>
      </section>

      <section title="Firewall Friendly ">
        <t>This requirement is clearly difficult to judge as firewall
        implementations are highly different in both implementation, scope of
        what it investigates in packets, and set policies. A reasonable goal
        is to minimize the likeliness that rules and policies intended to let
        RTP media streams pass, will also let these streams through when
        multiplexing RTP sessions over a single transport. The below analysis
        shows that no solution is truly firewall friendly and all are judged
        as being partially meeting this goal. However, the reason why it is
        believed that a firewall might react to the streams are quite
        different.</t>

        <t>The Single Session and Redefine the SSRC field are likely the least
        suspect solutions from a firewall perspective. However, as their
        transport flows contain multiple SSRCs with payloads that indicate
        likely multiple different media types they are still likely to make a
        picky firewall block the transport. This is especially true for
        Firewalls that take signalling messages into account where it will
        expect a particular media type in a given context. A non upgraded
        firewall might in fact produce two different contexts with overlapping
        transport parameters where both rules will receive media streams of
        the other media type that are outside of the allowed rule. However, to
        be clear if these proposals doesn't get through, none of the other
        will either as they all will have this behavior.</t>

        <t>The header extension proposal is potentially problematic for two
        reasons. The first reason, which also other proposals has, is related
        to that the same SSRC value can exist in two RTP sessions over the
        same underlying flow. Anyone tracking the sequence number and
        timestamp will react badly as the second media stream with the same
        SSRC causes constant jumps back and forth in these fields compared to
        the first stream, if packets are transmitted simultaneously for both
        SSRCs. This issue can likely only be solved by having the Firewalls
        that like to track flows to also use the session identifier to create
        context. This is possible as the header extension will be in the clear
        and in the front. The second issue is that the header extension itself
        may get the firewall to react. Especially very picky ones that expect
        packets with certain media types to have certain packet lengths. They
        are not compatible with a header extension.</t>

        <t>The Multiplexing Shim shares the issue with multiple flows for the
        same SSRC. Firewalls and deep packet inspection cause the shim
        placement to be in question. If it is a pre-fixed shim, it prevents
        the packet from looking like regular IP/UDP/RTP packets and be
        correctly classified in Firewalls and DPI engines. However, if one
        puts it last, it is unlikely that any firewall or DPI ever will be
        able to take the session context into account as it is at the end of
        the packet. This as many line rate processing devices only take a
        certain amount of the headers into account.</t>

        <t>The SRTP MKI field is likely the solution that has least firewall
        and DPI issues, after the single RTP session. There is no additional
        suspect field. The only difference from a single RTP session in the
        transport flow is the fact that multiple MKI are guaranteed to be
        used. However, that may occur also in a single RTP session usage. Thus
        the only issues are the one shared with single session and the one
        that several RTP media streams may use the same SSRC.</t>

        <t>The octet in the padding field has, in addition to the issues the
        SRTP MKI field has, the single issue that it redefines something that
        is supposed to be zero into a value. Thus potentially causing a deeply
        inspecting firewall to clamp the flow in fear of covert channel or
        non-compliance.</t>
      </section>

      <section title="Monitoring and Reporting">
        <t>The monitoring and reporting requirement considers several aspects.
        How useful monitoring can one get from an existing legacy monitor, and
        secondary any issues in upgrading them to handle the selected
        solution. Thirdly, packet selector filters and packet sniffers
        concerns are considered.</t>

        <t>In general one can expect the proposals that have only a single
        SSRC space to work better with legacy. Thus both Single Session and
        Redefine SSRC space can gather and report data on media flows most
        likely. The only potential issue is that due to the different media
        types and clock rates, some failure may occur. In particular a third
        party monitor may be targeted to a specific media type, like
        monitoring VoIP. That monitor will have problems processing any video
        packets correctly and generate the VoIP specific metrics for any video
        sending SSRC. In general, no legacy solution for monitoring will be
        able to correctly create the sub-contexts that each RTP session has in
        the solutions, without update to handle the new semantics. Also when
        it comes to the packet filtering and selector filters, fine grained
        control can only be accomplished implementing the new semantics.
        Therefore only the Single Session meets this requirement fully.</t>

        <t>Redefine the SSRC field is close to fully meeting the requirement,
        however due to that there exist a session structure that is hidden to
        anyone that is not upgraded to understand the semantics, this only
        gets a partial.</t>

        <t>The other proposals all can have multiple RTP sessions using the
        same SSRC. This will create significant issues for any legacy third
        party monitor. Only an updated monitor, or for that matter packet
        selector, can pick out the individual media streams and their
        associated RTCP traffic. Thus all these proposals gets a failure to
        meet the requirement.</t>
      </section>

      <section title="Usable over Multicast">
        <t>As discussed earlier the goal with having the option usable also
        over multicast is to remove the need to produce different media
        streams for transport over unicast and multicast. All of the proposals
        successfully meet the requirement.</t>
      </section>

      <section title="Incremental Deployment">
        <t>The possibility to deploy the usage of the multiplexing of multiple
        RTP sessions over a single transport, especially in the context of
        multi-party sessions, is a great benefit for any of the proposals.
        Thus not all end-point implementations needs to be upgraded before one
        start enabling it in the central node and any signalling.</t>

        <t>Considering a centralized multi-party application where some
        participants are using multiple transport flows and you want to enable
        one particular participant to use the single transport to the central
        node, one criteria stands out. The possibility to have one RTP session
        per transport in one leg, and in the next multiplex them together with
        minimal complexity and packet changes. Here there are significant
        differences.</t>

        <t>The Multiplexing Shim has the least overhead for this. As the
        central node or gateway between deployments only needs to either add
        or remove the shim identifier and then forward the packet over the
        corresponding transport, either a joint one on the single transport
        side, or over the individual one on the multiple transport side.</t>

        <t>The SRTP MKI field proposal is almost as good, as the only main
        difference is the need to coordinate the used MKIs on the
        non-multiplexed legs so that there is no overlap between the RTP
        sessions. And if there is, the MKI can be translated in gateway as
        SRTP has no integrity protection over the MKI. Thus both multiplexing
        shim and SRTP MKI field does successfully meet this requirement.</t>

        <t>The Header Extension supports multiple full 32-bit SSRC spaces and
        can thus handle all the RTP sessions without need for any SSRC
        translation, however this proposal does run into the problem that the
        gateway needs to be in the security context to be able to add or
        remove the header extension when SRTP is used. In addition to the
        security implications of that, there is a complexity overhead due to
        the need to redo the authentication tags on all RTP/RTCP packets. Thus
        it gets a partial.</t>

        <t>The Octet in the Padding field share issues with the header
        extension but have even higher complexities for this. The reason is
        that the padding field is also encrypted. Thus to add or remove it
        (although removing it may be unnecessary) forces the end-point to
        encrypt at least that byte also, and for ciphers that are not
        stream-ciphers, the whole packet needs to be re-encrypted. Thus this
        proposal gets a very weak partially meeting the requirement.</t>

        <t>The Single Session and Redefine the SSRC field do not allow several
        vanilla RTP sessions to be connected to these proposals. The reason is
        the single 32-bit SSRC space they have. Single Session only has one
        session and the Redefine the SSRC fields uses some of the bits as
        session identifier. This forces the gateway to translate the SSRC
        whenever it does not fulfill the rules or semantics of the multiplexed
        side. For Redefine SSRC field this becomes almost constant as the
        session identifier part of the SSRC must be the same over all SSRCs
        from the same session. For Single Session it may only be needed when
        there otherwise would be an SSRC collision between the sessions. This
        further assumes that the non-multiplexed side would never use any of
        the RTP mechanisms that require the same SSRC in multiple RTP
        sessions, as they cannot be gatewayed at all. When translating an SSRC
        there is first of all an overhead, with SRTP that includes a complete
        authenticate, decrypt, encrypt and create a new authentication tag
        cycle. In addition, the SSRC translation could potentially be a
        deployment obstacle for new RTP/RTCP extensions required to be
        understood by the translator to be correctly translated. Therefore
        these two proposals gets a fail to meet the requirements.</t>
      </section>

      <section title="Summary and Conclusion">
        <t>This section contains a summary table of the high level outcome
        against the different requirements.</t>

        <t>A table mapping the requirements against the ID numbers used in the
        table is the following:<list style="hanging">
            <t hangText="1:">Support multiple RTP sessions over one transport
            flow</t>

            <t hangText="2:">Enable same SSRC value in multiple RTP sessions
            <list style="hanging">
                <t hangText="2.1:">Avoid SSRC translation in
                gateways/translators</t>

                <t hangText="2.2:">Support existing extensions</t>
              </list></t>

            <t hangText="3:">Ensure SRTP functions</t>

            <t hangText="4:">Don't Redefine used bits</t>

            <t hangText="5:">Firewall Friendly</t>

            <t hangText="6:">Monitoring and Reporting should still
            function</t>

            <t hangText="7:">Usable over Multicast</t>

            <t hangText="8:">Incremental deployment</t>

            <t hangText="OH:">Overhead in Bytes. + means variable</t>
          </list></t>

        <figure align="center" anchor="fig-comp-table"
                title="Summary Table of Evaluation (Successfully (S), Partially (P) or Fails (F) to meet requirement)">
          <artwork><![CDATA[
---------------+---+---+---+---+---+---+---+---+---+----
Solution       | 1 |2.1|2.2| 3 | 4 | 5 | 6 | 7 | 8 | OH
---------------+---+---+---+---+---+---+---+---+---+----
Header Ext.    | S | S | P | P | F | P | F | S | P | 8+
Multiplex Shim | S | S | S | S | S | P | F | S | S | 1
Single Session | F | F | F | S | S | P | S | S | F | 0
SRTP MKI Field | S | S | S | P | F | P | F | S | S | 4
Padding Field  | S | S | S | F | P | P | F | S | P | 2
Redefine SSRC  | S | F | F | P | F | P | P | S | S | 0
---------------+---+---+---+---+---+---+---+---+---+----
]]></artwork>
        </figure>

        <t/>

        <t>Considering these options, the authors would recommend that AVTCORE
        standardize a solution based on a post or prefixed multiplexing field,
        i.e. a shim approach combined with the appropriate signalling as
        described in <xref target="sec-mux-shim"/>.</t>
      </section>
    </section>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-23 13:29:47