http://stupid.domain.name/ietf/

One document matched: draft-westerlund-perc-rtp-field-considerations-00.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info"
     docName="draft-westerlund-perc-rtp-field-considerations-00"
     ipr="trust200902">
  <front>
    <title abbrev="PERC handling of RTP fields">Handling Considerations for
    the RTP fields in PERC</title>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 2</street>

          <city>SE-164 80 Stockholm</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <date day="19" month="October" year="2015" />

    <abstract>
      <t>This draft discusses how the Privacy Enhanced RTP Conferencing
      solution will need consider the different RTP header fields in regards
      to both hop-by-hop and end-to-end security.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>In the design of the Privacy Enchanced RTP Conferencing (PERC)
      end-to-end security solution for <xref target="RFC3550">RTP</xref> media
      streams there is need to carefully consider what properties the
      different RTP fields have, their security and privacy implications, and
      provide recommendations and requiremements for how they are handled.
      This review needs to consider both hop-by-hop properties as well as the
      end-to-end security.</t>

      <t>The fields analysed are that of a regular RTP packet. This is to
      consider the impact of the information that exists normally in an
      centralized multi-party conference.</t>

      <t>This document is a working document, and not intended to be published
      as an RFC.</t>
    </section>

    <section title="Definitions">
      <t>This document uses the following definitions:<list style="hanging">
          <t hangText="Endpoint:">An RTP stream sending and/or receiving
          entity that is part of the end-to-end security context.</t>

          <t hangText="MDD:">Media Delivery Device - An RTP middlebox that
          operates according to any of the three possible <xref
          target="I-D.ietf-avtcore-rtp-topologies-update">RTP
          topologies</xref> that is possible in the PERC system:<list
              style="empty">
              <t>Transport Translator - Relay</t>

              <t>Switching RTP Mixer</t>

              <t>Selective Forwarding Middlebox (SFM)</t>
            </list></t>

          <t hangText="Third Party:">An entity that is neither an endpoint nor
          an MDD.</t>
        </list></t>

      <t></t>

      <section title="Connection Case">
        <t>This analys is based on a basic connectivity use cases, where a
        media stream sending endpoint (originating) sends one or more RTP
        streams to a MDD. That MDD selectively forwards media to another MDD
        (Cascaded) which further sends the media (when selecting to) from the
        originating endpoint to the receiving endpoint. This connection case
        is depicted in <xref target="fig-connection-case"></xref>.</t>

        <figure anchor="fig-connection-case">
          <artwork><![CDATA[+-------------+     +------+     +------+     +-----------+
|             |     |      |     |      |     |           |
| Originating +---->+ MDD  +---->+ MDD  +---->+ Receiving |
| Endpoint    |     |      |     |      |     | Endpoint  |
|             |     |      |     |      |     |           |
+-------------+     +------+     +------+     +-----------+
]]></artwork>
        </figure>

        <t>The MDDs are not trusted with anything except forwarding media
        according to the policies given to it by endpoints and to not forward
        media from third parties. </t>
      </section>

      <section title="Additional Assumptions">
        <t>This document assumes that the originating media stream is uniquely
        identified by the SSRC value used by the originating endpoint. This
        SSRC value needs to be preserved to the receiving endpoint. It is
        assumed that even if SSRC/CSRC translation is in use by an MDD, there
        will exist an one to one mapping between originating SSRC value and
        the SSRC/CSRC value the receiving endpoint receives. Further for MDDs
        operating as Media Switching RTP mixers they will indicate the
        originating SSRC as CSRC when it switches that stream into one of the
        MDD's SSRCs. The CSRC will need to be maintained even over multiple
        MDDs. </t>

        <t></t>
      </section>
    </section>

    <section title="RTP Packets Fields">
      <t>This section analyses each RTP packet field or part. The anlysis for
      each field should answer the following questions:<list style="empty">
          <t>Can it or needs to be modified on path by an MDD?</t>

          <t>Does the receiving endpoint need the originating endpoint's set
          value?</t>

          <t>Does it need end-to-end authentication?</t>

          <t>Does it need end-to-end confidentiality?</t>

          <t>Does it need hop-by-hop authentication?</t>

          <t>Does it need hob-by-hop confidentiality?</t>
        </list>As a general rule, the only reason to encrypt something without
      integrity Protection is to save the overhead of the tag. As the PERC
      Solution will have both hbh tag and e2e tag, no overhead is saved by not
      integrity protecting so as a general rule confidentiality implies
      authentication. </t>

      <t>Some general considerations apply. Fields that are end-to-end
      authenticated is actually recommended to be hop-by-hop authenticated,
      even when there only are a end-to-end version of the field. The reason
      for this is to detect modifications at the earliest instance and avoid
      wasting resource further down the path.</t>

      <section title="Version Field (V)">
        <t>As the solution is focused on RTP as defined by <xref
        target="RFC3550"></xref> this field must be 2. The field is not
        expected to be modified by an MDD. The receiving endpoint will also
        assume that the originating endpoint used RTP (v=2). Modification of
        this field should result in the packet being dropped by the receiving
        endpoint or MDD, independent if it is hop-by-hop authenticated. The
        version field does not require end-to-end authentication as the MDD
        has more efficient denial of service attacks that it can perform on
        the endpoints, including not forwarding a single media packet/stream.
        The field can not be confidentiality protected end-to-end as the MDD
        must know that it is RTP (v=2) it receives. The field may be
        hop-by-hop confidentiality protected as part of an attempt to hide
        that the packet stream is RTP, although packet analysis is likely to
        reveal that the streams are real-time media anyway.</t>

        <t>If ever an RTP v=3 is defined in the future it is clear that one
        particular version must be used per hop. It is not possible to predict
        if it would be possible to have the end-to-end information translated
        between one hop using v=2 and one using v=3. If such translation and
        e2e authentication would be performed, the receiving entity must be
        aware of it, to know that the field's value is not the original one.
        Thus, it becomes a choice if one want to require explicit knowledge of
        the translation, or not demand it by excluding the field from
        end-to-end authentication. </t>
      </section>

      <section title="Padding Indicator bit (P)">
        <t>The padding bit is an indicator for the presence of the padding
        octets at the end of the RTP payload. As further discussed in <xref
        target="sec-padding-octets">Padding Octets</xref> the padding is
        considered part of the payload and jointly protected with the payload.
        The reason for this is that padding can help hide length variations in
        the payload that can leak information about the media content being
        carried <xref target="RFC6562"></xref>.</t>

        <t>As the Payload and padding octets are end-to-end protected, the
        padding indicator can't be modified by the MDD, due to its inability
        to remove the padding octets. For correct processing in the receiving
        endpoint the padding indicator needs to be correct. Therefore it
        should be end-to-end authenticated. It could be end-to-end
        confidentiality protected. The benefit of protecting it end-to-end
        would be that the MDD would not know if the end-to-end payload is
        padded or not. Knowing if the payload is padded or not reduces the
        uncertainty for an attacker that attempts to perform content analysis
        based on payload length. Because of that it would beneficial to
        protect the padding bit also hop-by-hop, if not already protected
        end-to-end. The padding bit should be hop-by-hop authenticated to
        protect if end-to-end authentication is not used. </t>
      </section>

      <section title="Extension Indicator bit (X)">
        <t>The extension indicator bit indicates if the header extension part
        is present. The MDD will be the target recipient of some RTP header
        extensions. It can also remove the ones not necessary to reach the
        receiving endpoint. This can result that something starting out with
        header extensions may no longer have any on the last hop. Thus, the
        MDD must be able to modify the X bit. Currently, there is no strong
        argument for why a receiving endpoint needs to know that there where
        header extensions present from the originating endpoint that has been
        removed. It might arise when using end-to-end protected header
        extensions and want to ensure detection of removal of such header
        extensions by the MDD. However, other methods for ensuring that exist,
        most likley by authenticating the end-to-end header extensions
        themselves. Conclusion is that there are no need for knowing the
        original value.</t>

        <t>There are no need for end-to-end confidentiality, nor
        authentication. Hop-by-hop authentication shall be used to prevent
        unnecesary erronous processing of the packet. Hop-by-hop
        confidentiality is recommended but lack of it has very minor impact as
        the information leaked is the presence or not of header extensions.
        Having this knowledge may simplify payload length based attacks in
        regards to the content.</t>
      </section>

      <section anchor="sec-cc" title="CSRC Count (CC)">
        <t>Contributing Sources count indicates how many CSRC values that are
        part of the CSRC field and are critical to know to correctly find the
        start of the payload within the RTP packet. When using MDDs that
        follow the Media Switching RTP Mixer topology (Section 3.6.2 of <xref
        target="I-D.ietf-avtcore-rtp-topologies-update"></xref>) the MDD will
        need to insert the originating endpoint's SSRC as CSRC value in the
        outgoing stream when that stream contains a payload from the by the
        CSRC identified originating stream. This results that in the MDD can
        modify and add CSRC fields when performing switching. And in cases an
        MDD operating like a SFM (Section 3.7 of <xref
        target="I-D.ietf-avtcore-rtp-topologies-update"></xref>) receives a
        switched media stream it may attempt to restore the mixed stream into
        a number of SSRC specific streams, thus removing the CSRC field. An
        originating endpoint is unlikely to have a need to insert an CSRC,
        this as in PERC context it is expected that the media sources have a
        direct relation to the endpoint. The need for an endpoint to express
        that it generates a mixed or switched stream where it can generate
        "end-to-end" secured payload with such properties appear to be in a
        violation of the intended security model. The current conclusion will
        be no need for orignal value. <list style="empty">
            <t>Note: The possibility for originating endpoints to create a
            CSRC list will need further discussion as it affects the
            possibility to rely on the SSRC/CSRC value as reference to the
            originating identity. </t>
          </list>The CC field does not appear to need end-to-end
        authenticated, nor confidentiality protected. The CC field shall be
        hop-by-hop authenticated to prevent third party modifcations as it
        effects finding the payload limit. Errors here can only lead to
        wasting resources for further entities in the conference, and should
        be detected as early as possible. Erronous payload delimitation due to
        error in the CC field will result in the receiving endpoint's
        integrity verification of the end-to-end payload will fail. Hop-by-hop
        confidentiality is recommened as the CC field allows a third party to
        better determine the RTP payload size, thus being information with
        some privacy sensitivity</t>

        <t></t>
      </section>

      <section title="Marker Bit (M)">
        <t>The marker bit semantics are dependent on the RTP payload format in
        use. Two dominant semantics are in use, but not limited to these two.
        Video primarily use it to indicate the last packet carrying part of an
        encoded video frame. Audio primarily use it to indicate the start of a
        talk spurt, indicating where an receiver could adjust its jitter
        buffer and playout.</t>

        <t>The MDD could depending on semantics potentially have an interest
        in setting the marker. One example could be an MDD that like to set an
        marker bit for audio to indicate the start of a media stream when
        swtiching in/on a particular originating endpoint's stream. In the
        discussion about this for PERC the conclusion is that an MDD can use
        other methods for indicating the switch in event. The main argument
        for this is to avoid having to understand the semantics of the payload
        currently present. Especially as codec switches can change the
        semantics in the middle of an ongoing conference session. The marker
        bit is meta data about the stream that can be relevant for knowing
        where appropriate switching points are, depending on the
        semantics.</t>

        <t>The receiving endpoint's need for original value from the
        originating endpoint is dependent on the semantics. However, for many
        semantics it is important for the originating value is know by the
        receiving endpoint. Therefore the recommendation is to require the
        originating value to be made available to the receiving endpoint.</t>

        <t>The recommendation is to use end-to-end authenticaiton of the
        value. End-to-end confidentiality needs exits as the marker bit can
        carry semantics direclty related to the content encoded. Audio's
        common semantics as start of speech burst, is telling the passive
        monitoring something on the ongoing flow of information. This needs to
        be balanced against the potential needs for the MDD to have this
        information for better function, like knowing where to switch.</t>

        <t>The marker bit should be both hop-by-hop authenticated as well as
        confidentiality protected. This is to prevent modification of this
        important piece of information to avoid that the MDD react to
        manipulated data. The confidentiality is there to prevent third
        parties from learning the information, potentially privacy
        sensitive.</t>
      </section>

      <section title="Payload Type (PT)">
        <t>The payload type identies the RTP payload format and thus normally
        the encoding of the media content in the payload. The dominant usage
        is to use some type of signalling protocol to agree on a mapping
        between a payload format and its parameters following the payload
        formats MIME type and the 7-bit field values. There exist some
        statically assigned codecs, but these values can still be assigned to
        other payload format configurations by the signalling.</t>

        <t>The MDD is expected to be required to rewrite the PT values when
        forwarding the payloads. The reason for this is that in many
        signalling contexts the binding between a payload type value and the
        payload format configuration will only have local meaning. And the PT
        value identifying a particular codec configuration is not unlikely a
        different PT value with another endpoint. Thus, the MDD will need to
        maintain translation tables for each ingress and egress pair.</t>

        <t>As knowing the correct payload format and codec configuration is
        cruical to be able to correctly decode the received payload, it is in
        the interest of the receiving endpoint to know the originating payload
        format and codec configuration. This would indicate a need to know the
        original value of the PT field. Unfortunately that is not sufficient
        to securly verify that no malicious changes has occurded on the path
        by a third party or the MDDs. The receiving endpoint would need to
        know also how the originating PT values map against the payload format
        and its parameters to verify correctness.</t>

        <t>End-to-end authentication of original value is recommended, given
        that the receiving endpoint also get the payload format configuration.
        End-to-end confidentiality would be desirable as it simplifies for an
        attacker to know which codec is used, or at least detect when the
        codec changes. When doing content analytics it simplifies to know the
        codec, so the codecs behaviour can be accounted for. However, this is
        not cruical information, and it appears very difficult to
        confidentiality protect the PT field value in respect to the MDD.</t>

        <t>Hop-by-hop authentication is important to prevent thrid-party
        modifications and avoid wasting resources by forwarding erronous
        information. Hob-by-hop confidentiality is recommended by not cruical
        as the information leakage can be limited to knowing when the same
        codec is being used. If the signalling is kept confidential towards
        any third party, then this minimal leakage is achieved. If one uses
        payload formats that has static mappings without remapping them, then
        the codec will be known by third parties. As a countering requirement
        that may need to be considered. The payload type is usually needed in
        third party quality monitors that gather statitics about the RTP
        packet stream as it passes a measuring point.</t>
      </section>

      <section anchor="sec-sequnce-nr" title="Sequence Number">
        <t>The MDD will need to modify the originating sequence number when it
        performs any switching or on/off operations on the RTP stream. This to
        ensure that the outgoing RTP stream has consistent sequence numbers
        with the number of packets actually sent, rather then how many that is
        being received at the ingress.</t>

        <t>The receiving endpoint likely need the originating sequence number
        or something semantically equivalent. The reasons for this is
        decryption, replay protection, and packet reordering. If the receiving
        endpoint knows through an end-to-end authenticated way the sequence in
        which the payloads was originated, the receiver can prevent using
        payloads that are replays from previous points in the RTP stream.
        <list style="empty">
            <t>Note: Sequence number based replay is vunerable in a
            environment where the MDD can perform swithcing operations. This
            from an attack using delaying of packets, rathern than replaying
            them. Due to the switching operation the receiving endpoint will
            need to accept any sequence number that is greater than previously
            received, as it lacks knowledge about how many payloads the
            originating endpoint has sent in the time interval since the last
            payload was received. Thus an MDD can select to send any payload
            between the last forwarded and the latest received from the
            origin.</t>
          </list></t>

        <t>End-to-End authentication of the original payload seqence number is
        likely required. End-to-end confidentiality is not possible as the
        MDDs needs to know in which sequence the payloads where sent. Being
        able to re-order the payloads could help improving the confidentiality
        of the media content as analysis using randomly reordered packets
        would be significantly more difficult. However, due to the real-time
        properties, such actions are unlikely to be feasible. However, if any
        such deliberate reordering would be attempted, the original sequence
        number would need to be confidentiality protected.</t>

        <t>Hop-by-hop authentication of the sequence number is recommended to
        prevent attacks on the receiver buffer, including forcing the receiver
        to discard other packets. Hop-by-hop confidentiality is recommened but
        not required. This as the goal would be to attempt to hide the correct
        sequence, across unintentional or intentional reordering, and enable
        detection of lost packets. Such knowledge has some use in content
        analysis. At the same time having this information in the clear
        enables third party monitoiring to gather statistics about re-ordering
        and packet loss.</t>
      </section>

      <section title="Timestamp">
        <t>The RTP timestamp expresses playout related time information. When
        a MDD is an media switching RTP mixer, it will need to provide a
        consistent timeline across switches. The timeline is also the outgoing
        SSRC's (from the mixer) internal timeline, and not specific to any of
        the originating RTP streams being switched into the stream. Thus, the
        timestamp in relation to the originating packet will need to be
        rewritten.</t>

        <t>The receiving endpoint could have use of the original value. First
        it could be used to detect malicous rewrite attempts that forces the
        receiver into flusing the receiver buffer or perform concealment over
        media that otherwise would have been played out. Secondly it can be
        used as a protection against the delay attack discussed above in <xref
        target="sec-sequnce-nr"></xref>. However, protection against these
        type of attacks by the MDD can be fragile and may cause more harm than
        gain. For the first type of attacks, it is clear that some
        modifications of the timeline between originating sources are
        necessary. This is first to align content segments so they have
        matching boundaries. Secondly, as the different endpoint don't have
        synchronized clocks there will be clock skew, thus some clock skew
        compensation at switch points are to be expected. For the delay attack
        protection also the clock skew issue is present. For both clock skew
        related issues this is further complicated that the clock skew
        compensation information is in RTCP and curently under control of the
        MDD. Thus, one would need to consider protecting this RTCP information
        end-to-end, or provided using other protocol means.</t>

        <t>If the original timestamp needs end-to-end authentication is
        dependent on if one can define a mechanism for delay attack protection
        using it. If not it is likely not needed. End-to-end confidentiality
        will be difficult as the MDD will need to know where in the timeline a
        particular payload belongs to. This is also closely related to the
        payload sequence information discussed above <xref
        target="sec-sequnce-nr"></xref>.</t>

        <t>Hop-by-hop authentication is needed to prevent third party attacks.
        Hop-by-hop confidentiality is recommended as it prevents leaking
        information about the sequence of the media and how much media is
        packed into each payload, especially for audio. This is coupled to the
        protection on provide the sequence number. At the same time a third
        party quality monitor likely need the RTP timestamp to perform its
        role adequately.</t>
      </section>

      <section title="SSRC">
        <t>The SSRC identifies the source of the RTP packet. As each SSRC has
        its own RTP sequence number space as well as timestamp sequence,
        collisions shall be avoided. For the PERC usage it is also important
        that a receiving endpoint can separate two different originating
        sources and to map the SSRC to a human readable name (or alias). The
        important security related issue is that unless the originating RTP
        stream can be identified the MDD could create one outgoing stream that
        selects packets from either of them. This may be challenging due to
        replay protection, but not impossible depending on how the sequence
        number and timestamps align. To avoid having multiple identifers for
        the RTP packet stream, the design team has proposed that the SSRC
        shall be unique and the original value preserved to the receiving
        endpoint. <list style="empty">
            <t>Note: There where no agreement on how the uniqness shall be
            ensured and are for further discussion.</t>
          </list></t>

        <t>Even if the originating endpoints have unique SSRCs, it is not
        clear if the same requirement will be extended to the MDD, and then
        especially media switching RTP mixers that have their own SSRCs. Thus
        translation of SSRC as a method for dealing with SSRC collisions may
        need to be dealt with.</t>

        <t>The original SSRC needs to be authenticated end-to-end to prevent
        the splicing attack described above. The SSRC can't be confidentiality
        protected end-to-end as it is required by the MDD to know which
        packets are part of the same RTP stream. Note that for an media
        switching mixer, the SSRC field will not be the original one, instead
        that value is expected to be put in the CSRC field.</t>

        <t>The SSRC shall be authenticated hop-by-hop to prevent splicing or
        redirecting packets between incoming RTP streams. It would have
        benefits to confidentiality protect the SSRC towards third parties as
        it would make it more difficult for such an attacker to associate
        packets to different RTP streams, when the originating endpoint sends
        more than one stream in the same transport flow.</t>
      </section>

      <section title="CSRC List">
        <t>The contributing source list contains the SSRC values of the RTP
        streams that contributed to the media content of this packet. In the
        PERC case, where the payload is end-to-end and not mixed in the middle
        boxes the field is expected to contain a single value. This is for the
        case where the originating SSRC is moved into the CSRC field with the
        MDD acts as an media switching mixer. As discusssed in <xref
        target="sec-cc"></xref> there could in theory be cases where an
        endpoint is performing mixes and thus need to include multiple CSRC
        values, but it appears to be contradicting the security model.</t>

        <t>The MDD needs to be able to add the CSRC field when not present. As
        it populates it with the orignating SSRC value, it simple moves the
        information from one place to another. Thus, the authentication and
        confidentiality requirements will be the same as for the SSRC field.
        End-to-End authentication of the CSRC value is performed, when the
        field is present instead of the SSRC field. Here CSRC fields from an
        originating endpoint will be an issue that requires special
        considerations. End-to-end confidentiality is not possible, due to the
        MDD moving the field from the SSRC place.</t>

        <t>Hop-by-hop the CSRC list shall be authenticated to prevent a third
        party to corrupt the field. Hop-by-hop confidentiality is recommended
        but not requried.</t>
      </section>

      <section title="Header Extensions">
        <t>This section assumes that the RTP header extension is used
        following the mechanism in <xref target="RFC5285"></xref>. Thus, the
        header extension can contain multiple different extensions as agreed
        and identified according to signalling. Each header extension format
        in use are assigned an identifer that are per endpoint and RTP session
        agreed. This results in that the MDD are likely to need to renumber
        them between ingress and egrees if they forward the extension. In
        addition a number of header extensions in use will be intended and
        targeted to the MDD. When MDDs are cascade they will likely need to
        forward the extension between themselves, and only on the last leg
        towards the receiving endpoint remove them.</t>

        <t>What security properties that are needed will be highly dependent
        on the header extension and their content. Therefore a number of
        header extensions are analysed in this section to determine if they
        contain material that need end-to-end authentication or also
        end-to-end confidentiality.</t>

        <t>The current summary of the known information is the following. The
        MDD needs to modify the IDs and add or remove some header extensions.
        There are header extensions that really should use hop-by-hop
        confidentiality (See Audio levels), and all should have hop-by-hop
        authentication to prevent modification impacting the MDD's processing
        and forwarding decision. The SMPTE time-code mapping, the Cordination
        of Video Orientation, the Region of Interest and the SDES information
        are all information from the originating endpoint intended to
        receiving endpoint. In the case of the SDES information, likely also
        needed by the MDD. This is information that all should be
        authenticated end-to-end to ensure that the MDD can't modify it. SPMTE
        time-codes, Coordination of video orientation (CVO), Region of
        Interest (ROI) are all information that the MDD lack need to see to be
        able to perform its task to forward media appropriately. Thus
        end-to-end confidentiality is recommended to be applied.</t>

        <section title="Transmission Time offsets">
          <t>The <xref target="RFC5450">Transmission Time offsets</xref> are
          header extension that encodes the time of transmission of the RTP
          packet in relation to the RTP timestamp. Being directly related to
          the transmission of the whole RTP packet it is non-sensitive
          information from a privacy and confidentiality aspect. It only
          provides more detaild information in what sequence a packet actually
          was sent, information that both the timestamp and sequence number
          provide.</t>

          <t>The authentication of this information can be valuable. However,
          as the MDD receives and the potentially fowards it, it has limited
          end-to-end value, and it is more appropriate for an MDD to rewrite
          this header when forwarding the packet to provide hop-by-hop
          transport information. Thus, hop-by-hop authentication is
          recommended.</t>
        </section>

        <section title="SMPTE time-code mapping">
          <t>The <xref target="RFC5484">SMPTE time-code mapping</xref> is
          providing SMPTE time codes associated with the RTP packet. This
          information is meta data to the media content in the payload.
          End-to-end authentication is recommend to ensure that the data is
          non-modified from the originating endpoint. The meta data may be
          privacy sensitive as it reveals information about the timeline for
          the content the receiver sees, inluding seeking in stored contentet
          provided into a conferencing context. There appear to be no reason
          why the MDD should have access to this, and end-to-end
          confidentiality is recommended.</t>

          <t>Hop-by-hop authentication is recommended, and confidentiality
          should be applied if not used end-to-end.</t>
        </section>

        <section title="Synchronisation metadata">
          <t><xref target="RFC6051">Synchronisation metadata</xref> is an
          header extension that provides the NTP and RTP Timestamp information
          binding, just like in the RTCP Sender Report. This is information
          that a MDD may need to perform its work efficiently, especially when
          functioning as an media switching mixer. The information could be
          end-to-end authenticated to prevent the MDD from intefering with it,
          and if included by an originating endpoint it can be assumed that it
          is intended for any current receiver of this RTP stream. The
          information does not appear to be sensitive from a confidentiality
          perspective.</t>
        </section>

        <section title="Client to Mixer Audio Level">
          <t>The <xref target="RFC6464">Client-to-Mixer Audio Level
          Indication</xref> is very interesting and problematic header
          extension. It contains the audio level of the audio included in the
          RTP packet. If that information is provided frequently enough is may
          provide an attacker of good possibilities as of deducing what is
          being said <xref target="RFC6562"></xref>. It is also is important
          meta data needed by an MDD if it is to perform the RTP stream
          switching based on who is talking.</t>

          <t>This header may require end-to-end confidentiality, this is for
          cases where the meta data is inteded for the receiving endpoints
          only, and not the MDDs. In cases of cascaded MDDs it could
          potentially be of interest to have authentication of the origin, but
          with a method that the MDDs could verify, and which would allow the
          final MDD before a receiving endpoint to remove the header
          extension.</t>

          <t>The header shall be hop-by-hop confidentiality protected and
          authenticated.</t>
        </section>

        <section title="Mixer-to-client audio level">
          <t><xref target="RFC6465">Mixer-to-Client Audio Level
          Indication</xref> is an providing audio levels for individual
          contributing sources within an audio mix. As the PERC system does
          not support content mixing, this header does not appear
          relevant.</t>
        </section>

        <section title="Coordination of video orientation (CVO)">
          <t>The Coordination of video orientation (CVO) [3GPP TS 26.114,
          version 12.5.0] provides a receiver with meta data about a video
          stream indicating which direction in the video is "up". Thus
          enabling the receiving endpoint to display the video content
          correctly oriented.</t>

          <t>This information is meta data about the media content itself. It
          does not appear to be information required by an MDD for its task.
          Changing the video orientation may in some cases completely change
          the meaning, e.g. a hand doing sign language. Therefore, this
          information should be end-to-end confidentiality protected as well
          as authenticated. Hop-by-hop authentication is recommended and
          confidentiality as well if not applied end-to-end.</t>
        </section>

        <section title="Region-of-interest (ROI)">
          <t>Region-of-interest (ROI) [3GPP TS 26.114, version 13.1.0] is an
          header extension providing the receiving endpoint information that
          the video image it receives is a covering a particular sub-area of
          what is originally captured. There exist other protocol mechanism to
          select the region of interest.</t>

          <t>This information is meta data about the media content itself. It
          does not appear to be information required by an MDD for its task.
          Therefore this information should be end-to-end confidentiality
          protected as well as authenticated. Hop-by-hop authentication is
          recommended and confidentiality as well if not applied
          end-to-end.</t>
        </section>

        <section title="SDES Information">
          <t>The SDES header extension is defined in <xref
          target="I-D.ietf-avtext-sdes-hdr-ext"></xref> and provides SDES
          CNAME and <xref
          target="I-D.ietf-mmusic-sdp-bundle-negotiation">MID</xref>
          information associated with the originating SSRC.</t>

          <t>The privacy sensitve nature of the CNAME is dependent of how it
          is generated. If generated with privacy in <xref
          target="RFC7022">mind</xref> then it will not need to be end-to-end
          confientiality protected. If not it may require end-to-end
          confidentiality. The MID values are references into SDP media
          descriptions and are not expected to be sensitive. This information
          is provided by the originating endpoint, and being able to trust it
          is highly valuabel, thus it should be end-to-end authenticated, and
          preferably also be possible to validate by the MDD.</t>

          <t>The hop-by-hop should be authenticated to avoid wasting
          resources, and the hop-by-hop confiendiality reduces the tracking
          possibilities by third parties.</t>
        </section>
      </section>

      <section title="Payload">
        <t>The payload is the payload format with the media content that is to
        be confidentiality protected end-to-end. Thus, the MDD shall not be
        able to modify it. It needs to be end-to-end confidentiality protected
        and authenticated. The payload should be hop-by-hop authenticated to
        prevent wasting downstream resources by forwarding a corrupt or
        modified payload. Hop-by-hop confidentiality is not strictly needed as
        it will be protected end-to-end. However, to help prevent tracking of
        how particular payloads are forwarded, it could be confidentiality
        protected also hop-by-hop.</t>
      </section>

      <section anchor="sec-padding-octets" title="Padding Octets">
        <t>The padding octets that come after the regular payload are often
        used to hide payload length variations when that is sensitive and
        could lead to breach of the confidentiality of the content. Thus, it
        important that the amount of padding can't be determined by either the
        MDD or any third party. Thus, end-to-end confidentiality and
        authentication is necessary. Hop-by-hop authentication is recommended
        to prevent wasting resources on corrupt or modified padding.
        Hop-by-hop confidentiality is not necessary due to the end-to-end one,
        but would reduce tracking possibilities.</t>
      </section>
    </section>

    <section anchor="Summary" title="Summery">
      <t>The following table summarizes the information from the previous
      section. Legend:<list style="hanging">
          <t hangText="Yes:">Something is required to be done, or in the case
          of MDD modification need to be possible.</t>

          <t hangText="No:">Something that is not to be done, nor needs to be
          done.</t>

          <t hangText="Rec:">Recommened to be done but not required.</t>

          <t hangText="May:">It can be done, but neither recommened or
          required (Yes).</t>

          <t hangText="*:">Please see description in the section for specific
          considerations.</t>

          <t hangText="?:">Classification is more uncertain and need further
          input.</t>
        </list></t>

      <texttable anchor="summaryTable" title="Summary of Handling Required">
        <ttcol align="center">Data</ttcol>

        <ttcol align="center">MDD Mod</ttcol>

        <ttcol align="center">Orig Needed</ttcol>

        <ttcol align="center">E2E Auth</ttcol>

        <ttcol align="center">E2E Conf</ttcol>

        <ttcol align="center">HBH Auth</ttcol>

        <ttcol align="center">HBH Conf</ttcol>

        <c>V</c>

        <c>No</c>

        <!-- MDD Modify -->

        <c>No</c>

        <!-- Orig Needed-->

        <c>No</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>May</c>

        <!-- HBH Conf-->

        <c>P</c>

        <c>No</c>

        <!-- MDD Modify -->

        <c>Yes</c>

        <!-- Orig Needed-->

        <c>Yes</c>

        <!-- E2E Auth-->

        <c>May?</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>X</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>No</c>

        <!-- Orig Needed-->

        <c>No</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>CC</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>No</c>

        <!-- Orig Needed-->

        <c>No</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>M</c>

        <c>No</c>

        <!-- MDD Modify -->

        <c>Yes</c>

        <!-- Orig Needed-->

        <c>Yes</c>

        <!-- E2E Auth-->

        <c>Rec?</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Yes</c>

        <!-- HBH Conf-->

        <c>PT</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>Yes?</c>

        <!-- Orig Needed-->

        <c>Rec?</c>

        <!-- E2E Auth-->

        <c>No*</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>Seq No</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>Yes*</c>

        <!-- Orig Needed-->

        <c>Yes</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>Timestamp</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>Yes?</c>

        <!-- Orig Needed-->

        <c>Yes?</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>SSRC</c>

        <c>May</c>

        <!-- MDD Modify -->

        <c>Yes*</c>

        <!-- Orig Needed-->

        <c>Yes*</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>CSRCs</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>Yes*</c>

        <!-- Orig Needed-->

        <c>Yes*</c>

        <!-- E2E Auth-->

        <c>No</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Rec</c>

        <!-- HBH Conf-->

        <c>Extensions</c>

        <c>Yes</c>

        <!-- MDD Modify -->

        <c>Some?</c>

        <!-- Orig Needed-->

        <c>Some?</c>

        <!-- E2E Auth-->

        <c>Some?</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>Some</c>

        <!-- HBH Conf-->

        <c>Payload</c>

        <c>No</c>

        <!-- MDD Modify -->

        <c>Yes</c>

        <!-- Orig Needed-->

        <c>Yes</c>

        <!-- E2E Auth-->

        <c>Yes</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>May?</c>

        <!-- HBH Conf-->

        <c>Padding</c>

        <c>No</c>

        <!-- MDD Modify -->

        <c>Yes</c>

        <!-- Orig Needed-->

        <c>Yes</c>

        <!-- E2E Auth-->

        <c>Yes</c>

        <!-- E2E Conf-->

        <c>Yes</c>

        <!--HBH Auth -->

        <c>May?</c>

        <!-- HBH Conf-->
      </texttable>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no request of IANA.</t>

      <t>Note to RFC Editor: this section may be removed on publication as an
      RFC.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The purpose of this document include discussing the security issue
      around the information in the RTP header. That is covered above in the
      document. Worth noting is the differences in recommendation for
      hop-by-hop confidentiality compared to regular SRTP. Where SRTP for
      allowing third party monitors as well as enabling the use of IP/UDP/RTP
      header compressors the RTP header information is in clear text and only
      integrity protected.</t>

      <t>With the increased privacy concerns <xref
      target="RFC6973"></xref><xref target="RFC7258"></xref> and known attacks
      based on payload length analys, it has become more important to consider
      confidentiality protect the whole RTP header, but specifically the X,
      CC, M, PT fields as they reveal important information around the payload
      and its length. Based on this I recommend that we not only consider SRTP
      as outer security layer to provide hop-by-hop confidentiality and
      integrity protection, but also methods that protect the whole RTP
      packet, like DTLS.</t>
    </section>

    <section title="Contributors">
      <t>Cullen Jennings contributed the initial version of the summary
      table.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>The author like to thank John Mattsson for review comments. </t>
    </section>
  </middle>

  <back>
    <references title="Informative References">
      <?rfc include='reference.RFC.3550'?>

      <?rfc include='reference.RFC.5285'?>

      <?rfc include='reference.RFC.5450'?>

      <?rfc include='reference.RFC.5484'?>

      <?rfc include='reference.RFC.6051'?>

      <?rfc include='reference.RFC.6464'?>

      <?rfc include='reference.RFC.6465'?>

      <?rfc include='reference.RFC.6562'?>

      <?rfc include='reference.RFC.6973'?>

      <?rfc include='reference.RFC.7022'?>

      <?rfc include='reference.RFC.7258'?>

      <?rfc include='reference.I-D.ietf-avtcore-rtp-topologies-update'?>

      <?rfc include='reference.I-D.ietf-avtext-sdes-hdr-ext'?>

      <?rfc include='reference.I-D.ietf-mmusic-sdp-bundle-negotiation'?>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 05:40:08