http://stupid.domain.name/ietf/

One document matched: draft-ietf-avtcore-rtp-security-options-01.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-ietf-avtcore-rtp-security-options-01"
     ipr="trust200902">
  <front>
    <title abbrev="Options for Securing RTP Sessions">Options for Securing RTP
    Sessions</title>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 6</street>

          <city>SE-164 80 Kista</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <author fullname="Colin Perkins" initials="C. " surname="Perkins">
      <organization>University of Glasgow</organization>

      <address>
        <postal>
          <street>School of Computing Science</street>

          <city>Glasgow</city>

          <code>G12 8QQ</code>

          <country>United Kingdom</country>
        </postal>

        <email>csp@csperkins.org</email>
      </address>
    </author>

    <date day="22" month="October" year="2012"/>

    <abstract>
      <t>The Real-time Transport Protocol (RTP) is used in a large number of
      different application domains and environments. This hetrogeneity
      implies that different security mechanisms are needed to provide
      services such as confidentiality, integrity and source authentication of
      RTP/RTCP packets suitable for the various environments. The range of
      solutions makes it difficult for RTP-based application developers to
      pick the most suitable mechanism. This document provides an overview of
      a number of security solutions for RTP, and gives guidance for
      developers on how to choose the appropriate security mechanism.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t><xref target="RFC3550">Real-time Transport Protocol (RTP)</xref> is
      widely used in a large variety of multimedia applications, including
      Voice over IP (VoIP), centralized multimedia conferencing, sensor data
      transport, and Internet television (IPTV) services. These applications
      can range from point-to-point phone calls, through centralised group
      teleconferences, to large-scale television distribution services. The
      types of media can vary significantly, as can the signalling methods
      used to establish the RTP sessions.</t>

      <t>This multi-dimensional heterogeneity has so far prevented development
      of a single security solution that meets the needs of the different
      applications. Instead significant number of different solutions have
      been developed to meet different sets of security goals. This makes it
      difficult for application developers to know what solutions exist, and
      whether their properties are appropriate. This memo gives an overview of
      the available RTP solutions, and provides guidance on their
      applicability for different application domains. The guidance provided
      is not exhaustive, and this memo does not provide normative
      recommendations.</t>

      <t>It is important that application developers consider the security
      goals and requirements for their application. The IETF considers it
      important that protocols implement, and makes available to the user,
      secure modes of operation <xref target="RFC3365"/>. Because of the
      heterogeneity of RTP applications and use cases, however, a single
      security solution cannot be mandated. Instead, application developers
      need to select mechanisms that provide appropriate security for their
      environment. It is strongly encouraged that common mechanisms are used
      by related applications in common environments. The IETF publishes
      guidelines for specific classes of applications, so it worth searching
      for such guidelines.</t>

      <t>The remainder of this document is structured as follows. <xref
      target="sec-background"/> provides additional background. <xref
      target="sec-options"/> outlines the available security mechanisms at the
      time of this writing, and lists their key security properties and
      constraints. That is followed by guidelines and important aspects to
      consider when securing an RTP application in <xref
      target="sec-applications"/>. Finally, we give some examples of
      application domains where guidelines for security exist in <xref
      target="sec-examples"/>.</t>
    </section>

    <section anchor="sec-background" title="Background">
      <t>RTP can be used in a wide variety of topologies, and combinations of
      topologies, due to it's support for unicast, multicast groups, and
      broadcast topologies, and the existence of different types of RTP
      middleboxes. In the following we review the different topologies
      supported by RTP to understand their implications for the security
      properties and trust relations that can exist in RTP sessions.</t>

      <section title="Point to Point Sessions">
        <t>The most basic use case is two directly connected end-points, shown
        in <xref target="fig-p2p"/>, where A has established an RTP session
        with B. In this case the RTP security is primarily about ensuring that
        any third party can't compromise the confidentiality and integrity of
        the media communication. This requires confidentiality protection of
        the RTP session, integrity protection of the RTP/RTCP packets, and
        source authentication of all the packets to ensure no
        man-in-the-middle attack is taking place.</t>

        <t>The source authentication can also be tied to a user or an
        end-points verifiable identity to ensure that the peer knows who they
        are communicating with. Here the combination of the security protocol
        protecting the RTP session and its RTP and RTCP traffic and the
        key-management protocol becomes important in which security statements
        one can do.</t>

        <figure align="center" anchor="fig-p2p"
                title="Point to Point Topology">
          <artwork><![CDATA[+---+         +---+
| A |<------->| B |
+---+         +---+
]]></artwork>
        </figure>

        <t/>
      </section>

      <section title="Sessions Using an RTP Mixer">
        <t>An RTP mixer is a an RTP session level middlebox that one can build
        an multi-party RTP based conference around. The RTP mixer might
        actually perform media mixing, like mixing audio or compositing video
        images into a new media stream being sent from the mixer to a given
        participant; or it might provide a conceptual stream, for example the
        video of the current active speaker. From a security point of view,
        the important featurs of an RTP mixer is that it generates a new media
        stream, and has its own source identifier, and does not simply forward
        the original media.</t>

        <t>An RTP session using a mixer might have a topology like that in
        <xref target="fig-mixer"/>. In this examples, participants A-D each
        send unicast RTP traffic between themselves and the RTP mixer, and
        receive a RTP stream from the mixer, comprising a mixture of the
        streams from the other participants.</t>

        <figure align="center" anchor="fig-mixer"
                title="Example RTP Mixer topology">
          <artwork><![CDATA[+---+      +------------+      +---+
| A |<---->|            |<---->| B |
+---+      |            |      +---+
           |    Mixer   |
+---+      |            |      +---+
| C |<---->|            |<---->| D |
+---+      +------------+      +---+
 ]]></artwork>
        </figure>

        <t>A consequence of an RTP mixer having its own source identifier, and
        acting as an active participant towards the other end-points, is that
        the RTP mixer needs to be a trusted device that is part of the
        security context(s) established. The RTP mixer can also become a
        security enforcing entity. For example, a common approach to secure
        the topology in <xref target="fig-mixer"/> is to establish a security
        context between the mixer and each participant independently, and have
        the mixer source authenticate each peer. The mixer then ensures that
        one participant cannot impersonsate another.</t>
      </section>

      <section title="Sessions Using an RTP Translator">
        <t>RTP translators are middleboxes that provide various levels of
        in-network media translation and transcoding. Their security
        properties vary widely, depending on which type of operations they
        attempt to perform. We identify three different categories of RTP
        translator: transport translators, gateways, and media transcoders. We
        discuss each in turn.</t>

        <section title="Transport Translator (Relay)">
          <t>A transport translator <xref target="RFC5117"/> operates on a
          level below RTP and RTCP. It relays the RTP/RTCP traffic from one
          end-point to one or more other addresses. This can be done based
          only on IP addresses and transport protocol ports, with each receive
          port on the translator can have a very basic list of where to
          forward traffic. Transport translators should also implement ingress
          filtering to prevent random traffic from being forwarded that isn't
          coming from a participant in the conference.</t>

          <t><xref target="fig-relay"/> shows an example transport translator,
          where traffic from any one of the four participants will be
          forwarded to the other three participants unchanged. The resulting
          topology is very similar to Any source Multicast (ASM) session (as
          discussed in <xref target="sec-asm"/>), but implemented at the
          application layer.</t>

          <figure align="center" anchor="fig-relay"
                  title="RTP relay translator topology">
            <artwork><![CDATA[+---+      +------------+      +---+
| A |<---->|            |<---->| B |
+---+      |    Relay   |      +---+
           | Translator |
+---+      |            |      +---+
| C |<---->|            |<---->| D |
+---+      +------------+      +---+]]></artwork>
          </figure>

          <t>A transport translator can often operate without needing to be in
          the security context, as long as the security mechanism does not
          provide protection over the transport-layer information. A transport
          translator does, however, make the group communication visible, and
          so can complicate keying and source authentication mechanisms. This
          is further discussed in <xref target="sec-asm"/>.</t>
        </section>

        <section anchor="sec-gateway" title="Gateway">
          <t>Gateways are deployed when the endpoints are not fully
          compatible. <xref target="fig-gateway"/> shows an example topology.
          The functions a gateway provides can be diverse, and range from
          transport layer relaying between two domains not allowing direct
          communication, via transport or media protocol function initiation
          or termination, to protocol or media encoding translation. The
          supported security protocol might even be one of the reasons a
          gateway is needed.</t>

          <figure align="center" anchor="fig-gateway"
                  title="RTP Gateway Topology">
            <artwork><![CDATA[
+---+      +-----------+      +---+
| A |<---->|  Gateway  |<---->| B |
+---+      +-----------+      +---+
 ]]></artwork>
          </figure>

          <t>The choice of security protocol and the details of the gateway
          function will determine if the gateway needs to be a trusted part of
          the application security context or not. Many gateways need to be
          trusted by all peers to perform the translation; in other cases some
          or all peers might not be aware of the presence of the gateway. The
          security protocols have different properties depending on the degree
          of trust and visibility needed. Ensuring communication is possible
          without trusting the gateway can be strong incentive for accepting
          different security properties. Some security solutions will be able
          to detect the gateways as manipulating the media stream, unless the
          gateway is a trusted device.</t>
        </section>

        <section anchor="sec-transcode" title="Media Transcoder">
          <t>A Media transcoder is a special type of gateway device that
          changes the encoding of the media being transported by RTP. The
          discussion in <xref target="sec-gateway"/> applies. A media
          transcoder alters the media data, and so almost certainly needs to
          be trusted device that is part of the security context.</t>
        </section>
      </section>

      <section anchor="sec-asm" title="Any Source Multicast">
        <t><xref target="RFC1112">Any Source Multicast</xref> is the original
        multicast model where any multicast group participant can send to the
        multicast group, and get their packets delivered to all group members
        (see <xref target="fig-asm"/>). This form of communication has
        interesting security properties, due to the many-to-many nature of the
        group. Source authentication is important, but all participants in the
        group security context will have access to the necessary secrets to
        decrypt and verify integrity of the traffic. Thus use of any symmetric
        security functions fails if the goal is to separate individual sources
        within the security context; alternate solutions are needed.</t>

        <figure align="center" anchor="fig-asm"
                title="Any Source Multicast Group">
          <artwork><![CDATA[
           +-----+
+---+     /       \    +---+
| A |----/         \---| B |
+---+   /   Multi-  \  +---+
       +    Cast     +
+---+   \  Network  /  +---+
| C |----\         /---| D |
+---+     \       /    +---+
           +-----+]]></artwork>
        </figure>

        <t>In addition the potential large size of multicast groups creates
        some considerations for the scalability of the solution and how the
        key-management is handled.</t>
      </section>

      <section title="Source-Specific Multicast">
        <t><xref target="RFC4607">Source Specific Multicast</xref> allows only
        a specific end-point to send traffic to the multicast group. That
        end-point is labelled the Distribution Source in <xref
        target="fig-ssm-session"/>. It distributes traffic from a number of
        RTP media sources, MS1 to MSm. <xref target="fig-ssm-session"/> also
        depicts the feedback part of the SSM <xref target="RFC5760">RTP
        session using unicast feedback</xref> from a number of receivers
        R1..Rn that sends feedback to a Feedback Target (FT) and eventually
        aggregated and distributed to the group.</t>

        <t>The use of SSM makes it more difficult to inject traffic into the
        multicast group, but not impossible. Source authentication
        requirements apply for SSM sessions too, and a non-symmetric
        verification of who sent the RTP and RTCP packets is needed.</t>

        <t>The SSM communication channel needs to be securely established and
        keyed. In addition one also have the individual unicast feedback that
        also needs to be secured.</t>

        <figure align="center" anchor="fig-ssm-session"
                title="SSM-based RTP session with Unicast Feedback">
          <artwork><![CDATA[
 +-----+  +-----+          +-----+
 | MS1 |  | MS2 |   ....   | MSm |
 +-----+  +-----+          +-----+
    ^        ^                ^
    |        |                |
    V        V                V
+---------------------------------+
|       Distribution Source       |
+--------+                        |
| FT Agg |                        |
+--------+------------------------+
  ^ ^           |
  :  .          |
  :   +...................+
  :             |          .
  :            / \          .
+------+      /   \       +-----+
| FT1  |<----+     +----->| FT2 |
+------+    /       \     +-----+
  ^  ^     /         \     ^  ^
  :  :    /           \    :  :
  :  :   /             \   :  :
  :  :  /               \  :  :
  :   ./\               /\.   :
  :   /. \             / .\   :
  :  V  . V           V .  V  :
 +----+ +----+     +----+ +----+
 | R1 | | R2 | ... |Rn-1| | Rn |
 +----+ +----+     +----+ +----+]]></artwork>
        </figure>

        <t/>
      </section>
    </section>

    <section anchor="sec-options" title="Security Options">
      <t>This section provides an overview of a number of currently defined
      security mechanisms that can be used with RTP.</t>

      <section title="Secure RTP">
        <t>The <xref target="RFC3711">Secure RTP (SRTP) protocol</xref> is one
        of the most commonly used mechanisms to provide confidentiality,
        integrity protection and source authentication for RTP. SRTP was
        developed with RTP header compression and third party monitors in
        mind. Thus the RTP header is not encrypted in RTP data packets, and
        the first 8 bytes of the first RTCP packet header in each compound
        RTCP packet are not encrypted. The entirity of RTP packets and
        compound RTCP packets are integrity protected. This allows RTP header
        compression to work, and lets third party monitors determine what RTP
        traffic flows exist based on the SSRC fields, but protects the
        sensitive content.</t>

        <t>The source authentication guarantees provided by SRTP are highly
        dependent on the cryptographic transform and key-management scheme
        used. In some cases all a receiver can determine is whether the
        packets come from someone in the group security context, and not what
        group member send the packets. Thus, the source authentication
        guarantees depend also on the session topology. Some cryptographic
        transform have stronger authentication properties which can guarantee
        a given source, even over a multi-party session, e.g. those based on
        TESLA <xref target="RFC4383"/>.</t>

        <t>SRTP can easily be extended with additional cryptographic
        transforms. At the time of this writing, the following transforms are
        defined or under definition:<list style="hanging">
            <t hangText="AES CM and HMAC-SHA-1:">AES Counter Mode encryption
            with 128 bits keys combined with 128 bits keyed HMAC-SHA1 using 80
            or 32 bits authentication tags are the default cryptographic
            transform which need to be supported. Defined in <xref
            target="RFC3711">SRTP</xref>.</t>

            <t hangText="AES-f8 and HMAC-SHA-1:">AES f8 mode encryption with
            128 bits keys combined with keyed HMAC-SHA1 using 80 or 32 bits
            authentication. Defined in <xref target="RFC3711">SRTP</xref>.</t>

            <t hangText="TESLA:">As a complement to the regular symmetric
            keyed authentication transforms, like HMAC-SHA1. The TESLA based
            authentication scheme can provide per-source authentication in
            some group communication scenarios. The downside is need for
            buffering the packets for a while before authenticity can be
            verified. The TESLA transform for SRTP is defined in <xref
            target="RFC4383"/>.</t>

            <t hangText="SEED:">An Korean national standard cryptographic
            transform that is defined to be used with SRTP in <xref
            target="RFC5669"/>. It has three modes, one using SHA-1
            authentication, one using Counter with CBC-MAC, and finally one
            using Galois Counter mode.</t>

            <t hangText="ARIA:">An <xref
            target="I-D.ietf-avtcore-aria-srtp">Korean block cipher</xref>,
            that supports 128, 192 and 256 bits keys. It also has three modes,
            Counter mode where combined with HMAC-SHA1 with 80 or 32 bits
            authentication tags, Counter mode with CBC-MAC and Galois Counter
            mode. It also defines a different key derivation function than the
            AES based.</t>

            <t hangText="AES-192 and AES-256:">cryptographic transforms for
            SRTP based on AES-192 and AES-256 counter mode encryption and
            160-bit keyed HMAC-SHA1 with 80 and 32 bits authentication tags
            for authentication. Thus providing 192 and 256 bits encryption
            keys and NSA Suite B included cryptographic transforms. Defined in
            <xref target="RFC6188"/>.</t>

            <t hangText="AES-GCM:">There is also ongoing work to define
            AES-GCM (Galois Counter Mode) and AES-CCM (Counter with CBC)
            authentication for AES-128 and AES-256. This authentication is
            included in the cipher text which becomes expanded with the length
            of the authentication tag instead of using the SRTP authentication
            tag. This is defined in <xref
            target="I-D.ietf-avtcore-srtp-aes-gcm"/>.</t>
          </list></t>

        <t><xref target="RFC4771"/> defines a variant of the authentication
        tag that enables a receiver to obtain the Roll over Counter for the
        RTP sequence number that is part of the Initialization vector (IV) for
        many cryptographic transforms. This enables quicker and easier options
        for joining a long lived secure RTP group, for example a broadcast
        session.</t>

        <t>RTP header extensions are in normally carried in the clear and only
        integrity protected in SRTP. This can be problematic in some cases, so
        <xref target="I-D.ietf-avtcore-srtp-encrypted-header-ext"/> defines an
        extension to also encrypt selected header extensions.</t>

        <t>SRTP does not contain an integrated key-management solution, and
        instead relies on an external key management protocol. There are
        several protocols that can be used. The following sections outline
        some popular schemes.</t>

        <section title="Key Management for SRTP: DTLS-SRTP">
          <t>A Datagram Transport Layer Security extension exists for
          establishing SRTP keys <xref target="RFC5763"/><xref
          target="RFC5764"/>. This extension provides secure key-exchange
          between two peers, including perfect forward secrecy and enabling
          binding strong identity verification to an end-point. The default
          key generation will generate a key that contains material
          contributed by both peers. The key-exchange happens in the media
          plane directly between the peers. The common key-exchange procedures
          will take two round trips assuming no losses. TLS resumption can be
          used when establishing additional media streams with the same peer,
          used reducing the setup time to one RTT.</t>

          <t>DTLS-SRTP key management can use the signalling protocol in three
          ways. First, to agree on using DTLS-SRTP for media security.
          Secondly, to determine the network location (address and port) where
          each side is running an DTLS listener to let the parts perform the
          key-management handshakes that generate the keys used by SRTP.
          Finally, to exchange hashes of each sides certificates to enable
          each side to verify that they have connected to the by signalling
          indicated port and not a man in the middle. That way enabling some
          binding between the key-exchange and the signalling. This usage is
          well defined for SIP/SDP in <xref target="RFC5763"/>, and in most
          cases can be adopted for use with other bi-directions signalling
          solutions.</t>
        </section>

        <section title="Key Management for SRTP: MIKEY">
          <t><xref target="RFC3830">Multimedia Internet Keying (MIKEY)</xref>
          is a keying protocol that has several modes with different
          properties. MIKEY can be used in point-to-point applications using
          SIP and RTSP (e.g., VoIP calls), but is also suitable for use in
          broadcast and multicast applications, and centralized group
          communications.</t>

          <t>MIKEY can establish multiple security contexts or cryptographic
          sessions with a single message. It is possible to use in scenarios
          where one entity generates the key and needs to distribute the key
          to a number of participants. The different modes and the resulting
          properties are highly dependent on the cryptographic method used to
          establish the Traffic Generation Key (TGK) that is used to derive
          the keys actually used by the security protocol, like SRTP.</t>

          <t>MIKEY has the following modes of operation:<list style="hanging">
              <t hangText="Pre-Shared Key:">Uses a pre-shared secret for
              symmetric key crypto used to secure a keying message carrying
              the already generated TGK. This system is the most efficient
              from the perspective of having small messages and processing
              demands.</t>

              <t hangText="Public Key encryption:">Uses a public key crypto to
              secure a keying message carrying the already generated TGK. This
              is more resource consuming but enables scalable systems. It does
              require a public key infrastructure to enable verification.</t>

              <t hangText="Diffie-Hellman:">Uses Diffie-Hellman key-agreement
              to generate the TGK, thus providing perfect forward secrecy. The
              downside is increased resource consumption in bandwidth and
              processing. This method can't be used to establish group keys as
              each pair of peers performing the MIKEY exchange will establish
              different keys.</t>

              <t hangText="HMAC-Authenticated Diffie-Hellman:"><xref
              target="RFC4650"/> defines a variant of the Diffie-Hellman
              exchange that uses a pre-shared key in a keyed HMAC to verify
              authenticity of the keying material instead of a digital
              signature as in the previous method. This method is still
              restricted to point-to-point usage.</t>

              <t hangText="RSA-R:"><xref target="RFC4738">MIKEY-RSA in Reverse
              mode</xref> is a variant of the public key method which doesn't
              rely on the initiator of the key-exchange knowing the responders
              certificate. This methods lets both the initiator and the
              responder to specify the TGK material depending on use case.
              Usage of this mode requires one round trip time.</t>

              <t hangText="TICKET:"><xref target="RFC6043"/> is a MIKEY
              extension using trusted centralized key management service and
              tickets, like Kerberos.</t>

              <t hangText="IBAKE:"><xref target="RFC6267"/> uses a key
              management services (KMS) but with lower demand on the KMS. If
              provides both perfect forward and backwards secrecy.</t>

              <t hangText="SAKKE:"><xref target="RFC6509"/> provides
              Sakai-Kasahara Key Encryption in MIKEY. Based on Identity based
              Public Key Cryptography to establish a shared secret value and
              certificate less signatures to provide source authentication. It
              features include simplex transmission, scalability, low-latency
              call setup, and support for secure deferred delivery.</t>
            </list></t>

          <t>MIKEY messages has several different defined transports. <xref
          target="RFC4567"/> defines how MIKEY messages can be embedded in
          general SDP for usage with the signalling protocols SIP, SAP and
          RTSP. There also exist an 3GPP defined usage of MIKEY that sends
          MIKEY messages directly over UDP to key the receivers of <xref
          target="3GPP.33.246">Multimedia Broadcast and Multicast Service
          (MBMS)</xref>.</t>

          <t>Based on the many choices it is important to consider the
          properties needed in ones solution and based on that evaluate which
          modes that are candidates for ones usage. More information on the
          applicability of the different MIKEY modes can be found in <xref
          target="RFC5197"/>.</t>
        </section>

        <section title="Key Management for SRTP: Security Descriptions">
          <t><xref target="RFC4568"/> provides a keying solution based on
          sending plain text keys in <xref target="RFC4566">SDP</xref>. It is
          primarily used with SIP and SDP Offer/Answer, and is well-defined in
          point to point sessions where each side declares its own unique key.
          Using Security Descriptions to establish group keys is less well
          defined, and can have security issues as the SSRC uniqurness
          property can't be guaranteed.</t>

          <t>Since keys are transported in plain text in SDP, they can easily
          be intercepted unless the SDP carrying protocol provides strong
          end-to-end confidentiality and authentication guarantees. This is
          not the common use of security descriptions with SIP, where instead
          hop by hop security is provided between signalling nodes using TLS.
          This still leaves the keying material sensitive to capture by the
          traversed signalling nodes. Thus in most cases the security
          properties of security description are weak.</t>
        </section>

        <section title="Key Management for SRTP: Encrypted Key Transport">
          <t><xref target="I-D.ietf-avtcore-srtp-ekt">Encrypted Key Transport
          (EKT)</xref> is an SRTP extension that enables group keying despite
          using a keying mechanism that can't support group keys, like
          DTLS-SRTP. It is designed for centralized conferencing, but can also
          be used in sessions where an end-points connect to a conference
          bridge or a gateway, and need to be provisioned with the keys each
          participant on the bridge or gateway uses to avoid decryption
          encryption cycles on the bridge or gateway.</t>

          <t>The mechanism is based on establishing an additional EKT key
          which everyone uses to protect their actual session key. The actual
          session key is sent in a expanded authentication tag to the other
          session participants. This key are only sent occasionally or
          periodically depending on use cases depending on what requirements
          exist for timely delivery or notification on when the key is needed
          by someone.</t>
        </section>

        <section title="Key Management for SRTP: Other systems">
          <t>There exist at least one additional SRTP key-management system,
          namely <xref target="RFC6189">ZRTP</xref>. This was a candidate for
          IETF standardization that wasn't chosen, and was published for
          information instead. Its properties are somewhat similar to
          DTLS.</t>

          <t>There might exist additional non-IETF defined solutions.</t>
        </section>
      </section>

      <section title="RTP Legacy Confidentiality">
        <t>Section 9 of the RTP standard <xref target="RFC3550"/> defines a
        DES or 3DES based encryption of RTP and RTCP packets. This mechanism
        is keyed using plain text keys in <xref target="RFC4566">SDP</xref>
        using the "k=" SDP field. This method of providing confidentiality has
        extremely weak security properties and is not to be used.</t>
      </section>

      <section title="IPsec">
        <t><xref target="RFC4301">IPsec</xref> can be used independent of mode
        to protect RTP and RTCP packets in transit from one network interface
        to another. This can be sufficient when the network interfaces have a
        direct relation, or in a secured environment where it can be
        controlled who can read the packets from those interfaces.</t>

        <t>The main concern with using IPsec to protect RTP traffic is that in
        most cases using a VPN approach that terminates the security
        association at some node prior to the RTP end-point leaves the traffic
        vulnerable to attack between the VPN termination node and the
        end-point. Thus usage of IPsec requires careful thought and design of
        its usage so that it really meets the security goals. A important
        question is how one ensure the IPsec terminating peer and the ultimate
        destination is the same.</t>

        <t>IPsec with RTP is more commonly used as security solution between
        central nodes in an infrastructure that exchanges many RTP sessions
        and media streams between the peers. The establishment of a secure
        tunnel between these peers minimizes the key-management overhead
        between these two boxes.</t>
      </section>

      <section title="DTLS">
        <t><xref target="RFC6347">Datagram Transport Layer Security (DTLS)
        </xref> can provide point to point security for RTP flows. The two
        peers would establish an DTLS association between each other,
        including the possibility to do certificate-based source
        authentication when establishing the association. All RTP and RTCP
        packets flowing will be protected by this DTLS association.</t>

        <t>Note: using DTLS is different to using DTLS-SRTP key management.
        DTLS-SRTP has the core key-management steps in common with DTLS, but
        DTLS-SRTP uses SRTP for the per packet security operations, while DTLS
        uses the normal datagram TLS data protection. When using DTLS, RTP and
        RTCP packets are completely encrypted with no headers in the clear,
        while DTLS-SRTP leaves the headers in the clear.</t>

        <t>DTLS can use similar techniques to those available for DTLS-SRTP to
        bind a signalling side agreement to communicate to the certificates
        used by the end-point when doing the DTLS handshake. This enables use
        without having a certificate based trust chain to a trusted
        certificate root.</t>
      </section>

      <section title="TLS over TCP">
        <t>When RTP is sent over <xref target="RFC4571">TCP</xref> it can also
        be sent over <xref target="RFC4572">TLS over TCP</xref>, using TLS to
        provide point to point security services. The security properties TLS
        provides are confidentiality, integrity protection and possible source
        authentication if the client or server certificates are verified and
        provide a usable identity. When used in multi-party scenarios using a
        central node for media distribution, the security provide is only
        between then central node and the peers, so the security properties
        for the whole session are dependent on what trust one can place in the
        central node.</t>
      </section>

      <section title="Payload-only Security Mechanisms">
        <t>Mechanisms have been defined that encrypt only the payload of the
        RTP packets, and leave the RTP headers and RTCP in the clear. There
        are several reasons why this might be appropriate, but a common
        rationale is to ensure that the content stored in RTP hint tracks in
        RTSP streaming servers has the media content in a protected format
        that cannot be read by the streaming server (this is mostly done in
        the context of Digital Rights Management). These approaches then uses
        a key-management solution between the rights provider and the
        consuming client to deliver the key used to protect the content,
        usually after the appropriate method for charging has happened, and do
        not include the media server in the security context. Such methods
        have several security weaknesses such the fact that the same key is
        handed out to a potentially large group of receiving clients,
        increasing the risk of a leak.</t>

        <t>Use of this type of solution can be of interest in environments
        that allow middleboxes to rewrite the RTP headers and select what
        streams that are delivered to an end-point (e.g., some types of
        centralised video conference systems). The advantage of encrypting and
        possibly integrity protecting the payload but not the headers is that
        the middlebox can't eavesdrop on the media content, but can still
        provide stream switching functionality. The downside of such a system
        is that it likely needs two levels of security: the payload level
        solution to provide confidentiality and source authentication, and a
        second layer with additional transport security ensuring source
        authentication and integrity of the RTP headers associated with the
        encrypted payloads. This can also results in the need to have two
        different key-management systems as the entity protecting the packets
        and payloads are different with different set of keys.</t>

        <t>The aspect of two tiers of security are present in ISMAcryp (see
        <xref target="sec-isma"/>) and the deprecated 3GPP Packet Based
        Streaming Service Annex.K <xref target="3GPP.23.234"/> solution.</t>

        <section anchor="sec-isma" title="ISMA Encryption and Authentication">
          <t>The Internet Streaming Media Alliance (ISMA) has defined <xref
          target="ISMACrypt2">ISMA Encryption and Authentication 2.0</xref>.
          This specification defines how one encrypts and packetizes the
          encrypted application data units (ADUs) in an RTP payload using the
          <xref target="RFC3640">MPEG-4 Generic payload format</xref>. The ADU
          types that are allowed are those that can be stored as elementary
          streams in an ISO Media File format based file. ISMAcryp uses SRTP
          for packet level integrity and source authentication from a
          streaming server to the receiver.</t>

          <t>Key-management for a ISMACryp based system can be achieved
          through <xref target="OMADRMv2">Open Mobile Alliance (OMA) Digital
          Rights Management 2.0</xref>, for example.</t>
        </section>
      </section>
    </section>

    <section anchor="sec-applications" title="Securing RTP Applications">
      <t>In the following we provide guidelines for how to choose appropriate
      security mechanisms for RTP applications.</t>

      <section title="Application Requirements">
        <t>This section discusses a number of application requirements that
        need be considered. An application designer choosing security
        solutions requires a good understanding of what level of security is
        needed and what behaviour they strive to achieve.</t>

        <section title="Confidentiality">
          <t>When it comes to confidentiality of an RTP session there are
          several aspects to consider:<list style="hanging">
              <t hangText="Probability of compromise:">When using encryption
              to provide media confidentiality, it is necessary to have some
              rough understanding of the security goal and how long one expect
              the protected content remain confidential. From that, one can
              determine what encryption algorithm is to be used from the set
              of available transforms.</t>

              <t hangText="Potential for other leakage:">RTP based security in
              most of its forms simply wraps RTP and RTCP packets into
              cryptographic containers. This commonly means that the size of
              the original RTP payload, and details of the RTP and RTCP
              headers, are visible to observers of the protected packet flow.
              This can provide information to those observers. A well
              documented case is the risk with variable bit-rate speech codecs
              that produce different sized packets based on the speech input
              <xref target="RFC6562"/>. Potential threats such as these need
              to be considered and, if they are significant, then restrictions
              will be needed on mode choices in the codec, or additional
              padding will need to be added to make all packets equal size and
              remove the informational leakage.</t>

              <t hangText="">Another case is RTP header extensions. If SRTP is
              used, header extensions are normally not protected by the
              security mechanism protecting the RTP payload. If the header
              extension carries information that is considered sensitive, then
              the application needs to be modified to ensure that mechanisms
              used to protect against such information leakage are
              employed.</t>

              <t hangText="Who has access:">When considering the
              confidentiality properties of a system, it is important to
              consider where the media handled in the clear. For example, if
              the system is based on an RTP mixer that needs the keys to
              decrypt the media, process, and repacketize it, then is the
              mixer providing the security guarantees expected by the other
              parts of the system? Furthermore, it is important to consider
              who has access to the keys, and are the keys stored or kept
              somewhere? The policies for the handling of the keys, and who
              can access the keys, need to be considered along with the
              confidentiality goals.</t>
            </list></t>

          <t>As can be seen the actual confidentiality level has likely more
          to do with the application's usage of centralized nodes, and the
          details of the key-management solution chosen, than with the actual
          choice of encryption algorithm (although, of course, the encryption
          algorithm needs to be chosen appropriately for the desired security
          level).</t>
        </section>

        <section title="Integrity">
          <t>Protection against modification of content by a third party, or
          due to errors in the network, is another factor to consider. The
          first aspect that one consider is what resilience one has against
          modifications to the content. This can affect what cryptographic
          algorithm is used, and the length of the integrity tags. However
          equally, important is to consider who is providing the integrity
          assertion, what is the source of the integrity tag, and what are the
          risks of modifications happening prior to that point where
          protection is applied? RTP applications that rely on central nodes
          need to consider if hop-by-hop integrity is acceptable, or if true
          end-to-end integrity protection is needed? Is it important to be
          able to tell if a middlebox has modified the data? There are some
          uses of RTP that require trusted middleboxes that can modify the
          data in a way that doesn't break integrity protection as seen by the
          receiver, for example local advertisment insertion in IPTV systems;
          there are also uses where it is essential that such in-network
          modification be detectable. RTP can support both, with appropriate
          choices of security mechanisms.</t>

          <t>Integrity of the data is commonly closely tied to the question of
          source authentication. That is, it becomes important to know who
          makes an integrity assertion for the data.</t>
        </section>

        <section title="Source Authentication">
          <t>Source authentication is about determining who sent a particular
          RTP or RTCP packet. It is normally closely tied with integrity,
          since you also want to ensure that what you received is what the
          claimed source really sent, so source authentication without
          integrity is not particularly useful. In similar way, although not
          as definitive, is that integrity without source authentication is
          also not particular useful: you need to know who claims this packet
          wasn't changed.</t>

          <t>Source authentication can be asserted in several different ways:
          <list style="hanging">
              <t hangText="Base level:">Using cryptographic mechanisms that
              give authentication with some type of key-management provides an
              implicit method for source authentication. Assuming that the
              mechanism has sufficient strength to not be circumvented in the
              time frame when you would accept the packet as valid, it is
              possible that assert the source authenticated statement that
              this message is most probably from someone that has the
              cryptographic key to this communication.</t>

              <t hangText="">What that assertion actually means is highly
              dependent on the application, and how it handles the keys. In an
              application where the key-handling is limited to two peers, this
              can form a basis for a trust relationship to the level that you
              can state as the traffic is authenticated and matching this
              particular context, it is coming either from me or from my peer
              (and I trust that neither has shared the key with anyone else).
              However, in a multi-party scenario where security contexts are
              shared among participants, most base-level authentication
              solution can't even assert that this packet is from the same
              source as the previous packet.</t>

              <t hangText="Binding the Source:">A step up in the assertion
              that can be done in base-level systems is to tie the signalling
              to the key-exchange. Here, the goal is to be at least be able to
              assert that the sender of the packets is the same entity that I
              have established the session with. How feasible this is depends
              on the properties of the key-management system used, the ability
              to tie the signalling to a particular peer, and what trust you
              place on the different nodes involved.</t>

              <t hangText="">For example, consider a point to point
              communication system that use DTLS-SRTP using self-signed
              certificates for key-management, and SIP for signalling. In such
              a system the end-points for the DTLS-SRTP handshake have
              securely established keys that are not visible to the signalling
              nodes. However, as the certificates used by DTLS is not bound to
              any PKI they can't be verified. Instead, hashes over the
              certificate are sent over the signalling path. If the signalling
              can be trusted not to collaborate on performaning a man in the
              middle attack by modifying the hashes, then the end-points can
              verify that they have reached the peer they are doing signalling
              with.</t>

              <t hangText="Using Identities:">If the applications have access
              to a system that can provide verifiable identities, then the
              source authentication can be bound to that identity. For
              example, in a point-to-point communication even symmetric key
              crypto, where the key-management can assert that the key has
              only been exchanged with a particular identity, can provide a
              strong assertion about who is sending the traffic.</t>

              <t hangText="">Note that all levels of the system much have
              matching capability to assert identity. Having the signalling
              assert that you include a particular identity in a multi-party
              communication session where the key-management systems establish
              keys in a way that one can assert that only the given identity
              has gotten the key. Using a authentication mechanism build on a
              group key and otherwise can't provide any assertion who sent the
              traffic than anyone that got the key provides no strong
              assertability on the media level than: Someone that has gotten
              the security context (key) sent this traffic.</t>
            </list></t>

          <t/>
        </section>

        <section title="Identity">
          <t>As seen in the previous section, having an identity provider
          system can benefit the applications by enabling them to do strong
          assertion between identity and the actual media source. Therefore,
          the need for identity needs to be considered. However, having
          identity systems might not be suitable for all types of application,
          since they require trusted infrastructure.</t>
        </section>

        <section title="Privacy">
          <t>RTP applications need to consider what privacy goals they have.
          As RTP applications communicate directly between peers in many
          cases, the IP addresses of any communication peer will be available.
          The main privacy concern with IP addresses is related to
          geographical location and the possibility to track a user of an
          end-point. The main way of avoid such concerns is the introduction
          of relay or centralized media mixers or forwarders that hides the
          address of a peer from any other peer. The security and trust placed
          in these relays obviously needs to be carefully considered.</t>

          <t>RTP itself can contribute to enabling a particular user to be
          tracked between communication sessions if the CNAME is generated
          according to the RTP specification in the form of user@host. Such
          RTCP CNAMEs are likely long term stable over multiple sessions,
          allowing tracking of users. This can be desirable for long-term
          fault tracking and diagnosis, but clearly has privacy implications.
          Instead cryptographically random ones could be used as defined by
          <xref target="I-D.rescorla-avtcore-random-cname">Random algorithm
          for RTP CNAME generation</xref>.</t>

          <t>If there exist privacy goals, these need to be considered, and
          the system designed with them in mind. In addition certain RTP
          features might have to be configured to safeguard privacy, or have
          requirements on how the implementation is done.</t>
        </section>
      </section>

      <section title="Application Structure">
        <t>When it comes to RTP security, the most appropriate solution is
        often highly dependent on the topology of the communication session.
        The signalling also impacts what information can be provided, and if
        this can be instance specific, or common for a group. In the end the
        key-management system will highly affect the security properties
        achieved by the application. At the same time, the communication
        structure of the application limits what key management methods are
        applicable. As different key-management have different requirements on
        underlying infrastructure it is important to take that aspect into
        consideration early in the design.</t>
      </section>

      <section title="Interoperability">
        <t>Few RTP applications exist as independent applications that never
        interoperate with anything else. Rather, they enable communication
        with a potentially large number of other systems. To minimize the
        number of security mechanisms that need to be implemented, it is
        important to consider if one can use the same security mechanisms as
        other applications. This can also reduce the problems of determining
        what security level is actually negotiated in a particular
        session.</t>

        <t>The desire to be interoperable can in some cases be in conflict
        with the security requirements determined for an application. To meet
        the security goals, it might be necessary to sacrifice
        interoperability. Alternatively, one can implement multiple security
        mechanisms, but then end up with an issue of ensuring that the user
        understands what it means to use a particular security level. In
        addition, the application can then become vulnerable to bid-down
        attack.</t>
      </section>
    </section>

    <section anchor="sec-examples" title="Examples">
      <t>In the following we describe a number of example security solutions
      for RTP using applications, services or frameworks. These examples are
      provided to show the choices that can be made. They are not normative
      recommendations for security.</t>

      <section title="Media Security for SIP-established Sessions using DTLS-SRTP">
        <t>The IETF evaluated media security for RTP sessions established
        using point-to-point SIP sessions in 2009. A number of requirements
        were determined, and based on those, the existing solutions for media
        security and especially the keying methods were analysed, and the
        resulting requirements and analysis were published in <xref
        target="RFC5479"/>. Based on this analysis, and the working group
        discussion, DTLS-SRTP was determined to be the best solution, and the
        specifications were finalized.</t>

        <t>The security solution for SIP using DTLS-SRTP is defined in the
        <xref target="RFC5763">Framework for Establishing a Secure Real-time
        Transport Protocol (SRTP) Security Context Using Datagram Transport
        Layer Security (DTLS)</xref>. On a high level it uses SIP with SDP
        offer/answer procedures to exchange the network addresses where the
        server end-point will have a DTLS-SRTP enable server running. The SIP
        signalling is also used to exchange the fingerprints of the
        certificate each end-point will use in the DTLS establishment process.
        When the signalling is sufficiently completed the DTLS-SRTP client
        performs DTLS handshakes and establishes SRTP session keys. The
        clients also verify the fingerprints of the certificates to verify
        that no man in the middle has inserted themselves into the
        exchange.</t>

        <t>At the basic level DTLS has a number of good security properties.
        For example, to enable a man in the middle someone in the signalling
        path needs to perform an active action and modify the signalling
        message. There also exist a solution that enables the fingerprints to
        be bound to identities established by the first proxy for each user
        <xref target="RFC4916"> </xref>. That reduces the number of nodes the
        connecting user UA has to trust to the first hop proxy, rather than
        the full signalling path.</t>
      </section>

      <section title="Media Security for WebRTC Sessions">
        <t><xref target="I-D.ietf-rtcweb-overview">Web Real-Time
        Communication</xref> is solution providing web-application with
        real-time media directly between browsers. The RTP transported
        real-time media is protected using a mandatory to use application of
        SRTP. The keying of SRTP is done using DTLS-SRTP. The security
        configuration is further defined in the <xref
        target="I-D.ietf-rtcweb-security-arch">WebRTC Security
        Architecture</xref>.</t>

        <t>The peers hash of their certificates are provided to a Javascript
        application that is part of a client server system providing
        rendezvous services for the ones a given peer wants to communicate
        with. Thus the handling of the hashes between the peers is not well
        defined. It becomes a matter of trust in the application. But unless
        the application and its server is intending to compromise the
        communication security they can provide a secure and integrity
        protected exchange of the certificate hashes thus preventing any
        man-in-the-middle (MITM) to insert itself in the key-exchange.</t>

        <t>The web application still have the possibility to insert a MITM.
        That unless one uses a Identity provider and the proposed <xref
        target="I-D.rescorla-rtcweb-generic-idp">identity solution</xref>. In
        this solution the Identity Provider which is a third party to the
        web-application signs the DTLS-SRTP hash combined with a statement on
        which user identity that has been used to sign the hash. The receiver
        of such a Identity assertion then independently verifies the user
        identity to ensure that it is the identity it intended to communicate
        and that the cryptographic assertion holds. That way a user can be
        certain that the application also can't perform an MITM and that way
        acquire the keys to the media communication.</t>

        <t>In the development of WebRTC there has also been high attention on
        privacy question. The main concerns that has been raised and are at
        all related to RTP are:<list style="hanging">
            <t hangText="Location Disclosure:">As ICE negotiation provides IP
            addresses and ports for the browser, this leaks location
            information in the signalling to the peer. To prevent this one can
            block the usage of any ICE candidate that isn't a relay candidate,
            i.e. where the IP and port provided belong to the service
            providers media traffic relay.</t>

            <t hangText="Prevent tracking between sessions:">RTP CNAMEs and
            DTLS-SRTP certificates is information that could possibly be
            re-used between session instances. Thus to prevent tracking the
            same information can't be re-used between different communication
            sessions.</t>
          </list></t>

        <t>Note: The above cases are focused on providing privacy towards
        other parties than the web service.</t>
      </section>

      <section title="3GPP Packet Based Streaming Service (PSS) ">
        <t>To be written:</t>
      </section>

      <section title="IPTV">
        <t>To be written:</t>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no request of IANA.</t>

      <t>Note to RFC Editor: this section can be removed on publication as an
      RFC.</t>
    </section>

    <section anchor="sec-security" title="Security Considerations">
      <t>This entire document is about security. Please read it.</t>
    </section>

    <section anchor="sec-ack" title="Acknowledgements">
      <t>We thank the IESG for their careful review of <xref
      target="I-D.ietf-avt-srtp-not-mandatory"/> which led to the writing of
      this memo.</t>
    </section>
  </middle>

  <back>
    <references title="Informative References">
      <?rfc include='reference.RFC.1112'?>

      <?rfc include='reference.RFC.3365'?>

      <?rfc include='reference.RFC.3550'?>

      <?rfc include='reference.RFC.3640'?>

      <?rfc include='reference.RFC.3711'?>

      <?rfc include='reference.RFC.3830'?>

      <?rfc include='reference.RFC.4301'?>

      <?rfc include='reference.RFC.4383'?>

      <?rfc include='reference.RFC.4566'?>

      <?rfc include='reference.RFC.4567'?>

      <?rfc include='reference.RFC.4568'?>

      <?rfc include='reference.RFC.4571'?>

      <?rfc include='reference.RFC.4572'?>

      <?rfc include='reference.RFC.4607'?>

      <?rfc include='reference.RFC.4650'?>

      <?rfc include='reference.RFC.4738'?>

      <?rfc include='reference.RFC.4771'?>

      <?rfc include='reference.RFC.4916'?>

      <?rfc include='reference.RFC.5117'?>

      <?rfc include='reference.RFC.5197'?>

      <?rfc include='reference.RFC.5479'?>

      <?rfc include='reference.RFC.5669'?>

      <?rfc include='reference.RFC.5760'?>

      <?rfc include='reference.RFC.5763'?>

      <?rfc include='reference.RFC.5764'?>

      <?rfc include='reference.RFC.6043'?>

      <?rfc include='reference.RFC.6188'?>

      <?rfc include='reference.RFC.6189'?>

      <?rfc include='reference.RFC.6267'?>

      <?rfc include='reference.RFC.6347'?>

      <?rfc include='reference.RFC.6509'?>

      <?rfc include='reference.RFC.6562'?>

      <?rfc include='reference.3GPP.33.246'?>

      <?rfc include='reference.I-D.ietf-avt-srtp-not-mandatory'?>

      <?rfc include='reference.I-D.ietf-avtcore-aria-srtp'?>

      <?rfc include='reference.I-D.ietf-avtcore-srtp-aes-gcm'?>

      <?rfc include='reference.I-D.ietf-avtcore-srtp-ekt'?>

      <?rfc include='reference.I-D.ietf-avtcore-srtp-encrypted-header-ext'?>

      <?rfc include='reference.I-D.ietf-rtcweb-overview'?>

      <?rfc include='reference.I-D.ietf-rtcweb-security-arch'?>

      <?rfc include='reference.I-D.rescorla-rtcweb-generic-idp'?>

      <?rfc include='reference.I-D.rescorla-avtcore-random-cname'?>

      <reference anchor="ISMACrypt2">
        <front>
          <title>ISMA Encryption and Authentication, Version 2.0 release
          version</title>

          <author fullname="Internet Streaming Media Alliance (ISMA)">
            <organization/>
          </author>

          <date month="November" year="2007"/>
        </front>

        <format target="http://www.mpegif.org/m4if/bod/ISMA/ISMA_E%26Aspec2.0.pdf"
                type="PDF"/>
      </reference>

      <reference anchor="OMADRMv2">
        <front>
          <title>OMA Digital Rights Management V2.0</title>

          <author fullname="Open Mobile Alliance">
            <organization>Open Mobile Alliance</organization>
          </author>

          <date day="23" month="July" year="2008"/>
        </front>

        <format target="http://www.openmobilealliance.org/technical/release_program/drm_v2_0.aspx"
                type="HTML"/>
      </reference>

      <reference anchor="3GPP.23.234">
        <front>
          <title>Technical Specification Group Services and System Aspects;
          Transparent end-to-end Packet-switched Streaming Service (PSS);
          Protocols and codecs</title>

          <author fullname="3GPP">
            <organization>3GPP</organization>
          </author>

          <date month="September" year="2009"/>
        </front>

        <seriesInfo name="3GPP TS" value="26.234 8.4.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/23234.htm"
                type="HTML"/>
      </reference>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 00:53:38