http://stupid.domain.name/ietf/

One document matched: draft-ietf-avtcore-rtp-security-options-10.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-ietf-avtcore-rtp-security-options-10"
     ipr="trust200902">
  <front>
    <title abbrev="Options for Securing RTP Sessions">Options for Securing RTP
    Sessions</title>

    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>Farogatan 6</street>

          <city>SE-164 80 Kista</city>

          <country>Sweden</country>
        </postal>

        <phone>+46 10 714 82 87</phone>

        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>

    <author fullname="Colin Perkins" initials="C. S." surname="Perkins">
      <organization>University of Glasgow</organization>

      <address>
        <postal>
          <street>School of Computing Science</street>

          <city>Glasgow</city>

          <code>G12 8QQ</code>

          <country>United Kingdom</country>
        </postal>

        <email>csp@csperkins.org</email>

        <uri>http://csperkins.org/</uri>
      </address>
    </author>

    <date day="15" month="January" year="2014"/>

    <abstract>
      <t>The Real-time Transport Protocol (RTP) is used in a large number of
      different application domains and environments. This heterogeneity
      implies that different security mechanisms are needed to provide
      services such as confidentiality, integrity and source authentication of
      RTP/RTCP packets suitable for the various environments. The range of
      solutions makes it difficult for RTP-based application developers to
      pick the most suitable mechanism. This document provides an overview of
      a number of security solutions for RTP, and gives guidance for
      developers on how to choose the appropriate security mechanism.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t><xref target="RFC3550">Real-time Transport Protocol (RTP)</xref> is
      widely used in a large variety of multimedia applications, including
      Voice over IP (VoIP), centralized multimedia conferencing, sensor data
      transport, and Internet television (IPTV) services. These applications
      can range from point-to-point phone calls, through centralised group
      teleconferences, to large-scale television distribution services. The
      types of media can vary significantly, as can the signalling methods
      used to establish the RTP sessions.</t>

      <t>This multi-dimensional heterogeneity has so far prevented development
      of a single security solution that meets the needs of the different
      applications. Instead significant number of different solutions have
      been developed to meet different sets of security goals. This makes it
      difficult for application developers to know what solutions exist, and
      whether their properties are appropriate. This memo gives an overview of
      the available RTP solutions, and provides guidance on their
      applicability for different application domains. It also attempts to
      provide indication of actual and intended usage at time of writing as
      additional input to help with considerations such as interoperability,
      availability of implementations etc. The guidance provided is not
      exhaustive, and this memo does not provide normative
      recommendations.</t>

      <t>It is important that application developers consider the security
      goals and requirements for their application. The IETF considers it
      important that protocols implement secure modes of operation and makes
      them available to users <xref target="RFC3365"/>. Because of the
      heterogeneity of RTP applications and use cases, however, a single
      security solution cannot be mandated <xref
      target="I-D.ietf-avt-srtp-not-mandatory"/>. Instead, application
      developers need to select mechanisms that provide appropriate security
      for their environment. It is strongly encouraged that common mechanisms
      are used by related applications in common environments. The IETF
      publishes guidelines for specific classes of applications, so it is
      worth searching for such guidelines.</t>

      <t>The remainder of this document is structured as follows. <xref
      target="sec-background"/> provides additional background. <xref
      target="sec-options"/> outlines the available security mechanisms at the
      time of this writing, and lists their key security properties and
      constraints. That is followed by guidelines and important aspects to
      consider when securing an RTP application in <xref
      target="sec-applications"/>. Finally, we give some examples of
      application domains where guidelines for security exist in <xref
      target="sec-examples"/>.</t>
    </section>

    <section anchor="sec-background" title="Background">
      <t>RTP can be used in a wide variety of topologies due to its support
      for point-to-point sessions, multicast groups, and other topologies
      built around different types of RTP middleboxes. In the following we
      review the different topologies supported by RTP to understand their
      implications for the security properties and trust relations that can
      exist in RTP sessions.</t>

      <section title="Point-to-Point Sessions">
        <t>The most basic use case is two directly connected end-points, shown
        in <xref target="fig-p2p"/>, where A has established an RTP session
        with B. In this case the RTP security is primarily about ensuring that
        any third party can't compromise the confidentiality and integrity of
        the media communication. This requires confidentiality protection of
        the RTP session, integrity protection of the RTP/RTCP packets, and
        source authentication of all the packets to ensure no
        man-in-the-middle attack is taking place.</t>

        <t>The source authentication can also be tied to a user or an
        end-point's verifiable identity to ensure that the peer knows who they
        are communicating with. Here the combination of the security protocol
        protecting the RTP session (and hence the RTP and RTCP traffic) and
        the key-management protocol becomes important to determine what
        security claims can be made.</t>

        <figure align="center" anchor="fig-p2p"
                title="Point-to-point topology">
          <artwork><![CDATA[+---+         +---+
| A |<------->| B |
+---+         +---+
]]></artwork>
        </figure>

        <t/>
      </section>

      <section title="Sessions Using an RTP Mixer">
        <t>An RTP mixer is an RTP session-level middlebox that one can build a
        multi-party RTP based conference around. The RTP mixer might actually
        perform media mixing, like mixing audio or compositing video images
        into a new media stream being sent from the mixer to a given
        participant; or it might provide a conceptual stream, for example the
        video of the current active speaker. From a security point of view,
        the important features of an RTP mixer is that it generates a new
        media stream, and has its own source identifier, and does not simply
        forward the original media.</t>

        <t>An RTP session using a mixer might have a topology like that in
        <xref target="fig-mixer"/>. In this example, participants A through D
        each send unicast RTP traffic to the RTP mixer, and receive an RTP
        stream from the mixer, comprising a mixture of the streams from the
        other participants.</t>

        <figure align="center" anchor="fig-mixer"
                title="Example RTP mixer Topology">
          <artwork><![CDATA[+---+      +------------+      +---+
| A |<---->|            |<---->| B |
+---+      |            |      +---+
           |    Mixer   |
+---+      |            |      +---+
| C |<---->|            |<---->| D |
+---+      +------------+      +---+
 ]]></artwork>
        </figure>

        <t>A consequence of an RTP mixer having its own source identifier, and
        acting as an active participant towards the other end-points is that
        the RTP mixer needs to be a trusted device that has access to the
        security context(s) established. The RTP mixer can also become a
        security enforcing entity. For example, a common approach to secure
        the topology in <xref target="fig-mixer"/> is to establish a security
        context between the mixer and each participant independently, and have
        the mixer source authenticate each peer. The mixer then ensures that
        one participant cannot impersonate another.</t>
      </section>

      <section title="Sessions Using an RTP Translator">
        <t>RTP translators are middleboxes that provide various levels of
        in-network media translation and transcoding. Their security
        properties vary widely, depending on which type of operations they
        attempt to perform. We identify three different categories of RTP
        translator: transport translators, gateways, and media transcoders. We
        discuss each in turn.</t>

        <section title="Transport Translator (Relay)">
          <t>A transport translator <xref target="RFC5117"/> operates on a
          level below RTP and RTCP. It relays the RTP/RTCP traffic from one
          end-point to one or more other addresses. This can be done based
          only on IP addresses and transport protocol ports, with each receive
          port on the translator can have a very basic list of where to
          forward traffic. Transport translators also need to implement
          ingress filtering to prevent random traffic from being forwarded
          that isn't coming from a participant in the conference.</t>

          <t><xref target="fig-relay"/> shows an example transport translator,
          where traffic from any one of the four participants will be
          forwarded to the other three participants unchanged. The resulting
          topology is very similar to Any Source Multicast (ASM) session (as
          discussed in <xref target="sec-asm"/>), but implemented at the
          application layer.</t>

          <figure align="center" anchor="fig-relay"
                  title="RTP relay translator topology">
            <artwork><![CDATA[+---+      +------------+      +---+
| A |<---->|            |<---->| B |
+---+      |    Relay   |      +---+
           | Translator |
+---+      |            |      +---+
| C |<---->|            |<---->| D |
+---+      +------------+      +---+]]></artwork>
          </figure>

          <t>A transport translator can often operate without needing access
          to the security context, as long as the security mechanism does not
          provide protection over the transport-layer information. A transport
          translator does, however, make the group communication visible, and
          so can complicate keying and source authentication mechanisms. This
          is further discussed in <xref target="sec-asm"/>.</t>
        </section>

        <section anchor="sec-gateway" title="Gateway">
          <t>Gateways are deployed when the endpoints are not fully
          compatible. <xref target="fig-gateway"/> shows an example topology.
          The functions a gateway provides can be diverse, and range from
          transport layer relaying between two domains not allowing direct
          communication, via transport or media protocol function initiation
          or termination, to protocol or media encoding translation. The
          supported security protocol might even be one of the reasons a
          gateway is needed.</t>

          <figure align="center" anchor="fig-gateway"
                  title="RTP gateway topology">
            <artwork><![CDATA[
+---+      +-----------+      +---+
| A |<---->|  Gateway  |<---->| B |
+---+      +-----------+      +---+
 ]]></artwork>
          </figure>

          <t>The choice of security protocol, and the details of the gateway
          function, will determine if the gateway needs to be trusted with
          access to the application security context. Many gateways need to be
          trusted by all peers to perform the translation; in other cases some
          or all peers might not be aware of the presence of the gateway. The
          security protocols have different properties depending on the degree
          of trust and visibility needed. Ensuring communication is possible
          without trusting the gateway can be strong incentive for accepting
          different security properties. Some security solutions will be able
          to detect the gateways as manipulating the media stream, unless the
          gateway is a trusted device.</t>
        </section>

        <section anchor="sec-transcode" title="Media Transcoder">
          <t>A Media transcoder is a special type of gateway device that
          changes the encoding of the media being transported by RTP. The
          discussion in <xref target="sec-gateway"/> applies. A media
          transcoder alters the media data, and thus needs to be trusted with
          access to the security context.</t>
        </section>
      </section>

      <section anchor="sec-asm" title="Any Source Multicast">
        <t><xref target="RFC1112">Any Source Multicast</xref> is the original
        multicast model where any multicast group participant can send to the
        multicast group, and get their packets delivered to all group members
        (see <xref target="fig-asm"/>). This form of communication has
        interesting security properties, due to the many-to-many nature of the
        group. Source authentication is important, but all participants with
        access to group security context will have the necessary secrets to
        decrypt and verify integrity of the traffic. Thus use of any group
        security context fails if the goal is to separate individual sources;
        alternate solutions are needed.</t>

        <figure align="center" anchor="fig-asm"
                title="Any source multicast (ASM) group">
          <artwork><![CDATA[
           +-----+
+---+     /       \    +---+
| A |----/         \---| B |
+---+   /   Multi-  \  +---+
       +    Cast     +
+---+   \  Network  /  +---+
| C |----\         /---| D |
+---+     \       /    +---+
           +-----+]]></artwork>
        </figure>

        <t>In addition the potential large size of multicast groups creates
        some considerations for the scalability of the solution and how the
        key-management is handled.</t>
      </section>

      <section title="Source-Specific Multicast">
        <t><xref target="RFC4607">Source-Specific Multicast</xref> allows only
        a specific end-point to send traffic to the multicast group,
        irrespective of the number of RTP media sources. The end-point is
        known as the media Distribution Source. For RTP session to function
        correctly with RTCP over an SSM session extensions have been defined
        in <xref target="RFC5760"/>. <xref target="fig-ssm-session"/> shows a
        sample SSM-based RTP session where several media sources, MS1...MSm,
        all send media to a Distribution Source, which then forwards the media
        data to the SSM group for delivery to the receivers, R1...Rn, and the
        Feedback Targets, FT1...FTn. RTCP reception quality feedback is sent
        unicast from each receiver to one of the Feedback Targets. The
        feedback targets aggregate reception quality feedback and forward it
        upstream towards the distribution source. The distribution source
        forwards (possibly aggregated and summarised) reception feedback to
        the SSM group, and back to the original media sources. The feedback
        targets are also members of the SSM group and receive the media data,
        so they can send unicast repair data to the receivers in response to
        feedback if appropriate.</t>

        <figure align="center" anchor="fig-ssm-session"
                title="Example SSM-based RTP session with two feedback targets">
          <artwork><![CDATA[
 +-----+  +-----+          +-----+
 | MS1 |  | MS2 |   ....   | MSm |
 +-----+  +-----+          +-----+
    ^        ^                ^
    |        |                |
    V        V                V
+---------------------------------+
|       Distribution Source       |
+--------+                        |
| FT Agg |                        |
+--------+------------------------+
  ^ ^           |
  :  .          |
  :   +...................+
  :             |          .
  :            / \          .
+------+      /   \       +-----+
| FT1  |<----+     +----->| FT2 |
+------+    /       \     +-----+
  ^  ^     /         \     ^  ^
  :  :    /           \    :  :
  :  :   /             \   :  :
  :  :  /               \  :  :
  :   ./\               /\.   :
  :   /. \             / .\   :
  :  V  . V           V .  V  :
 +----+ +----+     +----+ +----+
 | R1 | | R2 | ... |Rn-1| | Rn |
 +----+ +----+     +----+ +----+]]></artwork>
        </figure>

        <t/>

        <t>The use of SSM makes it more difficult to inject traffic into the
        multicast group, but not impossible. Source authentication
        requirements apply for SSM sessions too, and an individual
        verification of who sent the RTP and RTCP packets is needed. An RTP
        session using SSM will have a group security context that includes the
        media sources, distribution source, feedback targets, and the
        receivers. Each has a different role and will be trusted to perform
        different actions. For example, the distribution source will need to
        authenticate the media sources to prevent unwanted traffic being
        distributed via the SSM group. Similarly, the receivers need to
        authenticate both the distribution source and their feedback target,
        to prevent injection attacks from malicious devices claiming to be
        feedback targets. An understanding of the trust relationships and
        group security context is needed between all components of the
        system.</t>
      </section>
    </section>

    <section anchor="sec-options" title="Security Options">
      <t>This section provides an overview of security requirements, and the
      current RTP security mechanisms that implement those requirements. This
      cannot be a complete survey, since new security mechanisms are defined
      regularly. The goal is to help applications designer by reviewing the
      types of solution that are available. This section will use a number of
      different security related terms, described in the Internet Security
      Glossary, Version 2 <xref target="RFC4949"/>.</t>

      <section title="Secure RTP">
        <t>The <xref target="RFC3711">Secure RTP (SRTP) protocol</xref> is one
        of the most commonly used mechanisms to provide confidentiality,
        integrity protection, source authentication and replay protection for
        RTP. SRTP was developed with RTP header compression and third party
        monitors in mind. Thus the RTP header is not encrypted in RTP data
        packets, and the first 8 bytes of the first RTCP packet header in each
        compound RTCP packet are not encrypted. The entirety of RTP packets
        and compound RTCP packets are integrity protected. This allows RTP
        header compression to work, and lets third party monitors determine
        what RTP traffic flows exist based on the SSRC fields, but protects
        the sensitive content.</t>

        <t>SRTP works with transforms where different combinations of
        encryption algorithm, authentication algorithm, and pseudo-random
        function can be used, and the authentication tag length can be set to
        any value. SRTP can also be easily extended with additional
        cryptographic transforms. This gives flexibility, but requires more
        security knowledge by the application developer. To simplify things,
        SDP Security Descriptions (see <xref target="sdescription"/>) and
        DTLS-SRTP (see <xref target="dtls-srtp"/>) use pre-defined
        combinations of transforms, known as SRTP crypto suites and SRTP
        protection profiles, that bundle together transforms and other
        parameters, making them easier to use but reducing flexibility. The
        MIKEY protocol (see <xref target="mikey"/>) provides flexibility to
        negotiate the full selection of transforms. At the time of this
        writing, the following transforms, SRTP crypto suites, and SRTP
        protection profiles are defined or under definition:<list
            style="hanging">
            <t hangText="AES-CM and HMAC-SHA-1:">AES Counter Mode encryption
            with 128-bit keys combined with 160-bit keyed HMAC-SHA-1 with
            80-bit authentication tag. This is the default cryptographic
            transform that needs to be supported. The transforms are defined
            in <xref target="RFC3711">SRTP</xref>, with the corresponding SRTP
            crypto suite in <xref target="RFC4568"/> and SRTP protection
            profile in <xref target="RFC5764"/>.</t>

            <t hangText="AES-f8 and HMAC-SHA-1:">AES f8 mode encryption using
            128-bit keys combined with keyed HMAC-SHA-1 using 80-bit
            authentication. The transforms are defined in <xref
            target="RFC3711"/>, with the corresponding SRTP crypto suite in
            <xref target="RFC4568"/>. The corresponding SRTP protection
            profile is not defined.</t>

            <t hangText="SEED:">A Korean national standard cryptographic
            transform that is defined to be used with SRTP in <xref
            target="RFC5669"/>. Three options are defined, one using SHA-1
            authentication, one using Counter mode with CBC-MAC, and finally
            one using Galois Counter mode.</t>

            <t hangText="ARIA:">A <xref
            target="I-D.ietf-avtcore-aria-srtp">Korean block cipher</xref>,
            that supports 128-, 192- and 256- bit keys. It also defines three
            options, Counter mode where combined with HMAC-SHA-1 with 80 or 32
            bits authentication tags, Counter mode with CBC-MAC and Galois
            Counter mode. It also defines a different key derivation function
            than the AES based systems.</t>

            <t hangText="AES-192-CM and AES-256-CM:">Cryptographic transforms
            for SRTP based on AES-192 and AES-256 counter mode encryption and
            160-bit keyed HMAC-SHA-1 with 80- and 32-bit authentication tags.
            These provide 192- and 256-bit encryption keys, but otherwise
            match the default 128-bit AES-CM transform. The transforms are
            defined in <xref target="RFC3711"/> and <xref target="RFC6188"/>,
            with the SRTP crypto suites in <xref target="RFC6188"/>.</t>

            <t hangText="AES-GCM and AES-CCM:">AES Galois Counter Mode and AES
            Counter with CBC MAC for AES-128 and AES-256. This authentication
            is included in the cipher text which becomes expanded with the
            length of the authentication tag instead of using the SRTP
            authentication tag. This is defined in <xref
            target="I-D.ietf-avtcore-srtp-aes-gcm"/>.</t>

            <t hangText="NULL:">SRTP <xref target="RFC3711"/> also provides a
            NULL cipher that can be used when no confidentiality for RTP/RTCP
            is requested. The corresponding SRTP protection profile is defined
            in <xref target="RFC5764"/>.</t>
          </list></t>

        <t>The source authentication guarantees provided by SRTP depend on the
        cryptographic transform and key-management used. Some transforms give
        strong source authentication even in multiparty sessions; others give
        weaker guarantees and can authenticate group membership but not
        sources. TESLA <xref target="RFC4383"/> offers a complement to the
        regular symmetric keyed authentication transforms, like HMAC-SHA-1,
        and can provide per-source authentication in some group communication
        scenarios. The downside is need for buffering the packets for a while
        before authenticity can be verified.</t>

        <t><xref target="RFC4771"/> defines a variant of the authentication
        tag that enables a receiver to obtain the Roll over Counter for the
        RTP sequence number that is part of the Initialization vector (IV) for
        many cryptographic transforms. This enables quicker and easier options
        for joining a long lived secure RTP group, for example a broadcast
        session.</t>

        <t>RTP header extensions are normally carried in the clear and only
        integrity protected in SRTP. This can be problematic in some cases, so
        <xref target="RFC6904"/> defines an extension to also encrypt selected
        header extensions.</t>

        <t>SRTP is specified and deployed in a number of RTP usage contexts;
        Significant support in SIP-established VoIP clients including IMS;
        <xref target="I-D.ietf-mmusic-rfc2326bis">RTSP</xref> and RTP based
        media streaming. Thus SRTP in general is widely deployed. When it
        comes to cryptographic transforms the default (AES-CM and HMAC-SHA-1)
        is the most commonly used, but it might be expected that AES-GCM,
        AES-192-CM, and AES-256-CM will gain usage in future, especially due
        to the AES- and GCM-specific instructions in new CPUs.</t>

        <t>SRTP does not contain an integrated key-management solution, and
        instead relies on an external key management protocol. There are
        several protocols that can be used. The following sections outline
        some popular schemes.</t>

        <section anchor="dtls-srtp" title="Key Management for SRTP: DTLS-SRTP">
          <t>A Datagram Transport Layer Security extension exists for
          establishing SRTP keys <xref target="RFC5763"/><xref
          target="RFC5764"/>. This extension provides secure key-exchange
          between two peers, enabling Perfect Forward Secrecy (PFS) and
          binding strong identity verification to an end-point. Perfect
          Forward Secrecy is a property of the key-agreement protocol that
          ensures that a session key derived from a set of long-term keys will
          not be compromised if one of the long-term keys is compromised in
          the future. The default key generation will generate a key that
          contains material contributed by both peers. The key-exchange
          happens in the media plane directly between the peers. The common
          key-exchange procedures will take two round trips assuming no
          losses. TLS resumption can be used when establishing additional
          media streams with the same peer, and reduces the set-up time to one
          RTT for these streams (see <xref target="RFC5764"/> for a discussion
          of TLS resumption in this context).</t>

          <t>The actual security properties of an established SRTP session
          using DTLS will depend on the cipher suites offered and used, as
          well as the mechanism for identifying the end-points of the
          hand-shake. For example some cipher suits provide PFS , while other
          do not. When using DTLS, the application designer needs to select
          which cipher suites DTLS-SRTP can offer and accept so that the
          desired security properties are achieved. The next choice is how to
          verify the identity of the peer end-point. One choice can be to rely
          on the certificates and use a PKI to verify them to make an identity
          assertion. However, this is not the most common way, instead
          self-signed certificate are common to use, and instead establish
          trust through signalling or other third party solutions.</t>

          <t>DTLS-SRTP key management can use the signalling protocol in four
          ways. First, to agree on using DTLS-SRTP for media security.
          Secondly, to determine the network location (address and port) where
          each side is running a DTLS listener to let the parts perform the
          key-management handshakes that generate the keys used by SRTP.
          Thirdly, to exchange hashes of each side's certificates to bind
          these to the signalling, and ensure there is no man-in-the-middle
          attack. This assumes that one can trust the signalling solution to
          be resistant to modification, and not be in collaboration with an
          attacker. Finally to provide an assertable identity, e.g. <xref
          target="RFC4474"/> that can be used to prevent modification of the
          signalling and the exchange of certificate hashes. That way enabling
          binding between the key-exchange and the signalling.</t>

          <t>This usage is well defined for SIP/SDP in <xref
          target="RFC5763"/>, and in most cases can be adopted for use with
          other bi-directional signalling solutions. It is to be noted that
          there is work underway to revisit the <xref target="RFC4474">SIP
          Identity mechanism</xref> in the IETF STIR working group.</t>

          <t>The main question regarding DTLS-SRTP's security properties is
          how one verifies any peer identity or at least prevents
          man-in-the-middle attacks. This do requires trust in some DTLS-SRTP
          external party, either a PKI, a signalling system or some identity
          provider.</t>

          <t>DTLS-SRTP usage is clearly on the rise. It is mandatory to
          support in WebRTC. It has growing support among SIP end-points.
          DTLS-SRTP was developed in IETF primarily to meet security
          requirements for RTP based media established using SIP. The
          requirements considered can be reviewed in <xref
          target="RFC5479">"Requirements and Analysis of Media Security
          Management Protocols."</xref>.</t>
        </section>

        <section anchor="mikey" title="Key Management for SRTP: MIKEY">
          <t><xref target="RFC3830">Multimedia Internet Keying (MIKEY)</xref>
          is a keying protocol that has several modes with different
          properties. MIKEY can be used in point-to-point applications using
          SIP and RTSP (e.g., VoIP calls), but is also suitable for use in
          broadcast and multicast applications, and centralized group
          communications.</t>

          <t>MIKEY can establish multiple security contexts or cryptographic
          sessions with a single message. It is useable in scenarios where one
          entity generates the key and needs to distribute the key to a number
          of participants. The different modes and the resulting properties
          are highly dependent on the cryptographic method used to establish
          the session keys actually used by the security protocol, like
          SRTP.</t>

          <t>MIKEY has the following modes of operation:<list style="hanging">
              <t hangText="Pre-Shared Key:">Uses a pre-shared secret for
              symmetric key crypto used to secure a keying message carrying
              the already generated session key. This system is the most
              efficient from the perspective of having small messages and
              processing demands. The downside is scalability, where usually
              the effort for the provisioning of pre-shared keys is only
              manageable if the number of endpoints is small.</t>

              <t hangText="Public Key encryption:">Uses a public key crypto to
              secure a keying message carrying the already-generated session
              key. This is more resource intensive but enables scalable
              systems. It does require a public key infrastructure to enable
              verification.</t>

              <t hangText="Diffie-Hellman:">Uses Diffie-Hellman key-agreement
              to generate the session key, thus providing perfect forward
              secrecy. The downside is high resource consumption in bandwidth
              and processing during the MIKEY exchange. This method can't be
              used to establish group keys as each pair of peers performing
              the MIKEY exchange will establish different keys.</t>

              <t hangText="HMAC-Authenticated Diffie-Hellman:"><xref
              target="RFC4650"/> defines a variant of the Diffie-Hellman
              exchange that uses a pre-shared key in a keyed HMAC to verify
              authenticity of the keying material instead of a digital
              signature as in the previous method. This method is still
              restricted to point-to-point usage.</t>

              <t hangText="RSA-R:"><xref target="RFC4738">MIKEY-RSA in Reverse
              mode</xref> is a variant of the public key method which doesn't
              rely on the initiator of the key-exchange knowing the
              responder's certificate. This method lets both the initiator and
              the responder to specify the session keying material depending
              on use case. Usage of this mode requires one round-trip
              time.</t>

              <t hangText="TICKET:"><xref target="RFC6043"/> is a MIKEY
              extension using a trusted centralized key management service
              (KMS). The Initiator and Responder do not share any credentials;
              instead, they trust a third party, the KMS, with which they both
              have or can establish shared credentials.</t>

              <t hangText="IBAKE:"><xref target="RFC6267"/> uses a key
              management services (KMS) infrastructure but with lower demand
              on the KMS. Claims to provides both perfect forward and
              backwards secrecy.</t>

              <t hangText="SAKKE:"><xref target="RFC6509"/> provides
              Sakai-Kasahara Key Encryption in MIKEY. Based on Identity based
              Public Key Cryptography and a KMS infrastructure to establish a
              shared secret value and certificate less signatures to provide
              source authentication. Its features include simplex
              transmission, scalability, low-latency call set-up, and support
              for secure deferred delivery.</t>
            </list></t>

          <t>MIKEY messages have several different transports. <xref
          target="RFC4567"/> defines how MIKEY messages can be embedded in
          general SDP for usage with the signalling protocols SIP, SAP and
          RTSP. There also exist a <xref target="T3GPP.33.246">3GPP defined
          usage of MIKEY that sends MIKEY messages directly over UDP</xref> to
          key the receivers of <xref target="T3GPP.26.346">Multimedia
          Broadcast and Multicast Service (MBMS)</xref>. <xref
          target="RFC3830"/> defines the application/mikey media type allowing
          MIKEY to be used in, e.g., email and HTTP.</t>

          <t>Based on the many choices it is important to consider the
          properties needed in ones solution and based on that evaluate which
          modes that are candidates for ones usage. More information on the
          applicability of the different MIKEY modes can be found in <xref
          target="RFC5197"/>.</t>

          <t>MIKEY with pre-shared keys are used by <xref
          target="T3GPP.33.246"> 3GPP MBMS</xref> and IMS media security <xref
          target="T3GPP.33.328"/> specifies the use of the TICKET mode
          transported over SIP and HTTP. <xref
          target="I-D.ietf-mmusic-rfc2326bis">RTSP 2.0</xref> specifies use of
          the RSA-R mode. There are some SIP end-points that support MIKEY.
          The modes they use are unknown to the authors.</t>
        </section>

        <section anchor="sdescription"
                 title="Key Management for SRTP: Security Descriptions">
          <t><xref target="RFC4568"/> provides a keying solution based on
          sending plain text keys in <xref target="RFC4566">SDP</xref>. It is
          primarily used with SIP and the SDP Offer/Answer model, and is
          well-defined in point-to-point sessions where each side declares its
          own unique key. Using Security Descriptions to establish group keys
          is less well defined, and can have security issues since it's
          difficult to guarantee unique SSRCs (as needed to avoid a "two-time
          pad" attack - see Section 9 of <xref target="RFC3711"/>).</t>

          <t>Since keys are transported in plain text in SDP, they can easily
          be intercepted unless the SDP carrying protocol provides strong
          end-to-end confidentiality and authentication guarantees. This is
          not normally the case, where instead hop-by-hop security is provided
          between signalling nodes using TLS. This leaves the keying material
          sensitive to capture by the traversed signalling nodes. Thus, in
          most cases, the security properties of security descriptions are
          weak. The usage of security descriptions usually requires additional
          security measures, e.g. the signalling nodes be trusted and
          protected by strict access control. Usage of security descriptions
          requires careful design in order to ensure that the security goals
          can be met.</t>

          <t>Security Descriptions is the most commonly deployed keying
          solution for SIP-based end-points, where almost all end-points that
          support SRTP also support Security Descriptions. It is also used for
          access protection in IMS Media Security <xref
          target="T3GPP.33.328"/>.</t>
        </section>

        <section title="Key Management for SRTP: Encrypted Key Transport">
          <t><xref target="I-D.ietf-avtcore-srtp-ekt">Encrypted Key Transport
          (EKT)</xref> is an SRTP extension that enables group keying despite
          using a keying mechanism like DTLS-SRTP that doesn't support group
          keys. It is designed for centralized conferencing, but can also be
          used in sessions where end-points connect to a conference bridge or
          a gateway, and need to be provisioned with the keys each participant
          on the bridge or gateway uses to avoid decryption and encryption
          cycles on the bridge or gateway. This can enable interworking
          between DTLS-SRTP and other keying systems where either party can
          set the key (e.g., interworking with security descriptions).</t>

          <t>The mechanism is based on establishing an additional EKT key
          which everyone uses to protect their actual session key. The actual
          session key is sent in a expanded authentication tag to the other
          session participants. This key is only sent occasionally or
          periodically depending on use cases and depending on what
          requirements exist for timely delivery or notification.</t>

          <t>The only known deployment of EKT so far are in some Cisco video
          conferencing products.</t>
        </section>

        <section title="Key Management for SRTP: ZRTP and Other Solutions">
          <t>The <xref target="RFC6189">ZRTP</xref> key-management system for
          SRTP was proposed as an alternative to DTLS-SRTP. ZRTP provides best
          effort encryption independent of the signalling protocol and
          utilizes key continuity, Short Authentication Strings, or a PKI for
          authentication. ZRTP wasn't adopted as an IETF standards track
          protocol, but was instead published as an informational RFC.
          Commercial implementations exist.</t>

          <t>Additional proprietary solutions are also known to exist.</t>

          <!-- Dan Wing suggested mentioning Microsoft's MS-SSRTP here
        http://msdn.microsoft.com/en-us/library/cc431506%28v=office.12%29.aspx
        but I don't think that's a sufficiently stable reference. [csp] -->
        </section>
      </section>

      <section title="RTP Legacy Confidentiality">
        <t>Section 9 of the RTP standard <xref target="RFC3550"/> defines a
        DES or 3DES based encryption of RTP and RTCP packets. This mechanism
        is keyed using plain text keys in <xref target="RFC4566">SDP</xref>
        using the "k=" SDP field. This method can provide confidentiality but,
        as discussed in Section 9 of <xref target="RFC3550"/>, it has
        extremely weak security properties and is not to be used.</t>
      </section>

      <section anchor="sec-ipsec" title="IPsec">
        <t><xref target="RFC4301">IPsec</xref> can be used in either tunnel or
        transport mode to protect RTP and RTCP packets in transit from one
        network interface to another. This can be sufficient when the network
        interfaces have a direct relation, or in a secured environment where
        it can be controlled who can read the packets from those
        interfaces.</t>

        <t>The main concern with using IPsec to protect RTP traffic is that in
        most cases using a VPN approach that terminates the security
        association at some node prior to the RTP end-point leaves the traffic
        vulnerable to attack between the VPN termination node and the
        end-point. Thus usage of IPsec requires careful thought and design of
        its usage so that it meets the security goals. A important question is
        how one ensures the IPsec terminating peer and the ultimate
        destination are the same. Applications can have issues using existing
        APIs with determining if IPsec is being used or not, and when used who
        the authenticated peer entity is.</t>

        <t>IPsec with RTP is more commonly used as a security solution between
        infrastructure nodes that exchange many RTP sessions and media
        streams. The establishment of a secure tunnel between such nodes
        minimizes the key-management overhead.</t>
      </section>

      <section title="RTP over TLS over TCP">
        <t>Just as RTP can be sent over <xref target="RFC4571">TCP</xref>, it
        can also be sent over <xref target="RFC4572">TLS over TCP</xref>,
        using TLS to provide point-to-point security services. The security
        properties TLS provides are confidentiality, integrity protection and
        possible source authentication if the client or server certificates
        are verified and provide a usable identity. When used in multi-party
        scenarios using a central node for media distribution, the security
        provide is only between the central node and the peers, so the
        security properties for the whole session are dependent on what trust
        one can place in the central node.</t>

        <t><xref target="RFC2326">RTSP 1.0</xref> and <xref
        target="I-D.ietf-mmusic-rfc2326bis">2.0</xref> specifies the usage of
        RTP over the same TLS/TCP connection that the RTSP messages are sent
        over. It appears that RTP over TLS/TCP is also used in some
        proprietary solutions that uses TLS to bypass firewalls.</t>
      </section>

      <section title="RTP over Datagram TLS (DTLS)">
        <t><xref target="RFC6347">Datagram Transport Layer Security (DTLS)
        </xref> is a based on <xref target="RFC5246">TLS</xref>, but designed
        to work over a unreliable datagram oriented transport rather than
        requiring reliable byte stream semantics from the transport protocol.
        Accordingly, DTLS can provide point-to-point security for RTP flows
        analogous to that provided by TLS, but over an datagram transport such
        as UDP. The two peers establish an DTLS association between each
        other, including the possibility to do certificate-based source
        authentication when establishing the association. All RTP and RTCP
        packets flowing will be protected by this DTLS association.</t>

        <t>Note that using DTLS for RTP flows is different to using DTLS-SRTP
        key management. DTLS-SRTP uses the same key-management steps as DTLS,
        but uses SRTP for the per packet security operations. Using DTLS for
        RTP flows uses the normal datagram TLS data protection, wrapping
        complete RTP packets. When using DTLS for RTP flows, the RTP and RTCP
        packets are completely encrypted with no headers in the clear; when
        using DTLS-SRTP, the RTP headers are in the clear and only the payload
        data is encrypted.</t>

        <t>DTLS can use similar techniques to those available for DTLS-SRTP to
        bind a signalling-side agreement to communicate to the certificates
        used by the end-point when doing the DTLS handshake. This enables use
        without having a certificate-based trust chain to a trusted
        certificate root.</t>

        <t>There does not appear to be significant usage of DTLS for RTP.</t>
      </section>

      <section title="Media Content Security/Digital Rights Management">
        <t>Mechanisms have been defined that encrypt only the media content,
        operating within the RTP payload data and leaving the RTP headers and
        RTCP unaffected. There are several reasons why this might be
        appropriate, but a common rationale is to ensure that the content
        stored by RTSP streaming servers has the media content in a protected
        format that cannot be read by the streaming server (this is mostly
        done in the context of Digital Rights Management). These approaches
        then use a key-management solution between the rights provider and the
        consuming client to deliver the key used to protect the content and do
        not give the media server access to the security context. Such methods
        have several security weaknesses such as the fact that the same key is
        handed out to a potentially large group of receiving clients,
        increasing the risk of a leak.</t>

        <t>Use of this type of solution can be of interest in environments
        that allow middleboxes to rewrite the RTP headers and select which
        streams are delivered to an end-point (e.g., some types of centralised
        video conference systems). The advantage of encrypting and possibly
        integrity protecting the payload but not the headers is that the
        middlebox can't eavesdrop on the media content, but can still provide
        stream switching functionality. The downside of such a system is that
        it likely needs two levels of security: the payload level solution to
        provide confidentiality and source authentication, and a second layer
        with additional transport security ensuring source authentication and
        integrity of the RTP headers associated with the encrypted payloads.
        This can also results in the need to have two different key-management
        systems as the entity protecting the packets and payloads are
        different with different set of keys.</t>

        <t>The aspect of two tiers of security are present in ISMACryp (see
        <xref target="sec-isma"/>) and the deprecated <xref
        target="T3GPP.26.234R8">3GPP Packet Based Streaming Service
        Annex.K</xref> solution.</t>

        <section anchor="sec-isma" title="ISMA Encryption and Authentication">
          <t>The Internet Streaming Media Alliance (ISMA) has defined <xref
          target="ISMACryp2">ISMA Encryption and Authentication 2.0</xref>.
          This specification defines how one encrypts and packetizes the
          encrypted application data units (ADUs) in an RTP payload using the
          <xref target="RFC3640">MPEG-4 Generic payload format</xref>. The ADU
          types that are allowed are those that can be stored as elementary
          streams in an ISO Media File format based file. ISMACryp uses SRTP
          for packet level integrity and source authentication from a
          streaming server to the receiver.</t>

          <t>Key-management for a ISMACryp based system can be achieved
          through <xref target="OMADRMv2">Open Mobile Alliance (OMA) Digital
          Rights Management 2.0</xref>, for example.</t>
        </section>
      </section>
    </section>

    <section anchor="sec-applications" title="Securing RTP Applications">
      <t>In the following we provide guidelines for how to choose appropriate
      security mechanisms for RTP applications.</t>

      <section title="Application Requirements">
        <t>This section discusses a number of application requirements that
        need be considered. An application designer choosing security
        solutions requires a good understanding of what level of security is
        needed and what behaviour they strive to achieve.</t>

        <section title="Confidentiality">
          <t>When it comes to confidentiality of an RTP session there are
          several aspects to consider:<list style="hanging">
              <t hangText="Probability of compromise:">When using encryption
              to provide media confidentiality, it is necessary to have some
              rough understanding of the security goal and how long one expect
              the protected content to remain confidential. National or other
              regulations might provide additional requirements on a
              particular usage of an RTP. From that, one can determine which
              encryption algorithms are to be used from the set of available
              transforms.</t>

              <t hangText="Potential for other leakage:">RTP based security in
              most of its forms simply wraps RTP and RTCP packets into
              cryptographic containers. This commonly means that the size of
              the original RTP payload is visible to observers of the
              protected packet flow. This can provide information to those
              observers. A well-documented case is the risk with variable
              bit-rate speech codecs that produce different sized packets
              based on the speech input <xref target="RFC6562"/>. Potential
              threats such as these need to be considered and, if they are
              significant, then restrictions will be needed on mode choices in
              the codec, or additional padding will need to be added to make
              all packets equal size and remove the informational leakage.</t>

              <t hangText="">Another case is RTP header extensions. If SRTP is
              used, header extensions are normally not protected by the
              security mechanism protecting the RTP payload. If the header
              extension carries information that is considered sensitive, then
              the application needs to be modified to ensure that mechanisms
              used to protect against such information leakage are
              employed.</t>

              <t hangText="Who has access:">When considering the
              confidentiality properties of a system, it is important to
              consider where the media handled in the clear. For example, if
              the system is based on an RTP mixer that needs the keys to
              decrypt the media, process, and repacketize it, then is the
              mixer providing the security guarantees expected by the other
              parts of the system? Furthermore, it is important to consider
              who has access to the keys. The policies for the handling of the
              keys, and who can access the keys, need to be considered along
              with the confidentiality goals.</t>
            </list></t>

          <t>As can be seen the actual confidentiality level has likely more
          to do with the application's usage of centralized nodes, and the
          details of the key-management solution chosen, than with the actual
          choice of encryption algorithm (although, of course, the encryption
          algorithm needs to be chosen appropriately for the desired security
          level).</t>
        </section>

        <section title="Integrity">
          <t>Protection against modification of content by a third party, or
          due to errors in the network, is another factor to consider. The
          first aspect that one considers is what resilience one has against
          modifications to the content. Some media types are extremely
          sensitive to network bit errors, whereas others might be able to
          tolerate some degree of data corruption. Equally important is to
          consider the sensitivity of the content, who is providing the
          integrity assertion, what is the source of the integrity tag, and
          what are the risks of modifications happening prior to that point
          where protection is applied? These issues affect what cryptographic
          algorithm is used, and the length of the integrity tags, and whether
          the entire payload is protected.</t>

          <t>RTP applications that rely on central nodes need to consider if
          hop-by-hop integrity is acceptable, or if true end-to-end integrity
          protection is needed? Is it important to be able to tell if a
          middlebox has modified the data? There are some uses of RTP that
          require trusted middleboxes that can modify the data in a way that
          doesn't break integrity protection as seen by the receiver, for
          example local advertisement insertion in IPTV systems; there are
          also uses where it is essential that such in-network modification be
          detectable. RTP can support both, with appropriate choices of
          security mechanisms.</t>

          <t>Integrity of the data is commonly closely tied to the question of
          source authentication. That is, it becomes important to know who
          makes an integrity assertion for the data.</t>
        </section>

        <section anchor="source-auth" title="Source Authentication">
          <t>Source authentication is about determining who sent a particular
          RTP or RTCP packet. It is normally closely tied with integrity,
          since a receiver generally also wants to ensure that the data
          received is what the source really sent, so source authentication
          without integrity is not particularly useful. Similarly, integrity
          protection without source authentication is also not particularly
          useful; a claim that a packet is unchanged that cannot itself be
          validated as from the source (or some from other known and trusted
          party) is meaningless.</t>

          <t>Source authentication can be asserted in several different ways:
          <list style="hanging">
              <t hangText="Base level:">Using cryptographic mechanisms that
              give authentication with some type of key-management provide an
              implicit method for source authentication. Assuming that the
              mechanism has sufficient strength to not be circumvented in the
              time frame when you would accept the packet as valid, it is
              possible to assert a source-authenticated statement; this
              message is likely from a source that has the cryptographic
              key(s) to this communication.</t>

              <t hangText="">What that assertion actually means is highly
              dependent on the application and how it handles the keys. If
              only the two peers have access to the keys, this can form a
              basis for a strong trust relationship that traffic is
              authenticated coming from one of the peers. However, in a
              multi-party scenario where security contexts are shared among
              participants, most base-level authentication solutions can't
              even assert that this packet is from the same source as the
              previous packet.</t>

              <t hangText="Binding the source and the signalling:">A step up
              in the assertion that can be done in base-level systems is to
              tie the signalling to the key-exchange. Here, the goal is to at
              least be able to assert that the source of the packets is the
              same entity that the receiver established the session with. How
              feasible this is depends on the properties of the key-management
              system, the ability to tie the signalling to a particular
              source, and the degree of trust the receiver places on the
              different nodes involved.</t>

              <t hangText="">For example, systems where the key-exchange is
              done using the signalling systems, such as <xref
              target="RFC4568">Security Descriptions</xref>, enable a direct
              binding between signalling and key-exchange. In such systems,
              the actual security depends on the trust one can place in the
              signalling system to correctly associate the peer's identifier
              with the key-exchange.</t>

              <t hangText="Using Identifiers:">If the applications have access
              to a system that can provide verifiable identifiers, then the
              source authentication can be bound to that identifier. For
              example, in a point-to-point communication even symmetric key
              crypto, where the key-management can assert that the key has
              only been exchanged with a particular identifier, can provide a
              strong assertion about the source of the traffic. SIP identity
              <xref target="RFC4474"/> provides one example of how this can be
              done, and could be used to bind DTLS-SRTP certificates used by
              an end-point to the identity provider's public key to
              authenticate the source of a DTLS-SRTP flow.</t>

              <t hangText="">Note that all levels of the system need to have
              matching capability to assert identifiers. If the signalling can
              assert that only a given entity in a multiparty session has a
              key, then the media layer might be able to provide guarantees
              about the identifier used by the media sender. However, using an
              signalling authentication mechanism built on a group key can
              limit the media layer to asserting only group membership.</t>
            </list></t>

          <t/>
        </section>

        <section title="Identifiers and Identity">
          <t>There exist many different types of systems providing identifiers
          with different properties (e.g., SIP identity <xref
          target="RFC4474"/>). In the context of RTP applications, the most
          important property is the possibility to perform source
          authentication and verify such assertions in relation to any claimed
          identifiers. What an identifier really represent can also vary but,
          in the context of communication, one of the most obvious is the
          identifiers representing the identity of the human user one
          communicates with. However, the human user can also have additional
          identifiers in a particular role. For example, the human Alice, can
          also be a police officer and in some cases a identifier for her role
          as police officer will be more relevant than one that assert that
          she is Alice. This is common in contact with organizations, where it
          is important to prove the persons right to represent the
          organization. Some examples of identifier/Identity mechanisms that
          can be used: <list style="hanging">
              <t hangText="Certificate based:">A certificate is used to assert
              the identifiers used to claim an identity, by having access to
              the private part of the certificate one can perform signing to
              assert ones identity. Any entity interested in verifying the
              assertion then needs the public part of the certificate. By
              having the certificate, one can verify the signature against the
              certificate. The next step is to determine if one trusts the
              certificate's trust chain. Commonly by provisioning the verifier
              with the public part of a root certificate, this enables the
              verifier to verify a trust chain from the root certificate down
              to the identifier in the certificate. However, the trust is
              based on all steps in the certificate chain being verifiable and
              trusted. Thus provisioning of root certificates and the ability
              to revoke compromised certificates are aspects that will require
              infrastructure.</t>

              <t hangText="Online Identity Providers:">An online identity
              provider (IdP) can authenticate a user's right to use an
              identifier, then perform assertions on their behalf or provision
              the requester with short-term credentials to assert the
              identifiers. The verifier can then contact the IdP to request
              verification of a particular identifier. Here the trust is
              highly dependent on how much one trusts the IdP. The system also
              becomes dependent on having access to the relevant IdP.</t>
            </list></t>

          <t>In all of the above examples, an important part of the security
          properties are related to the method for authenticating the access
          to the identity.</t>
        </section>

        <section title="Privacy">
          <t>RTP applications need to consider what privacy goals they have.
          As RTP applications communicate directly between peers in many
          cases, the IP addresses of any communication peer will be available.
          The main privacy concern with IP addresses is related to
          geographical location and the possibility to track a user of an
          end-point. The main way of avoid such concerns is the introduction
          of relay (e.g., a TURN server <xref target="RFC5766"/>) or
          centralized media mixers or forwarders that hides the address of a
          peer from any other peer. The security and trust placed in these
          relays obviously needs to be carefully considered.</t>

          <t>RTP itself can contribute to enabling a particular user to be
          tracked between communication sessions if the CNAME is generated
          according to the RTP specification in the form of user@host. Such
          RTCP CNAMEs are likely long term stable over multiple sessions,
          allowing tracking of users. This can be desirable for long-term
          fault tracking and diagnosis, but clearly has privacy implications.
          Instead cryptographically random ones could be used as defined by
          <xref target="RFC7022">Guidelines for Choosing RTP Control Protocol
          (RTCP) Canonical Names (CNAMEs)</xref>.</t>

          <t>If there exist privacy goals, these need to be considered, and
          the system designed with them in mind. In addition certain RTP
          features might have to be configured to safeguard privacy, or have
          requirements on how the implementation is done.</t>
        </section>
      </section>

      <section title="Application Structure">
        <t>When it comes to RTP security, the most appropriate solution is
        often highly dependent on the topology of the communication session.
        The signalling also impacts what information can be provided, and if
        this can be instance specific, or common for a group. In the end the
        key-management system will highly affect the security properties
        achieved by the application. At the same time, the communication
        structure of the application limits what key management methods are
        applicable. As different key-management have different requirements on
        underlying infrastructure it is important to take that aspect into
        consideration early in the design.</t>
      </section>

      <section title="Automatic Key Management">
        <t>The <xref target="RFC4107">Guidelines for Cryptographic Key
        Management</xref> provide an overview of why automatic key management
        is important. They also provide a strong recommendation on using
        automatic key management. Most of the security solutions reviewed in
        this document provide or support automatic key management, at least to
        establish session keys. In some more long term use cases, credentials
        might in certain cases need to be be manually deployed.</t>

        <t>For SRTP an important aspect of automatic key management is to
        ensure that two time pads do not occur, in particular by preventing
        multiple end points using the same session key and SSRC. In these
        cases automatic key management methods can have strong dependencies on
        signalling features to function correctly. If those dependencies can't
        be fulfilled, additional constrains on usage, e.g., per-end point
        session keys, might be needed to avoid the issue.</t>

        <t>When selecting security mechanisms for an RTP application it is
        important to consider the properties of the key management. Using key
        management that is both automatic and integrated will provide minimal
        interruption for the user, and is important to ensure that security
        can, and will remain, to be on by default.</t>
      </section>

      <section title="End-to-End Security vs Tunnels">
        <t>If the security mechanism only provides a secured tunnel, for
        example like some common uses of <xref
        target="sec-ipsec">IPsec</xref>, it is important to consider the full
        end-to-end properties of the system. How does one ensure that the path
        from the endpoint to the local tunnel ingress/egress is secure and can
        be trusted (and similarly for the other end of the tunnel)? How does
        one handle the source authentication of the peer, as the security
        protocol identifies the other end of the tunnel. These are some of the
        issues that arise when one considers a tunnel based security protocol
        rather than an end-to-end. Even with clear requirements and knowledge
        that one still can achieve the security properties using a tunnel
        based solution, one ought to prefer to use end-to-end mechanisms, as
        they are much less likely to violate any assumptions made about
        deployment. These assumptions can also be difficult to automatically
        verify.</t>
      </section>

      <section title="Plain Text Keys">
        <t>Key management solutions that use plain text keys, like <xref
        target="sdescription">SDP Security Descriptions</xref>, require care
        to ensure a secure transport of the signalling messages that contain
        the plain text keys. For plain text keys the security properties of
        the system depend on how securely the plain text keys are protected
        end-to-end between the sender and receiver(s). Not only does one need
        to consider what transport protection is provided for the signalling
        message including the keys, but also the degree to which any
        intermediaries in the signalling are trusted. Untrusted intermediaries
        can perform man in the middle attacks on the communication, or can log
        the keys with the result in encryption being compromised significantly
        after the actual communication occurred.</t>
      </section>

      <section title="Interoperability">
        <t>Few RTP applications exist as independent applications that never
        interoperate with anything else. Rather, they enable communication
        with a potentially large number of other systems. To minimize the
        number of security mechanisms that need to be implemented, it is
        important to consider if one can use the same security mechanisms as
        other applications. This can also reduce problems of determining what
        security level is actually negotiated in a particular session.</t>

        <t>The desire to be interoperable can, in some cases, be in conflict
        with the security requirements of an application. To meet the security
        goals, it might be necessary to sacrifice interoperability.
        Alternatively, one can implement multiple security mechanisms, this
        however introduces the complication of ensuring that the user
        understands what it means to use a particular security system. In
        addition, the application can then become vulnerable to bid-down
        attack.</t>
      </section>
    </section>

    <section anchor="sec-examples" title="Examples">
      <t>In the following we describe a number of example security solutions
      for applications using RTP services or frameworks. These examples are
      provided to illustrate the choices available. They are not normative
      recommendations for security.</t>

      <section title="Media Security for SIP-established Sessions using DTLS-SRTP">
        <t>The IETF evaluated media security for RTP sessions established
        using point-to-point SIP sessions in 2009. A number of requirements
        were determined, and based on those, the existing solutions for media
        security and especially the keying methods were analysed. The
        resulting requirements and analysis were published in <xref
        target="RFC5479"/>. Based on this analysis and working group
        discussion, DTLS-SRTP was determined to be the best solution.</t>

        <t>The security solution for SIP using DTLS-SRTP is defined in the
        <xref target="RFC5763">Framework for Establishing a Secure Real-time
        Transport Protocol (SRTP) Security Context Using Datagram Transport
        Layer Security (DTLS)</xref>. On a high level the framework uses SIP
        with SDP offer/answer procedures to exchange the network addresses
        where the server end-point will have a DTLS-SRTP enable server
        running. The SIP signalling is also used to exchange the fingerprints
        of the certificate each end-point will use in the DTLS establishment
        process. When the signalling is sufficiently completed, the DTLS-SRTP
        client performs DTLS handshakes and establishes SRTP session keys. The
        clients also verify the fingerprints of the certificates to verify
        that no man in the middle has inserted themselves into the
        exchange.</t>

        <t>DTLS has a number of good security properties. For example, to
        enable a man in the middle someone in the signalling path needs to
        perform an active action and modify both the signalling message and
        the DTLS handshake. There also exists solutions that enables the
        fingerprints to be bound to identities. SIP Identity provides an
        identity established by the first proxy for each user <xref
        target="RFC4474"> </xref>. This reduces the number of nodes the
        connecting user User Agent has to trust to include just the first hop
        proxy, rather than the full signalling path. The biggest security
        weakness of this system is its dependency on the signalling. SIP
        signalling passes multiple nodes and there is usually no message
        security deployed, only hop-by-hop transport security, if any, between
        the nodes.</t>
      </section>

      <section title="Media Security for WebRTC Sessions">
        <t>Web Real-Time Communication (WebRTC) <xref
        target="I-D.ietf-rtcweb-overview"/> is a solution providing JavaScript
        web applications with real-time media directly between browsers. Media
        is transported using RTP protected using a mandatory application of
        SRTP <xref target="RFC3711"/>, with keying done using DTLS-SRTP <xref
        target="RFC5764"/>. The security configuration is further defined in
        the WebRTC Security Architecture <xref
        target="I-D.ietf-rtcweb-security-arch"/>.</t>

        <t>A hash of the peer's certificate is provided to the JavaScript web
        application, allowing that web application to verify identity of the
        peer. There are several ways in which the certificate hashes can be
        verified. An approach identified in the WebRTC security architecture
        <xref target="I-D.ietf-rtcweb-security-arch"/> is to use an identity
        provider. In this solution the Identity Provider, which is a third
        party to the web application, signs the DTLS-SRTP hash combined with a
        statement on the validity of the user identity that has been used to
        sign the hash. The receiver of such an identity assertion can then
        independently verify the user identity to ensure that it is the
        identity that the receiver intended to communicate with, and that the
        cryptographic assertion holds; this way a user can be certain that the
        application also can't perform a MITM and acquire the keys to the
        media communication. Other ways of verifying the certificate hashes
        exist, for example they could be verified against a hash carried in
        some out of band channel (e.g., compare with a hash printed on a
        business card), or using a verbal short authentication string (e.g.,
        as in ZRTP <xref target="RFC6189"/>), or using hash continuity.</t>

        <t>In the development of WebRTC there has also been attention given to
        privacy considerations. The main RTP-related concerns that have been
        raised are:<list style="hanging">
            <t hangText="Location Disclosure:">As ICE negotiation <xref
            target="RFC5245"/> provides IP addresses and ports for the
            browser, this leaks location information in the signalling to the
            peer. To prevent this one can block the usage of any ICE candidate
            that isn't a relay candidate, i.e. where the IP and port provided
            belong to the service providers media traffic relay.</t>

            <t hangText="Prevent tracking between sessions:">static RTP CNAMEs
            and DTLS-SRTP certificates provide information that is re-used
            between session instances. Thus to prevent tracking, such
            information is ought not be re-used between sessions, or the
            information ought not sent in the clear. Note, that generating new
            certificates each time prevents continuity in authentication,
            however, as WebRTC users are expected to use multiple devices to
            access the same communication service, such continuity can't be
            expected anyway, instead the above described identity mechanism
            has to be relied on.</t>
          </list></t>

        <t>Note: The above cases are focused on providing privacy from other
        parties, not on providing privacy from the web server that provides
        the WebRTC Javascript application.</t>
      </section>

      <section anchor="sec-ims-example"
               title="IP Multimedia Subsystem (IMS) Media Security">
        <t>In IMS, the core network is controlled by a single operator, or by
        several operators with high trust in each other. Except for some types
        of accesses, the operator is in full control, and no packages are
        routed over the Internet. Nodes in the core network offer services
        such as voice mail, interworking with legacy systems (PSTN, GSM, and
        3G), and transcoding. End-points are authenticated during the SIP
        registration using either IMS-AKA (using SIM credentials) or SIP
        Digest (using password).</t>

        <t>In <xref target="T3GPP.33.328">IMS media security</xref>,
        end-to-end encryption is therefore not seen as needed or desired as it
        would hinder for example interworking and transcoding, making calls
        between incompatible terminals impossible. Because of this IMS media
        security mostly uses end-to-access-edge security where SRTP is
        terminated in the first node in the core network. As the SIP signaling
        is trusted and encrypted (with TLS or IPsec), <xref
        target="RFC4568">security descriptions</xref> is considered to give
        good protection against eavesdropping over the accesses that are not
        already encrypted (GSM, 3G, LTE). Media source authentication is based
        on knowledge of the SRTP session key and trust in that the IMS network
        will only forward media from the correct end-point.</t>

        <t>For enterprises and government agencies, which might have weaker
        trust in the IMS core network and can be assumed to have compatible
        terminals, end-to-end security can be achieved by deploying their own
        key management server.</t>

        <t>Work on Interworking with WebRTC is currently ongoing; the security
        will still be end-to-access-edge, but using <xref
        target="RFC5763">DTLS-SRTP</xref> instead of security
        descriptions.</t>
      </section>

      <section anchor="sec-examples-pss"
               title="3GPP Packet Based Streaming Service (PSS) ">
        <t>The 3GPP Release 11 PSS specification of the Packet Based Streaming
        Service <xref target="T3GPP.26.234R11">(PSS)</xref> defines, in Annex
        R, a set of security mechanisms. These security mechanisms are
        concerned with protecting the content from being copied, i.e. Digital
        Rights Management. To meet these goals with the specified solution,
        the client implementation and the application platform are trusted to
        protect against access and modification by an attacker.</t>

        <t>PSS is <xref target="RFC2326">RTSP 1.0</xref> controlled media
        streaming over RTP. Thus an RTSP client whose user wants to access a
        protected content will request a session description (<xref
        target="RFC4566">SDP</xref>) for the protected content. This SDP will
        indicate that the media is <xref target="ISMACryp2">ISMACryp
        2.0</xref> protected media encoding application units (AUs). The
        key(s) used to protect the media are provided in either of two ways.
        If a single key is used then the client uses some DRM system to
        retrieve the key as indicated in the SDP. Commonly <xref
        target="OMADRMv2">OMA DRM v2</xref> will be used to retrieve the key.
        If multiple keys are to be used, then an additional RTSP stream for
        key-updates in parallel with the media streams is established, where
        key updates are sent to the client using Short Term Key Messages
        defined in the "Service and Content Protection for Mobile Broadcast
        Services" section of the <xref target="OMABCAST">OMA Mobile Broadcast
        Services</xref>.</t>

        <t>Worth noting is that this solution doesn't provide any integrity
        verification method for the RTP header and payload header information,
        only the encoded media AU is protected. 3GPP has not defined any
        requirement for supporting any solution that could provide that
        service. Thus, replay or insertion attacks are possible. Another
        property is that the media content can be protected by the ones
        providing the media, so that the operators of the RTSP server has no
        access to unprotected content. Instead all that want to access the
        media is supposed to contact the DRM keying server and if the device
        is acceptable they will be given the key to decrypt the media.</t>

        <t>To protect the signalling, RTSP 1.0 supports the usage of TLS. This
        is, however, not explicitly discussed in the PSS specification. Usage
        of TLS can prevent both modification of the session description
        information and help maintain some privacy of what content the user is
        watching as all URLs would then be confidentiality protected.</t>
      </section>

      <section title="RTSP 2.0">
        <t><xref target="I-D.ietf-mmusic-rfc2326bis">Real-time Streaming
        Protocol 2.0</xref> offers an interesting comparison to the <xref
        target="sec-examples-pss">PSS service</xref> that is based on RTSP 1.0
        and service requirements perceived by mobile operators. A major
        difference between RTSP 1.0 and RTSP 2.0 is that 2.0 is fully defined
        under the requirement to have mandatory to implement security
        mechanism. As it specifies how one transport media over RTP it is also
        defining security mechanisms for the RTP transported media
        streams.</t>

        <t>The security goals for RTP in RTSP 2.0 is to ensure that there is
        confidentiality, integrity and source authentication between the RTSP
        server and the client. This to prevent eavesdropping on what the user
        is watching for privacy reasons and to prevent replay or injection
        attacks on the media stream. To reach these goals, the signalling also
        has to be protected, requiring the use of TLS between the client and
        server.</t>

        <t>Using TLS-protected signalling the client and server agree on the
        media transport method when doing the SETUP request and response. The
        secured media transport is SRTP (SAVP/RTP) normally over UDP. The key
        management for SRTP is MIKEY using RSA-R mode. The RSA-R mode is
        selected as it allows the RTSP Server to select the key despite having
        the RTSP Client initiate the MIKEY exchange. It also enables the reuse
        of the RTSP servers TLS certificate when creating the MIKEY messages
        thus ensuring a binding between the RTSP server and the key exchange.
        Assuming the SETUP process works, this will establish a SRTP crypto
        context to be used between the RTSP Server and the Client for the RTP
        transported media streams.</t>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no request of IANA.</t>

      <t>Note to RFC Editor: this section can be removed on publication as an
      RFC.</t>
    </section>

    <section anchor="sec-security" title="Security Considerations">
      <t>This entire document is about security. Please read it.</t>
    </section>

    <section anchor="sec-ack" title="Acknowledgements">
      <t>We thank the IESG for their careful review of <xref
      target="I-D.ietf-avt-srtp-not-mandatory"/> which led to the writing of
      this memo. John Mattsson has contributed the <xref
      target="sec-ims-example">IMS Media Security example</xref>.</t>

      <t>The authors wished to thank Christian Correll, Dan Wing, Kevin Gross,
      Alan Johnston, Michael Peck, Ole Jacobsen, Spencer Dawkins, Stephen
      Farrell, John Mattsson, and Suresh Krishnan for review and proposals for
      improvements of the text.</t>
    </section>
  </middle>

  <back>
    <references title="Informative References">
      <?rfc include='reference.RFC.1112'?>

      <?rfc include='reference.RFC.2326'?>

      <?rfc include='reference.RFC.3365'?>

      <?rfc include='reference.RFC.3550'?>

      <?rfc include='reference.RFC.3640'?>

      <?rfc include='reference.RFC.3711'?>

      <?rfc include='reference.RFC.3830'?>

      <?rfc include='reference.RFC.4107'?>

      <?rfc include='reference.RFC.4301'?>

      <?rfc include='reference.RFC.4383'?>

      <?rfc include='reference.RFC.4474'?>

      <?rfc include='reference.RFC.4566'?>

      <?rfc include='reference.RFC.4567'?>

      <?rfc include='reference.RFC.4568'?>

      <?rfc include='reference.RFC.4571'?>

      <?rfc include='reference.RFC.4572'?>

      <?rfc include='reference.RFC.4607'?>

      <?rfc include='reference.RFC.4650'?>

      <?rfc include='reference.RFC.4738'?>

      <?rfc include='reference.RFC.4771'?>

      <?rfc include='reference.RFC.4949'?>

      <?rfc include='reference.RFC.5117'?>

      <?rfc include='reference.RFC.5197'?>

      <?rfc include='reference.RFC.5245'?>

      <?rfc include='reference.RFC.5246'?>

      <?rfc include='reference.RFC.5479'?>

      <?rfc include='reference.RFC.5669'?>

      <?rfc include='reference.RFC.5760'?>

      <?rfc include='reference.RFC.5766'?>

      <?rfc include='reference.RFC.5763'?>

      <?rfc include='reference.RFC.5764'?>

      <?rfc include='reference.RFC.6043'?>

      <?rfc include='reference.RFC.6188'?>

      <?rfc include='reference.RFC.6189'?>

      <?rfc include='reference.RFC.6267'?>

      <?rfc include='reference.RFC.6347'?>

      <?rfc include='reference.RFC.6509'?>

      <?rfc include='reference.RFC.6562'?>

      <?rfc include='reference.RFC.6904'?>

      <?rfc include='reference.RFC.7022'?>

      <?rfc include='reference.I-D.ietf-avt-srtp-not-mandatory'?>

      <?rfc include='reference.I-D.ietf-avtcore-aria-srtp'?>

      <?rfc include='reference.I-D.ietf-avtcore-srtp-aes-gcm'?>

      <?rfc include='reference.I-D.ietf-avtcore-srtp-ekt'?>

      <?rfc include='reference.I-D.ietf-mmusic-rfc2326bis'?>

      <?rfc include='reference.I-D.ietf-rtcweb-overview'?>

      <?rfc include='reference.I-D.ietf-rtcweb-security-arch'?>

      <reference anchor="ISMACryp2">
        <front>
          <title>ISMA Encryption and Authentication, Version 2.0 release
          version</title>

          <author fullname="Internet Streaming Media Alliance (ISMA)">
            <organization>Internet Streaming Media Alliance
            (ISMA)</organization>
          </author>

          <date month="November" year="2007"/>
        </front>

        <format target="http://www.mpegif.org/m4if/bod/ISMA/ISMA_E%26Aspec2.0.pdf"
                type="PDF"/>
      </reference>

      <reference anchor="OMADRMv2">
        <front>
          <title>OMA Digital Rights Management V2.0</title>

          <author fullname="Open Mobile Alliance">
            <organization>Open Mobile Alliance</organization>
          </author>

          <date day="23" month="July" year="2008"/>
        </front>

        <format target="http://www.openmobilealliance.org/technical/release_program/drm_v2_0.aspx"
                type="HTML"/>
      </reference>

      <reference anchor="OMABCAST">
        <front>
          <title>OMA Mobile Broadcast Services V1.0</title>

          <author fullname="Open Mobile Alliance">
            <organization>Open Mobile Alliance</organization>
          </author>

          <date day="1" month="February" year="2009"/>
        </front>

        <format target="http://technical.openmobilealliance.org/Technical/release_program/bcast_v1_0.aspx"
                type="HTML"/>
      </reference>

      <reference anchor="T3GPP.26.234R8">
        <front>
          <title>Technical Specification Group Services and System Aspects;
          Transparent end-to-end Packet-switched Streaming Service (PSS);
          Protocols and codecs</title>

          <author fullname="3GPP">
            <organization>3GPP</organization>
          </author>

          <date month="September" year="2009"/>
        </front>

        <seriesInfo name="3GPP TS" value="26.234 8.4.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/23234.htm"
                type="HTML"/>
      </reference>

      <reference anchor="T3GPP.26.234R11">
        <front>
          <title>Technical Specification Group Services and System Aspects;
          Transparent end-to-end Packet-switched Streaming Service (PSS);
          Protocols and codecs</title>

          <author fullname="3GPP">
            <organization>3GPP</organization>
          </author>

          <date month="September" year="2012"/>
        </front>

        <seriesInfo name="3GPP TS" value="26.234 11.1.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/23234.htm"
                type="HTML"/>
      </reference>

      <reference anchor="T3GPP.26.346">
        <front>
          <title>Multimedia Broadcast/Multicast Service (MBMS); Protocols and
          codecs</title>

          <author>
            <organization>3GPP</organization>
          </author>

          <date day="20" month="March" year="2013"/>
        </front>

        <seriesInfo name="3GPP TS" value="26.346 10.7.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/26346.htm"
                type="HTML"/>
      </reference>

      <reference anchor="T3GPP.33.246">
        <front>
          <title>3G Security; Security of Multimedia Broadcast/Multicast
          Service (MBMS)</title>

          <author>
            <organization>3GPP</organization>
          </author>

          <date day="21" month="December" year="2012"/>
        </front>

        <seriesInfo name="3GPP TS" value="33.246 12.1.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/33246.htm"
                type="HTML"/>
      </reference>

      <reference anchor="T3GPP.33.328">
        <front>
          <title>IP Multimedia Subsystem (IMS) media plane security</title>

          <author>
            <organization>3GPP</organization>
          </author>

          <date day="21" month="December" year="2012"/>
        </front>

        <seriesInfo name="3GPP TS" value="33.328 12.1.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/33328.htm"
                type="HTML"/>
      </reference>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-23 19:36:12