http://stupid.domain.name/ietf/

One document matched: draft-wing-avt-dtls-srtp-key-transport-00.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY I-D.ietf-avt-dtls-srtp SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-avt-dtls-srtp.xml">
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc4346 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4346.xml">
<!ENTITY rfc4086 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4086.xml">
<!ENTITY rfc2827 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2827.xml">
<!ENTITY I-D.ietf-mmusic-sdp-capability-negotiation SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-mmusic-sdp-capability-negotiation.xml">
<!ENTITY I-D.ietf-avt-topologies SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-avt-topologies.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc sortrefs="no" ?>
<?rfc colonspace='yes' ?>
<?rfc tocindent='yes' ?>
<rfc category="std" docName="draft-wing-avt-dtls-srtp-key-transport-00"
     ipr="full3978">
  <front>
    <title abbrev="DTLS-SRTP Key Transport">Datagram TLS Secure RTP
    (DTLS-SRTP) Key Transport</title>

    <author fullname="Dan Wing" initials="D." surname="Wing">
      <organization abbrev="Cisco Systems">Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <email>dwing@cisco.com</email>
      </address>
    </author>

    <date year="2007" />

    <workgroup>AVT Working Group</workgroup>

    <abstract>
      <t>The existing DTLS-SRTP specification allows SRTP keys to be
      established between a pair of SRTP endpoints. However, when there are
      more than two participants in an RTP session, DTLS-SRTP is unable to
      provide a single key for all of the participants.</t>

      <t>This document describes an extension to DTLS-SRTP which transports
      SRTP keying material from one DTLS-SRTP peer to another, so the same
      SRTP keying material can be used by multiple DTLS-SRTP peers. This
      reduces or eliminates the need to key each SRTP session
      individually.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>Unique SRTP master keys can be established, in each direction, using
      <xref target="I-D.ietf-avt-dtls-srtp">DTLS-SRTP</xref>. Using DTLS, each
      peer contributes part of the keying material to generate the eventual
      SRTP master key.</t>

      <t>In some scenarios after the DTLS handshake has completed, it is
      desirable for one peer to change its SRTP key and to transmit SRTP
      packets using an SRTP key that was not derived from the DTLS key
      exchange. These scenarios are described in <xref
      target="sec_scenarios"></xref>.</t>

      <t>The extensions described in this document allow transporting an SRTP
      master key from one DTLS peer to the other. Thus, DTLS Key Transport
      differs from from normal DTLS-SRTP in that the SRTP master key is not
      derived from the TLS handshake.</t>
    </section>

    <section title="Notational Conventions">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"></xref>.</t>
    </section>

    <section anchor="sec_scenarios" title="Scenarios">
      <t>DTLS Key Transport allows mixers and video switchers to avoid
having to encrypt each packet multiple times under multiple SRTP keys.
DTLS Key Transport causes sharing of
      SRTP keys. When sharing SRTP keys with several listeners, it is
      imperative that the SRTP is changed when a new listener is added or a
      listener is removed. This is because a legitimate listener should only
      be able to decrypt the SRTP stream while he is listening; he should not
      be able to decrypt the SRTP stream prior to joining the conference or
      after leaving the conference. The needs for audio mixing and video
      switching are different, and are described below.</t>

      <section anchor="section_point_multipoint"
               title="Point to Multipoint using the RFC 3550 mixer model">
        <t>In <xref target="I-D.ietf-avt-topologies"></xref>, this RTP
        topology is called 'Topo-Mixer'. A common application of an RTP mixer
        is to receive one (or more) streams from speakers, each individually
        keyed. The mixer will then do something intelligent with those streams
        (e.g., select the loudest one), and sends that stream to all of the
        listeners. Using DTLS-SRTP, without key transport, it is
        computationally expensive for the mixer to encrypt the payload
        uniquely for each SRTP listener. The architecture of a typical mixer
        requires each listener's SRTP to be encrypted serially, incurring
        additional delay for each successive listener. This is depicted below
        in <xref target="multipoint-without-kt"></xref>.</t>

        <figure anchor="multipoint-without-kt"
                title="Point to Multipoint Mixer, without DTLS Key Transport">
          <preamble></preamble>

          <artwork align="center"><![CDATA[                     +-------+         +------------+
+---------+          |       +--key=C->+ listener 1 |
| speaker +--key=A-->+       |         +------------+
+---------+          |       |         +------------+
                     | mixer +--key=D->+ listener 2 |
+---------+          |       |         +------------+
| speaker +--key=B-->+       |         +------------+
+---------+          |       +--key=E->+ listener 3 |
                     +-------+         +------------+]]></artwork>

          <postamble></postamble>
        </figure>

        <t>With DTLS Key Transport, the mixer can maintain one outbound SRTP
        cryptographic context, and encrypt the SRTP once for all listeners.
        This is depicted below in <xref
        target="multipoint-with-kt"></xref>.</t>

        <figure anchor="multipoint-with-kt"
                title="Point to Multipoint Mixer, with DTLS Key Transport">
          <preamble></preamble>

          <artwork align="center"><![CDATA[                     +-------+         +------------+
+---------+          |       +--key=C->+ listener 1 |
| speaker +--key=A-->+       |         +------------+
+---------+          |       |         +------------+
                     | mixer +--key=C->+ listener 2 |
+---------+          |       |         +------------+
| speaker +--key=B-->+       |         +------------+
+---------+          |       +--key=C->+ listener 3 |
                     +-------+         +------------+]]></artwork>

          <postamble></postamble>
        </figure>

        <t>The mixer is aware of listeners leaving or joining, and can rekey
        the remaining active listeners by the mixer sending them a DTLS
        "new_srtp_key" message <xref
        target="sec_protocol_description"></xref>.</t>
      </section>

      <section title="Point to Multipoint using video switching MCU">
        <t>In <xref target="I-D.ietf-avt-topologies"></xref>, this RTP
        topology is called 'Topo-Video-switch-MCU'. Video switching works
        differently than audio mixing. In video switching a simple decision is
        made: which video stream to send to the listeners. As with audio
        mixing shown above, DTLS-SRTP without key transport is computationally
        expensive for the video switcher to encrypt the payload uniquely for
        each SRTP listener. The architecture of a typical video switcher
        requires each listener's SRTP to be encrypted serially, incurring
        additional delay for each successive listener. This is depicted below
        in <xref target="multipoint-video-without-kt"></xref>.</t>

        <figure anchor="multipoint-video-without-kt"
                title="Point to Multipoint Video Switching, without DTLS Key Transport">
          <preamble></preamble>

          <artwork align="center"><![CDATA[                     +--------- +         +------------+
+---------+          |          +--key=C->+ listener 1 |
| speaker +--key=A-->+selected  |         +------------+
+---------+          |          |         +------------+
                     | switcher +--key=D->+ listener 2 |
+---------+          |          |         +------------+
| speaker +--key=B-->+dropped   |         +------------+
+---------+          |          +--key=E->+ listener 3 |
                     +----------+         +------------+]]></artwork>

          <postamble></postamble>
        </figure>

        <t>With DTLS key transport, this becomes easier; in fact, the video
        switcher doesn't need to decrypt the SRTP at all, but just make its
        decision (select the stream or drop the stream) and transmit the SRTP
        packets to the listeners. This is depicted below in <xref
        target="multipoint-video-with-kt"></xref>.</t>

        <figure anchor="multipoint-video-with-kt"
                title="Point to Multipoint Video Switching, with DTLS Key Transport">
          <preamble></preamble>

          <artwork align="center"><![CDATA[                     +----------+         +------------+
+---------+          |          +==key=A=>+ listener 1 |
| speaker +==key=A==>+selected  |         +------------+
+---------+          |          |         +------------+
                     | switcher +==key=A=>+ listener 2 |
+---------+          |          |         +------------+
| speaker +==key=B==>+dropped   |         +------------+
+---------+          |          +==key=A=>+ listener 3 |
                     +----------+         +------------+]]></artwork>

          <postamble></postamble>
        </figure>

        <t>The video switcher is aware of listeners leaving or joining.
        When such a change occurs, the video switcher SHOULD request
the speaker(s) change their respective SRTP keys.  
This is done by
        the video switcher sending a DTLS "new_srtp_key_request" message. The
        speaker will respond with a DTLS "new_srtp_key" message, which will be
        relayed, by the switcher, to each of the active listeners.  
Failing to change the key when a listener joins (or leaves) allows a
listener to decrypt SRTP traffic prior to (or after) they are authorized
participants in the conference.
</t>
<t><list style="empty"><t>Discussion point: additionally, do we want to
recommend the video switcher also SRTP traffic until the key is 
changed?  Some security policies would reasonably require that,
but other security policies would not require that.
</t>
</list>
</t>
        <t>When there are multiple speakers, as shown in <xref
        target="multipoint-video-with-kt"></xref> above, each speaker
        transmits with his own SRTP key. That SRTP key is derived from the
        DTLS handshake with the switcher. Each speaker uses DTLS-SRTP Key
        Transport to signal the SSRC that it will use.</t>
      </section>
    </section>

    <section anchor="sec_protocol_description" title="Protocol Description">
      <t>This section describes the extension to the DTLS protocol for
      DTLS-SRTP key transport, which allows securely communicating the SRTP
      key to the DTLS peer.</t>

      <section title="key_transport extension">
        <t>This document adds a new negotiated extension called
        "key_transport", which MUST only be requested in conjunction with the
        "use_srtp" extension (Section 3.2 of <xref
        target="I-D.ietf-avt-dtls-srtp"></xref>). The DTLS server indicates
        its support for key_transport by including key_transport in its
        ServerHello message. If a DTLS client includes key_transport in its
        ClientHello, but does not receive key_transport in the ServerHello,
        the DTLS client MUST NOT send DTLS packets with the srtp_key_transport
        content-type.</t>

        <t>Support for the DTLS Key Transport extension is indicated in SDP
        with the "srtp-kt" attribute. Advertising support for the extension is
        necessary in SDP because in some cases it is required to establish an
        SRTP call. For example, a mixer may be able to only support SRTP
        listeners if those listeners implement DTLS Key Transport (because it
        lacks the CPU cycles necessary to encrypt SRTP uniquely for each
        listener).</t>

        <figure anchor="handshake_message_flow" title="Hangshake Message Flow">
          <preamble>A message flow showing a DTLS client and DTLS server using
          the key_transport extension</preamble>

          <artwork align="center"><![CDATA[
Client                                               Server

ClientHello + use_srtp + key_transport
                             -------->
                     ServerHello + use_srtp + key_transport
                                               Certificate*
                                         ServerKeyExchange*
                                        CertificateRequest*
                             <--------      ServerHelloDone
Certificate*
ClientKeyExchange
CertificateVerify*
[ChangeCipherSpec]
Finished                     -------->
                                         [ChangeCipherSpec]
                             <--------             Finished
SRTP packets                 <------->      SRTP packets
]]></artwork>

          <postamble></postamble>
        </figure>

        <t>After successful negotiation of the key_transport extension, the
        DTLS client and server MAY exchange SRTP packets, encrypted using the
        KDF described in <xref target="I-D.ietf-avt-dtls-srtp"></xref>. This
        is normal and expected, even if Key Transport was negotiated by both
        sides, as neither side may (yet) have a need to alter the SRTP key.
        However, it is also possible that one (or both) peers will immediately
        send a new_srtp_key message before sending any SRTP.</t>
      </section>

      <section title="SRTP Key Transport Protocol">
        <t>A new protocol is defined, called the srtp_key_transport protocol
        which uses srtp_key_transport content-type which consists of the
        following message types:</t>

        <t><list style="hanging">
            <t hangText="new_srtp_key_request:">request that the DTLS peer
            choose a new key. Valid responses are new_srtp_key and
            new_srtp_key_error.</t>

            <t hangText="new_srtp_key:">contains the new SRTP keying material.
            This message is sent by a DTLS endpoint that wants to change its
            SRTP key. This does not change any cryptographic parameters (a new
            DTLS handshake is necessary for that), but only the DTLS key for
            the associated SRTP session. This message includes the SSRC that
            will be used for this key, which allows listeners to establish one
            SRTP crypto-context per speaker (necessary for the video switching
            scenario).  The key chosen MUST be cryptographically random
<xref target="RFC4086"></xref>.</t>

            <t hangText="new_key_activate:">indicates receiver is prepared to
            receive SRTP packets encrypted with the new key.</t>

            <t hangText="new_srtp_key_failure:">indicates a failure.</t>
          </list></t>

        <t>At any time, the DTLS client or DTLS server MAY send a
        key_transport message, as shown in <xref
        target="new_key_message_flow"></xref>. The sender of the new_srtp_key
        message MAY immediately start transmitting SRTP packets with this new
        key. However, to account for loss of the new_srtp_key message it is
        RECOMMENDED that the sender wait before changing to the new SRTP key
        until it receives the new_key_activate message or it times out waiting
        for the new_key_activate_message. The duration of this timeout may
        vary depending on the sensitivity of the content (e.g., 1 second or 10
        seconds). In any case, the new_srtp_key message is retransmitted until
        acknowledged by receipt of a new_key_activate message.</t>

        <figure anchor="new_key_message_flow" title="New Key Message Flow">
          <preamble></preamble>

          <artwork align="center"><![CDATA[
Client / Server                             Server / Client

[new_srtp_key_request]        -------->       
                             <--------         new_srtp_key
new_srtp_key_activiate        -------->]]></artwork>

          <postamble></postamble>
        </figure>

        <t>The following figure shows the state machine for the protocol.</t>

        <figure anchor="state_machine"
                title="Key Transport protocol state machine">
          <preamble></preamble>

          <artwork align="center"><![CDATA[      receive new_srtp_key_request from peer
         or decide to choose new SRTP key
                     |
                     |
send                 V
new_srtp_key  +---------------+    timeout
    +---------| Communicate   |--------+
    |         |     Key       |        |
    +-------->|               |        |
              +---------------+        |
                |           ^          |
     receive    |           |   +----------------+
new_key_activate|           +---| send SRTP using|
                |               |  new SRTP key  |
        +----------------+      +----------------+
        | send SRTP using|
        |  new SRTP key  |
        +----------------+
                |
                V
               done]]></artwork>

          <postamble></postamble>
        </figure>

        <figure>
          <preamble>Using the syntax described in <xref
          target="RFC4346">TLS</xref>, the following structures are
          used:</preamble>

          <artwork align="center"><![CDATA[
enum {
   new_srtp_key_request(0),
   new_srtp_key(1),
   new_srtp_key_activate(2),
   new_srtp_key_failure(128),
   (255)
} SRTPKeyTransportType;

struct {
   SRTPKeyTransportType keytrans_type;
   uint24 length;
   uint16 message_seq;                               
   uint24 fragment_offset;                           
   uint24 fragment_length;                           
   select (SRTPKeyTransportType) {
      case new_srtp_key_request:         NewSRTPKeyRequest;
      case new_srtp_key:                 NewSRTPKey;
      case new_srtp_key_activate:        NewSRTPKeyActivate;
      case new_srtp_key_failure:         NewSRTPKeyFailure;
    };
} KeyTransport;

struct {
    uint  random<64>;           // additional entropy for peer
} NewSRTPKeyRequest;

struct {
    uint32 ssrc;                // SSRC used for this keys
    uint   key<16..32>;         // The last change_cipher_spec 
    uint   auth_tag<4..10>      // decides the length of key 
                                // and auth_tag
    uint  random<64>;           // additional entropy for peer
} NewSRTPKey;

struct { 
    uint  random<64>;           // additional entropy for peer
} NewSRTPKeyActivate;

struct { } NewSRTPKeyFailure;]]></artwork>

          <postamble></postamble>
        </figure>
      </section>

      <section title="Key Transport SSRC and RTP SSRC Collisions">
        <t>Per <xref target="RFC3550"></xref>, if an RTP source notices an RTP SSRC collision, it is required to
        change its SSRC. If it has negotiated support for DTLS-SRTP Key
        Transport, it then MUST also send a new_srtp_key message indicating the new
        SSRC. As part of this procedure, it does not need to also change its
        SRTP key.</t>

        <t>The communication of the SSRC used for each key is necessary if
        there are multiple speakers in the video switching scenario. However,
        because a speaker is not able to determine if their audio or their
        video is being switched, a speaker MUST always adhere indicate a
        change in SSRC by following the procedure in this section for any SRTP
        stream (audio, video, or other).</t>
      </section>

      <section title="Fragmentation, Reassembly, and Retransmission">
        <t>Much like the DTLS handshake itself, this extension also needs to
        handle fragmentation and reassembly (to send a large key) and
        retransmission (to account for packet loss). Using the same technique
        as DTLS's handshake, the message_seq, fragment_offset, fragment_length
        are used.</t>

        <t>When transmitting the key transport message, the sender divides the
        message into a series of N contiguous data ranges. These ranges MUST
        NOT be larger than the maximum handshake fragment size and MUST
        jointly contain the entire key transport message. The ranges SHOULD
        NOT overlap. The sender then creates N key transport messages, all
        with the same message_seq value as the original key transport message.
        Each new message is labelled with the fragment_offset (the number of
        bytes contained in previous fragments) and the fragment_length (the
        length of this fragment). The length field in all messages is the same
        as the length field of the original message. An unfragmented message
        is a degenerate case with fragment_offset=0 and
        fragment_length=length.</t>

        <t>When a DTLS implementation receives a key transport message
        fragment, it MUST buffer it until it has the entire key transport
        message. DTLS implementations MUST be able to handle overlapping
        fragment ranges. This allows senders to retransmit key transport
        messages with smaller fragment sizes during path MTU discovery.</t>
      </section>
    </section>

    <section title="Examples">
      <t>In some scenarios, a device cannot successfully establish an SRTP
      session without DTLS-SRTP Key Transport. The following two examples
      show</t>

      <figure anchor="fig_example_offer_kt_required"
              title="Example SDP offer showing Key Transport is required">
        <preamble>Using the SDP syntax described in <xref
        target="I-D.ietf-mmusic-sdp-capability-negotiation"></xref>, the
        following figure shows an offerer that requires DTLS Key Transport in
        order to set up this call as an SRTP call, otherwise it can set up
        this call as an RTP call. This is indicated by the ",2" on the
        "a=pcfg" line. If the answerer does not understand "a=dtls-srtp" but
        does understand DTLS-SRTP and <xref
        target="I-D.ietf-mmusic-sdp-capability-negotiation"></xref>, this can
        cannot be established using DTLS-SRTP; however, it can be established
        using RTP.</preamble>

        <artwork><![CDATA[      v=0 
      o=- 25678 753849 IN IP4 192.0.2.1 
      s=  
      c=IN IP4 192.0.2.1 
      t=0 0 
      m=audio 53456 RTP/AVP 0 18  
      a=tcap:1 RTP/SAVP  
      a=acap:1 SHA-1 \
        4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
      a=acap:2 dtls-srtp  
      a=pcfg:1 t=1 a=1,2 ]]></artwork>

        <postamble></postamble>
      </figure>

      <figure anchor="fig_example_offer_kt_optional"
              title="Example SDP offer showing Key Transport is optional">
        <preamble>Using the SDP syntax described in <xref
        target="I-D.ietf-mmusic-sdp-capability-negotiation"></xref>, the
        following figure shows an offerer that indicates support for DTLS Key
        Transport but does not require DTLS Key Transport in order to set up
        this call as an SRTP call. This is indicated by the ",[2]" on the
        "a=pcfg" line. If the answerer does not understand "a=dtls-srtp" but
        does understand DTLS-SRTP and <xref
        target="I-D.ietf-mmusic-sdp-capability-negotiation"></xref>, this call
        can still be established using DTLS-SRTP.</preamble>

        <artwork><![CDATA[      v=0 
      o=- 25678 753849 IN IP4 192.0.2.1 
      s=  
      c=IN IP4 192.0.2.1 
      t=0 0 
      m=audio 53456 RTP/AVP 0 18  
      a=tcap:1 RTP/SAVP  
      a=acap:1 SHA-1 \
          4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
      a=acap:2 dtls-srtp  
      a=pcfg:1 t=1 a=1,[2] ]]></artwork>

        <postamble></postamble>
      </figure>

      <t></t>
    </section>

    <section anchor="security_considerations" title="Security Considerations">

      <t>In the point-to-multipoint scenario, <xref
      target="section_point_multipoint"></xref>, all of the listeners know the
      key being used by the mixer. Any of those listeners could create SRTP
      packets that are encrypted with this same key, and send those SRTP
      packets to other listeners. In order to reduce the vulnerability to this
      threat, it is RECOMMENDED that the source transport address of received
      SRTP packets be discarded if they do not match the source transport
      address of the associated DTLS-SRTP session. Additionally, the network
      SHOULD prevent IP address spoofing <xref target="RFC2827"></xref>.</t>

      <t>In the</t>
    </section>

    <section title="Acknowledgements">
      <t>Thanks to David McGrew for his improvements to this document and to
      the underlying protocol.</t>
    </section>

    <section title="IANA Considerations">
      <t>Register new SDP attribute "dtls-kt", new TLS content-type
      "key_transport" (26?).</t>

      <t></t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &I-D.ietf-avt-dtls-srtp;

      &rfc2119;

      &rfc4346;

      &rfc2827;

      &I-D.ietf-mmusic-sdp-capability-negotiation;

    </references>

    <references title="Informational References">
      &I-D.ietf-avt-topologies;
&rfc3550;
&rfc4086;
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 06:35:45