One document matched: draft-martinsen-tram-stuntrace-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-martinsen-tram-stuntrace-00"
     ipr="trust200902">
  <front>
    <title abbrev="stun trace">STUN Traceroute</title>
    <author fullname="Paal-Erik Martinsen" initials="P.E" surname="Martinsen">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>
      <address>
        <postal>
          <street>Philip Pedersens Vei 22</street>
          <code>1325</code>
          <city>Lysaker</city>
          <region>Akershus</region>
          <country>Norway</country>
        </postal>
        <email>palmarti@cisco.com</email>
      </address>
    </author>

    <author fullname="Dan Wing" initials="D." surname="Wing">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>
      <address>
        <postal>
          <street>170 West Tasman Drive</street>
          <city>San Jose</city>
          <region>California</region>
          <code>95134</code>
          <country>USA</country>
        </postal>
        <email>dwing@cisco.com</email>
      </address>
    </author>


    <date />

    <workgroup>TRAM</workgroup>

    <abstract>
      <t>
        After a UDP protocol such as RTP determines a network path is
        experiencing problems, a traceroute is often useful to
        determine which router or which link is contributing to the
        problem.  However, operating system traceroute commands follow
        a different path than the actual UDP flow which complicates
        troubleshooting.  A superior method is shown which is
        absolutely path-congruent with the UDP protocol itself, works
        on IPv4 and IPv6, and does not require administrative
        privileges on most operating systems.
      </t>
    </abstract>
  </front>

  <middle>
    <section anchor="introduction" title="Introduction">
      <t>
        Traceroute <xref target="RFC1393"/> is a simple tool available
        on most operating systems and is popular to debug the network
        by simply getting round-trip time along each hop to a remote
        IP address.  More advanced tools, such as MTR, provide more
        metrics such as packet loss and round trip time to each hop
        over several seconds or minutes.
      </t>
      <t>
        To simplify network debugging when dealing with bi-directional
        real time media it is often useful to get as much information
        as possible regarding the network path. In this specification
        probe packets are sent using the same 5-tuple where (S)RTP
        media is flowing. This will provide the most accurate results,
        as probe packets sent on a different 5-tuple may take another
        path due to Equal-Cost Multipath (ECMP, <xref
        target="RFC2992"/>), policy-based routing, and similar
        techniques.
      </t>
      <t>
        To avoid those problems, the probe packets need to be sent
        from the same socket and with the same DiffServ code point the
        normal (S)RTP media packets.  As shown in the examples below,
        most operating systems can pass the ICMP "Time to Live
        Exceeded" error to the application, so the application can
        perform the diagnostics over that network path.
      </t>
      <t>
        This specifications uses STUN <xref target="RFC5389"/> packets
        as probes. STUN packets are designed to be multiplexed
        together with RTP <xref target="RFC3550"/> (and <xref
        target="RFC3711">SRTP</xref>) and are unlikely to cause any
        "problems" for the (S)RTP receiver.  To differentiate each hop
        count, classic traceroute uses different UDP port numbers
        (e.g., TTL=1 uses UDP port 55001, TTL=2 uses UDP port 55002,
        etc.).  The mechanism described here uses the same UDP port
        number (so that the trace is path-congruent with the (S)RTP
        packets), and uses different length UDP packets to
        differentiate each hop count (e.g., TTL=1 uses length 501,
        TTL=2 uses length 502, etc.).
      </t>
      <t>
        To receive richer information from each hop, we also describe
        how to use ICMP extensions <xref target="RFC4884"/>.
      </t>
      <t>
        Using a technique based on ICMP replies avoids a forklift
        upgrade of the network to provide host applications with
        useful information. ICMP is already supported in most network
        and application stacks. It also provides a solution for
        routers and application to share a richer set of information
        through the STUN probes itself and embedding STUN responses
        with additional information to the application in the ICMP
        reply.
      </t>
    </section>

    <section anchor="notation" title="Notational Conventions">
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
        NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
        "OPTIONAL" in this document are to be interpreted as described
        in <xref target="RFC2119"/>.
      </t>
    </section>

    <section title="Overview of Operation">
      <t>
        An application using (S)RTP to send and receive media like
        audio and video following the guidelines in <xref
        target="RFC4961"/> uses symmetric send and receive ports. The
        application opens one socket that it uses to both send and
        receive media on.
      </t>
      <t>
        Multiplexing media-streams as described in Bundle <xref
        target="I-D.ietf-mmusic-sdp-bundle-negotiation" /> would limit
        the number of media-paths needing tracing for trouble shooting
        purposes. The method described here can be used to trace
        numerous paths simultaneously and does not rely on any
        multiplexing to work.
      </t>
      <t>
        It is important to note that the functionality described here
        can be done on most OSes without any administrative
        privileges.
      </t>
      <t>
        <xref target="fig_overview"/> depicts the various components
        needed for this to work. The application opens up its media
        socket as it would in normal cases where media is to be sent
        and received. It also opens up a ICMP socket or installs an
        error listener on the media socket.
      </t>
      <t>
      <figure anchor="fig_overview"><artwork><![CDATA[

                  POLL/      Network Node      Network Node
                  SELECT
 +-----+            |             |                  |
 |  A  |* ICMP +++ /+\++++++++++++|++++ <ICMP Reply> |
 |  L  |  SOCKET  |   |           |    ++++++++++++  |
 |  I  |          |   |           |                + |
 |  C  |* MEDIA ==|===|===========|=================+|====<(S)RTP>
 |  E  |  SOCKET --\-/------------|------------------X <STUN Probe>
 +-----+            |             |                  |(TTL expired)

                        ====== Media Path
                        ------ STUN Probes (on same 5 tuple as Media)
                        ++++++ ICMP reply

      ]]></artwork></figure>
      </t>
      <t>
        The application also need to listen on the sockets for any
        incoming ICMP packets or socket error messages. This is
        usually done with the socket calls select() or poll(). How to
        actually receive the ICMP messages will vary from OS to
        OS. See <xref target="sec_implementation_details"/> for
        implementation details on various OSes.
      </t>
      <t>
        Once the application have media running and is listening for
        ICMP replies it can start sending probes to detect networks
        nodes in the media path. This is done by sending STUN messages
        and setting the TTL/MAX_HOP limit in the IPv4/IPv6
        header. <xref target="sec_set_ttl"/> explains how to set this
        on various platforms.
      </t>
      <t>
        The STUN packet is sent on the same socket as the media packet
        are sent and received on.  Mixing (S)RTP and STUN is well known
        behavior and should not cause any problems.
      </t>
      <t>
        Along the path, every layer 3 network node (a.k.a. router)
        decreases the IPv4 TTL or IPv6 HOP_LIMIT field.  If the field
        becomes 0 the network node responds with a ICMP error "Time to
        Live Exceeded" (TTL Exceeded) or "Hop Limit Exceeded in
        Transit" (Time Exceeded Message).
      </t>
      <t>
        The application will receive a ICMP error in response to the
        offending probe packet. The source IP address of the ICMP
        packet will be the sending network node. This enables the
        application to trace the path towards the destination. The
        ICMP reply contains at least 8 bytes of the offending
        packet. The IP fragment of the offending packet in the ICMP
        reply can be used to determining if this ICMP reply actually
        was a reply to an offending packet the application did send
        out.
      </t>
      <!--
      <t>
        TODO: need to describe PADDING and using different-length
        packets in detail
      </t>
      -->

    </section>

    <section title="New STUN Attributes" anchor="new_attributes" >
      <t>
        This STUN extension defines the following new attribute:
      </t>
      <figure>
        <artwork align="left">
      <![CDATA[
      0xXXX0: PATH-NODE-PROBE


      ]]></artwork>
      </figure>

      <section title="PATH-NODE-PROBE">
        <t>
          This attribute have a length of 8. Padding is needed to hit
          the required STUN 32 bit STUN attribute boundary.
        </t>
        <t>
          <figure anchor="hop_attr" title="PATH-NODE-PROBE Attribute">
            <artwork align="left"><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | HOP           | PAD_8         |           PAD_16              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            ]]></artwork></figure></t>
          <t>
            The HOP field indicates what hop in the network path
            (relative to the application) the application is trying to
            learn the IP address of. This field should be set to the
            same value as the TTL/HOP_LIMIT field in the IPv4/IPv6
            header of the probe packet leaving the application. Note
            that the TTL/HOP_LIMIT field in the IPv4/IPv6 header will
            decrease as the packet traverses the path. The HOP field
            in the attribute will remain unchanged.
          </t>
          <t>
            This attribute is useful for clients when receiving the
            whole offending IP packet in the ICMP reply or when the
            rich ICMP feedback mechanism described in <xref
            target="sec_rich_feedback"/> is in use. The attribute will
            also be reflected back in a STUN response if the remote
            application supports is. This makes it easier to correlate
            sent probe packets and ICMP responses.
          </t>
      </section>
    </section>

    <section title="Base Protocol Procedures">
      <t>
        The procedures are simple; send a probe packet that may or may
        not trigger a reply from one of the nodes in the network path
        and then listen and parse any incoming replies.  The reply
        might be an ICMP Time To Live Exceeded (from an intermediate
        hop), a STUN response (from the (S)RTP peer), or any other
        ICMP error message.
      </t>

      <section title="Forming STUN Packet Probes">
        <t>
          To reduce chances of a STUN traceroute probe being stopped
          by various middle-boxes it is RECOMMENDED to use a STUN
          binding request as described in ICE <xref
          target="RFC5245"/>.
        </t>
        <t>
          Since the STUN packet can traverse the whole media-path and
          reach the remote peer it is RECOMMENDED the agent follows
          the guidelines for sending connectivity checks defined in
          the ICE <xref target="RFC5245"/>. Adding a USERNAME
          attribute and integrity protecting the STUN message enables
          the remote peer to authenticate the STUN message and create
          an appropriate response. If the remote peer is unable to
          authenticate the STUN request it will not send any
          response. Getting a response from the remote peer is useful
          as it is an indication the probe have traveled the whole
          network path.
        </t>
        <t>
          When forming the STUN packet probe the agent SHOULD add the
          PATH-NODE-PROBE attribute and MAY add a PADDING attribute as
          described in <xref target="RFC5780"/> Section 7.6. The
          PATH-NODE-PROBE attribute is useful for STUN servers
          receiving the STUN probe and it can be used to correlate any
          ICMP replies if the reply contains the complete offending
          packet. Adding the PADDING attribute is useful for clients
          that needs to have several outstanding probe packets on the
          same 5-tuple. The length of the offending packet reported
          back in any ICMP reply will make it possible to correlate
          this to the correct probe.
        </t>
        <t>
          The agent sending the STUN packet probe MUST store the
          length of the UDP packet (as reported in the IP header)
          containing the STUN probe.
        </t>
        <t>
          Before sending the probe on the wire it is important to set
          the appropriate TTL or HOP_LIMIT field in the IPv4 or IPv6
          header before the packet is sent. How to do this on various
          OSes are described in <xref target="sec_set_ttl"/>.
        </t>
        <t>
          The probe MUST also be sent with the same DSCP value as the
          (S)RTP packets. This is normally not a problem as the STUN
          probes and (S)RTP packets are sent on the same socket.
        </t>
      </section>

      <section title="Receiving a STUN Packet Probe" anchor="sec_rx_probe">
        <t>
          An agent that listens to for STUN requests (a.k.a STUN
          server) that receives a STUN request with a PATH-NODE-PROBE
          attribute, MUST include a PATH-NODE-PROBE attribute with the
          same value in the generated response.
        </t>
        <t>
          Any PADDING attributes as defined in <xref target="RFC5780"/>
          SHOULD be ignored by the STUN server.
        </t>
      </section>

      <section title="Receiving ICMP Messages">
        <t>
          After an agent sends a STUN probe it must be ready to
          receive a ICMP reply or a STUN reply.  Details on how to do
          this on various OSes are described in <xref
          target="sec_rx_icmp"/>.
        </t>
        <!--
        <t>
          TODO: Terminology. Is "IP fragment of the offending probe
          packet in the ICMP reply" clear and avoids confusion?

          TODO: earlier we need to use the term "IP Fragment" to
          explain that is the 64 bits (or whatever length) returned by
          routers in their ICMP payloads.  I think the paragraph below
          is first time we use the term "IP Fragment".  "IP fragment"
          is confusing when used with ICMP because it isn't a
          *fragmented IP packet*, which was my first interpretation of
          the term "IP fragment"!
        </t>
        -->
        <t>
          To prevent ICMP spoofing attacks <xref target ="RFC5927"/>
          <!--[TODO: cite Gont's work]-->, the received ICMP packet
          MUST be validated by port number and length in the IP
          fragment of the offending packet contained in the ICMP
          payload.  Port number validation checks that the port number
          in the offending IP fragment of the probe packet contained
          in the ICMP payload <!--[TODO: terminology problem
          (fixed?)]--> corresponds to the (S)RTP media (and STUN
          probe) 5-tuple.  The length validation checks IP packet
          length field in the IP fragment of the offending packet
          received in the ICMP reply.  This value MUST correspond to
          any length stored when the agent sent the STUN probe.  If
          the agent uses the PADDING (Defined in <xref
          target="RFC5780"/>) attribute to generate different length
          on the STUN probes it is possible to have several
          outstanding probes, thus speeding up the trace.
        </t>
      </section>
    </section>

    <section title="Rich Network Feedback" anchor="sec_rich_feedback">
      <t>
        ICMP can be extended to include Multi part messages as
        described in <xref target="RFC4884"/>. In the Discuss document
        <xref target="I-D.martinsen-tram-discuss"/>, we explored the
        possibility of using STUN as a transport protocol to get
        network feedback back to the client.  If the network element
        understands STUN it is possible to create a STUN response and
        add that to the ICMP reply, in addition to the mandatory 8
        bytes.  This STUN response can carry additional information
        from the network to the client.
      </t>

      <section title="Forming and Sending Request">
        <t>
          When forming the STUN packet probe the agent can add any of
          the attributes described in the discuss draft <xref
          target="I-D.martinsen-tram-discuss"/>.
        </t>
      </section>
      <section title="Receiving and Responding to Request">
        <t>
          If the network element understand the STUN message probe
          containing discuss attributes, it creates a STUN response as
          described in the discuss draft and put it into the ICMP
          response.
        </t>
      </section>
      <section title="Receiving a Response">
        <t>
          If the agent receives a ICMP reply that contains a STUN
          response, it MUST be processed as an ordinary STUN
          response. This will end the STUN transaction.
          </t>
      </section>
    </section>

    <section title="IPv4 and IPv6 Differences">
      <t>
        Core functionality is the same. Some code point
        differences. (more to be added later)
      </t>
    </section>
        
    <section title="IANA Considerations">
      <t>
        The codepoint for the new STUN attribute defined in this
        specification is described in <xref target="new_attributes"/>.
      </t>
    </section>

    <section anchor="security" title="Security Considerations">
      <t>
        ICMP messages does leak network topology, which is a
        well-known threat to networks and mitigations have long
        existed in routers and firewalls so that networks can be
        configured to not leak this topology information beyond their
        borders.
      </t>
      <t>
        ICMP spoofing and DOS attack prevention exist in routers
        deployed on the Internet today.
      </t>
      <t>
        No new threats have been added in this specification.
      </t>
    </section>

    <section anchor="ack" title="Acknowledgements">
      <t>
        Trond Andersen for actually implementing this and Wilson Chen for
        helping out with different OS behavior testing. 
      </t>
    </section>
    
</middle>

<back>
  <references title="Normative References">
    
    <?rfc include="reference.RFC.2119"?>
    <?rfc include="reference.RFC.3550"?>
    <?rfc include="reference.RFC.3711"?>
    <?rfc include="reference.RFC.4884"?>
    <?rfc include="reference.RFC.4961"?>
    
    <?rfc include="reference.RFC.5245"?>
    <?rfc include="reference.RFC.5389"?>
    <?rfc include="reference.RFC.5780"?>
    
    <?rfc include='reference.I-D.martinsen-tram-discuss' ?>
  </references>
  
  <references title="Informative References">
    <?rfc include="reference.RFC.1393"?>
      <?rfc include="reference.RFC.2992"?>
      <?rfc include="reference.RFC.5927"?>
      <?rfc include='reference.I-D.ietf-mmusic-sdp-bundle-negotiation'?>
      
      <reference anchor="ICMPTest" target="https://github.com/palerikm/ICMPTest/">
        <front>
          <title>ICMP test github repo</title>
          <author/>
          <date/>
        </front>
      </reference>
  </references>
  
  
  <section title="Platform Implementation Details" anchor="sec_implementation_details">
    <t>
      This section provides examples and hint on how probe packets can
      be sent and ICMP messages received on various OSes. For a
      complete example please refer to <xref target="ICMPTest"/>.
    </t>
    <section title="Setting TTL or HOP_LIMIT on Probes" anchor="sec_set_ttl">
      <t>
        Setting the appropriate value in the IPv4 or IPv6 header is
        the same for most platforms. Use
      </t>
      <t>
        <figure><artwork><![CDATA[
          setsockopt(sockHandle, IPPROTO_IP, IP_TTL, &sock_ttl,
          sizeof(sock_ttl)); 
          ]]></artwork></figure>
      </t>
      <t>
        for IPv4 or
      </t>
      <t>
        <figure><artwork><![CDATA[ setsockopt(sockHandle,
          IPPROTO_IPV6, IPV6_UNICAST_HOPS, &sock_ttl,
          sizeof(sock_ttl)); 
          ]]></artwork></figure>
      </t>
      <t>
        for IPv6.
      </t>
      <t>
        Sending the probes on the same socket as media is flowing
        requires the implementations to only set this when sending the
        probe packet. Remember to set it back to initial value when
        sending media. Most OSes seems to handle the setsockopt call
        correctly and not set the value in the IP header of any
        buffered packets.
      </t>
    </section>
        
    <section title="Receiving ICMP Messages" anchor="sec_rx_icmp">
      <section title="OS-X and iOS">
        <t>
          Creating a socket to listen for incoming ICMP messages can
          be done as:
        </t>
        <t><figure><artwork><![CDATA[
    icmpSocket=socket(config.remoteAddr.ss_family, SOCK_DGRAM,
                      IPPROTO_ICMP); <<<
         ]]></artwork></figure></t>
        <t>
          This is done in addition to the normal socket used to send
          media on (RTP) and probes. (Yes, even if the probe are sent
          on the media socket the ICMP reply will be on the ICMP
          sockets..)
        </t>
        <t>
          Code in the while(1) loop of poll would look something like:
        </t>
        <t><figure><artwork><![CDATA[
    for(i=0;i<numSockets;i++){
        if (ufds[i].revents & POLLIN) {
            if(i == rtpSock){
                //Handle "normal" data here.
            }
            if(i == icmpSock){//This is the ICMP socket
               //Handle ICMP packets here.
            }
        }
    }
        ]]></artwork></figure></t>
          
        </section>

        <section title="Linux and Android">
          <t>
            For unprivileged recipient of the ICMP messages an error
            handler must be installed. This can be done like:
          </t>
          <t><figure><artwork><![CDATA[
   setsockopt (config.sockfd, SOL_IP,
               IP_RECVERR, &val, sizeof(val)) < 0);
               ]]></artwork></figure></t>
          <t>
            In the poll() section of the code something like this
            needs to be there:
          </t>
          <t>
            <figure><artwork><![CDATA[
    struct msghdr msg;

    if (ufds[dataSock].revents & POLLERR) {
        if (recvmsg(sockfd, &msg, MSG_ERRQUEUE ) == -1) {
            //Ignore for now. Will get it later..
            continue;
        }
        //possible ICMP message
        //use cmsg to read the structures in msg
    }
            ]]></artwork></figure>
          </t>
          <t>
            Failing to call rcvmsg seems to let the msg fall through
            to the kernel. Looks like it will close down the socket
            because of the received error. So be careful!
          </t>
          <t>
            For application with the right administrative privileges
            it is possible create a separate ICMP listen socket as
            described in the previous section. The socket() call would
            then look like: 
          </t>
          <t>
            <figure><artwork><![CDATA[
            icmpSocket=socket(config.remoteAddr.ss_family, SOCK_RAW,
            IPPROTO_ICMP); ]]></artwork></figure> 
          </t>
          <t>
            The poll() loop will be as described for OS-X and iOS. No
            need for a error handler.
          </t>
        </section>
        
        <section title="Windows">
          <t>
            You might be able to get the IP address of the router
            returning the ICMP error by using an unconnected UDP
            socket, and looking at the 'from' sockaddr returned by
            recvfrom() or WSARecvMsg() when it returns an error.
          </t>
          <t>
            That is, we think that when an ICMP error is received and
            the embedded header matches what the socket uses,
            recvfrom/WSARecvMsg will complete with an error, and put
            the ICMP sender's sockaddr in the 'from' field.
          </t>
        </section>
    </section>
  </section>
</back>
</rfc>
  

PAFTECH AB 2003-20262026-04-24 04:25:59