One document matched: draft-ietf-intarea-nat-reveal-analysis-03.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-ietf-intarea-nat-reveal-analysis-03"
     ipr="trust200902">
  <front>
    <title abbrev="Revealing HOST_ID">Analysis of Solution Candidates to
    Reveal a Host Identifier (HOST_ID) in Shared Address Deployments</title>

    <author fullname="Mohamed Boucadair" initials="M." surname="Boucadair">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street></street>

          <city>Rennes</city>

          <region></region>

          <code>35000</code>

          <country>France</country>
        </postal>

        <email>mohamed.boucadair@orange.com</email>
      </address>
    </author>

    <author fullname="Joe Touch" initials="J." surname="Touch">
      <organization>USC/ISI</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <region></region>

          <code></code>

          <country></country>
        </postal>

        <email>touch@isi.edu</email>
      </address>
    </author>

    <author fullname="Pierre Levis" initials="P." surname="Levis">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street></street>

          <city>Caen</city>

          <region></region>

          <code>14000</code>

          <country>France</country>
        </postal>

        <email>pierre.levis@orange.com</email>
      </address>
    </author>

    <author fullname="Reinaldo Penno" initials="R." surname="Penno">
      <organization>Cisco</organization>

      <address>
        <postal>
          <street></street>

          <code></code>

          <country>USA</country>
        </postal>

        <email>repenno@cisco.com</email>
      </address>
    </author>

    <date day="7" month="August" year="2012" />

    <workgroup>INTAREA WG</workgroup>

    <keyword>NAT, Host Identifier</keyword>

    <abstract>
      <t>This document analyzes a set of solution candidates to mitigate some
      of the issues encountered when address sharing is used. In particular,
      this document focuses on means to reveal a host identifier (HOST_ID)
      when a Carrier Grade NAT (CGN) or application proxies are involved in
      the path. This host identifier must be unique to each host under the
      same shared IP address.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t></t>

      <section title="Context">
        <t>As reported in <xref target="RFC6269"></xref>, several issues are
        encountered when an IP address is shared among several subscribers.
        These issues are encountered in various deployment contexts: e.g.,
        Carrier Grade NAT (CGN), application proxies or A+P <xref
        target="RFC6346"></xref>. Examples of such issues are listed
        below:</t>

        <t><?rfc subcompact="yes" ?><list style="symbols">
            <t>Implicit identification (Section 13.2 of <xref
            target="RFC6269"></xref>)</t>

            <t>SPAM (Section 13.3 of <xref target="RFC6269"></xref>)</t>

            <t>Blacklisting a mis-behaving host (Section 13.1 of <xref
            target="RFC6269"></xref>)</t>

            <t>Redirect users with infected machines to a dedicated portal
            (Section 5.1 of <xref target="RFC6269"></xref>)</t>
          </list></t>

        <t>The sole use of the IPv4 address is not sufficient to uniquely
        distinguish a host. As a mitigation, it is tempting to investigate
        means which would help in disclosing an information to be used by the
        remote server as a means to uniquely disambiguate packets of hosts
        using the same IPv4 address.</t>

        <t>The risk of not mitigating these issues are: OPEX (Operational
        Expenditure) increase for IP connectivity service providers (costs
        induced by calls to a hotline), revenue loss for content providers
        (loss of users audience), customers unsatisfaction (low quality of
        experience, service segregation, etc.).</t>
      </section>

      <section title="Purpose and Scope">
        <t>The purpose of this document is to analyze a set of alternative
        channels to convey a host identifier and to assess to what extent they
        solve the problem described in <xref target="problem_space"></xref>.
        Below are listed the candidates analyzed in the document:</t>

        <t><?rfc subcompact="yes" ?><list style="symbols">
            <t>Use the Identification field of IP header (denoted as IP-ID,
            <xref target="ip_id"></xref>).</t>

            <t>Define a new IP option (<xref target="ip_option"></xref>).</t>

            <t>Assign port sets (<xref target="server"></xref>).</t>

            <t>Use ICMP (<xref target="server"></xref>).</t>

            <t>Define a new TCP Option (<xref target="tcp"></xref>).</t>

            <t>Enable Proxy Protocol <xref target="PROXY">(</xref>).</t>

            <t>Activate HIP (<xref target="hip"></xref>).</t>

            <t>Inject application headers (<xref target="xff"></xref>).</t>

            <t>Use an out-of-band mechanism (<xref
            target="ident"></xref>).</t>
          </list></t>

        <t><?rfc subcompact="no" ?>A synthesis is provided in <xref
        target="synthesis"></xref> while the detailed analysis is elaborated
        in <xref target="Analysis"></xref>.</t>

        <t><xref target="privacy"></xref> discusses privacy issues common to
        all HOST_ID solutions. It is out of scope of this document to
        elaborate on privacy issues specific to each solution.</t>
      </section>
    </section>

    <section anchor="problem_space" title="Problem to Be Solved">
      <t><list hangIndent="10" style="hanging">
          <t hangText="Observation:">Some servers use the source IPv4 address
          as an identifier to treat some incoming connections differently. Due
          to the deployment of CGNs (e.g., NAT44 <xref
          target="RFC3022"></xref>, NAT64 <xref target="RFC6146"></xref>),
          that address will be shared. In particular, when a server receives
          packets from the same source address, because this address is
          shared, the server does not know which host is the sending host
          <xref target="RFC6269"></xref>.</t>

          <t hangText="Objective:">The server should be able to sort out the
          packets by sending host.</t>

          <t hangText="Requirement:">The server must have extra information
          than the source IP address to differentiate the sending host. We
          call HOST_ID this information.</t>
        </list></t>

      <t>For all solutions analyzed, we provide answers to the following
      questions:</t>

      <t><?rfc subcompact="no" ?><list hangIndent="5" style="hanging">
          <t hangText="What is the HOST_ID?">It must be unique to each host
          under the same IP address. It does not need to be globally unique.
          Of course, the combination of the (public) IP source address and the
          identifier (i.e., HOST_ID) ends up being relatively unique. As
          unique as today's 32-bit IPv4 addresses which, today, can change
          when a host re-connects.</t>

          <t
          hangText="Where is the HOST_ID? (which protocol, which field):">If
          the HOST_ID is put at the IP level, all packets will have to bear
          the identifier. If it is put at a higher connection-oriented level,
          the identifier is only needed once in the session establishment
          phase (for instance TCP three-way-handshake), then, all packets
          received in this session will be attributed to the HOST_ID
          designated during the session opening.</t>

          <t hangText="Who puts the HOST_ID?">For almost all the analyzed
          solutions, the address sharing function injects the HOST_ID. When
          there are several address sharing functions in the data path, we
          describe to what extent the proposed solution is efficient. Another
          option to avoid potential performance degradation is to let the host
          inject its HOST_ID but the address sharing function will check its
          content (just like an IP anti-spoofing function). For some
          proposals, the HOST_ID is retrieved using an out-of-band mechanism
          or signaled in a dedicated notification channel.</t>

          <t hangText="What are the security considerations?">Security
          considerations are common to all analyzed solutions (see <xref
          target="Security"></xref>). Privacy-related aspects are discussed in
          <xref target="privacy"></xref>.</t>
        </list></t>

      <section title="IPv6 May Also Be Concerned">
        <t>Some of the issues mentioned in <xref
        target="problem_space"></xref> are independent of IPv4 vs. IPv6. Even
        in IPv6, address sharing can be used for a variety of reasons (e.g.,
        to hide network topology, to defeat hosts from offering network
        services directly, etc.).</t>

        <t>A solution to reveal HOST_ID is also needed in IPv6 deployment.</t>
      </section>
    </section>

    <section anchor="privacy" title="HOST_ID and Privacy">
      <t>IP address sharing is motivated by a number of different factors. For
      years, many network operators have conserved the use of public IPv4
      addresses by making use of Customer Premises Equipment (CPE) that
      assigns a single public IPv4 address to all hosts within the customer's
      local area network and uses NAT <xref target="RFC3022"></xref> to
      translate between locally unique private IPv4 addresses and the CPE's
      public address. With the exhaustion of IPv4 address space, address
      sharing between customers on a much larger scale is likely to become
      much more prevalent. While many individual users are unaware of and
      uninvolved in decisions about whether their unique IPv4 addresses get
      revealed when they send data via IP, some users realize privacy benefits
      associated with IP address sharing, and some may even take steps to
      ensure that NAT functionality sits between them and the public Internet.
      IP address sharing makes the actions of all users behind the NAT
      function unattributable to any single host, creating room for abuse but
      also providing some identity protection for non-abusive users who wish
      to transmit data with reduced risk of being uniquely identified.</t>

      <t>The proposals considered in this document add a measure of uniqueness
      back to hosts that share a public IP address. The extent of that
      uniqueness depends on which information is included in the HOST_ID.</t>

      <t>The volatility of the HOST_ID information is similar to the source IP
      address: a distinct HOST_ID may be used by the address sharing function
      when the host reboots or gets a new internal IP address. As with
      persistent IP addresses, persistent HOST_IDs facilitate user tracking
      over time.</t>

      <t>As a general matter, the HOST_ID proposals do not seek to make hosts
      any more identifiable than they would be if they were using a public,
      non-shared IP address. However, depending on the solution proposal, the
      addition of HOST_ID information may allow a device to be fingerprinted
      more easily than it otherwise would be. Should multiple solutions be
      combined (e.g., TCP Option and XFF) that include different pieces of
      information in the HOST_ID, fingerprinting may become even easier.</t>

      <t>A HOST_ID can be spoofed as this is also the case for spoofing an IP
      address. Furthermore, users of network-based anonymity services (like
      Tor) may be capable of stripping HOST_ID information before it reaches
      its destination.</t>

      <t>HOST_ID specification document(s) SHOULD explain the privacy impact
      of the solutions they specify, including the extent of HOST_ID
      uniqueness and persistence, assumptions made about the lifetime of the
      HOST_ID, whether and how the HOST_ID can be obfuscated or recycled, and
      the impact of the use of the HOST_ID on device or implementation
      fingerprinting. <xref target="I-D.iab-privacy-considerations"></xref>
      provides further guidance.</t>

      <t>For more discussion about privacy, refer to <xref
      target="RFC6462"></xref>.</t>
    </section>

    <section anchor="synthesis" title="Solutions Analysis">
      <t></t>

      <section title="Requirements">
        <t>Whatever the channel used to convey the HOST_ID, the following
        requirements are to be met:</t>

        <t><list style="hanging">
            <t hangText="Uniqueness of identifiers in HOST_ID:">It is
            RECOMMENDED that HOST_IDs be limited to providing local uniqueness
            rather than global uniqueness.</t>

            <t hangText="Refresh rate of HOST_ID:">Address sharing function
            SHOULD NOT use permanent HOST_ID values.</t>

            <t hangText="Manipulate HOST_IDs:">Address sharing function SHOULD
            be able to strip, re-write and add HOST_ID fields.</t>

            <t hangText="Interference between HOST_IDs:">An address sharing
            function, able to inject HOST_IDs in several layers, SHOULD reveal
            subsets of the same information (e.g., full IP address, lower 16
            bits of IP address, etc.).</t>
          </list></t>
      </section>

      <section title="Synthesis">
        <t>The following Table 1 summarizes the approaches analyzed in this
        document.</t>

        <t><?rfc subcompact="yes" ?></t>

        <t><list style="symbols">
            <t>"Success ratio" indicates the ratio of successful
            communications when the option is used. Provided figures are
            inspired from the results documented in <xref
            target="Options"></xref>.</t>

            <t>"Deployable today" indicates if the solution can be generalized
            without any constraint on current architectures and practices.</t>

            <t>"Possible Perf Impact" indicates the level of expected
            performance degradation. The rationale behind the indicated
            potential performance degradation is whether the injection
            requires some treatment at the IP level or not.</t>

            <t>"OS TCP/IP Modif" indicates whether a modification of the OS
            TCP/IP stack is required at the server side.</t>
          </list></t>

        <t><?rfc subcompact="no" ?></t>

        <figure align="center" anchor="table"
                title="Table 1: Summary of analyzed solutions.">
          <artwork><![CDATA[          +------+------+-----+------+-----+-----+-----+-----+-----+
          | IP   | TCP  |IP-ID|HTTP  |PROXY|Port | HIP |ICMP |IDENT|
          |Option|Option|     |Header|     | Set |     |     |     |
          |      |      |     |(XFF) |     |     |     |     |     |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
UDP       | Yes  | No   | Yes | No   | No  | Yes |     | Yes | No  |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
TCP       | Yes  | Yes  | Yes | No   | Yes | Yes |     | Yes | Yes |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
HTTP      | Yes  | Yes  | Yes | Yes  | Yes | Yes |     | Yes | Yes |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
Encrypted | Yes  | Yes  | Yes | No   | Yes | Yes |     | Yes | Yes |
Traffic   |      |      |     |      |     |     |     |     |     |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
Success   | 30%  | 99%  | 100%| 100% | Low | 100%|Low  |~100%|~100%|
Ratio     |      |      |     |      |     |     |     | (6) | (6) |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
Possible  | High | Low  | Low |  Med | High| No  | N/A | High|High |
Perf      |      |  to  |  to |   to |     |     |     |     |     |
Impact    |      | Med  | Med | High |     |     |     |     |     |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
OS TCP/IP | Yes  | Yes  | Yes | No   | No  | No  |     | Yes | Yes |
Modif     |      |      |     |      |     |     |     |     |     |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
Deployable| Yes  | Yes  | Yes | Yes  | No  | Yes | No  | Yes | Yes |
Today     |      |      |     |      |     |     |     |     |     |
----------+------+------+-----+------+-----+-----+-----+-----+-----+
Notes     |      |      | (1) |  (2) |     | (1) | (4) | (7) | (1) |
          |      |      |     |      |     | (3) | (5) |     | (7) |
----------+------+------+-----+------+-----+-----+-----+-----+-----+

 Notes:

 (1)  Requires mechanism to advertise NAT is participating in this
      scheme (e.g., DNS PTR record).
 (2)  This solution is widely deployed.
 (3)  When the port set is not advertised, the solution is less
      efficient for third-party services.
 (4)  Requires the client and the server to be HIP-compliant and HIP
      infrastructure to be deployed.
 (5)  If the client and the server are HIP-enabled, the address
      sharing function does not need to insert an identifier. If the
      client is not HIP-enabled, designing the device that performs
      address sharing to act as a UDP/TCP-HIP relay is not viable.
 (6)  Implementation-specific.
 (7)  The solution is inefficient in various scenarios as discussed
      in Section 5. 

]]></artwork>
        </figure>

        <t><?rfc subcompact="no" ?>According to the above table and the
        analysis elaborated in <xref target="Analysis"></xref>:<list
            style="symbols">
            <t>IP Option, IP-ID and Proxy Protocol proposals are broken;</t>

            <t>HIP (Host Identity Protocol) is not largely deployed;</t>

            <t>The use of Port Set may contradict the port randomization <xref
            target="RFC6056"></xref> requirement identified in <xref
            target="RFC6269"></xref>. This solution can be used by a service
            provider for the delivery of its own service offerings relying on
            implicit identification.</t>

            <t>X-Forwarded-For (XFF) is de facto standard deployed and
            supported in operational networks (e.g., HTTP Severs,
            Load-Balancers, etc.).</t>

            <t>From an application standpoint, the TCP Option is superior to
            XFF/Forwarded-For since it is not restricted to HTTP. Nevertheless
            XFF/Forwarded-For is compatible with the presence of address
            sharing and load-balancers in the communication path. To provide a
            similar functionality, the TCP Option may be extended to allow
            conveying a list of IP addresses and port numbers to not lose the
            source IP address in the presence of load-balancers. Another
            alternative is to combine the usage of both the HOST_ID TCP Option
            and XFF/Forwarded-For. Extending TCP is still possible as analyzed
            in <xref target="ExtendTCP"></xref>.</t>
          </list></t>

        <t></t>
      </section>
    </section>

    <section anchor="Analysis" title="Detailed Solutions Analysis">
      <t></t>

      <section anchor="ip_id"
               title="Use the Identification Field of IP Header (IP-ID)">
        <t></t>

        <section title="Description">
          <t>IP-ID (Identification field of IP header) can be used to insert
          an information which uniquely distinguishes a host among those
          sharing the same IPv4 address. An address sharing function can
          re-write the IP-ID field to insert a value unique to the host (16
          bits are sufficient to uniquely disambiguate hosts sharing the same
          IP address). Note that this field is not altered by some NATs; hence
          some side effects such as counting hosts behind a NAT as reported in
          <xref target="Count"></xref>.</t>

          <t>A variant of this approach relies upon the format of certain
          packets, such as TCP SYN, where the IP-ID can be modified to contain
          a 16 bit HOST_ID. Address sharing devices performing this function
          would require to indicate they are performing this function out of
          band, possibly using a special DNS record.</t>
        </section>

        <section title="Analysis">
          <t>This usage is not compliant with what is recommended in <xref
          target="I-D.ietf-intarea-ipv4-id-update"></xref>.</t>
        </section>
      </section>

      <section anchor="ip_option" title="Define an IP Option">
        <t></t>

        <section title="Description">
          <t>A solution alternative to convey the HOST_ID is to define an IP
          option <xref target="RFC0791"></xref>. HOST_ID IP option can be
          inserted by the address sharing function to uniquely distinguish a
          host among those sharing the same IP address. An example of such
          option is documented in <xref
          target="I-D.chen-intarea-v4-uid-header-option"></xref>. This IP
          option allows to convey an IPv4 address, an IPv6 prefix, a GRE key,
          IPv6 Flow Label, etc.</t>

          <t>Another way for using IP option has been described in Section 4.6
          of <xref target="RFC3022"></xref>.</t>
        </section>

        <section title="Analysis">
          <t>Unlike the solution presented in <xref target="tcp"></xref>, this
          proposal can apply for any transport protocol. Nevertheless, it is
          widely known that routers (and other middleboxes) filter IP options.
          IP packets with IP options can be dropped by some IP nodes. Previous
          studies demonstrated that "IP Options are not an option" (Refer to
          <xref target="Not_An_Option"></xref>, <xref
          target="Options"></xref>).</t>

          <t>As a conclusion, using an IP option to convey a host-hint is not
          viable.</t>
        </section>
      </section>

      <section anchor="server" title="Assign Port Sets">
        <t></t>

        <section title="Description">
          <t>This solution does not require any action from the address
          sharing function to disclose a host identifier. Instead of assuming
          all transport ports are associated with one single host, each host
          under the same external IP address is assigned a restricted port
          set. These port sets are then advertised to remote servers using
          off-line means. This announcement is not required for the delivery
          of internal services (i.e., offered by the service provider
          deploying the address sharing function) relying on implicit
          identification.</t>

          <t>Port sets assigned to hosts may be static or dynamic.</t>

          <t>Port set announcements to remote servers do not require to reveal
          the identity of individual hosts but only to advertise the enforced
          policy to generate non-overlapping port sets (e.g., the transport
          space associated with an IP address is fragmented to contiguous
          blocks of 2048 port numbers).</t>
        </section>

        <section title="Analysis">
          <t>The solution does not require defining new fields nor options; it
          is policy-based.</t>

          <t>The solution may contradict the port randomization as identified
          in <xref target="RFC6269"></xref>. A mitigation would be to avoid
          assigning static port sets to individual hosts.</t>

          <t>The method is convenient for the delivery of services offered by
          the service provider offering also the IP connectivity service.</t>
        </section>
      </section>

      <section anchor="icmp" title="Use ICMP">
        <t></t>

        <section title="Description">
          <t>Another alternative is to convey the HOST_ID using a separate
          notification channel than the packets issued to invoke the
          service.</t>

          <t>An implementation example is defined in <xref
          target="I-D.yourtchenko-nat-reveal-ping"></xref>. This solution
          relies on a mechanism where the address sharing function
          encapsulates the necessary differentiating information into an ICMP
          Echo Request packet that it sends in parallel with the initial
          session creation (e.g., SYN). The information included in the ICMP
          Request Data portion describes the five-tuples as seen on both of
          the sides of the address sharing function.</t>
        </section>

        <section title="Analysis">
          <t><?rfc subcompact="yes" ?><list style="symbols">
              <t>This ICMP proposal is valid for both UDP and TCP. Address
              sharing function may be configurable with the transport protocol
              which is allowed to trigger those ICMP messages.</t>

              <t>A hint should be provided to the ultimate server (or
              intermediate nodes) an ICMP Echo Request conveys a HOST_ID. This
              may be implemented using magic numbers.</t>

              <t>Even if ICMP packets are blocked in the communication path,
              the user connection does not have to be impacted.</t>

              <t>Some implementations requiring to delay the establishment of
              a session until receiving the companion ICMP Echo Request, may
              lead to some user experience degradation.</t>

              <t>Because of the presence of load-balancers in the path, the
              ultimate server receiving the SYN packet may not be the one
              which may receive the ICMP message conveying the HOST_ID.</t>

              <t>Because of the presence of load-balancers in the path, the
              port number assigned by address sharing may be lost. Therefore
              the mapping information conveyed in the ICMP may not be
              sufficient to associate a SYN packet with a received ICMP.</t>

              <t>The proposal is not compatible with the presence of cascaded
              NAT.</t>

              <t>The ICMP proposal will add a traffic overhead for both the
              server and the address sharing device.</t>

              <t>The ICMP proposal is similar to other mechanisms (e.g.,
              syslog, netflow) for reporting dynamic mappings to a mediation
              platform (mainly for legal traceability purposes). Performance
              degradation are likely to be experienced by address sharing
              functions because ICMP messages are to be sent in particular for
              each new instantiated mapping (and also even if the mapping
              exists).</t>

              <t>In some scenarios (e.g., Fixed-Mobile Convergence, Open WiFi,
              etc.), HOST_ID should be interpreted by intermediate devices
              which embed Policy Enforcement Points (PEP, <xref
              target="RFC2753"></xref>) responsible for granting access to
              some services. These PEPs need to inspect all received packets
              in order to find the companion (traffic) messages to be
              correlated with ICMP messages conveying HOST_IDs. This induces
              more complexity to these intermediate devices.</t>
            </list></t>

          <t><?rfc subcompact="no" ?></t>
        </section>
      </section>

      <section anchor="tcp" title="Define a TCP Option">
        <t></t>

        <section title="Description">
          <t>HOST_ID may be conveyed in a dedicated TCP Option. An example is
          specified in <xref target="I-D.wing-nat-reveal-option"></xref> which
          defines a new TCP Option called USER_HINT. This option encloses the
          TCP client's identifier (e.g., the lower 16 bits of their IPv4
          address, their VLAN ID, VRF ID, subscriber ID). The address sharing
          device inserts this TCP Option into the TCP SYN packet.</t>
        </section>

        <section title="Analysis">
          <t>Using a new TCP Option to convey the HOST_ID does not require any
          modification to the applications but it is applicable only for
          TCP-based applications. Applications relying on other transport
          protocols are therefore left unsolved.</t>

          <t><xref target="I-D.wing-nat-reveal-option"></xref> discusses the
          interference with other TCP Options.</t>

          <t>The risk related to handling a new TCP Option is low as measured
          in <xref target="Options"></xref>. <xref
          target="I-D.abdo-hostid-tcpopt-implementation"></xref> provides a
          detailed implementation and experimentation report of HOST_ID TCP
          Option. <xref target="I-D.abdo-hostid-tcpopt-implementation"></xref>
          investigated in depth the impact of activation HOST_ID in host,
          address sharing function and the enforcement of policies at the
          server side. <xref
          target="I-D.abdo-hostid-tcpopt-implementation"></xref> reports a
          failure ratio of 0,103% among top 100000 websites.</t>

          <t>Some downsides have been raised against defining a TCP Option to
          reveal a host identity:</t>

          <t><list style="symbols">
              <t>Conveying an IP address in a TCP Option may be seen as a
              violation of OSI layers but since IP addresses are already used
              for the checksum computation, this is not seen as a blocking
              point. Moreover, updated version of <xref
              target="I-D.wing-nat-reveal-option"></xref> does not allow
              anymore to convey an IP address (the HOST_ID is encoded in
              16bits).</t>

              <t>TCP Option space is limited, and might be consumed by the TCP
              client. <xref
              target="I-D.abdo-hostid-tcpopt-implementation"></xref> discusses
              two approaches to sending the HOST_ID: sending the HOST_ID in
              the TCP SYN (which consumes more bytes in the TCP header of the
              TCP SYN) and sending the HOST_ID in a TCP ACK (which consumes
              only two bytes in the TCP SYN). Content providers may find it
              more desirable to receive the HOST_ID in the TCP SYN, as that
              more closely preserves the HOST_ID received in the source IP
              address as per current practices. It is more complicated to
              implement sending the HOST_ID in a TCP ACK, as it can introduce
              MTU issues if the ACK packet also contains TCP data, or a TCP
              segment is lost. Note <xref
              target="I-D.wing-nat-reveal-option"></xref> allows only to
              enclose the HOST_ID in the TCP SYN packet.</t>

              <t>When there are several NATs in the path, the original HOST_ID
              may be lost.</t>

              <t>Interference with current usages such as X-Forwarded-For (see
              <xref target="xff"></xref>) should be elaborated to specify the
              behavior of servers when both options are used; in particular
              specify which information to use: the content of the TCP Option
              or what is conveyed in the application headers.</t>

              <t>When load-balancers or proxies are in the path, this option
              does not allow to preserve the original source IP address and
              source port. Preserving such information is required for logging
              purposes for instance (e.g., <xref target="RFC6302"></xref>).
              <xref target="I-D.abdo-hostid-tcpopt-implementation"></xref>
              defines a TCP Option which allows to reveal various combinations
              of source information (e.g., source port, source port and source
              IP address, source IPv6 prefix, etc.).</t>
            </list>More discussion about issues raised when extending TCP can
          be found at <xref target="ExtendTCP"></xref>.</t>
        </section>
      </section>

      <section anchor="PROXY" title="PROXY Protocol">
        <t></t>

        <section title="Description">
          <t>The solution, referred to as Proxy Protocol <xref
          target="Proxy"></xref>, does not require any application-specific
          knowledge. The rationale behind this solution is to prepend each
          connection with a line reporting the characteristics of the other
          side's connection as shown in the example depicted in <xref
          target="proxy_eg"></xref>:</t>

          <t><figure align="center" anchor="proxy_eg"
              title="Example of PROXY conection report">
              <artwork><![CDATA[    PROXY TCP4 192.0.2.1 192.0.2.15 56324 443\r\n]]></artwork>
            </figure></t>

          <t>Upon receipt of a message conveying this line, the server removes
          the line. The line is parsed to retrieve the transported protocol.
          The content of this line is recorded in logs and used to enforce
          policies.</t>
        </section>

        <section title="Analysis">
          <t>This solution can be deployed in a controlled environment but it
          can not be deployed to all access services available in the
          Internet. If the remote server does not support the Proxy Protocol,
          the session will fail. Other complications will raise due to the
          presence of firewalls for instance.</t>

          <t>As a consequence, this solution is broken and can not be
          recommended.</t>
        </section>
      </section>

      <section anchor="hip" title="Host Identity Protocol (HIP)">
        <t></t>

        <section title="Description">
          <t><xref target="RFC5201"></xref> specifies an architecture which
          introduces a new namespace to convey an identity information.</t>
        </section>

        <section title="Analysis">
          <t>This solution requires both the client and the server to support
          HIP <xref target="RFC5201"></xref>. Additional architectural
          considerations are to be taken into account such as the key
          exchanges, etc.</t>

          <t>If the address sharing function is required to act as a
          UDP/TCP-HIP relay, this is not a viable option.</t>
        </section>
      </section>

      <section anchor="xff"
               title="Inject Application Protocol Message Headers">
        <t></t>

        <section title="Description">
          <t>Another option is to not require any change at the transport nor
          the IP levels but to convey at the application payload the required
          information which will be used to disambiguate hosts. This format
          and the related semantics depend on its application (e.g., HTTP,
          SIP, SMTP, etc.).</t>

          <t>For HTTP, the X-Forwarded-For (XFF) or Forwarded-For (<xref
          target="I-D.ietf-appsawg-http-forwarded"></xref>) headers can be
          used to display the original IP address when an address sharing
          device is involved. Service Providers operating address sharing
          devices can enable the feature of injecting the XFF/Forwarded-For
          header which will enclose the original IPv4 address or the IPv6
          prefix part (see the example shown in <xref
          target="xff_eg"></xref>). The address sharing device has to strip
          all included XFF/Forwarded-For headers before injecting their own.
          Servers may rely on the contents of this field to enforce some
          policies such as blacklisting misbehaving users. Note that XFF can
          also be logged by some servers (this is for instance supported by
          Apache).</t>

          <t><figure align="center" anchor="xff_eg"
              title="Example of Forwarded-For">
              <artwork><![CDATA[Forwarded: for=192.0.2.1,for=[2001:db8::1]
Forwarded: proto=https;by=192.0.2.15]]></artwork>
            </figure></t>
        </section>

        <section title="Analysis">
          <t>Not all applications impacted by the address sharing can support
          the ability to disclose the original IP address. Only a subset of
          protocols (e.g., HTTP) can rely on this solution.</t>

          <t>For the HTTP case, to prevent users injecting invalid HOST_IDs,
          an initiative has been launched to maintain a list of trusted ISPs
          using XFF: See for example the list available at: <xref
          target="Trusted_ISPs"></xref> of trusted ISPs as maintained by
          Wikipedia. If an address sharing device is on the trusted XFF ISPs
          list, users editing Wikipedia located behind the address sharing
          device will appear to be editing from their "original" IP address
          and not from the NATed IP address. If an offending activity is
          detected, individual hosts can be blacklisted instead of all hosts
          sharing the same IP address.</t>

          <t>XFF header injection is a common practice of load balancers. When
          a load balancer is in the path, the original content of any included
          XFF header should not be stripped. Otherwise the information about
          the "origin" IP address will be lost.</t>

          <t>When several address sharing devices are crossed,
          XFF/Forwarded-For header can convey the list of IP addresses (e.g.,
          <xref target="xff_eg"></xref>). The origin HOST_ID can be exposed to
          the target server.</t>

          <t>XFF also introduces some implementation complexity if the HTTP
          packet is at or close to the MTU size.</t>

          <t>It has been reported that some "poor" implementation may
          encounter some parsing issues when injecting XFF header.</t>

          <t>For encrypted HTTP traffic, injecting XFF header may be
          broken.</t>
        </section>
      </section>

      <section anchor="ident" title="Use Out-of-Band Mechanisms (e.g., IDENT)">
        <t></t>

        <section title="Description">
          <t>Another alternative is to retrieve the HOST_ID using a dedicated
          query channel.</t>

          <t>An implementation example may rely on the Identification Protocol
          (IDENT, <xref target="RFC1413"></xref>). This solution assumes
          address sharing function implements the server part of IDENT while
          remote servers implement the client part of the protocol. IDENT
          needs to be updated (see <xref target="IDENT_NAT"></xref>) to be
          able to return a host identifier instead of the user-id as defined
          in <xref target="RFC1413"></xref>. The IDENT response syntax uses
          the same USERID field described in <xref target="RFC1413"></xref>
          but rather than returning a username, a host identifier (e.g., a 16
          bit value) is returned <xref target="IDENT_NAT"></xref>. For any new
          incoming connection, the server contacts the IDENT server to
          retrieve the associated identifier. During that phase, the
          connection may be delayed.</t>
        </section>

        <section title="Analysis">
          <t><?rfc subcompact="yes" ?><list style="symbols">
              <t>The solution is specific to TCP.</t>

              <t>This solution requires the address sharing function to embed
              an IDENT server.</t>

              <t>A hint should be provided to the ultimate server (or
              intermediate nodes) the address sharing function implements
              IDENT protocol. This can be achieved by publishing this
              capability using DNS.</t>

              <t>Some implementations requiring to delay the establishment of
              a session until receiving the companion IDENT response, may lead
              to some user experience degradation.</t>

              <t>The IDENT proposal will add a traffic overhead for both the
              server and the address sharing device.</t>

              <t>Performance degradation are likely to be experienced by
              address sharing functions embedding the IDENT server. This is
              even exacerbated if the address sharing function has to handle
              an IDENT query for each new instantiated mapping (and also even
              if the mapping exists).</t>

              <t>In some scenarios (e.g., Fixed-Mobile Convergence, Open WiFi,
              etc.), HOST_ID should be interpreted by intermediate devices
              which embed Policy Enforcement Points (PEP, <xref
              target="RFC2753"></xref>) responsible for granting access to
              some services. These PEPs need to inspect all received packets
              in order to generate the companion IDENT queries. This may
              induce more complexity to these intermediate devices.</t>

              <t>IDENT queries may be generated by non legitimate TCP servers.
              This would require the address sharing function to enforce some
              policies (e.g., rate limit queries, filter based on the source
              IP address, etc.).</t>
            </list></t>

          <t><?rfc subcompact="no" ?></t>
        </section>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document does not require any action from IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The same security concerns apply for the injection of an IP option,
      TCP Option and application-related content (e.g., XFF) by the address
      sharing device. If the server trusts the content of the HOST_ID field, a
      third party user can be impacted by a misbehaving user to reveal a
      "faked" HOST_ID (e.g., original IP address).</t>

      <t>HOST_ID may be used to leak information about the internal structure
      of a network behind an address sharing function. If this behavior is
      undesired for the network administrator, the address sharing function
      can be configured to strip any existing HOST_ID in received packets from
      internal hosts.</t>

      <t>HOST_ID specification documents SHOULD elaborate further on threats
      inherent to each individual solution to convey the HOST_ID (e.g., use of
      the IP-ID field to count hosts behind a NAT <xref
      target="Count"></xref>).</t>
    </section>

    <section title="Acknowledgments">
      <t>Many thanks to D. Wing and C. Jacquenet for their review, comments
      and inputs.</t>

      <t>Thanks also to P. McCann, T. Tsou, Z. Dong, B. Briscoe, T. Taylor, M.
      Blanchet, D. Wing and A. Yourtchenko for the discussions in Prague.</t>

      <t>Some of the issues related to defining a new TCP Option have been
      raised by L. Eggert.</t>

      <t>Privacy text is provided by A. Cooper.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.0791'?>

      <?rfc include='reference.RFC.6056'?>

      <?rfc include='reference.RFC.3022'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.I-D.wing-nat-reveal-option'?>

      <?rfc include='reference.I-D.ietf-appsawg-http-forwarded'?>

      <?rfc include='reference.I-D.iab-privacy-considerations'?>

      <?rfc include='reference.RFC.1413'?>

      <?rfc include='reference.RFC.5201'?>

      <?rfc include='reference.RFC.6462'?>

      <?rfc include='reference.RFC.6346'?>

      <?rfc include='reference.RFC.2753'?>

      <?rfc include='reference.RFC.6302'?>

      <?rfc include='reference.I-D.yourtchenko-nat-reveal-ping'?>

      <?rfc include='reference.I-D.chen-intarea-v4-uid-header-option'?>

      <?rfc include='reference.RFC.6146'?>

      <?rfc include='reference.I-D.abdo-hostid-tcpopt-implementation'?>

      <?rfc include='reference.RFC.6269'?>

      <?rfc include='reference.I-D.ietf-intarea-ipv4-id-update'?>

      <reference anchor="Not_An_Option"
                 target="http://www.eecs.berkeley.edu/Pubs/TechRpts/2005/EECS-2005-24.html">
        <front>
          <title>IP options are not an option</title>

          <author fullname="" initials="" surname="">
            <organization>R. Fonseca, G. Porter, R. Katz, S. Shenker, and I.
            Stoica,</organization>
          </author>

          <date year="2005" />
        </front>
      </reference>

      <reference anchor="Options"
                 target="http://conferences.sigcomm.org/imc/2004/papers/p336-medina.pdf">
        <front>
          <title>Measuring Interactions Between Transport Protocols and
          Middleboxes</title>

          <author fullname="" initials="" surname="">
            <organization>Alberto Medina, Mark Allman, Sally
            Floyd</organization>
          </author>

          <date year="2005" />
        </front>
      </reference>

      <reference anchor="ExtendTCP"
                 target="http://nrg.cs.ucl.ac.uk/mjh/tmp/mboxes.pdf">
        <front>
          <title>Is it still possible to extend TCP?</title>

          <author fullname="" initials="" surname="">
            <organization>Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A.,
            Handley, M. and H. Tokuda,</organization>
          </author>

          <date month="November" year="2011" />
        </front>
      </reference>

      <reference anchor="Trusted_ISPs"
                 target="http://meta.wikimedia.org/wiki/XFF_project#Trusted_XFF_list">
        <front>
          <title>Trusted XFF list</title>

          <author>
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>

      <reference anchor="Count"
                 target="http://www.cs.columbia.edu/~smb/papers/fnat.pdf">
        <front>
          <title>A technique for counting NATted hosts</title>

          <author fullname="Steve Belloven">
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>

      <reference anchor="Proxy"
                 target="http://haproxy.1wt.eu/download/1.5/doc/proxy-protocol.txt">
        <front>
          <title>The PROXY protocol</title>

          <author fullname="Willy Tarreau" initials="W." surname="Tarreau">
            <organization></organization>
          </author>

          <date month="November" year="2010" />
        </front>
      </reference>

      <reference anchor="IDENT_NAT" target="draft-wing-intarea-ident">
        <front>
          <title>Using the Identification Protocol with an Address Sharing
          Device</title>

          <author fullname="Dan Wing" initials="D." surname="Wing">
            <organization></organization>
          </author>

          <date month="August" year="2012" />
        </front>
      </reference>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 21:07:58