One document matched: draft-ietf-intarea-nat-reveal-analysis-08.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="yes"?>
<rfc category="info" docName="draft-ietf-intarea-nat-reveal-analysis-08"
     ipr="trust200902">
  <front>
    <title abbrev="Revealing HOST_ID">Analysis of Solution Candidates to
    Reveal a Host Identifier (HOST_IDENT) in Shared Address
    Deployments</title>

    <author fullname="Mohamed Boucadair" initials="M." surname="Boucadair">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street></street>

          <city>Rennes</city>

          <region></region>

          <code>35000</code>

          <country>France</country>
        </postal>

        <email>mohamed.boucadair@orange.com</email>
      </address>
    </author>

    <author fullname="Joe Touch" initials="J." surname="Touch">
      <organization>USC/ISI</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <region></region>

          <code></code>

          <country></country>
        </postal>

        <email>touch@isi.edu</email>
      </address>
    </author>

    <author fullname="Pierre Levis" initials="P." surname="Levis">
      <organization>France Telecom</organization>

      <address>
        <postal>
          <street></street>

          <city>Caen</city>

          <region></region>

          <code>14000</code>

          <country>France</country>
        </postal>

        <email>pierre.levis@orange.com</email>
      </address>
    </author>

    <author fullname="Reinaldo Penno" initials="R." surname="Penno">
      <organization>Cisco</organization>

      <address>
        <postal>
          <street></street>

          <code></code>

          <country>USA</country>
        </postal>

        <email>repenno@cisco.com</email>
      </address>
    </author>

    <date day="11" month="April" year="2013" />

    <workgroup>INTAREA WG</workgroup>

    <keyword>NAT, Host Identifier</keyword>

    <abstract>
      <t>This document is a collection of solutions to reveal a host
      identifier (denoted as HOST_IDENT) when a Carrier Grade NAT (CGN) or
      application proxies are involved in the path. This host identifier could
      be used by a remote server to sort out the packets by sending host. The
      host identifier must be unique to each host under the same shared IP
      address.</t>

      <t>This document analyzes a set of solution candidates to reveal a host
      identifier; no recommendation is sketched in the document.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>As reported in <xref target="RFC6269"></xref>, several issues are
      encountered when an IP address is shared among several subscribers.
      These issues are encountered in various deployment contexts: e.g.,
      Carrier Grade NAT (CGN), application proxies, or A+P <xref
      target="RFC6346"></xref>. Examples of such issues are: implicit
      identification (Section 13.2 of <xref target="RFC6269"></xref>), spam
      (Section 13.3 of <xref target="RFC6269"></xref>), blacklisting a
      mis-behaving host (Section 13.1 of <xref target="RFC6269"></xref>) or
      redirect users with infected machines to a dedicated portal (Section 5.1
      of <xref target="RFC6269"></xref>).</t>

      <t>In particular, some servers use the source IPv4 address as an
      identifier to treat some incoming connections differently. Due to the
      deployment of CGNs (e.g., NAT44 <xref target="RFC3022"></xref>, NAT64
      <xref target="RFC6146"></xref>), that address will be shared. In
      particular, when a server receives packets from the same source address,
      because this address is shared, the server does not know which host is
      the sending host <xref target="RFC6269"></xref>. The sole use of the
      IPv4 address is not sufficient to uniquely distinguish a host. As a
      mitigation, it is tempting to investigate means which would help in
      disclosing information to be used by the remote server as a means to
      uniquely disambiguate packets of hosts using the same IPv4 address.</t>

      <t>The risk of not mitigating these issues include: OPEX (Operational
      Expenditure) increase for IP connectivity service providers (costs
      induced by calls to a hotline), revenue loss for content providers (loss
      of users audience) and customers' dissatisfaction (low quality of
      experience, service segregation, etc.).</t>

      <t>The purpose of this document is to analyze a set of alternative
      channels to convey a host identifier and to assess to what extent they
      solve the problem described in <xref target="problem_space"></xref>.
      Below are listed the alternatives analyzed in the document:</t>

      <t><?rfc subcompact="yes" ?><list style="symbols">
          <t>Use the Identification field of IP header (denoted as IP-ID,
          <xref target="ip_id"></xref>).</t>

          <t>Define a new IP option (<xref target="ip_option"></xref>).</t>

          <t>Define a new TCP Option (<xref target="tcp"></xref>).</t>

          <t>Inject application headers (<xref target="xff"></xref>).</t>

          <t>Enable Proxy Protocol (<xref target="PROXY"></xref>).</t>

          <t>Assign port sets (<xref target="server"></xref>).</t>

          <t>Activate HIP (Host Identity Protocol, <xref
          target="hip"></xref>).</t>

          <t>Use a notification channel (<xref target="icmp"></xref>).</t>

          <t>Use an out-of-band mechanism (<xref target="ident"></xref>).</t>
        </list><?rfc subcompact="no" ?></t>

      <t>A synthesis is provided in <xref target="synthesis"></xref> while the
      detailed analysis is elaborated in <xref target="Analysis"></xref>.</t>

      <t><xref target="privacy"></xref> discusses privacy issues common to all
      candidate solutions. It is out of scope of this document to elaborate on
      privacy issues specific to each solution.</t>

      <t>This document does not include any recommendation because the working
      group felt it is too premature to include one.</t>
    </section>

    <section anchor="problem_space" title="On HOST_IDENT">
      <t>Policies relying on source IP address which are enforced by some
      servers will be applied to all hosts sharing the same IP address. For
      example, blacklisting the IP address of a spammer host will result in
      all other hosts sharing that address having their access to the
      requested service restricted. <xref target="RFC6269"></xref> describes
      the issues in detail. Therefore, due to address sharing, servers need
      extra information beyond the source IP address to differentiate the
      sending host. We call this information the HOST_IDENT.</t>

      <t>HOST_IDENT identifies a host under a shared IP address.
      Privacy-related considerations are discussed in <xref
      target="privacy"></xref>.</t>

      <t>Within this document, a host can be any computer located behind a
      Home Gateway or directly connected to an address-sharing function
      located in the network provider's domain (typically this would be the
      Home Gateway itself). </t>

      <t>Because HOST_IDENT is used by a remote server to sort out the packets
      by sending host, HOST_IDENT must be unique to each host under the same
      shared IP address, where possible. In the case where only the Home
      Gateway is revealed to the operator side of the translation function,
      HOST_IDENT need only be unique to the Home Gateway. HOST_IDENT does not
      need to be globally unique. Of course, the combination of the (public)
      IP source address and the identifier (i.e., HOST_IDENT) ends up being
      unique. </t>

      <t>If the HOST_IDENT is conveyed at the IP level, all packets will have
      to bear the identifier. If it is conveyed at a higher connection-
      oriented level, the identifier is only needed once in the session
      establishment phase (for instance TCP three-way-handshake), then, all
      packets received in this session will be attributed to the HOST_IDENT
      designated during the session opening. </t>

      <t>Within this document, we assume the operator-side address-sharing
      function injects the HOST_IDENT. Another deployment option to avoid
      potential performance degradation is to let the host or Home Gateway
      inject its HOST_IDENT but the address-sharing function will check its
      content (just like an IP anti-spoofing function). For some proposals,
      the HOST_IDENT is retrieved using an out-of-band mechanism or signaled
      in a dedicated notification channel. </t>

      <t>For A+P <xref target="RFC6346"></xref> and its variants, port set
      announcements may be needed as discussed in <xref
      target="server"></xref>.</t>

      <t>Security considerations are common to all analyzed solutions (see
      <xref target="Security"></xref>). Privacy-related aspects are discussed
      in <xref target="privacy"></xref>.</t>

      <t>HOST_IDENT common to all packets issued by a multi-interfaced host,
      or even globally unique is out of scope of this document. </t>
    </section>

    <section anchor="privacy" title="HOST_IDENT and Privacy">
      <t>IP address sharing is motivated by a number of different factors. For
      years, many network operators have conserved the use of public IPv4
      addresses by making use of Customer Premises Equipment (CPE) that
      assigns a single public IPv4 address to all hosts within the customer's
      local area network and uses NAT <xref target="RFC3022"></xref> to
      translate between locally unique private IPv4 addresses and the CPE's
      public address. With the exhaustion of IPv4 address space, address
      sharing between customers on a much larger scale is likely to become
      much more prevalent. While many individual users are unaware of and
      uninvolved in decisions about whether their unique IPv4 addresses get
      revealed when they send data via IP, some users realize privacy benefits
      associated with IP address sharing, and some may even take steps to
      ensure that NAT functionality sits between them and the public Internet.
      IP address sharing makes the actions of all users behind the NAT
      function unattributable to any single host, creating room for abuse but
      also providing some identity protection for non-abusive users who wish
      to transmit data with reduced risk of being uniquely identified.</t>

      <t>The proposals considered in this document add a measure of
      identifiability back to hosts that share a public IP address. The extent
      of that identifiability depends on what information is included in the
      HOST_IDENT.</t>

      <t>The volatility of the HOST_IDENT information is similar to that of
      the internal IP address: a distinct HOST_IDENT may be used by the
      address-sharing function when the host reboots or gets a new internal IP
      address. As with persistent IP addresses, persistent HOST_IDENTs
      facilitate user tracking over time.</t>

      <t>As a general matter, the HOST_IDENT proposals do not seek to make
      hosts any more identifiable than they would be if they were using a
      public, non-shared IP address. However, depending on the solution
      proposal, the addition of HOST_IDENT information may allow a device to
      be fingerprinted more easily than it otherwise would be. To prevent
      this, the following design considerations are to be taken into
      account:<list style="symbols">
          <t hangText="Uniqueness of identifiers in HOST_ID:">It is
          recommended that HOST_IDENTs be limited to providing local
          uniqueness rather than global uniqueness.</t>

          <t hangText="Refresh rate of HOST_ID:">Address-sharing function
          should not use permanent HOST_IDENT values.</t>
        </list></t>

      <t>Should multiple solutions be combined (e.g., TCP Option and Forwarded
      header) that include different pieces of information in the HOST_IDENT,
      fingerprinting may become even easier. To prevent this, an
      address-sharing function, able to inject HOST_IDENTs in several layers,
      should reveal the same subsets of information at each layer. For
      example, if one references the lower 16 bits of an IPv4 address, the
      other should reference these 16 bits too.</t>

      <t>A HOST_IDENT can be spoofed as this is also the case for spoofing an
      IP address. Furthermore, users of network-based anonymity services (like
      Tor) may be capable of stripping HOST_IDENT information before it
      reaches its destination.</t>

      <t>In order to control the information revealed to external parties, an
      address-sharing function should be able to strip, rewrite and add
      HOST_IDENT fields.</t>

      <t>An address-sharing function may be configured to enforce different
      end-user preferences with regards to HOST_IDENT injection. For example,
      HOST_IDENT injection can be disabled for some users. This feature is
      policy-based and deployment-specific.</t>

      <t>HOST_IDENT specification document(s) should explain the privacy
      impact of the solutions they specify, including the extent of HOST_IDENT
      uniqueness and persistence, assumptions made about the lifetime of the
      HOST_IDENT, whether and how the HOST_IDENT can be obfuscated or
      recycled, whether location information can be exposed, and the impact of
      the use of the HOST_IDENT on device or implementation fingerprinting.
      <xref target="I-D.iab-privacy-considerations"></xref> provides further
      guidance.</t>

      <t>For more discussion about privacy, refer to <xref
      target="RFC6462"></xref>.</t>
    </section>

    <section anchor="Analysis" title="Detailed Solutions Analysis">
      <t></t>

      <section anchor="ip_id"
               title="Use the Identification Field of IPv4 Header (IP-ID)">
        <t></t>

        <section title="Description">
          <t>The IPv4 ID (Identification field of IP header, i.e., IP-ID) can
          be used to insert information which uniquely distinguishes a host
          among those sharing the same IPv4 address. The use of IP-ID as a
          channel to convey HOST_IDENT is a theoretical construct (i.e., it is
          an undocumented proposal).</t>

          <t>An address-sharing function can re-write the IP-ID field to
          insert a value unique to the host (16 bits are sufficient to
          uniquely disambiguate hosts sharing the same IP address). The
          address-sharing function injecting the HOST_IDENT must follow the
          rules defined in <xref target="RFC6864"></xref>; in particular the
          same HOST_IDENT is not re-assigned to another host sharing the same
          IP address during a given time interval.</t>

          <t>A variant of this approach relies upon the format of certain
          packets, such as TCP SYN, where the IP-ID can be modified to contain
          a 16 bit HOST_IDENT.</t>

          <t>Address-sharing devices using this solution would be required to
          indicate that they do so, possibly using a special DNS record.</t>
        </section>

        <section title="Analysis">
          <t>This usage is not consistent with the fragment reassembly use of
          the Identification field <xref target="RFC0791"></xref> or the
          updated handling rules for the Identification field <xref
          target="RFC6864"></xref>.</t>

          <t>Complications may arise if the packet is fragmented before
          reaching the device injecting the HOST_IDENT. To appropriately
          handle those packet fragments, the address-sharing function will
          need to maintain a lot of state.</t>

          <t>Another complication to be encountered is where translation is
          balanced among several NATs; setting the appropriate HOST_IDENT by a
          given NAT would alter the coordination between those NATs. Of
          course, one can argue this coordinated NAT scenario is not a typical
          deployment scenario; regardless, using IP-ID as a channel to convey
          a HOST_IDENT is ill-advised.</t>
        </section>
      </section>

      <section anchor="ip_option" title="Define an IP Option">
        <t></t>

        <section title="Description">
          <t>A solution alternative to convey the HOST_IDENT is to define an
          IP option <xref target="RFC0791"></xref>. A HOST_IDENT IP option can
          be inserted by the address-sharing function to uniquely distinguish
          a host among those sharing the same IP address. An example of such
          option is documented in <xref
          target="I-D.chen-intarea-v4-uid-header-option"></xref>. This IP
          option allows the conveyance of an IPv4 address, an IPv6 prefix, a
          GRE (Generic Routing Encapsulation) key, an IPv6 Flow Label,
          etc.</t>

          <t>Another way for using an IP option has been described in Section
          4.6 of <xref target="RFC3022"></xref>.</t>
        </section>

        <section title="Analysis">
          <t>This proposal can apply to any transport protocol. Nevertheless,
          it is widely known that routers and other middleboxes filter IP
          options (e.g., drop IP packets with unknown IP options, strip
          unknown IP options, etc.). </t>

          <t>Injecting the HOST_IDENT IP Option introduces some
          implementations complexity in the following cases: <list
              style="symbols">
              <t>If the packet is at or close to the MTU size.</t>

              <t>The options space is exhausted.</t>
            </list></t>

          <t>Previous studies demonstrated that "IP Options are not an option"
          (Refer to <xref target="Not_An_Option"></xref>, <xref
          target="Options"></xref>).</t>

          <t>In conclusion, using an IP option to convey a HOST_IDENT is not
          viable.</t>
        </section>
      </section>

      <section anchor="tcp" title="Define a TCP Option">
        <t></t>

        <section title="Description">
          <t>HOST_IDENT may be conveyed in a dedicated TCP Option. An example
          is specified in <xref target="I-D.wing-nat-reveal-option"></xref>.
          This option encloses the TCP client's identifier (e.g., the lower 16
          bits of its IPv4 address, its VLAN ID, VRF ID, or subscriber ID).
          The address-sharing device inserts this TCP Option into the TCP SYN
          packet.</t>
        </section>

        <section title="Analysis">
          <t>Using a new TCP Option to convey the HOST_IDENT does not require
          any modification to the applications but it is applicable only for
          TCP-based applications. Applications relying on other transport
          protocols are therefore left unsolved.</t>

          <t><xref target="I-D.wing-nat-reveal-option"></xref> discusses the
          interference with other TCP Options.</t>

          <t>The risk to experience session failures due to handling a new TCP
          Option is low as measured in <xref target="Options"></xref>. <xref
          target="I-D.abdo-hostid-tcpopt-implementation"></xref> provides a
          detailed implementation and experimentation report of a HOST_IDENT
          TCP Option. This document investigated in depth the impact of
          activation HOST_IDENT on the host, the address-sharing function, and
          the enforcement of policies at the server side. It also reports a
          failure ratio of 0.103% among top 100000 websites.</t>

          <t>Some downsides have been raised against defining a TCP Option to
          reveal a host identity:</t>

          <t><list style="symbols">
              <t>Conveying an IP address in a TCP Option may be seen as a
              violation of OSI layers but since IP addresses are already used
              for the checksum computation, this is not seen as a blocking
              point. Moreover, updated version of <xref
              target="I-D.wing-nat-reveal-option"></xref> no longer allows
              conveyance of a full IP address as the HOST_IDENT is encoded in
              16 bits.</t>

              <t>TCP Option space is limited and might be consumed by the TCP
              client. <xref
              target="I-D.abdo-hostid-tcpopt-implementation"></xref> discusses
              two approaches to sending the HOST_IDENT: sending the HOST_IDENT
              in the TCP SYN (which consumes more bytes in the TCP header of
              the TCP SYN) and sending the HOST_IDENT in a TCP ACK (which
              consumes only two bytes in the TCP SYN).</t>

              <t>Content providers may find it more desirable to receive the
              HOST_IDENT in the TCP SYN, as that more closely preserves the
              HOST_IDENT received in the source IP address as per current
              practices. Moreover, sending the HOST_IDENT in the TCP SYN does
              not interfere with <xref
              target="I-D.ietf-tcpm-fastopen"></xref>. In the ACK mode, If the
              server is configured to deliver different data based on
              HOST_IDENT, then it would have to wait for the ACK before
              transmitting data.</t>

              <t>Injecting the HOST_IDENT TCP Option introduces some
              implementations complexity if the options space is exhausted.
              Specification document(s) should specify in detail the behavior
              of the address-sharing function in such case.</t>

              <t>It is more complicated to implement sending the HOST_IDENT in
              a TCP ACK as it can introduce MTU issues if the ACK packet also
              contains TCP data, or a TCP segment is lost. Note, MTU
              complications can be experienced also if user data is included
              in a SYN packet (e.g., <xref
              target="I-D.ietf-tcpm-fastopen"></xref>).</t>

              <t>When there are several NATs in the path, the original
              HOST_IDENT may be lost. The loss of the original HOST_IDENT may
              not be a problem as the target usage is between proxies or a CGN
              and server. Only the information leaked in the last
              communication leg (i.e., between the last address-sharing
              function and the server) is likely to be useful.</t>

              <t>Interference with usages such as Forwarded HTTP header (see
              <xref target="xff"></xref>) should be elaborated to specify the
              behavior of servers when both options are used; in particular,
              specify which information to use: the content of the TCP Option
              or what is conveyed in the application headers.</t>

              <t>When load-balancers or proxies are in the path, this option
              does not allow the preservation of the original source IP
              address and source port. Preserving such information is required
              for logging purposes for instance (e.g., <xref
              target="RFC6302"></xref>). <xref
              target="I-D.abdo-hostid-tcpopt-implementation"></xref> defines a
              TCP Option which allows revealing various combinations of source
              information (e.g., source port, source port and source IP
              address, source IPv6 prefix, etc.).</t>
            </list>More discussion about issues raised when extending TCP can
          be found at <xref target="ExtendTCP"></xref>.</t>
        </section>
      </section>

      <section anchor="xff"
               title="Inject Application Protocol Message Headers">
        <t></t>

        <section title="Description">
          <t>Another option is not to require any change within the transport
          nor the IP levels but to convey at the application payload the
          required information that will be used to disambiguate hosts. The
          format of the conveyed information and the related semantics depend
          on its application (e.g., HTTP, SIP, SMTP, etc.).</t>

          <t>Related mechanisms could be developed for other application-layer
          protocols, but the discussion in this document is limited to HTTP
          and similar protocols.</t>

          <t>For HTTP, Forwarded header (<xref
          target="I-D.ietf-appsawg-http-forwarded"></xref>) can be used to
          display the original IP address when an address-sharing device is
          involved. Service Providers operating address-sharing devices can
          enable the feature of injecting the Forwarded header which will
          enclose the original IPv4 address or the IPv6 prefix part (see the
          example shown in <xref target="xff_eg"></xref>). The address-sharing
          device has to strip all included Forwarded headers before injecting
          its own. Servers may rely on the contents of this field to enforce
          some policies such as blacklisting misbehaving users.</t>

          <t>Note that the X-Forwarded-For (XFF) header is obsoleted by <xref
          target="I-D.ietf-appsawg-http-forwarded"></xref>.</t>

          <t><figure align="center" anchor="xff_eg"
              title="Example of Forwarded-For">
              <artwork><![CDATA[Forwarded: for=192.0.2.1,for=[2001:db8::1]
Forwarded: proto=https;by=192.0.2.15]]></artwork>
            </figure></t>
        </section>

        <section title="Analysis">
          <t>Not all applications impacted by address sharing can support the
          ability to disclose the original IP address. Only a subset of
          protocols (e.g., HTTP) can rely on this solution.</t>

          <t>For the HTTP case, to prevent users injecting invalid
          HOST_IDENTs, an initiative has been launched by Wikipedia to
          maintain a list of trusted ISPs (Internet Service Providers) using
          XFF (See the list available at <xref target="Trusted_ISPs"></xref>).
          If an address-sharing device is on the trusted XFF ISPs list, users
          editing Wikipedia located behind the address-sharing device will
          appear to be editing from their "original" IP address and not from
          the NATed IP address. If an offending activity is detected,
          individual hosts can be blacklisted instead of all hosts sharing the
          same IP address.</t>

          <t>XFF header injection is a common practice of load balancers. When
          a load balancer is in the path, the original content of any included
          XFF header should not be stripped. Otherwise the information about
          the "origin" IP address will be lost.</t>

          <t>When several address-sharing devices are crossed, the Forwarded
          header can convey the list of IP addresses (e.g., <xref
          target="xff_eg"></xref>). The origin HOST_IDENT can be exposed to
          the target server.</t>

          <t>Injecting Forwarded header also introduces some implementations
          complexity if the HTTP message is at or close to the MTU size.</t>

          <t>It has been reported that "poor" HTTP proxy implementations may
          encounter parsing issues when injecting an XFF header.</t>

          <t>Injecting Forwarded header for all HTTPS traffic is infeasible.
          This may be problematic given the current HTTPS usage trends.</t>
        </section>
      </section>

      <section anchor="PROXY" title="PROXY Protocol">
        <t></t>

        <section title="Description">
          <t>The solution, referred to as Proxy Protocol <xref
          target="Proxy"></xref>, does not require any application-specific
          knowledge. The rationale behind this solution (Proxy Protocol
          Version 1) is to insert identification data directly into the
          application data stream prior to the actual protocol data being
          sent, regardless of the protocol. Every application protocol would
          begin with a textual string of "PROXY", followed by some textual
          identification data, ending with a CRLF, and only then the
          application data would be inserted. <xref target="proxy_eg"></xref>
          shows an example of a line of data used for this, in this case for a
          TCP over IPv4 connection received from 192.0.2.1:56324 and destined
          to 192.0.2.15:443.</t>

          <t><figure align="center" anchor="proxy_eg"
              title="Example of PROXY connection report">
              <artwork><![CDATA[    PROXY TCP4 192.0.2.1 192.0.2.15 56324 443\r\n
]]></artwork>
            </figure></t>

          <t>Upon receipt of a message conveying this line, the server removes
          the line. The line is parsed to retrieve the transported protocol.
          The content of this line is recorded in logs and used to enforce
          policies.</t>

          <t>Proxy Protocol Version 2 is designed to accommodate IPv4/IPv6 and
          also non-TCP protocols (see <xref target="Proxy"></xref> for more
          details).</t>
        </section>

        <section title="Analysis">
          <t>This solution can be deployed in a controlled environment but it
          can not be deployed to all access services available in the
          Internet. If the remote server does not support the Proxy Protocol,
          the session will fail. Other complications will arise due to the
          presence of firewalls, for instance.</t>

          <t>As a consequence, this solution is infeasible and can not be
          recommended.</t>
        </section>
      </section>

      <section anchor="server" title="Assign Port Sets">
        <t></t>

        <section title="Description">
          <t>This solution does not require any action from the
          address-sharing function to disclose a host identifier. Instead of
          assuming all transport ports are associated with one single host,
          each host under the same external IP address is assigned a
          restricted port set. These port sets are then advertised to remote
          servers using off-line means. This announcement is not required for
          the delivery of internal services (i.e., offered by the service
          provider deploying the address-sharing function) relying on implicit
          identification.</t>

          <t>Port sets assigned to hosts may be static or dynamic.</t>

          <t>Port set announcements to remote servers are not required to
          reveal the identity of individual hosts but only to advertise the
          enforced policy to generate non-overlapping port sets (e.g., the
          transport space associated with an IP address is fragmented to
          contiguous blocks of 2048 port numbers).</t>

          <t>Examples of such an option are documented in <xref
          target="RFC6346"></xref> and <xref
          target="I-D.donley-behave-deterministic-cgn"></xref>.</t>
        </section>

        <section title="Analysis">
          <t>The solution does not require defining new fields nor options; it
          is policy-based.</t>

          <t>The solution may contradict the port randomization (<xref
          target="RFC6056"></xref>) as identified in <xref
          target="RFC6269"></xref>. A mitigation would be to avoid assigning
          static port sets to individual hosts.</t>

          <t>The method is convenient for the delivery of services offered by
          the service provider also offering the Internet access service.</t>
        </section>
      </section>

      <section anchor="hip" title="Host Identity Protocol (HIP)">
        <t></t>

        <section title="Description">
          <t><xref target="RFC5201"></xref> specifies an architecture which
          introduces a new namespace to convey identity information.</t>
        </section>

        <section title="Analysis">
          <t>This solution requires both the client and the server to support
          HIP <xref target="RFC5201"></xref>. Additional architectural
          considerations are to be taken into account such as the key
          exchanges, etc.</t>

          <t>An alternative deployment model, which does not require the
          client to be HIP-enabled, is having the address-sharing function
          behave as a UDP/TCP-HIP relay. This model is also not viable as it
          assumes all servers are HIP-enabled.</t>

          <t>This solution is a theoretical construct (i.e., the proposal is
          not documented).</t>
        </section>
      </section>

      <section anchor="icmp"
               title="Use of a Notification Channel (e.g., ICMP)">
        <t></t>

        <section title="Description">
          <t>Another alternative is to convey the HOST_IDENT using a separate
          notification channel than the packets issued to invoke the
          service.</t>

          <t>An implementation example is defined in <xref
          target="I-D.yourtchenko-nat-reveal-ping"></xref>. This solution
          relies on a mechanism where the address-sharing function
          encapsulates the necessary host-identifying information into an ICMP
          Echo Request packet that it sends in parallel with the initial
          session creation (e.g., SYN). The information included in the ICMP
          Request Data portion describes the five-tuples as seen on both of
          the sides of the address-sharing function.</t>
        </section>

        <section title="Analysis">
          <t><?rfc subcompact="yes" ?><list style="symbols">
              <t>This ICMP proposal is valid for any transport protocol that
              uses a port number. The address-sharing function may be
              configured with the transport protocols which will trigger
              issuing those ICMP messages.</t>

              <t>A hint should be provided to the ultimate server (or
              intermediate nodes) that the ICMP Echo Request conveys a
              HOST_IDENT. This may be implemented using magic numbers.</t>

              <t>Even if ICMP packets are blocked in the communication path,
              the user connection does not have to be impacted.</t>

              <t>Implementations requiring delay of the establishment of a
              session until receipt of the companion ICMP Echo Request may
              lead to some user experience degradation.</t>

              <t>Because of the presence of load-balancers in the path, the
              ultimate server receiving the SYN packet may not be the one
              which receives the ICMP message conveying the HOST_IDENT.</t>

              <t>Because of the presence of load-balancers in the path, the
              port number assigned by address sharing may be lost. Therefore
              the mapping information conveyed in the ICMP may not be
              sufficient to associate a SYN packet with a received ICMP.</t>

              <t>The proposal is not compatible with the presence of cascaded
              NAT. The main reason is each NAT in the path will generate an
              ICMP message to reveal the internal host identifier. Because
              these messages will be translated by the downstream
              address-sharing devices, the remote server will receive multiple
              ICMP messages and will need to decide which host identifier to
              use.</t>

              <t>The ICMP proposal will add traffic overhead for both the
              server and the address-sharing device.</t>

              <t>The ICMP proposal is similar to other mechanisms (e.g.,
              Syslog <xref target="I-D.ietf-behave-ipfix-nat-logging"></xref>,
              IPFIX <xref target="I-D.ietf-behave-syslog-nat-logging"></xref>)
              for reporting dynamic mappings to a mediation platform (mainly
              for legal traceability purposes). Performance degradation is
              likely to be experienced by address-sharing functions because
              ICMP messages are sent for each new instantiated mapping (and
              also even if the mapping exists).</t>

              <t>In some scenarios (e.g., Section 3 of <xref
              target="I-D.boucadair-pcp-nat-reveal"></xref>), HOST_IDENT
              should be interpreted by intermediate devices which embed Policy
              Enforcement Points (PEP, <xref target="RFC2753"></xref>)
              responsible for granting access to some services. These PEPs
              need to inspect all received packets in order to find the
              companion (traffic) messages to be correlated with ICMP messages
              conveying HOST_IDENTs. This induces more complexity to these
              intermediate devices.</t>
            </list></t>

          <t><?rfc subcompact="no" ?></t>
        </section>
      </section>

      <section anchor="ident" title="Use Out-of-Band Mechanisms (e.g., IDENT)">
        <t></t>

        <section title="Description">
          <t>Another alternative is to retrieve the HOST_IDENT using a
          dedicated query channel.</t>

          <t>An implementation example may rely on the Identification Protocol
          (IDENT, <xref target="RFC1413"></xref>). This solution assumes the
          address-sharing function implements the server part of IDENT, while
          remote servers implement the client part of the protocol. IDENT
          needs to be updated (see <xref target="IDENT_NAT"></xref>) to be
          able to return a host identifier instead of the user-id as defined
          in <xref target="RFC1413"></xref>. The IDENT response syntax uses
          the same USERID field described in <xref target="RFC1413"></xref>
          but rather than returning a username, a host identifier (e.g., a
          16-bit value) is returned <xref target="IDENT_NAT"></xref>. For any
          new incoming connection, the server contacts the IDENT server to
          retrieve the associated identifier. During that phase, the
          connection may be delayed.</t>
        </section>

        <section title="Analysis">
          <t><?rfc subcompact="yes" ?><list style="symbols">
              <t>IDENT is specific to TCP. Alternative out-of-band mechanisms
              may be designed to cover other transport protocols such as
              UDP.</t>

              <t>This solution requires the address-sharing function to embed
              an IDENT server.</t>

              <t>A hint should be provided to the ultimate server (or
              intermediate nodes) that the address-sharing function implements
              the IDENT protocol. A solution example is to publish this
              capability using DNS; other solutions can be envisaged.</t>

              <t>An out-of-band mechanism may require some administrative
              setup (e.g., contract agreement) between the entity managing the
              address-sharing function and the entity managing the remote
              server. Such a deployment is not feasible in the Internet at
              large because establishing and maintaining agreements between
              ISPs and all service actors is burdensome and not scalable.</t>

              <t>Implementations requiring delay of the establishment of a
              session until receipt of the companion IDENT response may lead
              to some user experience degradation.</t>

              <t>The IDENT proposal will add traffic overhead for both the
              server and the address-sharing device.</t>

              <t>Performance degradation is likely to be experienced by
              address-sharing functions embedding the IDENT server. This is
              further exacerbated if the address-sharing function has to
              handle an IDENT query for each new instantiated mapping (and
              also even if the mapping exists).</t>

              <t>In some scenarios (e.g., Section 3 of <xref
              target="I-D.boucadair-pcp-nat-reveal"></xref>), HOST_IDENT
              should be interpreted by intermediate devices which embed Policy
              Enforcement Points (PEP, <xref target="RFC2753"></xref>)
              responsible for granting access to some services. These PEPs
              need to inspect all received packets in order to generate the
              companion IDENT queries. This may induce more complexity to
              these intermediate devices.</t>

              <t>IDENT queries may be generated by illegitimate TCP servers.
              This would require the address-sharing function to enforce some
              policies (e.g., rate limit queries, filter based on the source
              IP address, etc.).</t>
            </list></t>

          <t><?rfc subcompact="no" ?></t>
        </section>
      </section>
    </section>

    <section anchor="synthesis" title="Solutions Analysis: Synthesis">
      <t>The following Table 1 summarizes the approaches analyzed in this
      document.</t>

      <t><?rfc subcompact="yes" ?></t>

      <t><list style="symbols">
          <t>"Encrypted Traffic" refers to TLS. The use of IPsec and its
          complications to traverse NATs are discussed in Section 2.2 of <xref
          target="I-D.ietf-behave-64-analysis"></xref>. Similar to what is
          suggested in Section 13.5 of <xref target="RFC6269"></xref>,
          HOST_IDENT specification document(s) should analyze in detail the
          compatibility of each IPsec mode.</t>

          <t>"Success ratio" indicates the ratio of successful communications
          with remote servers when the HOST_IDENT is injected using a
          candidate solution. More details are provided below to explain how
          the success ratio is computed for each candidate solution.</t>

          <t>"Possible Perf Impact" indicates the level of expected
          performance degradation. The rationale behind the indicated
          potential performance degradation is whether the injection requires
          some treatment at the IP level or not.</t>

          <t>"OS TCP/IP Modif" indicates whether a modification of the OS
          TCP/IP stack is required at the server side.</t>

          <t>"Deployable today" indicates if the solution can be generalized
          without any constraint on current architectures and practices.</t>
        </list></t>

      <t><?rfc subcompact="no" ?></t>

      <figure align="center" title="Table 1: Summary of analyzed solutions.">
        <artwork><![CDATA[          +-----+------+------+------+-----+-----+-----+-----+-----+
          |IP-ID| IP   | TCP  |HTTP  |PROXY|Port | HIP |ICMP |IDENT|
          |     |Option|Option|Header|     | Set |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
UDP       | Yes | Yes  | No   | No   | No  | Yes |     | Yes | No  |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
TCP       | Yes | Yes  | Yes  | No   | Yes | Yes |     | Yes | Yes |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
HTTP      | Yes | Yes  | Yes  | Yes  | Yes | Yes |     | Yes | Yes |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
Encrypted | Yes | Yes  | Yes  | No   | Yes | Yes |     | Yes | Yes |
Traffic   |     |      |      |      |     |     |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
Success   | 100%| 30%  | 99%  | 100% | Low | 100%|Low  |~100%|~100%|
Ratio     |     |      |      |      |     |     |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
Possible  | Low | High | Low  |  Med | High| No  | N/A | High|High |
Perf      |  to |      |  to  |   to |     |     |     |     |     |
Impact    | Med |      | Med  | High |     |     |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
OS TCP/IP | Yes | Yes  | Yes  | No   | No  | No  |     | Yes | Yes |
Modif     |     |      |      |      |     |     |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
Deployable| Yes | Yes  | Yes  | Yes  | No  | Yes | No  | Yes | Yes |
Today     |     |      |      |      |     |     |     |     |     |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
Notes     | (1) |  (8) | (8)  |  (2) | (8) | (1) | (4) | (6) | (1) |
          | (7) |      |      |      |     | (3) | (7) | (8) | (6) |
          |     |      |      |      |     |     |     |     | (8) |
----------+-----+------+------+------+-----+-----+-----+-----+-----+
 Notes:
 (1)  Requires mechanism to advertise NAT is participating in this
      scheme (e.g., DNS PTR record).
 (2)  This solution is widely deployed (e.g., HTTP Severs, 
      Load-Balancers, etc.).
 (3)  When the port set is not advertised, the solution is less
      efficient for third-party services.
 (4)  Requires the client and the server to be HIP-compliant and HIP
      infrastructure to be deployed. If the client and the server are 
      HIP-enabled, the address-sharing function does not need to 
      insert an identifier. If the client is not HIP-enabled, 
      designing the device that performs address sharing to act 
      as a UDP/TCP-HIP relay is not viable.
 (6)  The solution is inefficient in some scenarios (see Section 5)
 (7)  The solution is a theoretical construct (i.e., the solution 
      is not documented).
 (8)  The solution is a documented proposal.  
]]></artwork>
      </figure>

      <t><?rfc subcompact="no" ?></t>

      <t>Provided success ratio figures for TCP and IP options are inspired
      from the results documented in <xref target="Options"></xref> and <xref
      target="I-D.abdo-hostid-tcpopt-implementation"></xref>. </t>

      <t>The provided success ratio for IP-ID is theoretical; it assumes the
      address-sharing function follows the rules in <xref
      target="RFC6864"></xref> to re-write the IP Identification field.</t>

      <t>Since PROXY and HIP are not widely deployed, the success ratio for
      establishing a communication with remote servers using these protocols
      is low.</t>

      <t>The success ratio for the ICMP-based solution is
      implementation-specific but it is likely to be close to 100%. The
      success ratio depends on how efficient the solution is implemented on
      the server side. A remote server which does not support the ICMP-based
      solution will ignore received companion ICMP messages. An upgraded
      server will need to delay accepting a session until receiving the
      companion ICMP message.</t>

      <t>The success ratio for IDENT solution is implementation-specific but
      it is likely to be close to 100%. The success ratio depends on how
      efficient the solution is implemented on the server side. A remote
      server which does not support IDENT will accept a session establishment
      request following its normal operation. An upgraded server will need to
      delay accepting a session until receipt of the response to the IDENT
      request it will send to the host.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document does not require any action from IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The same security concerns apply for the injection of an IP option,
      TCP Option and application-related content (e.g., Forwarded HTTP header)
      by the address-sharing device. If the server trusts the content of the
      HOST_IDENT field, a third party user can be impacted by a misbehaving
      user to reveal a "faked" HOST_IDENT (e.g., original IP address).</t>

      <t>HOST_IDENT may be used to leak information about the internal
      structure of a network behind an address-sharing function. If this
      behavior is undesired for the network administrator, the address-sharing
      function can be configured to strip any existing HOST_IDENT in received
      packets from internal hosts.</t>

      <t>HOST_IDENT specification documents should elaborate further on
      threats inherent to each individual solution used to convey the
      HOST_IDENT (e.g., use of the IP-ID field to count hosts behind a NAT
      <xref target="Count"></xref>).</t>

      <t>For more discussion of privacy issues related to HOST_IDENT, see
      <xref target="privacy"></xref>.</t>
    </section>

    <section title="Acknowledgments">
      <t>Many thanks to D. Wing, C. Jacquenet, J. Halpern, B. Haberman, and P.
      Yee for their review, comments and inputs.</t>

      <t>Thanks also to P. McCann, T. Tsou, Z. Dong, B. Briscoe, T. Taylor, M.
      Blanchet, D. Wing, and A. Yourtchenko for the discussions in Prague.</t>

      <t>Some of the issues related to defining a new TCP Option have been
      raised by L. Eggert.</t>

      <t>The privacy text was provided by A. Cooper.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.0791'?>

      <?rfc include='reference.RFC.6056'?>

      <?rfc include='reference.RFC.3022'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.I-D.wing-nat-reveal-option'?>

      <?rfc include='reference.I-D.ietf-behave-ipfix-nat-logging'?>

      <?rfc include='reference.I-D.ietf-behave-syslog-nat-logging'?>

      <?rfc include='reference.I-D.ietf-appsawg-http-forwarded'?>

      <?rfc include='reference.I-D.iab-privacy-considerations'?>

      <?rfc include='reference.RFC.1413'?>

      <?rfc include='reference.RFC.5201'?>

      <?rfc include='reference.RFC.6462'?>

      <?rfc include='reference.RFC.6346'?>

      <?rfc include='reference.RFC.2753'?>

      <?rfc include='reference.RFC.6302'?>

      <?rfc include='reference.I-D.yourtchenko-nat-reveal-ping'?>

      <?rfc include='reference.I-D.chen-intarea-v4-uid-header-option'?>

      <?rfc include='reference.I-D.boucadair-pcp-nat-reveal'?>

      <?rfc include='reference.I-D.ietf-tcpm-fastopen'?>

      <?rfc include='reference.I-D.donley-behave-deterministic-cgn'?>

      <?rfc include='reference.RFC.6146'?>

      <?rfc include='reference.I-D.abdo-hostid-tcpopt-implementation'?>

      <?rfc include='reference.RFC.6269'?>

      <?rfc include='reference.RFC.6864'?>

      <?rfc include='reference.I-D.ietf-behave-64-analysis'?>

      <reference anchor="Not_An_Option"
                 target="http://www.eecs.berkeley.edu/Pubs/TechRpts/2005/EECS-2005-24.html">
        <front>
          <title>IP options are not an option</title>

          <author fullname="" initials="" surname="">
            <organization>R. Fonseca, G. Porter, R. Katz, S. Shenker, and I.
            Stoica,</organization>
          </author>

          <date year="2005" />
        </front>
      </reference>

      <reference anchor="Options"
                 target="http://conferences.sigcomm.org/imc/2004/papers/p336-medina.pdf">
        <front>
          <title>Measuring Interactions Between Transport Protocols and
          Middleboxes</title>

          <author fullname="" initials="" surname="">
            <organization>Alberto Medina, Mark Allman, Sally
            Floyd</organization>
          </author>

          <date year="2005" />
        </front>
      </reference>

      <reference anchor="ExtendTCP"
                 target="http://nrg.cs.ucl.ac.uk/mjh/tmp/mboxes.pdf">
        <front>
          <title>Is it still possible to extend TCP?</title>

          <author fullname="" initials="" surname="">
            <organization>Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A.,
            Handley, M. and H. Tokuda,</organization>
          </author>

          <date month="November" year="2011" />
        </front>
      </reference>

      <reference anchor="Trusted_ISPs"
                 target="http://meta.wikimedia.org/wiki/XFF_project#Trusted_XFF_list">
        <front>
          <title>Trusted XFF list</title>

          <author>
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>

      <reference anchor="Count"
                 target="http://www.cs.columbia.edu/~smb/papers/fnat.pdf">
        <front>
          <title>A technique for counting NATted hosts</title>

          <author fullname="Steve Belloven">
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>

      <reference anchor="Proxy"
                 target="http://haproxy.1wt.eu/download/1.5/doc/proxy-protocol.txt">
        <front>
          <title>The PROXY protocol</title>

          <author fullname="Willy Tarreau" initials="W." surname="Tarreau">
            <organization></organization>
          </author>

          <date month="November" year="2010" />
        </front>
      </reference>

      <reference anchor="IDENT_NAT" target="draft-wing-intarea-ident">
        <front>
          <title>Using the Identification Protocol with an Address Sharing
          Device</title>

          <author fullname="Dan Wing" initials="D." surname="Wing">
            <organization></organization>
          </author>

          <date month="August" year="2012" />
        </front>
      </reference>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 21:05:06