One document matched: draft-donley-behave-deterministic-cgn-01.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="exp" docName="draft-donley-behave-deterministic-cgn-01"
     ipr="trust200902">
  <front>
    <title abbrev="deterministic-cgn">Deterministic Address Mapping to Reduce
    Logging in Carrier Grade NAT Deployments</title>

    <author fullname="Chris Donley" initials="C.D." surname="Donley">
      <organization>CableLabs</organization>

      <address>
        <postal>
          <street>858 Coal Creek Cir</street>

          <city>Louisville</city>

          <region>CO</region>

          <code>80027</code>

          <country>US</country>
        </postal>

        <email>c.donley@cablelabs.com</email>
      </address>
    </author>

    <author fullname="Chris Grundemann" initials="C.G." surname="Grundemann">
      <organization>CableLabs</organization>

      <address>
        <postal>
          <street>858 Coal Creek Cir</street>

          <city>Louisville</city>

          <region>CO</region>

          <code>80027</code>

          <country>US</country>
        </postal>

        <email>c.grundemann@cablelabs.com</email>
      </address>
    </author>

    <author fullname="Vikas Sarawat" initials="V.S." surname="Sarawat">
      <organization>CableLabs</organization>

      <address>
        <postal>
          <street>858 Coal Creek Cir</street>

          <city>Louisville</city>

          <region>CO</region>

          <code>80027</code>

          <country>US</country>
        </postal>

        <email>v.sarawat@cablelabs.com</email>
      </address>
    </author>

    <author fullname="Karthik Sundaresan" initials="K.S." surname="Sundaresan">
      <organization>CableLabs</organization>

      <address>
        <postal>
          <street>858 Coal Creek Cir</street>

          <city>Louisville</city>

          <region>CO</region>

          <code>80027</code>

          <country>US</country>
        </postal>

        <email>k.sundaresan@cablelabs.com</email>
      </address>
    </author>

    <date day="18" month="January" year="2012" />

    <abstract>
      <t>Many Carrier Grade NAT solutions require per-connection logging.
      Unfortunately, such logging is not scalable to many residential
      broadband services. This document suggests a way to manage Carrier Grade
      NAT translations in such a way as to significantly reduce the amount of
      logging required while providing traceability for abuse response.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>The world is rapidly running out of unallocated IPv4 addresses. To
      ensure IPv4 service continuity under the growing demands from new
      subscribers, devices, and service types, ISPs will be forced to share a
      single public IPv4 address among multiple subscribers using techniques
      such as <xref target="RFC6264">Carrier Grade Network Address Translation
      (CGN)</xref> (e.g., <xref target="I-D.shirasaki-nat444">NAT444</xref>,
      <xref target="RFC6333">DS-Lite</xref>, etc.). However, address sharing
      poses additional challenges to ISPs in responding to law enforcement
      requests or attack/abuse reports <xref target="RFC6269"></xref>. In
      order to respond to such requests to identify a specific user associated
      with an IP address, an ISP will need to map a subscriber's internal
      source IP address and source port with the global public IP address and
      source port provided by the CGN for every connection initiated by the
      user.</t>

      <t>CGN connection logging satisfies the need to identify attackers and
      respond to abuse/law enforcement requests, but it imposes significant
      operational challenges to ISPs. In lab testing, we have observed CGN log
      messages to be approximately 150 bytes long for <xref
      target="I-D.shirasaki-nat444">NAT444 </xref>, and 175 bytes for <xref
      target="RFC6333">DS-Lite </xref> (individual log messages vary somewhat
      in size). Although we are not aware of definitive studies of connection
      rates per subscriber, reports from several ISPs in the US sets the
      average number of connections per household per day at approximately
      33,000 connections per day. If each connection is individually logged,
      this translates to a data volume of approximately 5 MB per subscriber
      per day, or about 150 MB per subscriber per month; however, specific
      data volumes may vary across different ISPs based on myriad factors.
      Based on available data, a 1-million subscriber service provider will
      generate approximately 150 terabytes of log data per month, or 1.8
      petabytes per year.</t>

      <t>The volume of CGN logging can be reduced by assigning port ranges
      instead of individual ports. Using this method, only the assignment of a
      new port range is logged. This may massively reduce logging volume. The
      log reduction may vary depending on the length of the assigned port
      range, whether the port range is static or dynamic, etc. This has been
      acknowledged in <xref target="RFC6269">RFC6292</xref>:</t>

      <t>"Address sharing solutions may mitigate these issues to some extent
      by pre-allocating groups of ports. Then only the allocation of the group
      needs to be recorded, and not the creation of every session binding
      within that group. There are trade-offs to be made between the sizes of
      these port groups, the ratio of public addresses to subscribers, whether
      or not these groups timeout, and the impact on logging requirements and
      port randomization security <xref
      target="RFC6056">(RFC6056)</xref>."</t>

      <t>For the greatest reduction in logging, CGNs could be designed and/or
      configured to deterministically map internal addresses to {external
      address + port range} in such a way as to be able to algorithmically
      calculate the mapping. Only inputs and configuration of the algorithm
      need to be logged.</t>

      <t>This document describes a method for such CGN address mapping,
      combined with block port reservations, that significantly reduces the
      burden on ISPs while offering the ability to map a subscriber's inside
      IP address with an outside address and external port number observed on
      the Internet.</t>

      <t>The activation of the proposed port range allocation scheme is
      compliant with BEHAVE requirements such as the support of APP.</t>
    </section>

    <section title="Deterministic Port Ranges">
      <t>While a subscriber uses thousands of connections per day, most
      subscribers use far fewer at any given time. When the compression ratio
      (see Appendix B of <xref target="RFC6269">RFC6269</xref>) is low (e.g.,
      the ratio of the number of subscribers to the number of public IPv4
      addresses allocated to a CGN is closer to 10:1 than 1000:1), each
      subscriber could expect to have access to thousands of TCP/UDP ports at
      any given time. Thus, as an alternative to logging each connection, CGNs
      could deterministically map customer private addresses (received on the
      customer-facing interface of the CGN, a.k.a., internal side) to public
      addresses extended with port ranges (used on the Internet-facing
      interface of the CGN, a.k.a., external side). This algorithm allows an
      operator to identify a subscriber internal IP address when provided the
      public side IP and port number without having to examine the CGN
      translation logs. This prevents an operator from having to transport and
      store massive amounts of session data from the CGN and then process it
      to identify a subscriber.</t>

      <t>The algorithmic mapping can be expressed as:</t>

      <t>(External IP Address, Port Range) = function 1 (Internal IP
      Address)</t>

      <t>Internal IP Address = function 2 (External IP Address, Port
      Number)</t>

      <t>The CGN SHOULD provide a method for users to test both mapping
      functions (e.g., enter an External IP Address + Port Number and recieve
      the corrosponding Internal IP Address).</t>

      <t>Deterministic Port Range allocation requires configuration of the
      following variables:</t>

      <t><list style="symbols">
          <t>Inside IPv4/IPv6 address range (I);</t>

          <t>Outside IPv4 address range (O);</t>

          <t>Compression ratio (e.g. inside IP addresses I/outside IP
          addresses O) (C);</t>

          <t>Dynamic address pool factor (D), to be added to the compression
          ratio in order to create an overflow address pool;</t>

          <t>Maximum ports per user (M); and</t>

          <t>Reserved TCP/UDP port list</t>
        </list></t>

      <t>Note: The inside address range (I) will be an IPv4 range in NAT444
      operation (<xref target="I-D.shirasaki-nat444">NAT444</xref>) and an
      IPv6 range in DS-Lite operation (<xref
      target="RFC6333">DS-Lite</xref>).</t>

      <t>A subscriber is identified by an internal IPv4 address (e.g., NAT44)
      or an IPv6 prefix (e.g., DS-Lite or NAT64).</t>

      <t>The algorithm may be generalized to <xref
      target="I-D.miles-behave-l2nat">L2-aware NAT </xref> but this requires
      the configuration of the Internal interface identifiers (e.g., MAC
      addresses).</t>

      <t>The algorithm is not designed to retrieve an internal host among
      those sharing the same internal IP address (e.g., in a DS-Lite context,
      only an IPv6 address/prefix can be retrieved using the algorithm while
      the internal IPv4 address used for the encapsulated IPv4 datagram is
      lost).</t>

      <t>The assigned range of ports MAY also be used when translating ICMP
      requests (when re-writing the Identifier field).</t>

      <t>The CGN then reserves ports as follows:</t>

      <t><list style="numbers">
          <t>The CGN removes reserved ports from the port candidate list
          (e.g., 0-1023 for TCP and UDP). At a minimum, the CGN SHOULD remove
          system ports <xref target="RFC6335">(RFC6335)</xref> from the port
          candidate list reserved for deterministic assignment.</t>

          <t>The CGN calculates the total compression ratio (C+D), and
          allocates 1/(C+D) of the available ports to each internal IP
          address. Any remaining ports are allocated to the dynamic pool. Port
          allocation could be made sequentially (e.g. the first block goes to
          address 1, the second block to address 2, etc.), staggered (e.g.
          address 1 receives ports n*(C+D), address 2 receives ports
          1+n*(C+D), etc.), or through some other deterministic algorithm left
          to CGN implementation. Subscribers could be restricted to ports from
          a single IPv4 address, or could be allocated ports across all
          addresses in a pool, for example. The range of ports can be
          contiguous or be randomly assigned (e.g., using the random function
          defined in Section 2.2 of <xref target="RFC6431">RFC6431</xref>.</t>

          <t>When a subscriber initiates a connection, the CGN creates a
          translation mapping between the subscriber's inside local IP
          address/port and the CGN outside global IP address/port. The CGN
          MUST use one of the ports allocated in step 2 for the translation as
          long as such ports are available. The CGN MUST use the preallocated
          port range from step 2 for Port Control Protocol (PCP, <xref
          target="I-D.ietf-pcp-base"></xref>) reservations as long as such
          ports are available. While the CGN maintains its mapping table, it
          need not generate a log entry for translation mappings created in
          this step.</t>

          <t>The CGN will have a pool of ports left for dynamic assignment. If
          a subscriber uses more than the range of ports allocated in step 2
          (but fewer than the configured maximum ports M), the CGN uses a port
          from the dynamic assignment range for such a connection or for PCP
          reservations. The CGN MUST log dynamically assigned ports to
          facilitate subscriber-to-address mapping. The CGN SHOULD manage
          dynamic ports as described in <xref
          target="I-D.tsou-behave-natx4-log-reduction"> </xref>.</t>

          <t>Configuration of reserved ports (e.g., system ports) is left to
          operator configuration.</t>
        </list></t>

      <t>Thus, the CGN will maintain translation mapping information for all
      connections within its internal translation tables; however, it only
      needs to externally log translations for dynamically-assigned ports.</t>

      <section title="Stability and Load-Balancing Considerations">
        <t>Using the procedure defined in this document assumes a
        deterministic distribution of customers among deployed CGN devices.
        Balancing the traffic among several CGNs based on their actual load
        may not be supported because of the potential conflict of enforced
        algorithmic mapping rule. When CGN redundancy group is used, the same
        mapping rule, including in particular the external IP address, MUST be
        used. Furthermore, traffic oscillation MUST be avoided (because,
        unless state synchronization is used, the actual NAT state may not be
        instantiated in the redundancy group).</t>
      </section>

      <section title="IPv4 Port Utilization Efficiency">
        <t>For Service Providers requiring an aggressive address sharing
        ration, the use of the algorithmic mapping may impact the efficiency
        of the address sharing. Using a dynamic port range scheme, dynamic
        port assignment or a mix of static mapping and dynamic port assignment
        is more suitable for those SPs.</t>
      </section>

      <section title="Planning & Dimensioning">
        <t>Unlike dynamic approaches, the use of the algorithmic mapping
        requires more effort from operational teams to tweak the algorithm
        (e.g., size of the port range, address sharing ratio, etc.). Dedicated
        alarms SHOULD be configured when some port utilization thresholds are
        fired so that the configuration can be refined.</t>
      </section>

      <section title="Deterministic CGN Example">
        <t>To illustrate the use of deterministic NAT, let's consider a simple
        example. The operator configures an inside address range (I) of
        192.168.0.0/28 and outside address (O) of 203.0.113.1. The dynamic
        buffer factor is set to '2'. Thus, the total compression ratio is
        1:(14+2) = 1:16. Only the system ports (e.g. ports < 1024) are
        reserved. This configuration causes the CGN to preallocate
        ((65536-1024)/16 =) 4032 TCP and 4032 UDP ports per inside IPv4
        address. For the purposes of this example, let's assume that they are
        allocated sequentially, where 192.168.0.1 maps to 203.0.113.1 ports
        1024-5055, 192.168.0.2 maps to 203.0.113.1 ports 5056-9087, etc. The
        dynamic port range thus contains ports 57472-65535 (port allocation
        illustrated in the table below). Finally, the maximum ports/subscriber
        is set to 5040.</t>

        <texttable suppress-title="false">
          <ttcol>Inside Address / Pool</ttcol>

          <ttcol>Outside Address & Port</ttcol>

          <c>Reserved</c>

          <c>203.0.113.1:0-1023</c>

          <c>192.168.0.1</c>

          <c>203.0.113.1:1024-5055</c>

          <c>192.168.0.2</c>

          <c>203.0.113.1:5056-9087</c>

          <c>192.168.0.3</c>

          <c>203.0.113.1:9088-13119</c>

          <c>192.168.0.4</c>

          <c>203.0.113.1:13120-17151</c>

          <c>192.168.0.5</c>

          <c>203.0.113.1:17151-21183</c>

          <c>192.168.0.6</c>

          <c>203.0.113.1:21184-25215</c>

          <c>192.168.0.7</c>

          <c>203.0.113.1:25216-29247</c>

          <c>192.168.0.8</c>

          <c>203.0.113.1:29248-33279</c>

          <c>192.168.0.9</c>

          <c>203.0.113.1:33280-37311</c>

          <c>192.168.0.10</c>

          <c>203.0.113.1:37312-41343</c>

          <c>192.168.0.11</c>

          <c>203.0.113.1:41344-45375</c>

          <c>192.168.0.12</c>

          <c>203.0.113.1:45376-49407</c>

          <c>192.168.0.13</c>

          <c>203.0.113.1:49408-53439</c>

          <c>192.168.0.14</c>

          <c>203.0.113.1:53440-57471</c>

          <c>Dynamic</c>

          <c>203.0.113.1:57472-65535</c>
        </texttable>

        <t>When subscriber 1 using 192.168.0.1 initiates a low volume of
        connections (e.g. < 4032 concurrent connections), the CGN maps the
        outgoing source address/port to the preallocated range. These
        translation mappings are not logged.</t>

        <t>Subscriber 2 concurrently uses more than the allocated 4032 ports
        (e.g. for peer-to-peer, mapping, video streaming, or other
        connection-intensive traffic types), the CGN allocates up to an
        additional 1008 ports using bulk port reservations. In this example,
        subscriber 2 uses outside ports 5056-9087, and then 100-port blocks
        between 58000-58999. Connections using ports 5056-9087 are not logged,
        while 10 log entries are created for ports 58000-58099, 58100-58199,
        58200-58299, ..., 58900-58999.</t>

        <t>If a law enforcement agency reports abuse from 203.0.113.1, port
        2001, the operator can reverse the mapping algorithm to determine that
        the internal IP address subscriber 1 has been assigned generated the
        traffic without consulting CGN logs (by correlating the internal IP
        address with DHCP/PPP lease connection records). If a second abuse
        report comes in for 203.0.113.1, port 58204, the operator will
        determine that port 58204 is within the dynamic pool range, consult
        the log file, corrolate with connection records, and determine that
        subscriber 2 generated the traffic (assuming that the law enforcement
        timestamp matches the operator timestamp. As noted in <xref
        target="RFC6292">RFC6292</xref>, accurate time-keeping (e.g., use of
        NTP or Simple NTP) is vital).</t>

        <t>In this example, there are no log entries for the majority of
        subscribers, who only use pre-allocated ports. Only minimal logging
        would be needed for those few subscribers who exceed their
        pre-allocated ports and obtain extra bulk port assignments from the
        dynamic pool. Logging data for those users will include inside
        address, outside address, outside port range, and timestamp.</t>
      </section>
    </section>

    <section title="Additional Logging Considerations">
      <t>In order to be able to identify a subscriber based on observed
      external IPv4 address, port, and timestamp, an operator needs to know
      how the CGN was configured with regards to internal and external IP
      addresses, dynamic address pool factor, maximum ports per user, and
      reserved port range at any given time. Therefore, the CGN MUST generate
      a log message any time such variables are changed. Also, the CGN SHOULD
      generate such a log message once per day to facilitate quick
      identification of the relevant configuration in the event of an abuse
      notification.</t>

      <t>Such a log message MUST, at minimum, include the timestamp, inside
      prefix I, inside mask, outside prefix O, outside mask, D, M, and
      reserved port range; for example:</t>

      <t>[Wed Oct 11 14:32:52
      2000]:192.168.0.0:28:203.0.113.0:32:2:5040:1-1023,5004,5060.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no request of IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The security considerations applicable to NAT operation for various
      protocols as documented in, for example, <xref target="RFC4787">RFC
      4787</xref> and <xref target="RFC5382">RFC 5382</xref> also apply to
      this document.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>TBD</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.4787"?>

      <?rfc include="reference.RFC.6269"?>

      <?rfc include="reference.RFC.5382"?>

      <?rfc include="reference.RFC.6264"?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.6333"?>

      <?rfc include='reference.I-D.draft-ietf-pcp-base-21'?>

      <?rfc include='reference.RFC.6292'?>

      <?rfc include='reference.RFC.6056'?>

      <?rfc include='reference.RFC.6335'?>

      <?rfc include='reference.RFC.6431'?>

      <?rfc include='reference.I-D.draft-miles-behave-l2nat-00'?>

      <?rfc include="reference.I-D.draft-shirasaki-nat444-05"?>

      <?rfc include="reference.I-D.draft-tsou-behave-natx4-log-reduction-02"?>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 02:54:35