One document matched: draft-bonica-v6-multihome-01.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC1034 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml">
<!ENTITY RFC1918 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1918.xml">
<!ENTITY RFC3582 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3582.xml">
<!ENTITY RFC3596 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3596.xml">
<!ENTITY RFC4193 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4193.xml">
<!ENTITY RFC4271 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4271.xml">
<!ENTITY RFC5881 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5881.xml">
<!ENTITY RFC6296 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6296.xml">
<!ENTITY I-D.ietf-lisp SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-lisp.xml">
<!ENTITY I-D.rja-ilnp-intro SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.rja-ilnp-intro.xml">
<!ENTITY I-D.ietf-idr-best-external SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-idr-best-external.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="exp" docName="draft-bonica-v6-multihome-01" ipr="trust200902"
     updates="6296">
  <!-- ***** FRONT MATTER ***** -->

  <front>
    <title abbrev="Multihoming With NPT6">Multihoming with IPv6-to-IPv6
    Network Prefix Translation (NPTv6)</title>

    <author fullname="Ron Bonica" initials="R." role="editor" surname="Bonica">
      <organization>Juniper Networks</organization>

      <address>
        <postal>
          <street></street>

          <city>Sterling</city>

          <code>20164</code>

          <region>Virginia</region>

          <country>USA</country>
        </postal>

        <email>rbonica@juniper.net</email>
      </address>
    </author>

    <author fullname="Fred Baker" initials="F.J." surname="Baker">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street></street>

          <city>Santa Barbara</city>

          <code>93117</code>

          <region>California</region>

          <country>USA</country>
        </postal>

        <email>fred@cisco.com</email>
      </address>
    </author>

    <author fullname="Margaret Wasserman" initials="M." surname="Wasserman">
      <organization>Painless Security</organization>

      <address>
        <postal>
          <street>356 Abbott Street</street>

          <city>North Andover</city>

          <region>Massachusetts</region>

          <code>01845</code>

          <country>USA</country>
        </postal>

        <phone>+1 781 405 7464</phone>

        <email>mrw@painless-security.com</email>

        <uri>http://www.painless-security.com</uri>
      </address>
    </author>

    <author fullname="Gregory J. Miller" initials="G.J." surname="Miller">
      <organization>Verizon</organization>

      <address>
        <postal>
          <street></street>

          <city>Ashburn</city>

          <region>Virginia</region>

          <code>20147</code>

          <country>USA</country>
        </postal>

        <email>gregory.j.miller@verizon.com</email>
      </address>
    </author>

    <date year="2011" />

    <area>INT</area>

    <workgroup>INTAREA</workgroup>

    <keyword>multihoming</keyword>

    <keyword>IPv6</keyword>

    <abstract>
      <t>This memo describes an architecture for sites that are homed to
      multiple upstream providers. The architecture described herein uses
      IPv6-to-IPv6 Network Prefix Translation (NPTv6) to achieve redundancy,
      transport-layer survivability, load sharing and address
      independence.</t>

      <t>This memo updates Section 2.4 of RFC 6296.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="Introduction" title="Introduction">
      <t><xref target="RFC3582"></xref> establishes the following goals for
      IPv6 site multihoming:</t>

      <t><list hangIndent="6" style="hanging">
          <t>Redundancy - A site's ability to remain connected to the
          Internet, even when connectivity through one or more of its upstream
          providers fails.</t>

          <t>Transport-Layer Survivability - A site's ability to maintain
          transport-layer sessions across failover and restoration events.
          During a failover/restoration event, the transport-layer session may
          detect packet loss or reordering, but neither of these cause the
          transport-layer session to fail.</t>

          <t>Load Sharing - The ability of a site to distribute both inbound
          and outbound traffic across its upstream providers.</t>
        </list></t>

      <t><xref target="RFC3582"></xref> notes that a multihoming solution may
      require interactions with the routing subsystem. However, multihoming
      solutions must be simple and scalable. They must not require excessive
      operational effort and must not cause excessive routing table
      expansion.</t>

      <t><xref target="RFC6296"></xref> explains how a site can achieve
      address independence using IPv6-to-IPv6 Network Prefix Translation
      (NPTv6). In order to achieve address independence, the site assigns an
      inside address to each of its resources (e.g., hosts). Nodes outside of
      the site identify those same resources using a corresponding Provider
      Allocated (PA) address.</t>

      <t>The site resolves this addressing dichotomy by deploying an NPTv6
      translator between itself and its upstream provider. The NPTv6
      translator maintains a static, one-to-one mapping between each inside
      address and its corresponding PA address. That mapping persists across
      flows and over time.</t>

      <t>If the site disconnects from one upstream provider and connects to
      another, it may lose its PA assignment. However, the site will not need
      to renumber its resources. It will only need to reconfigure the mapping
      rules on its local NPTv6 translator.</t>

      <t>Section 2.4 of <xref target="RFC6296"></xref> describes an NPTv6
      architecture for sites that are homed to multiple upstream providers.
      While that architecture fulfils many of the goals identified by <xref
      target="RFC3582"></xref>, it does not achieve transport-layer
      survivability. Transport-layer survivability is not achieved because in
      this architecture, a PA address is useable only when the multi-homed
      site is directly connected to the allocating provider.</t>

      <t>This memo describes an alternative architecture for multihomed sites
      that require transport-layer survivability. It updates Section 2.4 of
      <xref target="RFC6296"></xref>. In this architecture, PA addresses
      remain usable, even when the multihomed site loses its direct connection
      to the allocating provider.</t>

      <t>The architecture described in this document can be deployed in sites
      that are served by two or more upstream providers. For the purpose of
      example, this document demonstrates how the architecture can be deployed
      in a site that is served by two upstream providers.</t>

      <section title="Terminology">
        <t>The following terms are used in this document:</t>

        <t><list style="hanging">
            <t>inbound packet - A packet that is destined for the multi-homed
            site</t>

            <t>outbound packet - A packet that originates at the multi-homed
            site and is destined for a point outside of the multi-homed
            site</t>

            <t>NPTv6 inside interface - An interface that connects an NPTv6
            translator to the site</t>

            <t>NPTv6 outside interface- An interface that connects an NPTv6
            translator to an upstream provider</t>
          </list></t>
      </section>
    </section>

    <section anchor="NPTv6_deployment" title="NPTv6 Deployment">
      <t>This section demonstrates how NPTv6 can be deployed in order to
      achieve the goals of <xref target="RFC3582"></xref>.</t>

      <section anchor="Topology" title="Topology">
        <figure anchor="npt-topo" title="NPTv6 Multihomed Topology">
          <artwork align="center"><![CDATA[

               Upstream                Upstream
              Provider #1             Provider #2
                /     \                 /       \
               /       \               /         \
              /     +------+       +------+       \        
         +------+   |Backup|       |Backup|   +------+   
         |  PE  |   |  PE  |       |  PE  |   |  PE  |   
         |  #1  |   |  #1  |       |  #2  |   |  #2  |   
         +------+   +------+       +------+   +------+        
             |                                    |      
             |                                    |              
         +------+                             +------+  
         |NPTv6 |                             |NPTv6 |
         |  #1  |                             |  #2  |
         +------+                             +------+  
             |                                    |
             |                                    |
      ------------------------------------------------------
                       Internal Network

          ]]></artwork>
        </figure>

        <t>In <xref target="npt-topo"></xref>, a site attaches all of its
        assets, including two NPTv6 translators, to an Internal Network. NPTv6
        #1 is connected to Provider Edge (PE) Router #1, which is maintained
        by Upstream Provider #1. Likewise, NPTv6 #2 is connected to PE Router
        #2, which is maintained by Upstream Provider #2.</t>

        <t>Each upstream provider also maintains a Backup PE Router. A
        forwarding tunnel connects the loopback interface of Backup PE Router
        #1 to the outside interface of NPTv6 #2. Likewise, another forwarding
        tunnel connects Backup PE Router #2 to NPTv6 #1. Network operators can
        select from many encapsulation techniques (e.g., GRE) to realize the
        forwarding tunnel. Tunnels are not depicted in <xref
        target="npt-topo"></xref>.</t>

        <t>In the figure, NPTv6 #1 and NPTv6 #2 are depicted as separate
        boxes. While vendors may produce a separate box to support the NPTv6
        function, they may also integrate the NPTv6 function into a
        router.</t>

        <t>During periods of normal operation, the Backup PE routers is very
        lightly loaded. Therefore, a single Backup PE router may back up
        multiple PE routers. Furthermore, the Backup PE router may be used for
        other purposes (e.g., primary PE router for another customer).</t>
      </section>

      <section anchor="Addressing" title="Addressing">
        <section anchor="providerAddressing"
                 title="Upstream Provider Addressing">
          <t>A Regional Internet Registry (RIR) allocates Provider Address
          Block (PAB) #1 to Upstream Provider #1. From PAB #1, Upstream
          Provider #1 allocates two sub-blocks, using them as follows.</t>

          <t>Upstream Provider #1 uses the first sub-block for its internal
          address assignments. It also uses that sub-block for numbering both
          ends of the interfaces between itself and its customers.</t>

          <t>Upstream Provider #1 uses the second sub-block for address
          allocation to its customers. We refer to a particular allocation
          from this sub-block as a Customer Network Block (CNB). A CNB
          allocated for a particular customer must be large enough to provide
          addressing for the customer's entire Internal Network. In our
          example, Upstream Provider #1 allocates a /60, called CNB #1, to its
          customer.</t>

          <t>The customer configures translation rules that reference CNB #1
          on NPTv6 #1 and NPTv6 #2. This makes selected hosts that are
          connected to the Internal Network accessible using CNB #1 addresses.
          See Section 2.3 for details.</t>

          <t>In a similar fashion, a Regional Internet Registry (RIR)
          allocates PAB #2 to Upstream Provider #2. Upstream Provider #2, in
          turn, allocates CNB #2 to the multihomed customer.</t>
        </section>

        <section anchor="siteAddressing" title="Site Addressing">
          <t>The site obtains a Site Address Block (SAB), either from <xref
          target="RFC4193">Unique Local Address (ULA)</xref> space, or by some
          other means. The SAB is as large as all of the site's CNBs,
          combined. In this example, because CNB #1 and CNB #2 are both /60's,
          the SAB is a /59.</t>

          <t>The site divides its SAB into smaller blocks, with each block
          being exactly as large as one CNB. It also associates each of the
          resulting sub-blocks with one of its CNBs. In this example, the site
          divides the SAB into a lower half and an upper half. It associates
          the lower half of the SAB with CNB #1 and the upper half of the SAB
          with CNB #2.</t>

          <t>Finally, the site assigns one SAB address to each interface that
          is connected to the Internal Network, including the inside
          interfaces of the two NPTv6 translators. The site also assigns a SAB
          address to the loopback interface of each NPTv6 translator. During
          periods of normal operation, interfaces that are assigned addresses
          from the lower half of the SAB receive traffic through Upstream
          Provider #1. Likewise, interfaces that are assigned addresses from
          the upper half of the SAB receive traffic through Upstream Provider
          #2.</t>

          <t>Selected interfaces, because they receive a great deal of
          traffic, must receive traffic through both upstream providers
          simultaneously. Furthermore, those interfaces must control the
          portion of traffic arriving through each upstream provider. The site
          assigns multiple addresses to those interfaces, some from the lower
          half and others from the upper half of the SAB. For any interface,
          the ratio of upper half to lower half assignments roughly controls
          the portion of traffic arriving through each upstream provider. See
          <xref target="Address_Translation"></xref>, <xref
          target="Routing"></xref> and <xref target="Load_Balancing"></xref>
          for details.</t>
        </section>
      </section>

      <section anchor="Address_Translation" title="Address Translation">
        <t>Both NPTv6 translators are configured with the following rules:</t>

        <t><list hangIndent="6" style="hanging">
            <t>For outbound packets, if the first 60 bits of the source
            address identify the lower half of the SAB, overwrite those 60
            bits with the 60 bits that identify CNB #1</t>

            <t>For outbound packets, if the first 60 bits of the source
            address identify the upper half of the SAB, overwrite those 60
            bits with the 60 bits that identify CNB #2</t>

            <t>For outbound packets, if none of the conditions above are met,
            either drop or pass the packet without translation, according to
            local security policy</t>

            <t>For inbound packets, if the first 60 bits of the destination
            address identify CNB #1, overwrite those 60 bits with the 60 bits
            that identify the lower half of the SAB</t>

            <t>For inbound packets, if the first 60 bits of the destination
            address identify CNB #2, overwrite those 60 bits with the 60 bits
            that identify the upper half of the SAB</t>

            <t>For inbound packets, if none of the conditions above are met,
            either drop or pass the packet without translation, according to
            local security policy</t>
          </list></t>

        <t>Due to the nature of the rules described above, NPTv6 translation
        is stateless. Therefore, traffic flows do not need to be symmetric
        across NPTv6 translators. Furthermore, a traffic flow can shift from
        one NPTv6 translator to another without causing transport-layer
        session failure.</t>
      </section>

      <section anchor="DNS" title="Domain Name System (DNS)">
        <t>In order to make all site resources reachable by <xref
        target="RFC1034">domain name</xref>, the site publishes <xref
        target="RFC3596">AAAA records</xref> associating each resource with
        all of its CNB addresses. While this DNS architecture is sufficient,
        it is suboptimal. Traffic that both originates and terminates within
        the site traverses NPTv6 translators needlessly. Several optimizations
        are available. These optimizations are well understood and have been
        applied to <xref target="RFC1918"></xref> networks for many years.</t>
      </section>

      <section anchor="Routing" title="Routing">
        <t>Upstream Provider #1 uses an Interior Gateway Protocol to flood
        topology information throughout its domain. It also uses <xref
        target="RFC4271">BGP</xref> to distribute customer and peer
        reachability information.</t>

        <t>PE #1 acquires a route to CNB #1 with NEXT-HOP equal to the outside
        interface of NPTv6 #1. PE #1 can either learn this route from a
        single-hop eBGP session with NPTv6 #1, or acquire it through static
        configuration. In either case, PE #1 overwrites the NEXT-HOP of this
        route with its own loopback address and distributes the route
        throughout Upstream Provider #1 using iBGP. The LOCAL PREF for this
        route is set high, so that the route will be preferred to alternative
        routes to CNB #1. Upstream Provider #1 does not distribute this route
        to CNB #1 outside of its own borders because it is part of the larger
        aggregate PAB #1, which is itself advertised.</t>

        <t>NPTv6 #1 acquires a default route with NEXT-HOP equal to the
        directly connected interface on PE #1. NPTv6 #1 can either learn this
        route from a single-hop eBGP session with PE #1, or acquire it through
        static configuration.</t>

        <t>Similarly, Backup PE #1 acquires a route to CNB #1 with NEXT-HOP
        equal to the outside interface of NPTv6 #2. Backup PE #1 can either
        learn this route from a multi-hop eBGP session with NPTv6 #2, or
        acquire it through static configuration. In either case, Backup PE #1
        overwrites the NEXT-HOP of this route with its own loopback address
        and distributes the route throughout Upstream Provider #1 using iBGP.
        Distribution procedures are defined in <xref
        target="I-D.ietf-idr-best-external"></xref>. The LOCAL PREF for this
        route is set low, so that the route will not be preferred to
        alternative routes to CNB #1. Upstream Provider #1 does not distribute
        this route to CNB #1 outside of its own borders.</t>

        <t>Even if Backup PE #1 maintains an eBGP session NPTv6 #2, it does
        not advertise the default route through that eBGP session. During
        failures, Backup PE #1 does not attract outbound traffic to
        itself.</t>

        <t>Finally, the Autonomous System Border Routers (ASBR) contained by
        Upstream Provider #1 maintain eBGP sessions with their peers. The
        ASBRs advertise only PAB #1 through those eBGP sessions. Upstream
        Provider #1 does not advertise any of the following to its eBGP
        peers:</t>

        <t><list hangIndent="6" style="hanging">
            <t>any prefix that is contained by PAB #1 (i.e., more
            specific)</t>

            <t>PAB #2 or any part thereof</t>

            <t>the SAB or any part thereof</t>
          </list></t>

        <t>Upstream Provider #2 is configured in a manner analogous to that
        described above.</t>
      </section>

      <section anchor="Failure_Detection_Recovery"
               title="Failure Detection and Recovery">
        <t>When PE #1 loses its route to CNB #1, it withdraws its iBGP
        advertisement for that prefix from Upstream Provider #1. The route
        advertised by Backup PE #1 remains and Backup PE #1 attracts traffic
        bound for CNB #1 to itself. Backup PE #1 forwards that traffic through
        the tunnel to NPTv6 #2. NPTv6 #2 performs translations and delivers
        the traffic to the Internal Network.</t>

        <t>Likewise, when NPTv6 #1 loses its default route, it makes itself
        unavailable as a gateway for other hosts on the Internal Network.
        NPTv6 #2 attracts all outbound traffic to itself and forwards that
        traffic through Upstream Provider #2. The mechanism by which NPTv6 #1
        makes itself unavailable as a gateway is beyond the scope of this
        document.</t>

        <t>If PE #1 maintains a single-hop eBGP session with NPTv6 #1, the
        failure of that eBGP session will cause both routes mentioned above to
        be lost. Otherwise, another failure detection mechanism such as <xref
        target="RFC5881">BFD</xref> is required.</t>

        <t>Regardless of the failure detection mechanism, inbound traffic
        traverses the tunnel only during failure periods and outbound traffic
        never traverses the tunnel. Furthermore, restoration is localized. As
        soon as the advertisement for CNB #1 is withdrawn throughout Upstream
        Provider #1, restoration is complete.</t>

        <t>Transport-layer connections survive Failure/Recovery events because
        both NPTv6 translators implement identical translation rules. When a
        traffic flow shifts from one translator to another, neither the source
        address nor the destination address changes.</t>
      </section>

      <section anchor="Load_Balancing" title="Load Balancing">
        <t>In the architecture described above, site addressing determines
        load balancing. If a host is numbered from the lower half of the SAB,
        its address is mapped to CNB #1, which is announced only by Upstream
        Provider #1 (as part of PAB #1). Therefore, during periods of normal
        operation, all traffic bound for that host traverses Upstream Provider
        #1 and NPTv6 #1. Likewise, if a host is numbered from the upper half
        of the SAB, its address is mapped to CNB #2, which is announced only
        by Upstream Provider #2 (as part of PAB #2). Therefore, during periods
        of normal operation, all traffic bound for that host traverses
        Upstream Provider #2 and NPTv6 #2.</t>

        <t>Hosts that receive a great quantity of traffic can be assigned
        multiple addresses, with some from the lower half and others from the
        upper half of the SAB. The address chosen for any particular flow
        determines the path of inbound traffic for that flow. For flows
        initiated outside of the Internal Network, the site influences the
        probability that a particular address will be used by manipulating the
        type and number of PAB addresses advertised in DNS.</t>
      </section>
    </section>

    <section anchor="Discussion" title="Discussion">
      <t>This section discusses the merits of the proposed architecture, as
      compared with other multihoming approaches <xref
      target="I-D.ietf-lisp"></xref> <xref
      target="I-D.rja-ilnp-intro"></xref>. The following are benefits of the
      proposed architecture:</t>

      <t><list hangIndent="6" style="hanging">
          <t>Address mapping information is required only at the NPTv6
          translator. There is no need to distribute mapping information
          beyond the boundaries of the multihomed site.</t>

          <t>Because only a small number of mapping rules are required at each
          multihomed site, there is no need to cache these rules.</t>

          <t>During periods of normal operation, packets do not need to be
          encapsulated. Inbound traffic traverses a tunnel only during failure
          periods and outbound traffic never traverses a tunnel.</t>

          <t>The proposal can be realized using a wide variety of existing
          encapsulation methods. It does not require a new encapsulation
          method.</t>

          <t>The failover/restoration mechanism is localized to a single
          autonomous system. Once updated routing information has been
          distributed throughout the autonomous system, the
          failover/restoration event is complete.</t>

          <t>Benefit can be derived from incremental, partial and even minimal
          deployment.</t>

          <t>The cost of the solution is born by its beneficiaries (i.e.,
          primarily the multihomed site and secondarily multihomed site's
          upstream provider).</t>
        </list></t>

      <t>The following are disadvantages of the proposed architecture:</t>

      <t><list hangIndent="6" style="hanging">
          <t>By modifying IPv6 addresses, this architecture violates the
          end-to-end principle.</t>

          <t>The load balancing capabilities described in this memo may not
          suffice for all sites. Those sites might be required to fall back
          upon other load balancing solutions (e.g., advertising multiple
          prefixes)</t>

          <t>The time required to redistribute traffic from one path to
          another is determined by DNS TTL</t>
        </list></t>
    </section>

    <section anchor="iana_considerations" title="IANA Considerations">
      <t>This document requires no IANA actions.</t>
    </section>

    <section anchor="security_considerations" title="Security Considerations">
      <t>As with any architecture that modifies source and destination
      addresses, the operation of access control lists, firewalls and
      intrusion detection systems may be impacted. Also many users may confuse
      NPTv6 translation with a NAT. Two limitations of NAT are that a) it does
      not support incoming connections without special configuration and b) it
      requires symmetric routing across the NAT device. Many users understood
      these limitations to be security features. Because NPTv6 has neither of
      these limitations, it also offers neither of these features.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks to John Scudder, Yakov Rekhter and Warren Kumari for their
      helpful comments, encouragement and support. Special thanks to Johann
      Jonsson, James Piper, Ravinder Wali, Ashte Collins, Inga Rollins and an
      anonymous donor, without whom this memo would not have been written.</t>
    </section>

    <!--
-->
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <references title="Normative References">
      &RFC1034;

      &RFC1918;

      &RFC3582;

      &RFC3596;

      &RFC4193;

      &RFC4271;

      &RFC5881;

      &RFC6296;
    </references>

    <references title="Informative References">
      &I-D.ietf-lisp;

      &I-D.rja-ilnp-intro;

      &I-D.ietf-idr-best-external;
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-22 22:17:25