http://stupid.domain.name/ietf/

One document matched: draft-ietf-bess-virtual-subnet-07.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
    There has to be one entity for each item to be referenced. 
    An alternate method (rfc include) is described in the references. -->
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-ietf-bess-virtual-subnet-07"
     ipr="trust200902">
  <front>
    <title abbrev="Virtual Subnet">Virtual Subnet: A BGP/MPLS IP VPN-based
    Subnet Extension Solution</title>

    <author fullname="Xiaohu Xu" initials="X.X." surname="Xu">
      <organization>Huawei Technologies</organization>

      <address>
        <postal>
          <street>No.156 Beiqing Rd</street>

          <city>Beijing</city>

          <region/>

          <code>100095</code>

          <country>CHINA</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>xuxiaohu@huawei.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Christian Jacquenet" initials="C.J." surname="Jacquenet">
      <organization>Orange</organization>

      <address>
        <postal>
          <street>4 rue du Clos Courtel</street>

          <city>Cesson-Sévigné,</city>

          <region/>

          <code>35512</code>

          <country>FRANCE</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>christian.jacquenet@orange.com</email>

        <uri/>
      </address>
    </author>

    <author fullname="Robert Raszuk" initials="R.R." surname="Raszuk">
      <organization>Bloomberg LP</organization>

      <address>
        <postal>
          <street>731 Lexington Ave</street>

          <city>New York City</city>

          <region>NY</region>

          <code>10022</code>

          <country>USA</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>robert@raszuk.net</email>

        <uri/>
      </address>
    </author>

    <author fullname="Truman Boyes" initials="T.B." surname="Boyes">
      <organization>Bloomberg LP</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>tboyes@bloomberg.net</email>

        <uri/>
      </address>
    </author>

    <author fullname="Brendan Fee" initials="B.F." surname="Fee">
      <organization>Extreme Networks</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <region/>

          <code/>

          <country/>
        </postal>

        <phone/>

        <facsimile/>

        <email>bfee@extremenetworks.com</email>

        <uri/>
      </address>
    </author>

    <!--

-->

    <date day="" month="" year="2015"/>

    <abstract>
      <t>This document describes a BGP/MPLS IP VPN-based subnet extension
      solution referred to as Virtual Subnet, which can be used for building
      Layer 3 network virtualization overlays within and/or between data
      centers.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>For business continuity purposes, Virtual Machine (VM) migration
      across data centers is commonly used in situations such as data center
      maintenance, migration, consolidation, expansion, or disaster avoidance.
      The IETF community has recognized that IP renumbering of servers (i.e.,
      VMs) after the migration is usually complex and costly. To allow the
      migration of a VM from one data center to another without IP
      renumbering, the subnet on which the VM resides needs to be extended
      across these data centers.</t>

      <t>To achieve subnet extension across multiple cloud data centers in a
      scalable way, the following requirements and challenges must be
      considered:</t>

      <t><list style="letters">
          <t>VPN Instance Space Scalability: In a modern cloud data center
          environment, thousands or even tens of thousands of tenants could be
          hosted over a shared network infrastructure. For security and
          performance isolation purposes, these tenants need to be isolated
          from one another.</t>

          <t>Forwarding Table Scalability: With the development of server
          virtualization technologies, it's not uncommon for a single cloud
          data center to contain millions of VMs. This number already implies
          a big challenge to the forwarding table scalability of data center
          switches. Provided multiple data centers of such scale were
          interconnected at Layer 2, this challenge would become even
          worse.</t>

          <t>ARP/ND Cache Table Scalability: <xref target="RFC6820"/> notes
          that the Address Resolution Protocol (ARP)/Neighbor Discovery (ND)
          cache tables maintained by default gateways within cloud data
          centers can raise scalability issues. Therefore, mastering the size
          of the ARP/ND cache tables is critical as the number of data centers
          to be connected increases.</t>

          <t>ARP/ND and Unknown Unicast Flooding: It's well-known that the
          flooding of ARP/ND broadcast/multicast messages as well as unknown
          unicast traffic within large Layer 2 networks is likely to affect
          network and host performance. When multiple data centers that each
          hosts millions of VMs are interconnected at Layer 2, the impact of
          such flooding would become even worse. As such, it becomes
          increasingly important to avoid the flooding of ARP/ND
          broadcast/multicast as well as unknown unicast traffic across data
          centers.</t>

          <t>Path Optimization: A subnet usually indicates a location in the
          network. However, when a subnet has been extended across multiple
          geographically-dispersed data center locations, the location
          semantics of such subnet is not retained any longer. As a result,
          traffic exchanged between a specific user and a server that would be
          located in different data centers, may first be forwarded through a
          third data center. This suboptimal routing would obviously result in
          an unnecessary consumption of the bandwidth resources between data
          centers. Furthermore, in the case where traditional VPLS technology
          <xref target="RFC4761"/> <xref target="RFC4762"/> is used for data
          center interconnect, return traffic from a server may be forwarded
          to a default gateway located in a different data center due to the
          configuration of a virtual router redundancy group. This suboptimal
          routing would also unnecessarily consume the bandwidth resources
          between data centers.</t>
        </list>This document describes a BGP/MPLS IP VPN-based subnet
      extension solution referred to as Virtual Subnet, which can be used for
      data center interconnection while addressing all of the aforementioned
      requirements and challenges. Here the BGP/MPLS IP VPN means both
      BGP/MPLS IPv4 VPN <xref target="RFC4364"/> and BGP/MPLS IPv6 VPN <xref
      target="RFC4659"/>. In addition, since Virtual Subnet is mainly built on
      proven technologies such as BGP/MPLS IP VPN and ARP/ND proxy <xref
      target="RFC0925"/><xref target="RFC1027"/><xref target="RFC4389"/>,
      those service providers that provide Infrastructure as a Service (IaaS)
      cloud services can rely upon their existing BGP/MPLS IP VPN
      infrastructure and take advantage of their BGP/MPLS VPN operational
      experience to interconnect data centers.</t>

      <t>Although Virtual Subnet is described in this document as an approach
      for data center interconnection, it can be used within data centers as
      well.</t>

      <t>Note that the approach described in this document is not intended to
      achieve an exact emulation of Layer 2 connectivity and therefore it can
      only support a restricted Layer 2 connectivity service model with
      limitations that are discussed in Section 4. As for the discussion about
      where this service model can apply, it's outside the scope of this
      document.</t>
    </section>

    <section anchor="Teminology" title="Terminology">
      <t>This memo makes use of the terms defined in <xref
      target="RFC4364"/>.</t>
    </section>

    <section anchor="Advertising" title="Solution Description">
      <section title="Unicast">
        <section title="Intra-subnet Unicast">
          <t><figure>
              <artwork align="center"><![CDATA[                           +--------------------+
    +------------------+   |                    |   +------------------+
    |VPN_A:192.0.2.1/24|   |                    |   |VPN_A:192.0.2.1/24|
    |              \   |   |                    |   |  /               |
    |    +------+   \ ++---+-+                +-+---++/   +------+     |
    |    |Host A+-----+ PE-1 |                | PE-2 +----+Host B|     |
    |    +------+\    ++-+-+-+                +-+-+-++   /+------+     |
    |     192.0.2.2/24 | | |                    | | |  192.0.2.3/24    |
    |                  | | |                    | | |                  |
    |     DC West      | | |  IP/MPLS Backbone  | | |     DC East      |
    +------------------+ | |                    | | +------------------+
                         | +--------------------+ |
                         |                        |
VRF_A :                  V                VRF_A : V
+------------+---------+--------+      +------------+---------+--------+
|   Prefix   | Nexthop |Protocol|      |   Prefix   | Nexthop |Protocol|
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.1/32|127.0.0.1| Direct |      |192.0.2.1/32|127.0.0.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.2/32|192.0.2.2| Direct |      |192.0.2.2/32|   PE-1  |  IBGP  |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.3/32|   PE-2  |  IBGP  |      |192.0.2.3/32|192.0.2.3| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.0/24|192.0.2.1| Direct |      |192.0.2.0/24|192.0.2.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
                   Figure 1: Intra-subnet Unicast Example
]]></artwork>
            </figure>As shown in Figure 1, two hosts (i.e., Hosts A and B)
          belonging to the same subnet (i.e., 192.0.2.0/24) are located in
          different data centers (i.e., DC West and DC East) respectively. PE
          routers (i.e., PE-1 and PE-2) that are used for interconnecting
          these two data centers create host routes for their own local hosts
          respectively and then advertise these routes by means of the
          BGP/MPLS IP VPN signaling. Meanwhile, an ARP proxy is enabled on
          Virtual Routing and Forwarding (VRF) attachment circuits of these PE
          routers.</t>

          <t>Let's now assume that host A sends an ARP request for host B
          before communicating with host B. Upon receiving the ARP request,
          PE-1 acting as an ARP proxy returns its own MAC address as a
          response. Host A then sends IP packets for host B to PE-1. PE-1
          tunnels such packets towards PE-2 which in turn forwards them to
          host B. Thus, hosts A and B can communicate with each other as if
          they were located within the same subnet.</t>
        </section>

        <section title="Inter-subnet Unicast">
          <t><figure>
              <artwork align="center"><![CDATA[                           +--------------------+
    +------------------+   |                    |   +------------------+
    |VPN_A:192.0.2.1/24|   |                    |   |VPN_A:192.0.2.1/24|
    |              \   |   |                    |   |  /               |
    |  +------+     \ ++---+-+                +-+---++/     +------+   |
    |  |Host A+-------+ PE-1 |                | PE-2 +-+----+Host B|   |
    |  +------+\      ++-+-+-+                +-+-+-++ |   /+------+   |
    |   192.0.2.2/24   | | |                    | | |  | 192.0.2.3/24  |
    |   GW=192.0.2.4   | | |                    | | |  | GW=192.0.2.4  |
    |                  | | |                    | | |  |    +------+   |
    |                  | | |                    | | |  +----+  GW  +-- |
    |                  | | |                    | | |      /+------+   |
    |                  | | |                    | | |    192.0.2.4/24  |
    |                  | | |                    | | |                  |
    |     DC West      | | |  IP/MPLS Backbone  | | |      DC East     |
    +------------------+ | |                    | | +------------------+
                        | +--------------------+ |
                        |                        |
VRF_A :                 V                VRF_A : V
+------------+---------+--------+      +------------+---------+--------+
|   Prefix   | Nexthop |Protocol|      |   Prefix   | Nexthop |Protocol|
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.1/32|127.0.0.1| Direct |      |192.0.2.1/32|127.0.0.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.2/32|192.0.2.2| Direct |      |192.0.2.2/32|  PE-1   |  IBGP  |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.3/32|   PE-2  |  IBGP  |      |192.0.2.3/32|192.0.2.3| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.4/32|   PE-2  |  IBGP  |      |192.0.2.4/32|192.0.2.4| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.0/24|192.0.2.1| Direct |      |192.0.2.0/24|192.0.2.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
| 0.0.0.0/0  |   PE-2  |  IBGP  |      | 0.0.0.0/0  |192.0.2.4| Static |
+------------+---------+--------+      +------------+---------+--------+
                   Figure 2: Inter-subnet Unicast Example (1)
]]></artwork>
            </figure>As shown in Figure 2, only one data center (i.e., DC
          East) is deployed with a default gateway (i.e., GW). PE-2 that is
          connected to GW would either be configured with or learn from GW a
          default route with the next-hop being pointed at GW. Meanwhile, this
          route is distributed to other PE routers (i.e., PE-1) as per normal
          <xref target="RFC4364"/> operation. Assume host A sends an ARP
          request for its default gateway (i.e., 192.0.2.4) prior to
          communicating with a destination host outside of its subnet. Upon
          receiving this ARP request, PE-1 acting as an ARP proxy returns its
          own MAC address as a response. Host A then sends a packet for Host B
          to PE-1. PE-1 tunnels such packet towards PE-2 according to the
          default route learnt from PE-2, which in turn forwards that packet
          to GW.</t>

          <t><figure>
              <artwork align="center"><![CDATA[                           +--------------------+
    +------------------+   |                    |   +------------------+
    |VPN_A:192.0.2.1/24|   |                    |   |VPN_A:192.0.2.1/24|
    |              \   |   |                    |   |  /               |
    |  +------+     \ ++---+-+                +-+---++/     +------+   |
    |  |Host A+----+--+ PE-1 |                | PE-2 +-+----+Host B|   |    
    |  +------+\   |  ++-+-+-+                +-+-+-++ |   /+------+   |
    |  192.0.2.2/24 |  | | |                    | | |  | 192.0.2.3/24  |
    |  GW=192.0.2.4 |  | | |                    | | |  | GW=192.0.2.4  |
    |  +------+    |   | | |                    | | |  |    +------+   |
    |--+ GW-1 +----+   | | |                    | | |  +----+ GW-2 +-- |
    |  +------+\       | | |                    | | |      /+------+   |
    |  192.0.2.4/24    | | |                    | | |    192.0.2.4/24  |
    |                  | | |                    | | |                  |
    |     DC West      | | |  IP/MPLS Backbone  | | |      DC East     |
    +------------------+ | |                    | | +------------------+
                        | +--------------------+ |
                        |                        |
VRF_A :                 V                VRF_A : V
+------------+---------+--------+      +------------+---------+--------+
|   Prefix   | Nexthop |Protocol|      |   Prefix   | Nexthop |Protocol|
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.1/32|127.0.0.1| Direct |      |192.0.2.1/32|127.0.0.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.2/32|192.0.2.2| Direct |      |192.0.2.2/32|  PE-1   |  IBGP  |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.3/32|   PE-2  |  IBGP  |      |192.0.2.3/32|192.0.2.3| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.4/32|192.0.2.4| Direct |      |192.0.2.4/32|192.0.2.4| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.0/24|192.0.2.1| Direct |      |192.0.2.0/24|192.0.2.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
| 0.0.0.0/0  |192.0.2.4| Static |      | 0.0.0.0/0  |192.0.2.4| Static |
+------------+---------+--------+      +------------+---------+--------+
                   Figure 3: Inter-subnet Unicast Example (2)
]]></artwork>
            </figure>As shown in Figure 3, in the case where each data center
          is deployed with a default gateway, hosts will get ARP responses
          directly from their local default gateways, rather than from their
          local PE routers when sending ARP requests for their default
          gateways.</t>

          <t><figure>
              <artwork align="center"><![CDATA[                                  +------+
                           +------+ PE-3 +------+
    +------------------+   |      +------+      |   +------------------+
    |VPN_A:192.0.2.1/24|   |                    |   |VPN_A:192.0.2.1/24|
    |              \   |   |                    |   |  /               |
    |  +------+     \ ++---+-+                +-+---++/     +------+   |
    |  |Host A+-------+ PE-1 |                | PE-2 +------+Host B|   |
    |  +------+\      ++-+-+-+                +-+-+-++     /+------+   |
    |  192.0.2.2/24    | | |                    | | |    192.0.2.3/24  |
    |  GW=192.0.2.1    | | |                    | | |    GW=192.0.2.1  |
    |                  | | |                    | | |                  |
    |     DC West      | | |  IP/MPLS Backbone  | | |      DC East     |
    +------------------+ | |                    | | +------------------+
                         | +--------------------+ |
                         |                        |
VRF_A :                  V                VRF_A : V
+------------+---------+--------+      +------------+---------+--------+
|   Prefix   | Nexthop |Protocol|      |   Prefix   | Nexthop |Protocol|
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.1/32|127.0.0.1| Direct |      |192.0.2.1/32|127.0.0.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.2/32|192.0.2.2| Direct |      |192.0.2.2/32|  PE-1   |  IBGP  |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.3/32|   PE-2  |  IBGP  |      |192.0.2.3/32|192.0.2.3| Direct |
+------------+---------+--------+      +------------+---------+--------+
|192.0.2.0/24|192.0.2.1| Direct |      |192.0.2.0/24|192.0.2.1| Direct |
+------------+---------+--------+      +------------+---------+--------+
| 0.0.0.0/0  |   PE-3  |  IBGP  |      | 0.0.0.0/0  |   PE-3  |  IBGP  |
+------------+---------+--------+      +------------+---------+--------+
                   Figure 4: Inter-subnet Unicast Example (3)
]]></artwork>
            </figure>Alternatively, as shown in Figure 4, PE routers
          themselves could be configured as default gateways for their locally
          connected hosts as long as these PE routers have routes to reach
          outside networks.</t>
        </section>
      </section>

      <section title="Multicast">
        <t>To support IP multicast between hosts of the same Virtual Subnet,
        MVPN technologies <xref target="RFC6513"/> could be used without any
        change. For example, PE routers attached to a given VPN join a default
        provider multicast distribution tree which is dedicated to that VPN.
        Ingress PE routers, upon receiving multicast packets from their local
        hosts, forward them towards remote PE routers through the
        corresponding default provider multicast distribution tree. Within
        this context, the IP multicast doesn't include link-local
        multicast.</t>
      </section>

      <section title="Host Discovery">
        <t>PE routers should be able to dynamically discover their local hosts
        and keep the list of these hosts up-to-date in a timely manner so as
        to ensure the availability and accuracy of the corresponding host
        routes originated from them. PE routers could accomplish local host
        discovery by some traditional host discovery mechanisms using ARP or
        ND protocols.</t>
      </section>

      <section title="ARP/ND Proxy">
        <t>Acting as an ARP or ND proxy, a PE router should only respond to an
        ARP request or Neighbor Solicitation (NS) message for a target host
        when it has a best route for that target host in the associated VRF
        and the outgoing interface of that best route is different from the
        one over which the ARP request or NS message is received. In the
        scenario where a given VPN site (i.e., a data center) is multi-homed
        to more than one PE router via an Ethernet switch or an Ethernet
        network, the Virtual Router Redundancy Protocol (VRRP) <xref
        target="RFC5798"/> is usually enabled on these PE routers. In this
        case, only the PE router being elected as the VRRP Master is allowed
        to perform the ARP/ND proxy function.</t>
      </section>

      <section title="Host Mobility">
        <t>During the VM migration process, the PE router to which the moving
        VM is now attached would create a host route for that host upon
        receiving a notification message of VM attachment (e.g., a gratuitous
        ARP or unsolicited NA message). The PE router to which the moving VM
        was previously attached would withdraw the corresponding host route
        when noticing the detachment of that VM. Meanwhile, the latter PE
        router could optionally broadcast a gratuitous ARP or send an
        unsolicited NA message on behalf of that host with source MAC address
        being one of its own. In this way, the ARP/ND entry of this host that
        moved and which has been cached on any local host would be updated
        accordingly. In the case where there is no explicit VM detachment
        notification mechanism, the PE router could also use the following
        trick to detect the VM detachment: upon learning a route update for a
        local host from a remote PE router for the first time, the PE router
        could immediately check whether that local host is still attached to
        it by some means (e.g., ARP/ND PING and/or ICMP PING). It is important
        to ensure that the same MAC and IP are associated to the default
        gateway active in each data center, as the VM would most likely
        continue to send packets to the same default gateway address after
        having migrated from one data center to another. One possible way to
        achieve this goal is to configure the same VRRP group on each location
        so as to ensure that the default gateway active in each data center
        shares the same virtual MAC and virtual IP addresses.</t>
      </section>

      <section title="Forwarding Table Scalability on Data Center Switches">
        <t>In a Virtual Subnet environment, the MAC learning domain associated
        with a given Virtual Subnet which has been extended across multiple
        data centers is partitioned into segments and each segment is confined
        within a single data center. Therefore data center switches only need
        to learn local MAC addresses, rather than learning both local and
        remote MAC addresses.</t>
      </section>

      <section title="ARP/ND Cache Table Scalability on Default Gateways">
        <t>When default gateway functions are implemented on PE routers as
        shown in Figure 4, the ARP/ND cache table on each PE router only needs
        to contain ARP/ND entries of local hosts. As a result, the ARP/ND
        cache table size would not grow as the number of data centers to be
        connected increases.</t>
      </section>

      <section title="ARP/ND and Unknown Unicast Flood Avoidance">
        <t>In a Virtual Subnet environment, the flooding domain associated
        with a given Virtual Subnet that has been extended across multiple
        data centers, is partitioned into segments and each segment is
        confined within a single data center. Therefore, the performance
        impact on networks and servers imposed by the flooding of ARP/ND
        broadcast/multicast and unknown unicast traffic is minimized.</t>
      </section>

      <section title="Path Optimization">
        <t>Take the scenario shown in Figure 4 as an example, to optimize the
        forwarding path for the traffic between cloud users and cloud data
        centers, PE routers located in cloud data centers (i.e., PE-1 and
        PE-2), which are also acting as default gateways, propagate host
        routes for their own local hosts respectively to remote PE routers
        which are attached to cloud user sites (i.e., PE-3). As such, traffic
        from cloud user sites to a given server on the Virtual Subnet which
        has been extended across data centers would be forwarded directly to
        the data center location where that server resides, since traffic is
        now forwarded according to the host route for that server, rather than
        the subnet route. Furthermore, for traffic coming from cloud data
        centers and forwarded to cloud user sites, each PE router acting as a
        default gateway would forward traffic according to the longest-match
        route in the corresponding VRF. As a result, traffic from data centers
        to cloud user sites is forwarded along an optimal path as well.</t>
      </section>
    </section>

    <!---->

    <section title="Limitations">
      <section title="Non-support of Non-IP Traffic">
        <t>Although most traffic within and across data centers is IP traffic,
        there may still be a few legacy clustering applications which rely on
        non-IP communications (e.g., heartbeat messages between cluster
        nodes). Since Virtual Subnet is strictly based on L3 forwarding, those
        non-IP communications cannot be supported in the Virtual Subnet
        solution. In order to support those few non-IP traffic (if present) in
        the environment where the Virtual Subnet solution has been deployed,
        the approach following the idea of “route all IP traffic, bridge
        non-IP traffic” could be considered. In other words, all IP
        traffic including both intra- and inter-subnet, would be processed
        according to the Virtual Subnet design, while non-IP traffic would be
        forwarded according to a particular Layer 2 VPN approach. Such unified
        L2/L3 VPN approach requires ingress PE routers to classify packets
        received from hosts before distributing them to the corresponding L2
        or L3 VPN forwarding processes. Note that more and more cluster
        vendors are offering clustering applications based on Layer 3
        interconnection.</t>
      </section>

      <section title="Non-support of IP Broadcast and Link-local Multicast">
        <t>As illustrated before, intra-subnet traffic across PE routers is
        forwarded at Layer 3 in the Virtual Subnet solution. Therefore, IP
        broadcast and link-local multicast traffic cannot be forwarded across
        PE routers in the Virtual Subnet solution. In order to support the IP
        broadcast and link-local multicast traffic in the environment where
        the Virtual Subnet solution has been deployed, the unified L2/L3
        overlay approach as described in Section 4.1 could be considered as
        well. That is, IP broadcast and link-local multicast messages would be
        forwared at Layer 2 while routable IP traffic would be processed
        according to the Virtual Subnet design.</t>
      </section>

      <section title="TTL and Traceroute">
        <t>As mentioned before, intra-subnet traffic is forwarded at Layer 3
        in the Virtual Subnet context. Since it doesn’t require any
        change to the Time To Live (TTL) handling mechanism of the BGP/MPLS IP
        VPN, when doing a traceroute operation on one host for another host
        (assuming that these two hosts are within the same subnet but are
        attached to different sites), the traceroute output would reflect the
        fact that these two hosts within the same subnet are actually
        connected via a Virtual Subnet, rather than a Layer 2 connection since
        the PE routers to which those two hosts are connected would be
        displayed in the traceroute output. In addition, for any other
        applications that generate intra-subnet traffic with TTL set to 1,
        these applications may not work properly in the Virtual Subnet
        context, unless special TTL processing and loop-prevention mechanisms
        for such context have been implemented. Details about such special TTL
        processing and loop-prevention mechanisms are outside the scope of
        this document.</t>
      </section>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks to Susan Hares, Yongbing Fan, Dino Farinacci, Himanshu Shah,
      Nabil Bitar, Giles Heron, Ronald Bonica, Monique Morrow, Rajiv Asati,
      Eric Osborne, Thomas Morin, Martin Vigoureux, Pedro Roque Marque, Joe
      Touch, Wim Henderickx, Alia Atlas and Stephen Farrell for their valuable
      comments and suggestions on this document. Thanks to Loa Andersson for
      his WG LC review on this document. Thanks to Alvaro Retana for his AD
      review on this document. Thanks to Ronald Bonica for his RtgDir review.
      Thanks to Donald Eastlake for his Sec-DIR review of this document.
      Thanks to Jouni Korhonen for the OPS-Dir review of this document. Thanks
      to Roni Even for the Gen-ART review of this document. Thanks to Sabrina
      Tanamal for the IANA review of this document.</t>

      <!---->
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>There is no requirement for any IANA action.</t>

      <!---->
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>Since the BGP/MPLS IP VPN signaling is reused without any change,
      those security considerations as described in <xref target="RFC4364"/>
      are applicable to this document. Meanwhile, since security issues
      associated with the NDP are inherited due to the use of NDP proxy, those
      security considerations and recommendations as described in <xref
      target="RFC6583"/> are applicable to this document as well.</t>

      <t>Inter data-center traffic often carries highly sensitive information
      at higher layers that is not directly understood (parsed) within an
      egress or ingress PE. For example, migrating a VM will often mean moving
      private keys and other sensitive configuration information. For this
      reason inter data-center traffic should always be protected for both
      confidentiality and integrity using a strong security mechanism such as
      IPsec <xref target="RFC4301"/>. In future it may be feasible to protect
      that traffic within the MPLS layer <xref
      target="I-D.ietf-mpls-opportunistic-encrypt"/> though at the time of
      writing the mechanism for that is not sufficiently mature to recommend.
      Exactly how such security mechanisms are deployed will vary from case to
      case, so securing the inter data-center traffic may or may not involve
      deploying security mechanisms on the ingress/egress PEs or further
      "inside" the data centers concerned. Note though that if security is not
      deployed on the egress/ingress PEs there is a substantial risk that some
      sensitive traffic may be sent in clear and therefore be vulnerable to
      pervasive monitoring <xref target="RFC7258"/> or other attacks. </t>

      <!---->
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.4364"?>

      <?rfc include="reference.RFC.0925"?>

      <?rfc include="reference.RFC.1027"?>

      <?rfc include="reference.RFC.4389"?>

      <!---->
    </references>

    <references title="Informative References">
      <!---->

      <?rfc include="reference.RFC.6820"?>

      <?rfc include="reference.RFC.4301"?>

      <?rfc include="reference.RFC.4761"?>

      <?rfc include="reference.RFC.4762"?>

      <?rfc include="reference.RFC.4659"?>

      <?rfc include="reference.RFC.5798"?>

      <?rfc include="reference.RFC.6513"?>

      <?rfc include="reference.RFC.6583"?>

      <?rfc include="reference.RFC.7258"?>

      <?rfc include="reference.I-D.ietf-mpls-opportunistic-encrypt"?>
    </references>
  </back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 05:26:18