One document matched: draft-baker-ipv6-ospf-dst-src-routing-01.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- Some of the more generally applicable PIs that most I-Ds might want to use -->
<!-- Try to enforce the ID-nits conventions and DTD validity -->
<?rfc strict="yes" ?>
<!-- Items used when reviewing the document -->
<?rfc comments="no" ?>
<!-- Controls display of <cref> elements -->
<?rfc inline="no" ?>
<!-- When no, put comments at end in comments section, 
                                 otherwise, put inline -->
<?rfc editing="no" ?>
<!-- When yes, insert editing marks: editing marks consist of a 
                                 string such as <29> printed in the blank line at the 
                                 beginning of each paragraph of text. -->
<!-- Create Table of Contents (ToC) and set some options for it.  
         Note the ToC may be omitted for very short documents, but idnits insists on a ToC 
         if the document has more than 15 pages. -->
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<!-- If "yes" eliminates blank lines before main section entries. -->
<?rfc tocdepth="4"?>
<!-- Sets the number of levels of sections/subsections... in ToC -->
<!-- Choose the options for the references. 
         Some like symbolic tags in the references (and citations) and others prefer 
         numbers. The RFC Editor always uses symbolic tags.
         The tags used are the anchor attributes of the references. -->
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc rfcprocack="yes" ?>
<!-- If "yes", causes the references to be sorted in order of tags.
                                 This doesn't have any effect unless symrefs is "yes" also. -->
<!-- These two save paper: Just setting compact to "yes" makes savings by not starting each 
         main section on a new page but does not omit the blank lines between list items. 
         If subcompact is also "yes" the blank lines between list items are also omitted. -->
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<!-- end of list of popular I-D processing instructions -->
<!-- end of list of processing instructions -->
<rfc category="std" docName="draft-baker-ipv6-ospf-dst-src-routing-01"
     ipr="trust200902">
  <front>
    <title abbrev="OSPF Source/Destination Routing">IPv6 Source/Destination
    Routing using OSPFv3</title>
    <author fullname="Fred Baker" initials="F.J." surname="Baker">
      <organization>Cisco Systems</organization>
      <address>
        <postal>
          <street></street>
          <city>Santa Barbara</city>
          <code>93117</code>
          <region>California</region>
          <country>USA</country>
        </postal>
        <email>fred@cisco.com</email>
      </address>
    </author>
    <date />
    <area>Routing</area>
    <workgroup>OSPF</workgroup>
    <abstract>
      <t>This note describes the changes necessary for OSPFv3 to route classes
      of IPv6 traffic that are defined by an IPv6 source prefix and a
      destination prefix. This implies not simply routing "to a destination",
      but "traffic going to that destination AND coming from a specified
      source". It may be combined with other qualifying attributes, such as
      "traffic going to that destination AND using a specified flow label AND
      from a specified source prefix". The obvious application is egress
      routing, as required for a multihomed entity with a provider-allocated
      prefix from each of several upstream networks. Traffic within the
      network could be source/destination routed as well, or could be routed
      from "any prefix", ::/0. If traffic is routed from the relevant PA
      prefixes but in fact has a source address that is in none of them, the
      traffic in effect has no route.</t>
    </abstract>
    <!--		
		<note title="Foreword">
		</note>
		-->
    <!--
      <?rfc needLines="10" ?>
      <texttable anchor="table_example" title="A Very Simple Table">
      <preamble>Tables use ttcol to define column headers and widths.
      Every cell then has a "c" element for its content.</preamble>
          <ttcol align="center">ttcol #1</ttcol>
                                    <ttcol align="center">ttcol #2</ttcol>
                      <c>c #1</c>		<c>c #2</c>
                      <c>c #3</c>		<c>c #4</c>
                      <c>c #5</c>		<c>c #6</c>
      <postamble>which is a very simple example.</postamble>
      </texttable>
    -->
  </front>
  <middle>
    <!--		
      <t>There are multiple list styles: "symbols", "letters", "numbers", 
"hanging", "format", etc.</t>
      <t>
	<list style="symbols">
	    <t>First bullet</t>
	    <t>Second bullet</t>
	</list>
     </t>
-->
    <!--
<figure anchor="reference" title="Figure">
<artwork align="center">
<![CDATA[
	ASCII artwork goes here... 
]]>
</artwork>
</figure>
-->
    <section title="Introduction">
      <t>This specification builds on <xref target="RFC5340">OSPF for
      IPv6</xref> and the extensible LSAs defined in <xref
      target="I-D.acee-ospfv3-tlv-extend"></xref>. It adds the sub-TLV option
      for an IPv6 Source Prefix, to define routes defined by a source and a
      destination prefix.</t>
      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119"></xref>.</t>
      </section>
    </section>
    <section anchor="extensions" title="Theory of Routing">
      <t>Both IS-IS and OSPF perform their calculations by building a lattice
      of routers and routes from the router performing the calculation to each
      router, and then use those routes to get to destinations that those
      routes advertise connectivity to. Following the SPF algorithm,
      calculation starts by selecting a starting point (typically the router
      doing the calculation), and successively adding {link, router) pairs
      until one has calculated a route to every router in the network. As each
      router is added, including the original router, destinations that it is
      directly connected to are turned into routes in the route table: "to get
      to 2001:db8::/32, route traffic to {interface, list of next hop
      routers}". For immediate neighbors to the originating router, of course,
      there is no next hop router; traffic is handled locally.</t>
      <t>In this context, the route is qualified by a source prefix; It is
      installed into the FIB with the value, and the FIB applies the route if
      and only if the IPv6 source address matches the advertised prefix. Of
      course, there may be multiple LSAs in the LSDB with the same destination
      and differing source prefixes; these may also have the same or differing
      next hop lists. The intended forwarding action is to forward matching
      traffic to one of the next hop routers associated with this destination
      and source prefix, or to discard non-matching traffic as "destination
      unreachable".</t>
      <t>LSAs that lack a source prefix sub-TLV match any flow label, by
      definition.</t>
      <section title="Dealing with ambiguity">
        <t>In any routing protocol, there is the possibility of ambiguity. An
        area border router might, for example, summarize the routes to other
        areas into a small set of relatively short prefixes, which have more
        specific routes within the area. Traditionally, we have dealt with
        that using a "longest match first" rule. If the same datagram matches
        more than one destination prefix advertised within the area, we follow
        the route to the longest matching prefix.</t>
        <t>When routing a class of traffic, we follow an analogous "most
        specific match" rule; we follow the route for the most specific
        matching tuple. In cases of simple overlap, such as routing to
        2001:db8::/32 or 2001:db8:1::/48, that is exactly analogous; we choose
        one of the two routes.</t>
        <t>It is possible, however, to construct an ambiguous case in which
        neither class subsumes the other. For example, presume that <list
            style="symbols">
            <t>A is a prefix,</t>
            <t>B is a more-specific prefix within A,</t>
            <t>C is a different prefix, and</t>
            <t>D is a more-specific prefix of C.</t>
          </list></t>
        <t>The two classes {A, D, *, *} and {B, C, *, *} are ambiguous: a
        datagram within {B, D, *, *} matches both classes, and it is not clear
        in the data plane what decision to make. Solving this requires the
        addition of a third route in the FIB corresponding to the class {B, D,
        *, *}, which is more-specific than either of the first two, and can be
        given routing guidance based on metrics or other policy in the usual
        way.</t>
      </section>
    </section>
    <section title="Extensions necessary for IPv6 Source/Destination Routing in OSPFv3">
      <t>The extensible LSA format defined in <xref
      target="I-D.acee-ospfv3-tlv-extend"></xref> requires one additional
      option to accomplish label+destination routing: the source prefix. This
      is defined here.</t>
      <t><list style="empty">
          <t>Editor's note-to-self: the following statement is my expectation.
          That said, the authors of <xref
          target="I-D.acee-ospfv3-tlv-extend"></xref> suggest that an area
          should have one type of LSA (as specified in <xref
          target="RFC5340"></xref>) or the extended LSA. I'll leave the
          statement for the moment, and remove it if the OSPF working group
          tells me to.</t>
        </list></t>
      <t>In addition, should (as one might expect is normal) destination-only
      intra-area-prefix, inter-area-prefix, and AS-external-prefix LSAs be
      encountered, we need a rule for interpretation. The rule is that they
      are treated exactly as the extensible version if the source prefix
      option is not specified or is specified to be ::/0 (any IPv6
      address).</t>
      <section anchor="OSPFv3-IPv6-source" title="IPv6 Source Prefix TLV">
        <t>The IPv6 Source Prefix TLV is derived from the Link-LSA and
        Inter-Area-Prefix-LSA address format.</t>
        <?rfc needLines="6"?>
        <figure title="Source Prefix TLV">
          <artwork align="center"><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|              Type             |             Length            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  PrefixLength | PrefixOptions |             0                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Address Prefix                         |
|                             ...                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
        </figure>
        <t><list style="hanging">
            <t hangText="Type:">assigned by IANA</t>
            <t hangText="TLV Length:">Length of the value portion of the TLV
            in octets. This is by definition 20.</t>
            <t
            hangText="PrefixLength, PrefixOptions, and Address Prefix:">Representation
            of an IPv6 address prefix, as described in <xref
            target="RFC5340"></xref> Appendix A.4.1</t>
          </list></t>
      </section>
    </section>
    <section anchor="IANA" title="IANA Considerations">
      <t>The OSPF Working Group will need a registry for sub-TLV Types. To be
      discussed</t>
    </section>
    <section anchor="Security" title="Security Considerations">
      <t>While source/destination routing could be used as part of a security
      solution, it is not really intended for the purpose. The approach limits
      routing, in the sense that it routes traffic to an appropriate egress,
      or gives a way to prevent communication between systems not included in
      a source/destination route, and in that sense could be considered
      similar to an access list that is managed by and scales with
      routing.</t>
    </section>
    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>
</t>
    </section>
  </middle>
  <back>
    <!-- references split to informative and normative -->
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>
      <?rfc include="reference.RFC.2460"?>
      <?rfc include="reference.RFC.5340" ?>
      <!-- rfc include="reference.I-D.acee-ospfv3-tlv-extend" -->
            <reference anchor="I-D.acee-ospfv3-tlv-extend">
        <front>
          <title>OSPFv3 LSA Extendibility</title>

          <author fullname="Acee Lindem" initials="A." surname="Lindem">
            <organization>Ericsson</organization>

            <address>
              <postal>
                <street>102 Carric Bend Court</street>

                <city>Cary</city>

                <region>NC</region>

                <country>USA</country>

                <code>27519</code>
              </postal>

              <email>acee.lindem@ericsson.com</email>
            </address>
          </author>

          <author fullname="Sina Mirtorabi" initials="S." surname="Mirtorabi">
            <organization>Cisco Systems</organization>

            <address>
              <postal>
                <street>170 Tasman Drive</street>

                <city>San Jose</city>

                <region>CA</region>

                <country>USA</country>

                <code>95134</code>
              </postal>

              <email>sina@cisco.com</email>
            </address>
          </author>

          <author fullname="Abhay Roy" initials="A." surname="Roy">
            <organization>Cisco Systems</organization>

            <address>
              <postal>
                <street>170 Tasman Drive</street>

                <city>San Jose</city>

                <region>CA</region>

                <country>USA</country>

                <code>95134</code>
              </postal>

              <email>akr@cisco.com</email>
            </address>
          </author>

          <author fullname="Fred Baker" initials="F." surname="Baker">
            <organization>Cisco Systems</organization>

            <address>
              <postal>
                <street></street>

                <city>Santa Barbara</city>

                <region>CA</region>

                <country>USA</country>

                <code>93117</code>
              </postal>

              <email>fred@cisco.com</email>
            </address>
          </author>

          <date month="April" year="2013" />

          <abstract>
            <t></t>
          </abstract>
        </front>
      </reference>
    </references>
    <references title="Informative References">
      <?rfc include="reference.RFC.2827" ?>
      <reference anchor="PATRICIA">
        <front>
          <title>Practical Algorithm to Retrieve Information Coded in
          Alphanumeric</title>
          <author fullname="D.R. Morrison" initials="D.R." surname="Morrison">
            <organization>Association for Computing Machinery</organization>
          </author>
          <date month="October" year="1968" />
        </front>
        <seriesInfo name="Journal of the ACM" value="15(4) pp514-534" />
        <format target="http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Tree/PATRICIA/"
                type="HTML" />
      </reference>
    </references>
    <section anchor="log" title="Change Log">
      <t><list style="hanging">
          <t hangText="Initial Version:">February 2013</t>
          <t hangText="First revision:">April 2013</t>
        </list></t>
    </section>
    <section anchor="egress-routing" title="Use case: Egress Routing">
      <t>Using this technology for egress routing is straightforward. Presume
      a multihomed edge (residential or enterprise) network with multiple
      egress points to the various ISPs. These ISPs allocate PA prefixes to
      the network. Due to <xref target="RFC2827">BCP 38</xref>, the network
      must presume that its upstream ISPs will filter out any traffic
      presented to them that does not use their PA prefix.</t>
      <t>Within the network, presume that a /64 prefix from each of those PA
      prefixes is allocated on each LAN, and that hosts generate and use
      multiple addresses on each interface.</t>
      <t>Within the network, we permit any host to communicate with any other.
      Hence, routing advertisements within the network use traditional
      destination routing, which is understood to be advertising the traffic
      class <list style="empty">
          <t>{destination, ::/0}.</t>
        </list></t>
      <t>From the egresses, the firewall or its neighboring router injects a
      default route for traffic "from" its PA prefix: <list style="empty">
          <t>{::/0, PA prefix}.</t>
        </list></t>
      <t>Routing is calculated as normal, with the exception that traffic
      following a default route will select that route based on the source
      address. Traffic will never be lost to BCP 38 filters, because by
      definition the only traffic sent to the ISP is using the PA prefix
      assigned by the ISP. In addition, while hosts can use spoofed addresses
      outside of their PA prefixes to attack each other, they cannot send
      traffic using spoofed addresses to their upstream networks; such traffic
      has no route.</t>
    </section>
    <section title="FIB Design">
      <t>While the design of the Forwarding Information Base is not a matter
      for standardization, as it only has to work correctly, not interoperate
      with something else, the design of a FIB for this type of lookup may
      differ from approaches used in destination routing. We describe one
      possible approach that is known to work, from the perspective of a proof
      of concept.</t>
      <section title="Linux Source-Address Forwarding">
        <t>The University of Waikato has added to the Linux Advanced Routing
        & Traffic Control facility the ability to maintain multiple FIBS,
        one for each of a set of prefixes. Implementing source/destination
        routing using this mechanism is not difficult.</t>
        <t>The router must know what source prefixes might be used in its
        domain. This may be by configuration or, at least in concept, learned
        from the routing protocols themselves. In whichever way that is done,
        one can imagine two fundamental FIB structures to serve N source
        prefixes; N FIBs, one per prefix, or N+1 FIBs, one per prefix plus one
        for destinations for which the source prefix is unspecified.</t>
        <section title="One FIB per source prefix">
          <t>In an implementation with one FIB per source prefix, the routing
          algorithm has two possibilities. <list style="symbols">
              <t>If it calculates a route to a prefix (such as a default
              route) associated with a given source prefix, it stores the
              route in the FIB for the relevant source prefix.</t>
              <t>If it calculates a route for which the source prefix is
              unspecified, it stores that route in all N FIBs.</t>
            </list></t>
          <t>When forwarding a datagram, the IP forwarder looks at the source
          address of the datagram to determine which FIB it should use. If it
          is from an address for which there is no FIB, the forwarder discards
          the datagram as containing a forged source address. If it is from an
          address within one of the relevant prefixes, it looks up the
          destination in the indicated FIB and forwards it in the usual
          way.</t>
          <t>The argument for this approach is simplicity: there is one place
          to look in making a forwarding decision for any given datagram. The
          argument against it is memory space; it is likely that the FIBs will
          be similar, but every destination route not associated with a source
          prefix is duplicated in each FIB. In addition, since it
          automatically removes traffic whose source address is not among the
          configured list, it limits the possibility of user software using
          improper addresses.</t>
        </section>
        <section title="One FIB per source prefix plus a general FIB">
          <t>In an implementation with N+1 FIBs, the algorithm is slightly
          more complex. <list style="symbols">
              <t>If it calculates a route to a prefix (such as a default
              route) associated with a given source prefix, it stores the
              route in the FIB for the relevant source prefix.</t>
              <t>If it calculates a route for which the source prefix is
              unspecified, it stores that route in the FIB that is not
              associated with a source prefix.</t>
            </list></t>
          <t>When forwarding a datagram, the IP forwarder looks at the source
          address of the datagram to determine which FIB it should use. If it
          is from one of the configured prefixes, it looks the destination up
          in the indicated FIB. In any event it also looks the destination up
          in the "unspecified source address" FIB. If the destination is found
          in only one of the two, the indicated route is followed. If the
          destination is found in both, the more specific route is
          followed.</t>
          <t>The argument for this approach is memory space; if a large
          percentage of routes are only in the general FIB, such as when
          egress routing is used for the default route and all other routes
          are internal, the other FIBs are likely to be very small - perhaps
          only a single default route. The argument against this approach is
          complexity: most lookups if not all will be done in a
          prefix-specific FIB and in the general FIB.</t>
        </section>
      </section>
      <section title="PATRICIA">
        <t>One approach is a <xref target="PATRICIA"></xref> Tree. This is a
        relative of a Trie, but unlike a Trie, need not use every bit in
        classification, and does not need the bits used to be contiguous. It
        depends on treating the bit string as a set of slices of some size,
        potentially of different sizes. Slice width is an implementation
        detail; since the algorithm is most easily described using a slice of
        a single bit, that will be presumed in this description.</t>
        <section title="Virtual Bit String">
          <t>It is quite possible to view the fields in a datagram header
          incorporated into the classification tuple as a virtual bit string
          such as is shown in <xref target="conceptual"></xref>. This bit
          string has various regions within it. Some vary and are therefore
          useful in a radix tree lookup. Some may be essentially constant -
          all global IPv6 addresses at this writing are within 2000::/3, for
          example, so while it must be tested to assure a match, incorporating
          it into the radix tree may not be very helpful in classification.
          Others are ignored; if the destination is a remote /64, we really
          don't care what the EID is. In addition, due to variation in prefix
          length and other details, the widths of those fields vary among
          themselves. The algorithm the FIB implements, therefore, must
          efficiently deal with the fact of a discontiguous lookup key.</t>
          <figure anchor="conceptual"
                  title="Treating a traffic class as a virtual bit string">
            <artwork align="center"><![CDATA[
+---------------------+----------------------+-----+-----------+
|Destination Prefix   |Source Prefix         |DSCP | Flow Label|
+------+------+-------+------+-------+-------+-----+-----------+
 Common|Varying|Ignored|Common|Varying|Ignored|Varying or ignored
]]></artwork>
          </figure>
        </section>
        <section title="Tree Construction">
          <t>The tree is constructed by recursive slice-wise decomposition. At
          each stage, the input is a set of classes to be classified. At each
          stage, the result is the addition of a lookup node in the tree that
          identifies the location of its slice in the virtual bit string
          (which might be a bit number), the width of the slice to be
          inspected, and an enumerated set of results. Each result is a
          similar set of classes, and is analyzed in a similar manner.</t>
          <t>The analysis is performed by enumerating which bits that have not
          already been considered are best suited to classification. For a
          slice of N bits, one wants to select a slide that most evenly
          divides the set of classes into 2^N subsets. If one or more bits in
          the slice is ignored in some of the classes, those classes must be
          included in every subset, as the actual classification of them will
          depend on other bits.</t>
          <figure anchor="patricia-tree" title="Example PATRICIA Tree">
            <artwork align="center"><![CDATA[
Input:{2001:db8::/32, ::/0, *, *}
      {2001:db8:1::/48, ::/0, AF41, *}
      {2001:db8:1::/48, ::/0, AF42, *}
      {2001:db8:1::/48, ::/0, AF43, *}
Common parts: Destination prefix 2001:dba, source prefix, and label
Varying parts: DSCP and the third set of sixteen bits in the
               destination prefix
One possible decomposition:
(1) slice = DSCP
    enumerated cases:
(a) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF41, *} }
(b) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF42, *} }
(c) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF43, *} }
(2) slice = third sixteen bit field in destination
    This divides each enumerated case into those containing 0001 and
    "everything else", which would imply 2001:db8::/32
                           (1) DSCP
                 --------------------------
                (1a)       (1b)         (1c)
               /    \     /    \       /    \
             /32   /48  /32   /48    /32   /48
]]></artwork>
          </figure>
        </section>
        <section title="Tree Lookup">
          <t>To look something up in a PATRICIA Tree, one starts at the root
          of the tree and performs the indicated comparisons recursively
          walking down the tree until one reaches a terminal node. When the
          enumerated subset is empty or contains only a single class,
          classification stops. Either classification has failed (there was no
          matching class, or one has presumably found the indicated class. At
          that point, every bit in the virtual bit string must be compared to
          the classifier; classification is accepted on a perfect match.</t>
          <t>In the example in <xref target="patricia-tree"></xref>, if a
          packet {2001:db8:1:2:3:4:5:6, 2001:db8:2:3:4:5:6:7, AF41, 0}
          arrives, we start at the root. Since it is an AF41 packet, we deduce
          that case (1a) applies, and since the destination has 0001 in the
          third sixteen bit field of the destination address, we are comparing
          to {2001:db8:1::/48, ::/0, AF41, *}. Since the destination address
          is within 2001:db8:1::/48, classification as that succeeds.</t>
        </section>
      </section>
    </section>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 19:35:20