One document matched: draft-van-beijnum-multi6-isp-int-aggr-00.txt


Internet Draft                                            I. van Beijnum
Document: draft-van-beijnum-multi6-isp-int-aggr-00.txt      October 2002
Expires: April 2003


             Provider-Internal Aggregation based on Geography
                     to Support Multihoming in IPv6


1 Mandatory Statements

         This document is an Internet-Draft and is subject to
         all provisions of Section 10 of RFC2026.

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as
     Internet-Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-
     Drafts as reference material or to cite them other than as
     "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/1id-abstracts.html

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html


2 Abstract

Current 6bone backbone routing guidelines prohibit traditional
multihoming in IPv6, because current IPv4-style multihoming doesn't
scale. This stands in the way of successful adoption of IPv6. The
solution outlined in this memo proposes aggregating the routing
information for multihomed destinations inside service provider
networks based on geography to accomplish scalable multihoming in IPv6
using current protocols and implementations. This solution does not
require network operators to increase the density of interconnection;
nor does it require significant cooperation or simultaneous adoption.


3 Introduction

Current IPv4 and IPv6 interdomain routing operational practices depend
heavily on aggregation in order to reach the necessary scalability.
Current aggregation is exclusively service provider based: ISPs
(Internet Service Providers) obtain blocks of address space from the
Regional Internet Registries (RIRs) and assign their customers addresses
from these blocks. Then they announce a single route for each block to
other networks. This aggregation makes it possible for millions of



Van Beijnum                                                       Page 1
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


organizations to be connected to the internet while limiting the global
routing table to only slightly more than a hundred thousand destination
prefixes.

Unfortunately, provider-based aggregation doesn't work for networks
connected to the internet over more than one connection ("multi-homed"
networks). In the current IPv4 internet, multihoming is typically done
by announcing a route for an independent address block to two or more
ISPs. The address block may actually be part of a larger PA (provider
aggregatable) block, but it must be visible in the global routing table
independently from possible aggregates to make multihoming work under
all circumstances. This makes it impossible for many millions of
networks to multihome: the global routing table would grow beyond what
routers can handle.

There are efforts underway to provide in IPv6 the failover and load
balancing functionality present in current "IPv4 style" multihoming in
different ways that wouldn't increase the size the global routing table.
However, all these new multihoming solutions are still on the drawing
board and need changes to protocols and implementations. In the mean
time, the current 6Bone backbone routing guidelines [RFC2772] don't
allow non-aggregated routes in the IPv6 global routing table and thereby
make IPv4-style multihoming impossible.

This draft proposes new operational practices that will allow networks
to handle a much larger global routing table, so multihoming in IPv6 can
be made possible within a very short time frame. However, it is very
important to note this isn't a perfect "one size fits all" solution that
scales to huge numbers of multihomed networks without any pain or
effort. (See the Limitations section later in this document.) But at
least this mechanism makes multihoming possible almost immediately,
without having to wait for protocols and implementations to be changed
or even for network operators to reconfigure their networks. The latter
can be done later, and on a per-network basis, as the size of the global
routing table becomes problematic for individual networks. The idea is
to make multihoming possible now, while providing networks with the
means to control the size of the routing table in their routers later as
necessary.

After implementing the necessary filtering mechanisms, growth to several
million multihomed networks world wide should be possible without much
trouble. In theory, this mechanism can support many hundreds of millions
multihomed networks, but this will be hard to accomplish in practice, so
work on more advanced multihoming solutions should continue. There is at
least one multihoming solution under development [MHAP] that can use the
same addressing mechanism as is needed for the solution proposed here,
so there is potential for gradual replacement by a more permanent
solution. This addressing mechanism is explained in a separate document
[GAPI].





Van Beijnum                                                       Page 2
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


4 How It Works

To make multihoming (as we know it today) possible, individual routes
must be present in the global routing table. But in order to fit the
routing table into a router, there must be aggregation. These
requirements seem at odds with each other. This is because there is an
unspoken assumption: the full global routing table must be present in
all routers that are part of the default-free zone. Dropping this
requirement makes everything much more complex, but it is possible. The
global routing table can then be split into several parts, where
individual routers all handle one (or a few) of those parts.

This works as long as traffic for a certain subset of the destination
networks present in the global routing table is always sent to a router
containing that part of the global routing table. The obvious way to
accomplish this is for each router to announce an aggregate covering the
part of the global routing table it serves. For instance, if a network
has four routers and wants to divide routing information for the IPv6
global unicast address space over those routers, it could have router A
handle 2000::/5, router B 2800::/5, router C 3000::/5 and router D
3800::/5. So if this network peers with another network that announces
2200:abc::/35 and 3ffe:def::/35, all routers except router A filter out
the first route, and all routers except router D filter out the second
route. When router C then has a packet for 2200:abc:1:2::1, it sends the
packet to router A (because router A announces the 2000::/5 aggregate)
and router A delivers the packet to the right peer. Note that this
behavior is completely hidden from the peer: the aggregates are only
used within the local network, they are not announced to peers. To avoid
confusion with regular provider aggregatable routes, the term "pilot
routes" will be used for this type of private aggregates.

This practice scales relatively well: by adding more routers, it is
possible to accommodate a global routing table of arbitrary size. (These
extra routers must be "border routers" that interconnect with other
networks.) However, there is a major problem: traffic for certain
address ranges must always first be transported to the location of the
router handling this address range. So if two end-users in Europe want
to communicate, but the address range for one of them is handled in
North America by the other's ISP, and the other's address range is
handled by a router in Japan, this traffic that has the potential to
stay within the region has to circle the globe.
This "scenic routing" can be avoided by assigning address space to
multihomers in a geographically aggregatable manner. This way, networks
can have a range of addresses be handled by a router in the region where
the addresses are used. However, this is not a strict requirement. For
instance, a network that only has a presence in the US doesn't
necessarily have to interconnect with other networks in Europe or Asia.
In practice, it will have routers at the US East Coast (where many
European networks are present) handle the European address ranges, and
routers at the US West Coast (where many Asian networks are present)
handle the Asian address ranges.



Van Beijnum                                                       Page 3
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


5 Operational Details

First of all: more specific routes from customers are usually not
filtered. They are announced to peers at all interconnect locations. It
is up to the network receiving the routes to filter them. Only when two
networks agree on where to exchange routing information for certain
geographic aggregates, there may be outbound filtering of more specific
multihomed routes.

The aggregation scheme works as follows. The network is divided into
zones. The exact way in which this is done depends on the particular
topology of the network, and doesn't have to match the layout of other
networks. Static pilot routes for all address ranges used within the
zone are configured on at least two routers (for redundancy) in that
zone (or as close to the zone as is practical). Then both EBGP and IBGP
filters are configured per peer. The IBGP filters are applied to all
sessions with routers in other zones (not to sessions with other routers
within the zone) and filter out the more specific routes falling within
the address ranges used in the zone. The EBGP filters do the opposite
and allow only more specific routes for destinations within the region.
This makes sure more specific multihomed routes are allowed in the
routing table within the zone, but aren't announced over IBGP to other
zones.

5.1 Interconnection

Since interconnection is not an exact science, there may not be adequate
interconnection within the zone with some peer networks. When this is
the rule rather than the exception, this indicates the zones are too
small. Increasing the zone or merging several zones will make sure there
is interconnection with most peer networks within the zone itself. For
the few networks for which interconnection within the zone isn't
possible, EBGP filters that always allow all more specific routes are
used. Also, these routes are tagged with an internal community that
prevents them from being filtered in IBGP. As a result, there is no
aggregation for these peers, but there is still full connectivity. It
should be possible to limit this de-aggregation to a small number of
zones rather than the entire network with more sophisticated filtering.

5.2 Zone Partitioning

It is important that regions are never partitioned, because when this
happens, packets for certain destinations will loop. The router inside
the zone will route them outside the zone because of the more specifics
pointing to the other partition of the zone over a router that isn't
part of the zone, and the first router outside the zone will route the
packets back into the region to the closest router announcing the pilot
route.






Van Beijnum                                                       Page 4
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


5.3 Example Picture

The following picture represents an AS with four routers and eight
peers, divided into two zones that each handle routing for three
regions:

       [S]   [T]               [U]   [V]
        |     |                 |     |
      b E   c E     ZONE 1      E c   E c
      a B   b B     REGIONS     B c   B b
      a G   a G     A, B, C     G b   G a
      v P   v P                 P v   P v
        |     |                 |     |
     +--+-----+-+  <abbccc    +-+-----+--+
     |   RTR 1  +---I-B-G-P---+   RTR 2  |
     +--+------++    cbbaaa>  +-+-----+--+
      C |      |                |     | C
      B I    +-|----I-B-G-P-----+     I B
      A |    | | GFE>     <ABC        | A
      v B    | |                      B v
  ======|====|=|======================|=======
        G ^  | |                    ^ G
        | E  | | CBA>     <EFG      E |
        P F  | +----I-B-G-P-----+   F P
        | G  |                  |   G |
     +--+----+--+  <eefggg    +-+-----+--+
     |   RTR 3  +---I-B-G-P---+   RTR 4  |
     +--+-----+-+    gfffee>  +-+-----+--+
        |     |                 |     |
      ^ E   ^ E                 E ^   E ^
      e B   e B     ZONE 2      B e   B e
      f G   f G     REGIONS     G f   G g
      f P   g P     E, F, G     P g   P g
        |     |                 |     |
       [W]   [X]               [Y]   [Z]

[S], [T], [U], [V],  Peer EBGP routers
[W], [X], [Y], [Z]
RTR 1, RTR 2         Routers in zone 1
RTR 3, RTR 4         Routers in zone 2
A, B, C, E, F, G     Pilot (aggregate) routes
a, b, c, e, f, g     Individual /48 routes for end-user networks
<, >, ^, v           The direction of the routing information flow


6 Migration

Migration from a regular, non-aggregated setup to full geographical
aggregation doesn't have to be immediate. The process can be carried out
is several steps:




Van Beijnum                                                       Page 5
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


1. The border router handling most of the traffic to a specific
   geographical destination or aggregate of several destinations is
   promoted to "designated router" for the matching address range. The
   designated router is configured to announce a pilot route over IBGP
   and with filters that don't allow more specifics for the destinations
   covered by the pilot route to be announced over IBGP to non-border
   routers. Now only border routers have the more specific routes.

2. Border routers are configured with EBGP filters to filter out
   incoming more specific routes covered by pilot routes announced by 
   far away designated routers. (For instance, routers in Europe are 
   configured to filter out American more specifics for which an 
   American router announces a pilot route.) The designated router is 
   configured to no longer send these more specifics over IBGP to the 
   routers that now filter those same routes on EBGP sessions. (For the 
   American routers, their European IBGP neighbors now essentially 
   become part of the group "non-border routers".) Now each border 
   router only has a subset of all multihomed more specifics in its 
   routing table.

Step 1 can be implemented on individual routers one at a time, and,
barring configuration mistakes, doesn't pose any risks. There is only
one pilot route, and only more specific routes announced by the same
router as the pilot route are suppressed. Since both the new pilot route
and the now suppressed more specific routes point to the same border
router, the way packets are routed through the network is completely
identical and there is no risk of loops. If different a router than the
designated router has the preferred external route for a more specific,
this more specific route will be announced as before, since only the
designated router is configured to filter out these more specifics.

When the designated router is the one holding the best external route,
non-border routers won't see any more specific routes for this
destination. The designated router has a filter, and the other border
routers don't announce the route over IBGP because they aren't the ones
holding the best route. To aid aggregation, the designated router can be
configured to increase the IBGP Local Preference attribute for the more
specifics it acts as designated router for. This way, the route over the
designated router is always preferred, even if another router has a
matching more specific with a shorter AS path or better Multi Exit
Discriminator metric.

When the designated router becomes unreachable or loses its external
routes, there will be automatic de-aggregation: more specific routes are
announced by other routers.

Step 2 can also be implemented one router at a time. The new EBGP
filters should be installed first, after which the designated router can
be configured to no longer announce more specifics to the border routers
with the new EBGP filters. If this is done the other way around, more
specifics will leak over IBGP and there will be non-optimal routing.
Without step 2, there is no aggregation in border routers: they need to


Van Beijnum                                                       Page 6
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


hear the designated router announce a "better" more specific, or they
will start to announce their own over IBGP. Introducing step 2
introduces the risk that certain destinations become unreachable when
there is an outage. For instance, when European routers no longer see
American more specifics, and the European and American parts of the
network become partitioned, it is no longer possible for the European
routers to send traffic to American destinations, even if there is
peering in Europe that would have made this possible before. This step
should only be taken if the risk of network partitions is negligible.


7 Address Allocation Requirements

In order for the practices described here to work, a new address
allocation architecture must be implemented. Since several architectures
are possible, and it would be beneficial to share such an architecture
with other proposed IPv6 multihoming solutions, this document doesn't
specify an address allocation architecture, but rather lists the
requirements such an architecture must meet in order to be usable for
geographic aggregation as outlined in this document.

To allow coexistence between regular provider aggregatable address space
which is already extensively used in IPv6 and the addresses assigned in
accordance with any new allocation architecture, it must be possible to
identify the type of address space easily.

The level of aggregation used by network operators will very likely
change radically the next five, ten or twenty years. At present, there
are only a few networks in IPv6 that can be called "multihomed" in the
IPv4 sense, and there are less than 20000 multihomed networks in IPv4.
When geographic aggregation becomes necessary because of the growth in
the number of multihomed networks, aggregation at the continent level
will probably suffice at first. (Meaning all the more specific routes
for a continent are present in routers throughout this continent.) As
the number of multihomed networks continues to grow, it will become
necessary to aggregate at the country level for small to medium sized
countries (such as those in Europe) or at the state or province level
for large countries (such as the United States, India or China).
Eventually, the aggregation needs will reach the city/metropolitan area
level.

At each stage of aggregation, the number of prefixes that must be listed
to identify a geographic area should be as low as possible.


8 Limitations

Since this scheme depends on geography for aggregation, it only works
well for organizations that connect to the internet in locations that
are close together. An organization with a network spanning multiple
countries and connecting to the internet in all those countries isn't
geographically aggregatable, and neither is an organization connecting


Van Beijnum                                                       Page 7
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


to ISPs very far way, for instance by means of a satellite circuit.

These types of organizations must choose address space falling within a
geographic area that doesn't (fully) fit if they elect to use the type
of address space this aggregation scheme uses. This choice will have
consequences on routing efficiency, and when the infrastructure changes,
the organization may need to adopt a new address range to minimize the
routing efficiencies created by the change.


9 Route Visibility for Customers

In order to be able to do traffic engineering for outbound traffic,
multihomed customers need to receive a consistent view of the global
routing table from all their ISPs. If the aggregation levels of
different ISPs used by a multihomed customer don't match, because of the
longest match first rule, most of the traffic will flow over the ISP
doing the least aggregation. To avoid this, ISPs are strongly encouraged
to provide their customers with a full, unaggregated view of the global
routing table. If an ISP aggregates internally, such a view could be
obtained by the customer by having an EBGP (multihop if necessary)
session with one or more route servers, in addition to the regular EBGP
session to the next hop router.

ISPs should also provide their customers with pilot routes at all
aggregation levels, even if the ISPs themselves don't (yet) aggregate.
This makes it possible for customers to filter out more specifics and
still maintain a consistent view of the global routing table. If an ISP
can't do this immediately (adding a large number of pilot routes is a
lot of work) the ISP should establish a time frame for implementing the
necessary pilot routes and communicate this to existing and potential
customers. A reasonable time frame would be six months to implement
continent/country/province/state level pilot routes for the whole world,
a year to implement metropolitan area pilot routes for the regions the
ISP is active, and 18 months to implement world wide metropolitan area
pilot routes, starting from the moment a geographically aggregatable
address allocation mechanism is implemented.


10 Traffic Flow

Larger ISP and ISP-like networks that interconnect with other networks
in more than one location must have a policy on how to select the
interconnect location used for traffic to those other networks. At
present, the most widely adopted policy is "early exit" or "hot potato":
packets are routed to the closest interconnect location where the other
network is present and delivered to the destination network there. As a
result, packets travel most of the way over the destination network. If
both networks use the early exit policy, traffic in one direction will
travel most of the way over one network, and traffic in the other
direction most of the way over the other network, so the policy is
"fair" as long as the traffic volumes are fairly equal in both


Van Beijnum                                                       Page 8
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


directions. This policy is implemented by not changing the default
behavior for the most widely available BGP implementations.

Since the aggregation scheme described in this document requires traffic
to be transported to a location where more specific routing information
is known, and this location is presumably close to the destination of
the packet, adoption of this scheme leads to a "late exit" routing
policy for multihomed traffic. Assuming early exit is still used for
single homed traffic, there are four possible permutations for the
traffic flow between any two hosts:

1. Hosts A and B both single homed: both early exit = "fair"
2. Host A single homed, host B multihomed: traffic is exchanged close to
   host B = host A's network does most of the work
3. Host A multihomed, host B single homed: traffic is exchanged close to
   host A = host B's network does most of the work
4. Hosts A and B both multihomed: both late exit = "fair"

Since networks can control the level of late exit routing by
(selectively) de-aggregating and many interconnection (peering)
agreements call for equal traffic volumes in both directions, the
potential for changes in the flow of traffic should not adversely affect
existing networks.


11 IANA Considerations

The Regional Internet Registries should take the requester's geographic
location into consideration when assigning address space.

If this scheme is adopted, the number of networks requiring an
Autonomous System number will rise beyond what can be accommodated using
the current 16-bit AS number space. There is a draft proposing the use
of 32-bit AS numbers [32bitAS]. Since having a universally recognized AS
number is less important for a multihomed "leaf" network than for a
transit network, it is recommended that the 32-bit AS number capability
be implemented as soon as possible. All multihomed networks requesting
an AS number that are capable of using a 32-bit AS number should be
assigned an AS number higher than 65535, so 16-bit compatible AS numbers
remain available for transit networks.


12 Security Considerations

This aggregation scheme doesn't propose any changes to protocols or
implementations, so it doesn't introduce any new protocol or
implementation risks. However, there is one problem: since routing
information is removed from large parts of the network, it is no longer
possible to use the routing table to do ingress filtering [RFC2267]
using the "unicast RPF" feature implemented by several router vendors.
The alternative, having statically configured filter lists, doesn't
scale. This leaves networks implementing this aggregation scheme with no


Van Beijnum                                                       Page 9
draft-van-beijnum-multi6-isp-int-aggr-00.txt                October 2002


protection against incoming packets with falsified source addresses, so
it is highly recommended that network operators make sure they don't
generate or accept from customers packets with falsified source
addresses and that vendors implement mechanisms to trace back the source
of these falsified packets.


13 Document and Author Information

This document expires April, 2003. The latest version of this document
will always be available at http://www.muada.com/drafts/. Comments are
welcome at:

Iljitsch van Beijnum
Karel Roosstraat 95
2571 BG  Den Haag
Netherlands

Email: iljitsch@muada.com


14 References

[RFC2267] RFC 2267, "Network Ingress Filtering: Defeating Denial of
          Service Attacks which employ IP Source Address Spoofing"

[RFC2772] RFC 2772, "6Bone Backbone Routing Guidelines"

[32bitAS] "BGP support for four-octet AS number space", work in progress

[GAPI] "A Geographically Aggregatable Provider Independent Address Space 
       to Support Multihoming in IPv6", work in progress

[MHAP] MHAP draft, work in progress




















Van Beijnum                                                      Page 10


PAFTECH AB 2003-20262026-04-23 03:52:28