One document matched: draft-bonaventure-bgp-route-reflectors-00.txt
Internet Engineering Task Force Olivier Bonaventure
INTERNET DRAFT Steve Uhlig
Bruno Quoitin
UCL
July, 2004
The case for more versatile BGP Route Reflectors
<draft-bonaventure-bgp-route-reflectors-00.txt>
Status of this Memo
By submitting this Internet-Draft, we certify that any applicable
patent or other IPR claims of which we are aware of have been
disclosed, and any of which we become aware will be disclosed, in
accordance with RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 31, 2004.
Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract
The Border Gateway Protocol (BGP) is the standard interdomain
routing protocol in the Internet. Inside an Autonomous System (AS),
the interdomain routes are often distributed by using BGP Route
Reflectors (RR). Today, most RR are simple BGP routers. We show that
by adding intelligence to the RR, it is possible to improve both the
Bonaventure/Uhlig/Quoitin [Page 1]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
routing and the packet forwarding in ASes. We show how a versatile RR
can help an AS to engineer the flow of its incoming or outgoing
interdomain traffic. We also discuss how a versatile RR could help to
reduce the BGP convergence time or reduce the size of the routing
tables when providing BGP/MPLS VPN services.
1 Introduction
The Border Gateway Protocol (BGP) [1] is used today by more than
16.000 Autonomous Systems (AS) to exchange their interdomain routes.
The stability and performance of BGP are key factors for the
stability and performance of the global Internet. Although BGP
suffers from a low convergence in case of failure and some BGP
routers tend to transmit too many routing messages, recent studies
have shown that BGP routing is stable [2, 3], at least when
considering the routes towards destinations receiving lots of
packets. BGP is also used inside many ISPs to distribute several
other types of information such as BGP/MPLS VPN routes [4] or flow
specifications [5].
When used for interdomain routing, BGP relies on two types of
sessions that are established over TCP connections. Two BGP routers
from different domains connected with a physical link will use an
eBGP session to exchange their interdomain routes. The interdomain
routes received by the border routers of an AS need to be propagated
through the AS. This is usually done by relying on iBGP sessions. The
initial BGP specification assumed that a full-mesh of iBGP sessions
would be established inside each AS to distribute the interdomain
routes. A consequence of this full-mesh of iBGP sessions is that a
BGP router will not distribute over an iBGP session a route received
over another iBGP session. However, this full-mesh quickly appeared
unscalable since an AS with N routers needs to support N Ú (N-1)/2
iBGP sessions.
Two solutions have been proposed to solve this scaling problem.
With the confederation approach, each AS is divided into smaller sub-
ASes containing each about a few tens of routers. Inside each sub-AS,
a full mesh of iBGP sessions are established between the routers of
the sub-AS and special eBGP sessions are used between routers of
different sub-ASes. A second approach, which, based on discussions
with ISP operators, appears to be more often used by large ASes, is
to rely on BGP Route Reflectors (RR) [6]. A RR is a special BGP
router which is allowed to redistribute over iBGP sessions routes
that it has received over some iBGP sessions. A RR has two types of
iBGP peers : its client-peers and its non-client peers. The non-
client peers are usually other RR. A RR will receive routes from all
its iBGP peers and will use its BGP decision process and its IGP
table to determine the best routes to reach each destination. If the
Bonaventure/Uhlig/Quoitin [Page 2]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
best route was received on an iBGP session with a client peer, it
will be advertised to all the iBGP peers. On the other hand, if the
route was received from a non-client peer it will only be advertised
to client-peers.
The number of RRs in an IP network is much smaller than the number
of routers [7]. A network with several tens of routers would
typically have one (or two for redundancy reasons) RR. Larger
networks with several hundred of routers in various countries may use
up to a few tens of RRs connected in a full mesh or with a RR
hierarchy.
Discussions with ISPs indicate that there are three ways to deploy
RR. The first solution is to place the RR function on existing
backbone routers. In this case, the router needs to have enough CPU
and memory capabilities to support the RR function while handling its
normal load. Another approach is to use a dedicated router that
does not forward IP packets but is equipped with a good CPU and large
memory. Finally, smaller ASes sometimes rely on PCs or workstations
running open-source RRs.
In most deployments of RRs today, the goal is often to minimise the
CPU load on the RR. RRs are often only considered as a way to solve
the iBGP distribution problem. In this paper, we assume that the RR
service is provided by a carrier-class workstation or a cluster of
workstations where the CPU and the memory are not as limited as on
current routers.
We show in this article that by correctly exploiting the knowledge
of the RR, it is possible to provide new services both inside and
between ASes. We discuss several examples that could each lead to
entire papers on the topic. We show in section 2 that a more
intelligent RR could avoid the forwarding loops that may occur with a
badly placed current RR. Then, in section 3 we show how a versatile
RR could allow a transit AS to efficiently engineer its interdomain
traffic. Finally, in section 4, we discuss the role that versatile
RRs could play in ASes using MPLS to support VPN services or
interdomain LSPs.
2 Limitations of current RR
The currently deployed RRs advertise their own best route to each
of their client peers. This allows the RR to compute a single best
route, but this creates several problems. The first problem is that
routing and even forwarding loops can occur when RR are used. Several
of those problems have been described in the literature [8] and
reported in real networks [9].
As an example, consider the topology shown in Figure 1 based on
[8]. The arrows show the BGP sessions. The IGP weight of each
physical link is also shown. In this network, RX and RY advertise the
prefix P. The two RRs prefer the route learned via eBGP and
Bonaventure/Uhlig/Quoitin [Page 3]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
advertise it to their client. If R1 receives a packet destined to P,
its BGP table forces it to send it to RR1. However, the IGP topology
will cause the packet to be sent to R2. R2's BGP table forces it to
send the packet to RR2, but to reach this nexthop, R2 will send the
packet via R1 ...
<See PostScript version of this document>
Figure 1: Simple network topology with a forwarding loop
Extensions to BGP [10] have been proposed to solve this problem,
but they are not implemented and deployed. The current solution is to
apply guidelines when designing iBGP topologies [11, 12, 7]. Those
guidelines impose restrictions on the graph of the iBGP sessions.
Those restrictions depend on the IGP topology and the location of the
RR. In practice, the IGP topology changes frequently as links or
routers fail or are added to the network or when traffic engineering
tools are used to engineer the intradomain traffic by setting the IGP
weights [13]. Ensuring that the guidelines are preserved after each
IGP change is not an easy task.
If the CPU of the RR is not a severe bottleneck, a solution to
avoid the routing and forwarding loops induced by RRs would be to
change the behaviour of the RR. Instead of computing its own best
route which is then distributed to all its clients, a RR could
compute the best route that would be computed by each client if it
had the same BGP table as the RR. Since one step of the BGP decision
process uses the IGP distance between the router and the nexthop
contained in each BGP route, the RR would need to know the IGP
distance between each of its clients and each BGP nexthop. This
information can be obtained by computing the IGP table of each
client or by defining a new protocol to allow a client to report this
information to its RR [14]. If the RR recomputes the IGP tables of
its clients, they need to be updated after each IGP change. Several
algorithms [15] have been proposed to incrementally update the
routing table of a router after a topology change. There are also
incremental versions of the all-pairs shortest paths algorithms
[16]. Based on those algorithms, it should be possible to build
incremental algorithms to determine the BGP updates to be sent to
the clients of a RR after a BGP or an IGP change.
Another issue with RR is the convergence time in case of failure.
Consider again the network topology shown in Figure 1. Assume that
the bottom AS is a provider advertising prefix P at both RX and RY.
Assume that the forwarding loop problem mentioned above has been
solved by forcing each RR to compute the best route for each client.
In this case, R1 sends its packet to P via RR2. If the link RR2-RY
fails, RR2 would withdraw its route to P on its iBGP session with
RR1. RR1 would then send a new route to R1. If instead of one prefix
Bonaventure/Uhlig/Quoitin [Page 4]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
we consider that 100,000 routes used by R1 pass via RR2. Then, when
the RR2-RY eBGP session fails, RR2 needs to withdraw 100,000 routes
on the RR2-RR1 iBGP session. RR1 would then need to update the
100,000 routes on the RR1-R1 iBGP session. This could take several
seconds or more depending on the performance of the RR. If instead a
full-mesh of iBGP sessions was used in this network, R1 would have
received all the eBGP routes learned by RR2 and RR1. When the failure
of link RR2-RY is reported by the IGP, R1 could consider all the
routes via RR2 as unreachable and could switch to the routes learned
from RR1. In large ISPs with a hierarchy of RRs, the impact of the
RRs on the BGP convergence time may be even larger.
With today's stringent SLAs, there is a clear need to reduce the
convergence time in case of failures. A versatile RR could help to
reduce it by advertising several routes to its clients. Knowing the
IGP table of each of its clients, the RR can easily determine the
best BGP route, but also the second route that it would select if
the first become unreachable. By using the BGP extensions proposed in
[10], the RR could advertise the best and the second route to each
client. This would ensure that the client can quickly switch to a new
route when the primary one becomes unreachable. This solution could
probably be even more useful in networks providing RFC2547 BGP/MPLS
VPN services given their tight SLA constraints.
3 RR-assisted traffic engineering
Another important problem in the global Internet is the need to
perform traffic engineering. Several solutions to engineer the flow
of the IP packets in the network are used. Some tune the intradomain
traffic by setting of the IGP weights [13] or establishing MPLS LSPs
[17]. Another problem is to engineer the flow of the inter-domain
traffic. As mentioned in [17], ``inter-domain Internet traffic
engineering is crucial to the performance enhancement of the global
Internet infrastructure.'' However, inter-domain traffic engineering
today often relies on tweaking the configurations of the routers
[18, 19] and is often more an art than science.
3.1 Reference environment
To perform traffic engineering, a RR needs two types of
information. Traffic statistics constitute the first type of
information. For intradomain traffic engineering, those statistics
are collected as POP-POP or router-router traffic matrices. For
interdomain traffic engineering purposes, more precise statistics are
required at the granularity of the BGP routes. However, in practice,
accurate statistics for each route are not required [20]. Studies of
the traffic characteristics in different networks [18, 19] have shown
that a small number of prefixes are responsible for most of the
Bonaventure/Uhlig/Quoitin [Page 5]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
traffic. We assume in this paper that per-BGP route volume statistics
are maintained by the border routers and sent to the RR for those
heavy prefixes.
The second type of information required to engineer the flow of the
interdomain traffic are the routing tables of the border routers. A
RR is ideally placed to obtain this information since it already
collects the BGP routes and participates in the IGP.
Figure 2 provides the typical network environment for RR-based
traffic engineering. At regular time intervals or when some routing
change occur, the RR computes the best route that each ingress BGP
router should use to reach a particular destination prefix outside
the AS.
<See PostScript version of this document>
Figure 2: Typical configuration of traffic engineering RR.
When considering interdomain traffic engineering, we need to
distinguish between the control of the outbound traffic and the
control of the inbound traffic.
3.2 Outbound interdomain traffic engineering
Let us first consider the case of a stub AS that needs to engineer
its
outgoing traffic to a few transit providers. In principle, this
engineering is simple since the network operator can define filters
on all its border routers to prefer some upstream provider for some
prefixes. However, the size of the BGP routing tables (more than
140.000 routes today) make the search of the ideal configuration
difficult [21].
Furthermore, the traffic pattern changes regularly [3] and thus a
perfect configuration at time t may become inconvenient at time t+1.
In [22, 23], we have shown that by using intelligent route
reflectors, it is possible to engineer the flow of the outbound
interdomain traffic even when the traffic patterns changes with time.
The principles of the solution described in [22, 23] can be
summarised as follows. First, the RR collects traffic statistics
regularly as explained above. Second, the RR receives all the routes
from the stub's providers. This can be obtained by establishing
multi-hop eBGP sessions between the RR and the border router of each
provider. Another solution is the BGP extension proposed in [10] to
force the stub's border routers to advertise all their routes and not
only their best routes. To control the flow of the outgoing traffic,
the RR simply has to control the iBGP advertisements that it sends to
the stub's border routers.
Based on this routing and traffic information, the RR regularly
runs an evolutionary algorithm. This algorithm can be configured
Bonaventure/Uhlig/Quoitin [Page 6]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
with different objective functions such as balancing the traffic
among providers, reducing the total cost based on the billing used,
.... To fulfil the objective function, the evolutionary algorithm
will select from time to time, a few prefixes to be moved from one
provider to another. We have shown in [23] that load balancing was
possible with only a few iBGP messages per minute, an iBGP load much
lower than the normal load of BGP messages in the global Internet.
The RR-based traffic engineering method described above is also
applicable for transit ASes. A more detailed description of this
approach may be found in [24, 25].
In a stub AS, changing iBGP advertisements is possible since the
impact of those advertisements is limited to the stub AS. In a
transit AS such as GEANT, an iBGP change can lead sometimes to eBGP
changes that could force peers to change their best BGP route. To
prevent the BGP route changes to generate instabilities in the rest
of the Internet, aggregation could be used by the local AS so that
changes in the egress point within the AS do not impact customer
ASes. Figure 3 illustrates the use of aggregation to prevent
frequent BGP route changes that do not impact the actual path
followed by IP packets for the ASes upstream from the flow of the
traffic. Suppose that the route reflector RR decides to change the
egress point to reach the external prefix A.B.C.D/Y, from egress E1
to egress E2 for ingress point I1. Under normal conditions, whenever
ingress I1 changes its best BGP route to reach prefix A.B.C.D/Y, it
requires that a new BGP route be advertised by I1 to the external
BGP peers. To prevent I1 to have to advertise a new BGP route every
time the best egress point to be used by I1 changes, I1 could
advertise to its upstream customers an aggregated AS path. This AS
path would contain the set of ASes present in the two BGP routes
that could be used by I1 to reach the destination prefix A.B.C.D/Y,
as illustrated by Figure 3. In practice, the RR does not need to
aggregate the AS-Paths of all the possible routes to a destination,
only the routes that it could select with its modified decision
process. Often, the best routes for a given destination will be
learned from the same peer over different peering sessions. In this
case, the aggregation is trivial since all the routes have the same
AS-Path.
<See PostScript version of this document>
Figure 3: AS path aggregation by RR.
The solution described above could be extended to larger transit
ASes that contain more than one (pair of) RR. This would require the
definition of protocols that allow RR to exchange routing information
and traffic statistics and coordination mechanism between the RRs.
For instance, one could choose that each RR is responsible for the
Bonaventure/Uhlig/Quoitin [Page 7]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
ingress routers it has an iBGP session with. Each RR would then
compute the best route for its ingress routers towards each
destination prefix and send them to these ingress routers. For this
solution to be scalable in terms of the BGP advertisements, each RR
would advertise to all other RR's of the domain aggregated AS paths.
3.3 Inbound Interdomain Traffic Engineering
Engineering a domain's incoming traffic with BGP is a difficult
task [19, 26, 18]. Indeed engineering the incoming traffic of one
domain requires the ability to influence how distant domains will
select the route that they use to send packets towards the domain.
Different techniques exist: announcing more specific prefixes, making
selective announcements, prepending the AS-Path and using
redistribution communities [19] [27]. However, these methods suffer
from several drawbacks. The first two methods increase the size of
the BGP routing tables of all routers. AS-Path prepending, while
being a widely used method, is known to be coarse and unpredictable.
Finally, the redistribution communities are difficult to setup due to
the combinatorial explosion of possibilities and the inaccurate view
of the topology and policies one has from a single domain's point of
view [27].
In this section, we show that a more deterministic approach to
engineering the flow of the incoming traffic is possible. Our method
relies on a cooperation between the source and destination domains
and results in the establishment of interdomain tunnels. A
destination domain willing to control how it is reached by a source
domain requests the source to establish a tunnel to one of its border
router. The tunnel is then used by the source to forward the packets
destined to the destination domain. In this way, the packets sent by
the given source enter the destination's network through the desired
access link.
To explain our approach, let us consider the example topology
shown in Figure 4. AS1 is a stub that wants to control how it is
reached by source AS2. On the figure, we can see that there exists
multiple interdomain paths between AS2 and AS1. With the normal BGP,
the packets from AS2 reach AS1 via router RD1.
<See PostScript version of this document>
Figure 4: Inbound TE using tunnels.
Assume that to reduce the delay or balance its incoming traffic,
AS1 wishes to receive the packets sent by AS2 via ProviderB and thus
router RD2. For this, AS1 will request AS2 to establish a tunnel
with destination RD2 to reach all its prefixes. For this, we propose
that a route-reflector RD1 inside AS1 establishes an eBGP session
Bonaventure/Uhlig/Quoitin [Page 8]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
with a route-reflector, RS3, in the source domain AS2. This multihop
eBGP session could be established manually as in the case of peering
links or more dynamically. To allow a dynamic establishment of those
sessions, AS2 must advertise the address of its route-reflector that
needs to be contacted. This address can be encoded as an extended
community value attached to the route(s) advertised by AS2. To avoid
security issues, the multi-hop eBGP session should be established
over an IPSec tunnel that provides authentication, data integrity
and anti-replay. Moreover, BGP extensions such as S-BGP [28] or
soBGP [29] should be used to check the validity of the prefixes
advertised by RD1.
The destination domain will typically advertise its own prefixes
over the multi-hop eBGP session and the source domain will not
advertise any prefix. Each BGP advertisement will also contain a
flexible community value [30] indicating the tunnel endpoint in the
destination domain (RD2 in our example), the type of tunnel to be
used (L2TP, GRE, ...) and possible tunnel parameters such as cookies
or identifiers. Instead of using flexible communities, another
possibility would be to use MP_BGP and to carry tunnel related
information in the MP_REACH_NLRI and a tunnel-SAFI as proposed in
[31]. By using as tunnel end-point the IP address of RD2 on the link
with ProviderB in figure 4, the destination domain can control the
ingress link over which the packets will arrive provided that this
address is advertised by ProviderB.
When RS3 has received the route towards the network of AS1 over the
multi-hop eBGP session, it will select which router(s) will establish
the requested tunnel(s) towards the tunnel end-point. It must also
update the routes that it distributes with iBGP inside AS2 to ensure
that the packets towards AS1 will be forwarded to the tunnel head-
end in AS2. Prior to establishing one tunnel towards AS1, RS3 needs
to check that the tunnel end-point is reachable by verifying that it
has received at least one BGP route to reach it. Depending on the
connectivity of AS2, RS3 may choose to establish one or several
tunnels to reach the endpoint. Since RS3 is a route-reflector, it has
the most complete knowledge of the available routes towards the
tunnel endpoint. RS3 will typically select AS2's best egress router
to reach the endpoint as the head-end of the tunnel. Note that the
selection may depend on other criteria such as the availability of
special hardware to perform the required encapsulation on the
routers. In order to ask a client to establish a tunnel towards RD2,
RS3 sends to this client an iBGP update containing the tunnel
attributes. Upon reception of this update, the client establishes the
tunnel. Once the tunnel is up and running, it updates its routing
table and sends iBGP advertisements to announce the new route in
AS2.
In the case of a stub source domain, the above procedure will only
cause iBGP changes. On the contrary, if the source domain is a
transit AS, the new routes using the tunnel could be advertised
Bonaventure/Uhlig/Quoitin [Page 9]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
outside the domain. In this case, the BGP updates that are
advertised outside the source domain AS2 should have an AS-Path that
is composed first of AS2 itself, followed by the AS-Path of the route
followed by the tunnel and finally, the destination domain AS1. Using
such a path is necessary to allow BGP to continue to detect loops by
using the AS-Path attribute. Indeed, without this AS-Path, a transit
domain ASX could select AS2 has its next-hop to reach AS1 while the
tunnel used by AS2 passes through ASX. The traffic would then pass
twice through the same domain which would be a waste of ressources.
IP tunnels such as GRE or L2TP have been often been criticised
because of the cost of encapsulating/decapsulating packets and the
risk of fragmentation. The first problem is not anymore an issue
since several vendors offer interfaces supporting
encapsulation/decapsulation at line rate. With Packet over SONET/SDH
links, the MTU is less a problem given the available frame size.
Furthermore PathMTU discovery is used by almost all endsystems
today/widely deployed and used. Compared to other proposals such as
[26], the solution described above can be used without deploying new
protocols in the transit domains. For example, universities or
research networks could use it to control high-bandwidth flows.
4 Route Reflectors and MPLS
Many large ISPs are currently using MPLS to provide BGP/MPLS VPN
services to their corporate customers [4]. Today, those services are
often provided within a single AS. Three types of routers are usually
distinguished in a network providing BGP/MPLS VPN services. A CE
router is a router owned and maintained by a customer. A PE router is
a router maintained by the network provider and directly attached to
a CE router. A PE router will usually learn the routes reachable via
each of its attached CE routers through a special IGP or BGP session
[4]. To isolate all the different VPNs, a PE router will maintain
one VPN Routing and Forwarding table (VRF) for each supported VPN.
BGP is used by the PE routers to distribute the content of their VRF
to other PE that are attached to the same VPN customers. The
forwarding of VPN packets from one PE to another relies on the
utilisation of MPLS, GRE or IPSec tunnels. Thanks to the utilisation
of those tunnels, the core routers, also called P routers, do not
need to maintain per-VPN VRFs. Since BGP is used to distribute the
VPN routes inside the network, RR are often used to scale the iBGP
full-mesh between the PE routers.
Thanks to the routeviews and RIPE RIS projects, the behaviour of
BGP in the global Internet has received a lot of attention and BGP is
better known than a few years ago. Despite of that, few studies have
analysed BGP/MPLS VPNs. A recent study [32] revealed that the
behaviour of BGP is very different when considering VPN services than
Bonaventure/Uhlig/Quoitin [Page 10]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
when considering the global Internet.
A first difference is the size of the routing tables. In the global
Internet, few routes are more specific than /24 and the BGP routes
are very stable. In the BGP/MPLS network analysed in [32], the
situation is completely different. First, the BGP/MPLS routing table
is already larger than the Internet BGP routing table and is growing
quickly. Second, the BGP/MPLS VPN routing table contains much more
specific prefixes than the BGP Internet routing table. Figure 5,
based on [32] compares the percentage of routes for the most common
prefix lengths. This figure shows that 55% of the Internet routes
are for /24 prefixes and other common prefix sizes are /16 to /23.
In the BGP/MPLS VPN routing tables, 38% of the routes have a /32
IPv4 prefix as destination and 9% correspond to a /30 prefix. A
first consequence of those specific routes is that in the network
studied in [32], the BGP/MPLS VPN routing table in the RR is already
larger than the BGP Internet routing table and the BGP/MPLS routing
table. Another consequence is that the BGP/MPLS VPN routes are less
stable and the BGP messages are much more frequent in the BGP/MPLS
network [32].
<See PostScript version of this document>
Figure 5: Prefix distribution in the BGP Internet and BGP/MPLS VPN
routing tables
The size of the BGP/MPLS routing tables will force operators to
utilise route aggregation mechanisms for the BGP/MPLS VPNs. The
default BGP aggregation [1] is able to aggregate routes for
contiguous prefixes coming from different ASes in a single
advertisement. This technique could be applied by the customers on
the CE routers. However, a CE router could only aggregate its local
routes. A versatile RR, receiving VPN routes from several PE routers
could perform a better aggregation by considering all the routes
inside each VPN. Given the volatility of some BGP/MPLS routes, the
RR would need to be able to change an aggregate dynamically after an
event in a customer network.
The next step for the BGP/MPLS VPNs is to provide those services
across different ASes. Several solutions are proposed in [4]. One
of the possible solutions is to directly interconnect the RR of
different ASes with a multi-hop eBGP session to distribute the
inter-provider VPN routes. In this case, the RR should clearly
aggregate the VPN routes that it sends over the multi-hop eBGP
session.
Another problem with BGP/MPLS VPNs is that important VPN sites are
often attached to two different PE routers. This dual attachment is
often required for redundancy, but once the two links are
established, customers often require to be able to use them for both
inbound and outbound traffic. For the packets sent by the CE router
Bonaventure/Uhlig/Quoitin [Page 11]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
to the network provider, this depends only on the customer network.
For the packets sent by the VPN provider towards the CE router, the
ability to load-balance the traffic between the two PE routers
depends on the configuration of PE routers of the VPN provider. A
possible solution is to use per-site route distinguishers [4] to
ensure that each PE receives all the advertisements from all the PE
routers attached to the same VPN. However, this increases the size of
the BGP/MPLS routing tables. A versatile route reflector could be
configured to advertise a single route when scalability is important
and several routes, for example by using the BGP extensions proposed
in [10], for the VPNs sites where load-balancing must be achieved.
Another situation where RR could play a role in MPLS networks is
when interdomain LSPs [33] need to be established with RSVP-TE. To
establish a LSP with RSVP-TE, the head-end Label Switching Router
(LSR) computes an explicit route. In a single IGP area, this
computation relies on the topology distributed by the IGP. Across
interdomain boundaries, this computation becomes more difficult since
BGP distributes reachability and not topological information. For a
primary LSP, the head-end LSR could use the route distributed by BGP.
For a disjoint secondary LSP, this becomes more difficult as the
head-end usually only receive the best BGP route to each destination.
A RR that collects all the candidate routes learned via BGP could
select among those routes to find a disjoint route for the secondary
LSP.
5 Conclusion
BGP Route Reflectors were designed to solve the scaling problem of
the iBGP full-mesh. For this, the RR collects the best routes from
all its clients. Instead of only serving as a distributor of iBGP
advertisements, we have shown that by exploiting the routing
knowledge of the RR it is possible to improve the routing in ASes.
We have then shown several situations where versatile RR could be
used to support very useful services in Autonomous Systems. One of
those situations is the need to engineer the flow of the outgoing
interdomain traffic of a stub or transit AS. Another situation
occurs when an AS wishes to control the flow of its incoming traffic.
Besides those traffic engineering usages, versatile RR could also be
used to reduce the convergence time in case of failure or the size of
the BGP/MPLS routing tables.
Acknowledgements
This work was supported by the DGTRE in the framework of the TOTEM
project ( http://totem.info.ucl.ac.be). We would like to thank
Nicolas Dubois for the data used in figure 5.
Bonaventure/Uhlig/Quoitin [Page 12]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
References
[1] Y. Rekhter, T. Li, and S. Hares, ``A Border Gateway Protocol 4
(BGP-4),'' April 2003, internet draft, draft-ietf-idr-
bgp4-20.txt,
work in progress.
[2] J. Rexford, J. Wang, Z. Xiao, and Y. Zhang, ``Bgp routing
stability of popular destinations,'' in Proc. Internet
Measurement
Workshop, November 2002.
[3] S. Uhlig, V. Magnin, O. Bonaventure, C. Rapier, and L. Deri,
``Implications of the topological properties of internet traffic
on
traffic engineering,'' in ACM Symposium on Applied Computing,
March
2004.
[4] E. Rosen and Y. Rekhter, ``BGP/MPLS IP VPNs,'' September 2003,
internet draft, draft-ietf-l3vpn-rfc2547bis-01.txt, work in
progress.
[5] P. Marques, N. Sheth, R. Raszuk, J. Mauch, and D. McPherson,
``Dissemination of flow specification rules,'' June 2003,
internet
draft, draft-marques-idr-flow-spec-00.txt, work in progress.
[6] T. Bates, R. Chandra, and E. Chen, ``BGP route reflection - an
alternative to full mesh iBGP,'' April 2000, internet RFC 2796.
[7] B. Halabi, Internet Routing Architectures.1em plus 0.5em minus
0.4emCisco Press, 1997.
[8] T. Griffin and G. Wilfong, ``Analysis of the MED oscillation
problem in BGP,'' in ICNP2002, 2002.
[9] D. McPherson, V. Gill, D. Walton, and A. Retana, ``BGP
persistent
route oscillation condition,'' 2002, internet draft,
draft-ietf-idr-route-oscillation-01.txt, work in progress.
[10] D. Walton, D. Cook, A. Retana, and J. Scudder, ``Advertisement
of
Multiple Paths in BGP,'' November 2002, internet draft,
draft-walton-bgp-add-paths-01.txt, work in progress.
Bonaventure/Uhlig/Quoitin [Page 13]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
[11] T. Griffin and G. Wilfong, ``On the correctnes of iBGP
configuration,'' in SIGCOMM'02, Pittsburgh, PA, USA, August 2002,
pp. 17--29.
[12] L. Xiao, J. Wang, and K. Nahrstedt, ``Reliability-aware IBGP
Route Reflection Topology Design,'' in 11th IEEE International
Conference on Network Protocols (ICNP 2003), Atlanta, Georgia,
USA,
November 2003.
[13] B. Fortz, J. Rexford, and M. Thorup, ``Traffic engineering
with
traditional IP routing protocols,'' IEEE Communications Magazine,
October 2002.
[14] R. Musunuri and J. Cobb, ``A complete solution to stable
iBGP,''
in IEEE International Conference on Communications (ICC), 2004.
[15] C. Alaettinoglu, V. Jacobson, and H. Yu, ``Towards millisecond
IGP congergence,'' November 2000, internet draft,
draft-alaettinoglu-ISIS-convergence-00.ps, wor k in progress.
[16] C. Demetrescu and G. F. Italiano, ``A New Approach to Dynamic
All
Pairs Shortest Paths,'' in Proceedings of the 35th ACM symposium
on
Theory of computing (STOC'03), June 2003, pp. 159--166.
[17] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao,
``Overview and principles of internet traffic engineering,'' May
2002, rFC 3272.
[18] N. Feamster, J. Borkenhagen, and J. Rexford, ``Guidelines for
interdomain traffic engineering,'' SIGCOMM Comput. Commun. Rev.,
vol. 33, no. 5, pp. 19--30, 2003.
[19] B. Quoitin, S. Uhlig, C. Pelsser, L. Swinnen, and O.
Bonaventure,
``Interdomain traffic engineering with BGP,'' IEEE Communications
Magazine, May 2003.
[20] S. Leinen, ``Evaluation of candidate protocols for IP flow
information ex port (IPFIX),'' January 2004, internet draft,
draft-leinen-ipfix-eval-contrib-02, work in pr ogress.
[21] T. Ye and S. Kalyanaraman, ``A recursive random search
algorithm
Bonaventure/Uhlig/Quoitin [Page 14]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
for large-scale network parameter configuration,'' in Proc. of
ACM
SIGMETRICS, 2003.
[22] S. Uhlig, O. Bonaventure, and B. Quoitin, ``Interdomain
Traffic
Engineering with minimal BGP Configurations,'' in Proc. of the
18^th
International Teletraffic Congress, Berlin, September 2003.
[23] S. Uhlig, ``Implications of the traffic characteristics on
interdomain tra ffic engineering,'' Ph.D. dissertation, Computer
Science and Engineering Department, Unive rsit‰ catholique de
Louvain, March 2004.
[24] ------, ``A multiple-objectives evolutionary perspective to
interdomain traffic engineering in the internet,'' in Workshop on
Nature Inspired Approaches to Networks and Telecommunications
(NIANT) in PPSN04, Birmingham, UK, September 2004.
[25] S. Uhlig and B. Quoitin, ``BGP-based interdomain traffic
engineering for transit ASes.''
[26] S. Agarwal, C. Chuah, and R. Katz, ``OPCA: Robust Interdomain
Policy Rrouting and Traffic Control,'' in Proceedings of the 6th
International Conference on Open Architecture and Network
Programming, IEEE OpenArch, April 2003.
[27] B. Quoitin, S. Tandel, S. Uhlig, and O. Bonaventure,
``Interdomain Traffic Engineering with Redistribution
Communities,''
Computer Communications, vol. 27, no. 4, pp. 355--363, March
2004.
[28] S. Kent, C. Lynn, and K. Seo, ``Secure Border Gateway Protocol
(S-BGP),'' IEEE Journal on Selected Areas in Communications, vol.
18, no. 4, pp. 582--592, April 2000.
[29] R. White, ``Securing BGP Through Secure Origin BGP,'' The
Internet Protocol Journal, vol. 6, pp. 15--22, June 2003.
[30] A. Lange, ``Flexible BGP Communities,'' March 2004, internet
draft, draft-lange-flexible-communities-02, work in progress.
[31] G. Nalawade, R. Kapoor, and D. Tappan, ``Tunnel SAFI,''
October
2003, internet Draft, draft-nalawade-kapoor-tunnel-safi-01, work
in
Bonaventure/Uhlig/Quoitin [Page 15]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
progress.
[32] M. Nicolas, ``BGP/MPLS VPN monitoring for troubleshooting,
scalability verification and network migration safety,'' February
2004, presentation at MPLS2004 , Paris (France).
[33] R. Zhang and J. Vasseur, ``MPLS Inter-AS traffic engineering
requirements,'' November 2003, internet draft,
draft-ietf-tewg-interas-mpls-te-req-02.txt, work in progress.
Authors' addresses
Olivier Bonaventure, Steve Uhlig, Bruno Quoitin
Dept. Computing Science and Engineering
Universite catholique de Louvain (UCL)
Place Sainte-Barbe 2
B-1348 Louvain-la-Neuve
Belgium
http://www.info.ucl.ac.be/people/OBO
Bonaventure/Uhlig/Quoitin [Page 16]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has made
any independent effort to identify any such rights. Information on the
procedures with respect to rights in RFC documents can be found in BCP
78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this specification
can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary rights
that may cover technology that may be required to implement this
standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2004). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
This Internet-Draft will expire on December 31, 2004.
Acknowledgment
Bonaventure/Uhlig/Quoitin [Page 17]
draft-bonaventure-bgp-route-reflectors-00.txt July 2004
Funding for the RFC Editor function is currently provided by the
Internet Society.
Bonaventure/Uhlig/Quoitin [Page 18] | PAFTECH AB 2003-2026 | 2026-04-23 17:28:48 |