One document matched: draft-francois-ordered-fib-00.txt
Network Working Group P. Francois
Internet Draft O. Bonaventure
Expiration Date: January 2006 M. Shand
S. Previdi
S. Bryant
July 2005
Loop-free convergence using ordered FIB updates
draft-francois-ordered-fib-00.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet Drafts are working documents of the Internet Engineering
Task Force (IETF), its Areas, and its Working Groups. Note that other
groups may also distribute working documents as Internet Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than a "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This draft proposes a mechanism to prevent transient loops during
non-urgent topology changes by ordering the FIB updates on routers,
in networks which use link state routing protocols. This mechanism
can be used in case of graceful link shutdowns, metric changes and
when a link is enabled. The solution can also be used in conjunction
with an IPFRR mechanism wich turns a sudden link failure in a non-
urgent change, by ensuring a local protection of the link. After a
non-urgent topology change, each routers computes a rank that defines
the time at which it can safely update its FIB. A completion message
mechanism is also proposed in order to speed up the convergence
process.
Francois et al. Expires January 2006 [Page 1]
Internet Draft Ordered FIB Updates July 2005
1. Introduction
With link-state protocols [ISIS,OSPF], when the network topology
changes, some routers need to modify their Forwarding Information
Base (FIB) to take into account the new topology. Each topology
change causes a convergence phase. During this phase, routers may
transiently have inconsistent FIBs, which leads to packet loops and
losses, even if the reachability of the destinations is not
compromised after the topology change. When the link state change is
a metric update and when a new link is brought up in the network,
there is no direct loss of connectivity, but transient loops during
the convergence phase may still cause packet loss. The same
observation can be made in the case of a link failure. If the link is
protected [IPFRR,MPLSFRR], a loop free convergence mechanism may also
be used in place of the usual fast, best effort approach.
The remainder of this document is organised as follows. We first show
in section 2 that there is an ordering for the FIB updates when a
single link fails, is activated or when an IGP metric changes. We
also present an ordering for the cases of node failure and node
activation. In section 3, we present an implementation of the
ordering schemes. In section 4, a mechanism aimed at speeding up the
loop-free convergence process is presented. In section 5, we present
additional shortcuts to the ordering scheme.
2. Ordering of the FIB updates
In this section, we briefly present an ordering of the FIB updates
that allows to avoid transient forwarding loops in the network in the
case of topology changes. A more precise analysis of the rerouting
dynamics and correctness proofs of the mechanism can be found in
[PFOB05].
2.1 Link down / metric increase
We first consider the failure of a link or the increase of the metric
associated with one directed link. In this case, to ensure the
transient consistency of the forwarding tables of the routers, a
router R should update its FIB AFTER all the routers that used R
BEFORE the event, to reach the affected link.
Let us show that this rule ensures the transient consistency of the
routers. We assume that the link X->Y goes down and is protected, or
its metric is increased.
Firstly, once a packet reaches a router R with an outdated FIB for
its destination, it will only traverse routers with an outdated FIB,
according to the rule. It thus reaches X, whose protection will
Francois et al. Expires January 2006 [Page 2]
Internet Draft Ordered FIB Updates July 2005
ensure that the packet reaches its destination. In the case of a
metric increase, the packet will simply be forwarded along X->Y and
reach its destination.
Secondly, a packet cannot loop while being exclusively forwarded by
routers with an updated FIB, so that any packet is proved to never
enter a loop.
In other words, when the ordering is respected, a packet entering the
network that follows a path composed of updated or unaffected routers
cannot loop. If it reaches an outdated router, it cannot reach an
updated router anymore, so that it will not loop.
2.2 Link up / metric decrease
In the case of link up events or metric decreases, to ensure the
transient consistency of the forwarding tables of the routers, a
router R should update its FIB BEFORE the routers that WILL use R to
reach the affected link.
Let us show that this rule ensures the transient consistency of the
routers. We assume that the link X->Y goes up or that its metric is
decreased.
Firstly, when a packet reaches a router R that has updated its FIB,
all the routers on the path from R to X have updated their FIB, so
that the packet will reach X and be forwarded along X->Y, to finally
reach its destination.
Secondly, a packet could not loop between routers that have not yet
updated their FIB, so that any packet is proved to never enter a
loop.
2.3 Router down
An ordering of the FIB updates can also be performed to avoid
transient loops in the case of a router down event. There are
multiple implementation choices to advertise the failure of a node.
Firstly, a router being shut down can send a new link-state packet
that describes the failure of all its links, by setting the age of
each link to MAX_AGE. Secondly, the router can purge its own link-
state packet. Finally, and only in IS-IS, the router can send its
link-state packet by setting the overload-bit. All those solutions
will let the other routers of the network know about the failure by
receiving one single message.
The other routers of the network can avoid loops while adapting to
the node failure by respecting a rule that is very similar to the one
Francois et al. Expires January 2006 [Page 3]
Internet Draft Ordered FIB Updates July 2005
mentioned in the case of a link down event.
A router R should update its FIB AFTER all the routers that use R to
reach the router that is shutdown.
Let us show that this ordering ensures the transient consistency of
the forwarding. We assume that the router X is shut down.
Firstly, if a packet to a destination D reaches a router that has not
updated its FIB and that used X to reach D, it will only traverse
routers that have not updated their FIB, to finally reach X. X will
then forward the packet to its nexthop for D, and the packet will
reach its destination. This nexthop could not have X in its path(s)
to D before the event, as the contrary would mean that there was a
loop before the event. Secondly, a packet that does not reach a
router with an outdated FIB cannot loop.
2.4 Router up
In IS-IS, when a router X is brought up, it should firstly send its
link-state packet with an overload bit set. After a while, each of
its neighbours will have sent a link-state packet describing their
links to X as being up. At that moment, X can send a link-state
packet with an overload bit unset, so that the processing of this
link-state packet by the other routers will make them consider all
the links connected to X as being up, in one single FIB update.
In OSPF, the overload-bit does not exists. An OSPF router that is
brought up should thus wait for a while before sending its own LSA,
or firstly send its LSA with a MAX-METRIC assigned to each of its
links, and only send a LSA with the real metrics after a while.
By proceeding like this, we have the opportunity to order those FIB
updates and avoid forwarding inconsistencies during the convergence.
The ordering of the FIB updates follows a rule that is very similar
to the rule mentioned in the case of a link up event. A router R
should update its FIB BEFORE all the routers that WILL use R to reach
the router that is brought up.
Let us show that this ordering ensures the transient consistency of
the forwarding. We assume that the router X comes in the network.
Firstly, when a packet with destination D reaches a router R that has
already updated its FIB, and that uses X to reach D, the packet will
reach X as it will traverse routers that have already updated their
FIB. X will then forward the packet to its nexthop for destination D.
Francois et al. Expires January 2006 [Page 4]
Internet Draft Ordered FIB Updates July 2005
The path from this nexthop to D cannot contain X, so that the packet
is proved to be consistently forwarded to D. Secondly, a packet that
exclusively traverses routers with an outdated FIB cannot loop.
3. Implementation of the ordering
In this section, we show how the proposed ordering can be applied by
routers that have to adapt to a topology change.
3.1 Link down / metric increase
To respect the proposed ordering, routers can compute a rank that
will be used to determine the time at which they can perform their
FIB update. In the case of the failure of link X->Y or an increase of
the metric of link X->Y, router R will compute the reverse Shortest
Path Tree (rSPT) of Y. This rSPT gives the shortest paths to reach Y.
The branch of the reverse SPT that is below R corresponds to the set
of shortest paths to R that are used by the routers that reach X->Y
via R.
The rank of router R is defined as the depth (in number of hops) of
this branch. To avoid confusion in the case of ECM paths, we can say
that the the maximum depth must be taken into account.
Router R will update its FIB at time : T0 + rank * MAX_FIB where T0
is the arrival time of the link-state packet containing the topology
change and MAX_FIB a network-wide constant that reflects the maximum
time required to update a FIB. The value of MAX_FIB is implementation
specific and is out of the scope of this document. This value must
be agreed by all the routers of the network. This agreement can be
done by using a capability TLV as defined in [SLFTV].
All the routers that use R to reach X->Y will compute a lower rank
than R, so that the order will be respected. It should be noted that
only the routers that used X->Y before the event have to compute
their rank.
3.2 Link up / metric decrease
In the case of a link up or a link metric decrease affecting link
X->Y, a router R must have a rank that is higher than the rank of the
routers that it will use to reach X->Y, according to the rule
mentioned in 2.2. The rank of R is thus equal to the number of hops
between R and X in its Shortest Path Tree. When R has multiple equal
cost paths to X, the rank is the length of the longest path in hops
Francois et al. Expires January 2006 [Page 5]
Internet Draft Ordered FIB Updates July 2005
to X.
Router R will update its FIB at time : T0 + rank * MAX_FIB where T0
is the arrival time of the link-state packet containing the topology
change and MAX_FIB a network-wide constant that reflects the maximum
time required to update a FIB.
It should be noted that only the routers that use X->Y after the
event have to compute a rank, i.e. only the routers that have X->Y in
their SPT after the link-state change.
3.3 Router down
When a router X is shutdown, the rank that has to be applied by a
router R is equal to the depth of the branch under R in rSPT(X). In
the case of ECM paths to R, the maximum of the length must be taken
into account. Note that X will only be shutdown once all the routers
have updated their FIB. The rank that is computed by X thus gives
the time at which X can be effectively shut down.
3.4 Router up
When a router X comes in the network, the rank that has to be applied
by a router R is equal to the maximum length of its ECM paths to X,
in its updated Shortest Path Tree. Note that X will be the first to
update its FIB, as it will have a rank equal to 0. This translates
the fact that X must have built its FIB before any router can use it
to forward packets.
4. Completion messages
As noted in the previous sections, only the routers that are
currently using a link affected by a topology change need to update
their FIB. The FIB of all the other routers does not need to be
updated at all. Furthermore, the routers that need to update their
FIB do not necessarily need MAX_FIB seconds to perform their FIB
update [CCRJULY].
In this section, we show how completion messages can be used to speed
up the convergence after a non-urgent topology change by shortcuting
the computed rank while respecting the transient consistency of the
routers.
To do this, we adapt the mechanism proposed in [INFOCOM]. Routers
maintain a waiting list, that lists the neighbours from which they
have to wait for a completion message before being allowed to update
their own FIB. Upon reception of a completion message from a
Francois et al. Expires January 2006 [Page 6]
Internet Draft Ordered FIB Updates July 2005
neighbour, a router removes this neighbour from its waiting list.
Once its waiting list becomes empty, the router is allowed to update
its FIB. The next sections explain how the waiting list must be
built for each type of event, and when completion messages are sent
by the routers.
4.1 Link down / metric increase
Let us assume that the link X->Y goes down, or its metric is
increased. A router R that currently has X->Y in its SPT computes
rSPT(Y) to determine its rank. By doing this, it computes the set of
routers (and thus implicitly the set of its neighbours) that use it
to reach link X->Y. The waiting list of R is equal to the set of
neighbours that use it to reach X->Y. These are the neighbours below
R in rSPT(Y).
The current SPT of R gives the set of neighbours that it uses to
reach X->Y. These neighbours have to wait for R, and R is thus
present in their waiting list. This set of neighbours is the set of
neighbours to which R will send its completion messages, once it has
updated its FIB.
4.2 Link up / metric decrease
Let us assume that the link X->Y goes up, or its metric is updated.
A router R will compute its new SPT (SPT_new(R)). If X->Y is in
present in SPT_new(R), R will use link X->Y and thus needs to compute
its rank. For this, R computes the set of routers (and implicitly
its set of neighbours) that it uses to reach X->Y. The nexthops for X
in SPT_new(R). R must reroute after those routers to respect the
ordering. Its waiting list is thus equal to those nexthops.
When R has updated its FIB, it can send a completion message to its
neighbours, so that a neighbour that has R in its waiting list will
be allowed to remove R from it.
4.3 Router down
Let us assume that a router X goes down. A router R will compute
rSPT(X) to determine its rank. When doing this, it also computes the
set of its neighbours that uses R to reach X. The waiting list of R
is equal to this set.
The current SPT of R gives the set of neighbours that will have R in
their waiting list. Once R has updated its FIB, it will send its
completion messages to those neighbours.
Francois et al. Expires January 2006 [Page 7]
Internet Draft Ordered FIB Updates July 2005
4.4 Router up
Let us assume that a router X comes in the network. A router R will
compute its new SPT. It thus also computes its nexthops for X. This
set of nexthops (more than one in the case of equal cost paths to X)
is the waiting list that R will use.
Once R has updated its FIB, it sends a completion message to its
neighbours, so that a neighbour that has R in its waiting list will
be allowed to remove R from it.
5. Ranking Shortcuts.
In the case of a link (X->Y) down or metric increase, a router R can
sometimes shortcut its rank because it can decide, when computing the
rSPT, which neighbours are not affected by the event. These are the
neighbours that did not use the link X->Y, so that their paths to all
the destinations do not change.
Let us assume that link X->Y goes down. While R computes rSPT(Y), it
can mark the routers that use the link X->Y. These are the routers
that are below X->Y in the rSPT. If a neighbour N of R is not marked,
all the FIB updates that will let packets be forwarded to N by R can
be performed directly. Note that if N uses Y->X, R can still reroute
to N because R and N are not affected for the same set of
destinations, so that the destinations affected in R, using X->Y, are
never affected in a router N using Y->X.
If a FIB update contains an ECMP entry for a destination d, and not
all the nexthops for d were not marked, then the normal ranking can
be applied for this destination, or the FIB update will have to be
modified in order to only reroute to the unmarked nexthops. In this
case, a new FIB update will have to be performed for d when the rank
timer has elapsed or all the necessary completion messages have
arrived. The choice between these two solutions is a trade-off
between an increased number of FIB updates and a faster convergence.
References
[OSPF] J. Moy, OSPF Version 2. RFC 2328. Apr. 1998.
[IS-IS] "Intermediate system to Intermediate system routeing
information exchange protocol for use in conjunction
with the Protocol for providing the Connectionless-mode
Network Service (ISO 8473)," ISO/IEC 10589:2002,
Second Edition.
Francois et al. Expires January 2006 [Page 8]
Internet Draft Ordered FIB Updates July 2005
[IPFRR] M. Shand, S. Bryant, IP Fast Reroute Framework,
draft-ietf-rtgwg-ipfrr-framework-03.txt, June 2005
[MPLSFRR] Pan, P. et al, "Fast Reroute Extensions to RSVP-TE for LSP
Tunnels", RFC 4090
[PFOB05] P. Francois, O. Bonaventure. Avoiding transient loops
during IGP convergence. IEEE INFOCOM 2005, March 2005,
Miami, Fl., USA
[CCRJULY] P. Francois, C. Filsfils, J. Evans, O. Bonaventure.
Achieving sub-second IGP convergence in large IP networks,
ACM SIGCOMM Computer Communication Review, July 2005
[SLFTV] A. Atlas, S. Bryant, M. Shand, Synchronization of Loop Free
Timer Values, draft-atlas-bryant-shand-lf-timers-00.txt
Appendix A : Pseudo-codes
Figure 1 shows the pseudocode for the router reaction in case of non-
urgent link failure or metric increase.
Pseudo-code for link (X->Y) down/metric increase in router R :
If(X->Y in SPT(R)){
//R is affected by the event
//Compute rspt(Y)
rspt = rSPT(Y);
//compute the rank from rspt (can also be done directly
//during rspt computation)
worst_case_update_delay = depth(R,rspt) * MAX_FIB
//Build the waiting list : these are the neighbours below R in rspt
//
WaitingList = getNeighboursUpstream(R,rspt);
//Build the list of neighbours to which R will send a
//completion message :
//these are the nexthops for X in the current SPT of R
completionMessageSendingList = nexthops(X,SPT(R));
//Wait for the waiting list to be empty or the rank time
//to elapse.
while(not WaitingList.empty()
|| worst_case_update_delay.elapsed()){
Francois et al. Expires January 2006 [Page 9]
Internet Draft Ordered FIB Updates July 2005
if(completionMessageReceived()){
//New completion message received
WaitingList.remove(completionMessage.sender);
}
}
//Rank time has elapsed or waiting list is empty
UpdateFIB();
//FIB has been updated, send completion messages
ForEach(N in completionMessageSendingList){
send(N,"FIB Updated for X->Y");
}
}
else{
//R is not affected by the event, nothing to do
}
Figure 1 : Pseudocode for link down/metric increase
Figure 2 provides the pseudocode for the reaction of a router to a
non-urgent link up or metric decrease event.
Pseudo-code for link (X->Y) up/metric decrease in router R :
//Compute the new SPT.
newSPT = computeSPT(R)
If(X->Y in newSPT){
//R is affected by the event
//compute the rank of the router. This is the length (in hops)
//from R to X in the new spt. (note that it is equal to the
//length from R to X in the old spt.
worst_case_update_delay = PathLength(R,X,newSPT) * MAX_FIB
//Build the waiting list : these are the nexthops for X in the new SPT.
WaitingList = nexthops(R,X,newSPT);
//Wait for the waiting list to be empty or the rank time to elapse.
while(not WaitingList.empty() || worst_case_update_delay.elapsed()){
if(completionMessageReceived()){
//New completion message received
WaitingList.remove(completionMessage.sender);
}
}
//Rank time has elapsed or waiting list is empty
UpdateFIB();
//Fib updated, send completion messages to the neighbours.
ForEach(N in Neighbours(R))
Send(N,"Fib Updated X->Y");
Francois et al. Expires January 2006 [Page 10]
Internet Draft Ordered FIB Updates July 2005
}
else{
//R is not affected by the event, nothing to do
}
Figure 2 : Pseudocode for link up/metric decrease
Authors Addresses
Pierre Francois
Universite catholique de Louvain
Place Sainte Barbe, 2
B-1348, Louvain-La-Neuve
Belgium
Email: pfr@info.ucl.ac.be
Olivier Bonaventure
Universite catholique de Louvain
Place Ste Barbe, 2
B-1348, Louvain-La-Neuve
Belgium
Email: Bonaventure@info.ucl.ac.be
Mike Shand
Cisco Systems,
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom
Email: mshand@cisco.com
Stefano Previdi
Cisco Systems
Via Del Serafico 200
00142 Roma - Italy
Email : sprevidi@cisco.com
Stewart Bryant
Cisco Systems,
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom
Email: stbryant@cisco.com
Francois et al. Expires January 2006 [Page 11]
Internet Draft Ordered FIB Updates July 2005
Intellectual Property Considerations
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Full Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Francois et al. Expires January 2006 [Page 12]
| PAFTECH AB 2003-2026 | 2026-04-23 21:29:54 |