One document matched: draft-chavali-bgp-prefixlimit-01.txt
Differences from draft-chavali-bgp-prefixlimit-00.txt
Network Working Group (Editor)Srikanth Chavali
INTERNET DRAFT Vasile Radoaca
Expiration Date: October 2004 Nortel Networks, Inc.
Mo Miri
BellSouth
Luyuan Fang
AT&T
(Editor)Susan Hares
NextHop Technologies
April 2004
Peer Prefix Limits Exchange in BGP
draft-chavali-bgp-prefixlimit-01.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document proposes a mechanism to allow BGP peers to coordinate
the setting of a limit on the number of prefixes which one BGP
speaker will send to its peer. Coordination can prevent disruption
of the peering session or discarding of routes, which can occur when
a maximum prefix limit is configured on the "receiving" peer, and the
"sending" peer exceeds the limit.
1. Terms
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
In this document we use the term "BGP sender" to refer to a BGP
speaker which is advertising prefixes to its peer. We use the term
"BGP receiver" to refer to a BGP speaker which is receiving prefixes
from its peer. Although it is clear that in reality each peer is
usually both a "BGP sender" and a "BGP receiver", we emphasize a
unidirectional relationship in this document for clarity.
2. Introduction
There are many scenarios where BGP [BGP-4] peering may be established
between two speakers in which there is an expectation that some
limited number of prefixes will be announced by a given speaker.
Section 6 describes these secnarios. Several implementations of BGP
offer a configuration option that allows a BGP receiver to provision
a limit to the number of prefixes it will accept from a specific
peer. When the limit is exceeded, then there are generally two
options: the prefixes exceeding the limit can be dropped by the BGP
receiver, or the peering session may be terminated by the BGP
receiver and restarted at a later time. Neither of these options is
desirable.
Dropping prefixes leads to network unreliability, since the dropped
prefixes will be unreachable through the BGP receiver. Terminating
the BGP session is probably worse, since all traffic between the
peers will typically be disrupted, even for those prefixes which were
advertised before the limit was reached. In many cases, the result of
not limiting the number of received BGP prefixes can be much worse
than either case just mentioned. If the BGP receiver becomes
overloaded, it can fail and affect many or all of its peers. The
effects of the disruptions caused by lost peering sessions and device
failures propagate through the Internet, leading to instability as
described in detail in [BGP-STUDY]. Other undesirable effects include
resource utilization on the peers from restarting the peering
session, and the processing load and bandwidth utilization from
withdrawing and re-advertising the prefixes throughout the Internet.
Other issues arising out of this are described in section 6.
The disruption may be due to network changes, misconfigurations,
miscommunications, or other factors where the number of prefixes
advertised from a BGP sender to the receiver exceeds the expected
number, and the configurations must be revised. It may be due to a
specific configuration which is functioning properly in order to
prevent an overload condition, or it may occur when the receiving BGP
speaker becomes overloaded and suffers various consequences. Two
newer sources of additional route overload are: Virtual Private
Networks (VPN) services and denial of service (DOS) attacks. A denial
of service attacks which send additional more specific routes to a
bgp speaker can overload the routing table. In VPNs, a sudden
increase in routes may be a true addition of routes or a
misconfiguration, or a Denial of Service attack.
A basic functionality is proposed here for BGP speakers to exchange
three prefix limits per AFI/SAFI pair: warning, stop receiving, and
disconnect limits. BGP [BGP-4] peers coordinate several types of
information sent via the CAPABILITIES listed in the OPEN message or
the capabilities sent via the CAPABILITY message (Dynamic
capabilities). The BGP peers negotiate routes that will be sent in
Route Refreshes via the Outbound Route Filters (ORFs). This draft
proposes:
1. OPEN message with BGP Capability message [BGP_CAP] to carry the
proposed parameter.
2. A new Route Filter type for the OutBound Route Filter community or
Extended community [ORF,ASPATH ORF].
3. Definition of Prefix limit
Prefix limit is encoded as an optional capability parameter [BGP-CAP]
in the BGP OPEN message [BGP-4]. In addition, for dynamic re-
adjustment of these capabilities the Prefix limit TLV can be included
in:
- Dynamic Capability negotation (described in section 3.1),
- ORF of Type Prefix (described in section 3.2), and
- error messages related to Dynamic capabilities (section 5.3),
or CEASE codes (section 5.4).
If multiple of these features specify maximum prefix, the precedence
of the usage is: dynamic capability, ORF, Inform, and Soft Notifty.
By precedence we indicate that the Dynamic capability negotiation
takes priority over the other mechanisms.
The required fields in the Maximum Prefix TLV are sub-code 1 through
sub-code 3 which MUST be present in the Maximum Prefix TLV. All
optional fields MAY be present in the Maxmimum Prefix TLV.
3.1 Layout of Bytes
0 7 15
+---------+---------+
|code |length |
| | |
|1 octet | 1 octet |
+----------+--------+
0 7 15 23
+---------+---------+----------+
| AFI | SAFI |
| | |
|2 octets | 1 octet |
+---------+---------+----------+
0 7 15 23 55
+---------+---------+----------+----------+
|sub | |warning |warning |
|code 1 | length |indicator |prefix |
|[Warn] | | |limit |
| 1 octet | 1 octet | 1 octet | 4 octets |
+---------+-------+------------+----------+
0 7 15 23 55
+---------+---------+----------+-----------+
|sub | |stop |stop adver-|
|code 2 | length |advertise-|tisement |
|[stop] | |ment |prefix |
| | |action |limit |
| 1 octet | 1 octet |1 octet | 4 octets |
+---------+-------+------------+-----------+
0 7 15 23 55
+---------+---------+----------+-----------+
|sub | |reset |reset peer-|
|code 3 | length |peering |ing prefix |
| | |action |limit |
| 1 octet | 1 octet | 1 octet | 4 octets |
+---------+-------+------------+-----------+
0 7 15
+---------+---------+
| option length |
| |
| 2 octets |
+---------+---------+
0 7 15 47
+---------+----------+-----------+
|sub | |current Rx |
|code 4 |length |routes |
|[CurRX] | | |
| 1 octet | 1 octet | 4 octets |
+---------+----------+-----------+
0 7 15 47
+---------+----------+-----------+
|sub | |current Tx |
|code 5 |length |routes |
|[CurTX] | | |
| 1 octet | 1 octet | 4 octets |
+---------+----------+-----------+
0 7 15
+---------+-------+
|sub | |
|code 6 |length |
|[pfxln] | |
|1 octet |1 octet|
+---------+-------+
0 7 15 47 59 91
+-------+--------+----------+-----------+---------+
|prefix |action |warning |stop adver-|reset |
|length |flags |indicator |tisement |peering |
| 1 |for |limit for |limit for |limit for|
| |limits |prefix |prefix |prefix |
| |for |length-1 |length-1 |length-1 |
| |prefix | | | |
| |length-1| | | |
|1 octet|1 octet |4 octets |4 octets |4 octets |
+---------+-------+--------+-------+--------+-----+
.
.
.
.
+-------+--------+----------+-----------+---------+
|prefix |action |warning |stop adver-|reset |
|length |flags |indicator |tisement |peering |
| n |for |limit for |limit for |limit for|
| |limits |prefix |prefix |prefix |
| |for |length-n |length-n |length-n |
| |prefix | | | |
| |length-n| | | |
|1 octet|1 octet |4 octets |4 octets |4 octets |
+---------+-------+--------+-------+--------+-----+
0 7 15
+-------+-------+
|sub | |
|code 7 |length |
|[orfmx]| |
+-------+-------+
0 7 39 71 103 111
+----------+----------+-----------+------------+---------+------+
| action |warning |stop Adver-|reset peer- |ORF type |ORF |
| flags |indicator |tisement |ing | |Info |
|for ORF |prefix |prefix |prefix | | |
|match |limit for |limit for |limit for | | |
| |ORF match |ORF Match |ORF match | | |
| | | | | | |
| |(4 octets)|(4 octets) |(4 octets) |(1 octet)| |
+----------+----------+-----------+------------+---------+------+
3.2 Byte definitions
Meaning for each of the bitwise indicated capability fields above is
as follows:
Type-Code (1 octet):
code identifying this capability (TBD)
Length (1 octet):
The required portion of the Prefix limit TLV is 28 octets and
includes type-code, length, sub-codes 1-2, and the optional length.
The optional length of the prefix limit TLV is variable based on the
information. If the length exceeds, 254 octets, the length byte is
set to 255 and the length is determined by the 28 plus the number of
octets in the optional length field.
Address Family Identifier AFI (2 octets):
This along with the Subsequent Address Family Indentifier field
identifies the Network Layer Protocol associated with the Network
Address.
Subsequent Address Family Identifier SAFI (1 octet):
This along with the Address Family Identifier field identifies the
Network Layer Protocol associated with the Network Address.
sub code 1 (1 octet):
It is used to identify the number of routes sent before raising
warning. This is done by the BGP speaker that detects it.
Warning Indicator (1 octet):
This octet can be assigned a value of 0, 1 or 2.
A value of 0 means that the sender SHOULD NOT raise any warning. The
warning mechanisms are described in the operation section of this
draft. A value of 1 means the warning indication is necessary and
SHOULD be used by the sender when its route advertisement equals the
number of sent routes. If a BGP information messages is supported
(such as the BGP INFORM), a 2 value indicates that such a BGP message
will be transmitted to the remote peer if the route advertisement
limit is hit.
Warning prefix limit (4 octet):
Number of routes sent by the BGP sender. The value for this field is
dependent on the maximum prefix limit and SHOULD be always less than
it.
sub code 2 (1 octet):
It is used to identify the number of routes sent before the sender
BGP speaker needs to stop advertising routes to its receiving BGP
speaker.
Stop Advertisement action (1 octet):
This octet can be set to 0, 1 or 2.
Setting the value to 0, means the bgp speaker will ignore any routes
sent after the stop advertisment limit. Setting the bits to 1 means
that the route advertisement MUST be stopped by the speaker when the
route advertisement limit is hit. It is implicit that whichever
speaker encounters the situation will stop advertisement to its peer.
If a BGP information messages is supported (such as the BGP INFORM),
a 2 value indicates that such a BGP message will be transmitted to
the remote peer if the route advertisement limit is hit.
maximum prefix limit (4 octet):
Number of routes sent by the sender BGP speaker.
sub code 3 (1 octet):
It is used to identify the number of routes received after which the
BGP speaker will reset the peering session. It MUST be noted here
that this situation will never be encountered if adhered to the
draft. In other words this happens only during error conditions. The
error conditions are beyond the scope of this document.
reset peering action (1 octet):
This field can be set to 0, 1 or 2.
If the field is zero, the BGP speaker will reset the peering session
if the route sent to the peer exceeds the reset prefix limit. If the
field is 1, the BGP peer will reset the peering session and hold it
down until a manual restart occurs. If the field is 2, the BGP peer
will reset the peering session via mechanisms such as soft-notify.
reset prefix limit (4 octet):
Number of routes sent by the sender BGP speaker. The value for this
field is dependent on the maximum prefix limit and SHOULD be always
greater than it.
optional parameter length (1 octet):
The value of this optional variable length is 13 octets plus the
additional 29 bits of reserve field. This value can change when more
sub codes are added.
sub code 4:
The BGP speaker uses this sub-code to indicate to its peer the
current count of the routes it receieved from it.
current Rx routes:
Number of routes received by the BGP speaker from its peer. The value
of this field SHOULD always be less than or equal to the maximum
prefix limit configured to receive from the peer.
sub code 5:
The BGP speaker uses this sub-code to indicate to its peer the
current count of the routes sent to it.
current Tx routes:
Number of routes sent by the BGP speaker to its peer. The value of
this field SHOULD always be less than or equal to the maximum prefix
limit it receieved from the peer in the capability.
sub code 6:
The BGP speakers use this sub-code to indicate a prefix-length based
set of limits: (warning limit, stop advertisement limit, and reset
limit). The field carries an action flag that indicates actions that
occur for all prefixes that hit limits, and the limits per length of
the prefix. An example of a length of a prefix is length 19 for all
/19 routes. All /19 routes will have a warning limit, a stop
advertisement limit and a reset limit. Only 1 sub-code 6 parameter
may be in Prefix limit TLV.
prefix length-1:
The length (in bits) of the prefix group.
action-flags for prefix -1:
The action flag octet carries the set of action flags for all prefix
in the following bit pattern
0x00WWSSRR
The WW bits can be set with the warning indicator values (0,1,2)
indicated in sub-code 1. The SS bits can be set the stop
advertisement action values (0,1,2) indicated in sub-code 2. The RR
bits can be set to the rest action values (0,1,2) indicated in sub-
code 3.
warning prefix limit for prefix length-1:
The warning limit for the prefix length-1.
stop advertisement limit for prefix length-1:
The stop advertisement prefix limit for prefix length-1.
reset peering limit for prefix length:
The reset peering route limit for the prefix of length-1.
sub code 7:
Sub-code 7 allows the 3 basic prefix limits for set of prefixes
matching the ORFs. Multiple sub-code 7 TLVs may be in a Prefix TLV.
Action flags for ORF:
The action flag definitions are the same as for the action-flag for
sub-code 6 (prefix length).
warning indicator prefix limit for ORF match:
The warning indicator prefix limit for any prefix that match the ORF
filter.
stop advertisement prefix limit for ORF match:
The stop advertisement prefix limit for any prefix that matches the
ORF filter.
reset peering prefix limit for ORF Match:
The stop peering prefix limit for any prefix that matches the ORF
filter.
We refer to the warning prefix limit, maximum prefix limit and the
reset prefix limit as prefix limits in this document for the ease of
illustration.
3.3. Carrying Prefix limits in the Open Capabilities
The BGP OPEN capabilities field uses the following triples: triples
<Capability Code, Capability Length, Capability Value>, where each
triple is encoded as shown below:
+------------------------------+
| Capability Code (1 octet) |
+------------------------------+
| Capability Length (1 octet) |
+------------------------------+
| Capability Value (variable) |
+------------------------------+
The BGP Maximum Prefix Capability value to be assigned by IANA.
3.4. Interaction between sub-codes 6-7 and sub-codes 1-3
Within the TLV, if sub-code 6 or sub-code 7 are specified, these
cannot specify the 0/0 prefix length or an ORF match that matches all
routes.
3.5. Carrying Maximum Prefix Limits the the Dynamic Open Capabilities
The BGP Dynamic Capabilities is carried in the Capability message
(Message type 6), and uses the following fields:
+------------------------------+
| Action (1 octet) |
+------------------------------+
| Capability Code (1 octet) |
+------------------------------+
| Capability Length (1 octet) |
+------------------------------+
| Capability Value (variable) |
+------------------------------+
Action code of "0" in a dynamic capability adds the maximum preifx
limits specified in the TLV for the corresponding AFI/SAFI. The
Action code of "1" removes the prefix limits for a particular
AFI/SAFI. An Action Code of "0" followed by an action code of "0"
writes over the required fields, and provides an exclusive OR of the
optional fields.
3.6. Carrying Maximum Prefix in ORF Match Field in BGP Route Refresh
+--------------------------------------------------+
| Address Family Identifier (2 octets) |
+--------------------------------------------------+
| Reserved (1 octet) |
+--------------------------------------------------+
| Subsequent Address Family Identifier (1 octet) |
+--------------------------------------------------+
| When-to-refresh (1 octet) |
+--------------------------------------------------+
| ORF Type = Maximum Prefix (08) |
+--------------------------------------------------+
| Length of ORFs (2 octets) |
+--------------------------------------------------+
| First Maximum Prefix ORF sub-code (TLV 1-7) |
+--------------------------------------------------+
+--------------------------------------------------+
| Second Maximum Prefix ORF sub-code (TLV 1-7) |
+--------------------------------------------------+
...
+--------------------------------------------------+
| Nth Maximum Prefix ORF sub-code (TLV 1-7) |
+--------------------------------------------------+
ORF entries are carried in the BGP ROUTE-REFRESH message [BGP-RR]. A
single ROUTE-REFRESH message could carry multiple ORF entries, as
long as all these entries share the same AFI/SAFI.
From the encoding point of view each ORF entry consists of a common
part and type-specific part. The common part consists of <AFI/SAFI,
ORF-Type, Action, Match>.
The "When-to-refresh" field in the route can be one of IMMEDIATE
(0x01) or DEFER (0x02), the semantics and operation of which are
described in [BGP-CRF]. Following this field is a collection of one
or more ORFs, grouped by ORF-Type. The Maximum Prefix ORF type ORF
field can be intermixed with other ORF fields. If the ORF field is
specific to the Maximum Prefix field, the ORF (sub-code 7) should be
utilized to specify the ORF field.
The ORF-Type component is encoded as a one-octet field. The value 0
is reserved. The values currently proposed to be assigned are:
1. reserved (00) 2. Community (02) 3. Extended Community (03) 4.
AsPath (xx) 5. Prefix (64) 6. Maximum Prefix (08)
3.7. Carrying the Maximum Prefix in a Soft Notify [BGP-SOFT-NOTIFY]
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| AFI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SAFI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type-code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sub-code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Variable Data TLV |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The type code of 3 will indicate that a prefix maximum has been
exceeded. The sub-code will indicate which type of prefix maximum has
been exceeded. The value of <1> will indicate a warning prefix
maximum, the value of <2> will indicate that a stop advertisement
prefix maximum has been exceeded, and the value of <0> will indicate
that a reset peering advertistement has been exceeded.
The length specifies the length of the optional portion of the soft-
notify. The variable portion of the soft-notify SHOULD contain the
required fields of the Maximum prefix field. The variable Data TLV
MAY contain the fields of the optional fields.
4. Operation
4.1 Exchanging the configured prefix limits
BGP speakers exchange the prefix limits as an optional capability
parameter [BGP-CAP] as described in section 3.
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
Figure 1
In figure 1 both BGP speakers A and B exchange the prefix limits to
indicate the support for this capability. Each of A and B set the
warning prefix limit, maximum prefix limit and reset prefix limit
along with the actions associated with each of them in the capability
message before exchanging them. The warning prefix limit and reset
limit values are determined based on the configured maximum prefix
limit. They are typically a percentage value of the maximum prefix
limit. The exact percentage values are beyond the scope of this
document. The maximum prefix limit configured on A for the peer B
implies the maximum number of prefixes that A expects to receive from
B. B informs this in the new capability described in section 4. The
same interpretation applies to B too.
4.2 Dynamic Capability Reset of the Capability
Dynamic Capabilities can set the BGP speakers maximum prefix values
(warning indicator, stop advertisement, and reset peering values) to
different values that initially negotiated via the OPEN Capabilities.
The exact mechanisms for the decision to reset the values are outside
the scope of this specification. Figure 2 indicates how the dynamic
capability can be utilized when a prefix limit is detected by the BGP
speaker.
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
B detects warning
prefix limit
<------ generates dynamic
capability message
to A
Figure 2
4.2.1 Dynamic Capability use of Sub-code 4 (Current Received Route)
and Sub-Code 5 (Current Transmit Routes)
Sub-codes 4 (Current Received Routes) and sub-code 5(current Transmit
routes) provides information to the BGP speakers which aids in
preventing peer disruption. Figure 2 demonstrates the case where BGP
speaker A and B maintain a count of the routes they receive from each
other. Route processing operation is illustrated using the case where
B sends route advertisements to A. (The same operational procedures
apply for the other case of A sending route advertisements to B.)
B, as shown in figure 1, applies the out bound route policies on the
Adjacent-Rib-Out followed by the condition of the prefix limits
before route advertisements. Upon hitting the the warning indicator
prefix limit, BGP speaker B sends the Dynamic Capability messages to
A with 5 sub-codes: warning indicator (sub-code1), stop advertisement
(sub-code 2), reset peering indicator (sub-code 3), Current Receive
Routes (sub-code 4), Current Transmit routes (sub-code 5). The
additional sub-codes i.e 4 and 5, provide information that assists
the network administrators in prioritizing the handling of the
warning. For example, if the limits are 1000 routes for warning, 2000
for stop advertisement, and 3000 for reset peering and the current
routes are 1010. Then, it can be deduced by the network operator that
the received routes are well within the tolerance limit i.e sub-code
2. If instead for the same limits (1000,2000,3000), the current
received routes (by speaker B) is 1900, the network operator may want
to investigate the customer changes.
In figure 2 it can be seen in due course of route advertisements to
A, B generates a dynamic capability [BGP-DYN-CAP] destined to A
comprising of the sub-codes 1-5. The reason B sends this message in
this case is that it detects the warning limit at the time of route
advertisements earlier than A. In other words either A or B or both
of them could generate this message depending on timing of warning
limit detection. B and A MAY choose to raise internal warning when
this condition is detected. Following the warnings both A and B
continue advertising routes normally to each other.
If B determines that the prefix limits can be increased, BGP speaker
MAY send these changed values in the Dynamic capability alongwith
sub-codes 1-5.
In figure 3, B during route advertisement detects that the maximum
prefix limit for route advertisement is reached. It SHOULD stop
further route advertisements to A. In other words in this condition
it SHOULD implicitly mean to B that the announce policy to A is
stop/deny. B then SHOULD send a Dynamic Capability [BGP-DYN-CAP] to A
indicating the current Receive and Transmit routes (sub-code 4 and
Sub-cod 5). As in the case of warning prefix limit condition either A
or B or both could send dynamic capability [BGP-DYN-CAP]. Any route
withdrawal to A is automatically recorded and SHOULD result in
restoring the announce policy to the configured one (if any
configured) implicitly. This helps in, preserving the incremental
nature of the protocol and avoiding processing of routes by peers
such as B, which get discarded by speakers such as A when the limit
is reached. In addition to these network bandwidth consumption by the
route UPDATES can be avoided. It is expected that conformance to this
document will not lead to any further route advertisements to A by B
unless there exists an unforseen error. Under such situation A can
reset the peering session as indicated in the maximum prefix limit to
B during the capability negotiation.
+--------+ +--------+
| A | <-----------------> | B |
+--------+ +--------+
B detects stop advertisement
maximum prefix limit
and generates dynamic
<------ capability message
to provide additional
information.
Figure 3
4.2.2 Prefix limit changes Utilizing Dynamic Capabilities
If a need for prefix limits change arises, each BGP speaker A whose
configuration changes for its peer B, SHOULD dynamically [BGP-DYN-
CAP] inform the corresponding peer of this change. Such changes
SHOULD be handled as described in the following sub-sections.
4.2.2.1 Processing when maximum prefix limit is increased
When the prefix limits are increased in the configuration of A, in
figure 1, it SHOULD inform B about it as described in 4.2. B SHOULD
then restart the route advertisements and it MAY either choose to do
so from the Adjacent-Rib-Out for A incrementally or make use of Route
Refresh mechanism [BGP-RREFRESH], if it has stopped because of
reaching the maximum prefix limit. The former methodology is similar
to the approach taken prior to the introduction of Route Refresh. In
other words it can be handled in the way policy changes were handled
prior to the availability of Route Refresh mechanism, with a minor
change of just sending the routes that were rejected due to the
prefix limit. In doing so the restart of BGP peering and the
associated network traffic and service disruption with it, is
avoided. If the maximum prefix limit is not reached and increased
prefix limits are received by the peer B, then peer B SHOULD note
this and continue with its advertisements to A until these limits are
reached.
4.2.2.2 Processing when the maximum prefix limit is decreased
When the prefix limits are decreased in the configuration of A (refer
figure 1), then B SHOULD be informed about it as described in 4.2. B
then SHOULD note this information and SHOULD stop route advertisement
immediately if the number of route adtverisments exceeds this new
maximum prefix limit for A. By doing so B can avoid processing the
routes which will be discarded by A when it detects the maximum
prefix limit condition. A does this even before adding the routes to
its Adjacent-Rib-In for the peer or in some cases restarting of the
peering session. Additionally, network bandwidth consumption by the
routing UPDATES can be avoided this way. B at that point follows the
process described in 4.2 for route processing.
4.3 ORF based processing
The ORF filters can be carried either in the dynamic capability or in
the Route Refresh message. The processing of the Route Refresh and
ORF is described in 3.5 and [BGP-CRF].
4.4 Soft Notify processing
Soft Notify processing is restricted at first to sub-codes 1-5. Use
of sub-codes 6 and 7 in Soft Notify is left for Future study.
4.5 Prefix Length based limits processing
All of the operational procedures described in section 4.1 through
4.4 are applicable to the negotiated prefix length based limits.
5. Error Handling
The Maximum prefix TLV can be sent in an OPEN (Message 1), a Route
Refresh (message 5), or a capability (message 6). The sections below
define the error codes and sub-codes related to these message for the
maximum prefix draft.
5.1 Open Message responded to with Notification
OPEN messages can be rejected for the listed unsupported capabilities
by the BGP speakers. The error code for an open message negotiation
of Capabilities is sub-code 7 [BGP-CAP]. The maximum prefix TLV will
be included in the list of capabilities.
5.2 Route Refresh caused Notification Errors
[ROUTE-REFRESH] does not specify error messages associated with the
Route-Refresh processing.
5.3 Capability Message responded to with a Notification Errors
For errors in Dynamic Capabilities, a NOTIFICATION message may be
sent with the Capability messages error code (7) [BGP-DYNCAP] set.
Current sub-code for this error message are:
Subcode Symbolic Name
1 Invalid Action Value
2 Invalid Capability Length
3 Malformed Capability Value
4 Unsupported Capability Code
Support for the Maximum Prefix value negotations will require the
addition of the following sub-code
5 Invalid Capability Value
If the Maximum Prefix code is not supported, the NOTIFICATION message
will be returned with a error code of 7 with a sub-code of 4
(unsupported Capability Code). If the Maximum Prefix Capability is
supported, but the value is not-acceptable to receiving node, the
Notification can be sent with the 5 invalid capability value and the
data field set to the Maximum Prefix TLVs that are not acceptable.
5.4 Cease message for peering reset
When the reset maximum prefix value is exceeded, the peering session
SHOULD be dropped. In which case the CEASE code in the NOTIFICATION
message will be used. The [CEASECODE] proposed BGP Draft gives a
subcode of 1 for a Maximum prefix exceed. The data field has a
maximum prefix upper bound. This field should have a optional 1
octet field that allows a maximum prefix sub-codes to be encoded
beyond this field.
6. Usage in Current Service Providers
We provide an example to illustrate a typical Service Provider's (SP)
practice with maximum prefix limit. Providers can set one of three
levels: Warning, Stop and Reset. This section provides an example
of setting two limits (warning, stop/reset) versus three limits
(warning, stop, reset).
6.1. Two limits (warning, stop/reset)
The provider may set two levels of threshold on the BGP receivers at
the network edge: - low water mark as warning threshold and high
water mark as stop/reset level. The high water mark has been thought
of to quickly detect and stop a misconfigured router sending a full
blast of Internet routers. However, the High water mark also may be
exceeded in VPN clients by only a few routes as the routing tables
grow. Let's examine why this is problematic.
When the warning threshold is triggered, SNMP traps are transmitted
by the SP's BGP receiver (router) to the SP's management system. The
operator needs to contact the customer upon receiving the trap. When
the stop/reset threshold with maximum prefix limit is reached, the
BGP session may be dropped by the BGP receiver. Again it would
generate traps on the provider side. (Some implementation may not
drop the session, but drop the customer's routes or prefixes
silently.) Then the operator needs to work with the customer to
correct the problem and restart the session.
There are several issues around using only 2 levels: - First, the
provider has to prove to the customer that the session drop was due
to customer violating the agreed maximum prefix limit rather than
being due to the operator's network condition causing the session
drop. Keeping the warning traps may aid, but session error codes
specifying the reason of Maximum prefix exceeded will aid in
identifying the the reason for the BGP session drop.
Secondly, the operator has to work with customer to locate the root
cause, and more likely manually bring back the BGP peering session at
an agreed time. This is labor intensive for the operator and the
customer. If the stop/reset limit cathes an upswing in VPN traffic
from a site, the operator and customer must work in crisis mode to
resolve the growth. The customer may be more unhappy about the
session drops due to growth rather than misconfiguration.
Due to the above reasons, as todays common practice, a provider may
choose not to use the maximum prefix limit feature for their Internet
services to avoid these complications. But the same provider may
choose to use the maximum prefix limit feature in their MPLS VPN
services for customer connection, due to edge device resource
management needs which are particularly associated with VPN services.
The issues of where to use and not to use the maximum prefix limit
feature are beyond the scope of this draft.
6.2. three levels: warning, stop, reset
In this draft, we are promoting a proactive approach to dealing with
maximum prefix limit issues. With reference to the example above on
the relation of provider and customer edge devices (BGP senders), we
propose that both the customer and the provider participate in
setting these three levels of thresholds: warning, stop, and reset,
and reacting to the resulting warnings, traps or error messages.
Anytime a threshold is set or changed on either side, it is
communicated to the remote side via BGP signalling, and both sides
communicate dynamically whenever an unexpected event triggers any of
the threshold levels.
The warning level triggers the warning on both provider and customer
edge devices, so customer should act on it without waiting for the
provider to call.
The second level triggers the customer edge device to stop sending
routes, as it is reaching the agreed max prefix limit. This may also
result in traps being issued on both customer and provider side. The
idea is to have the customer take action to fix the problem without
dropping the session, thereby requiring less human intervention from
the provider side.
The third level triggers the session drop action from the provider
side. This is used as safeguard for the providers network in case the
customer edge device did not behave as expected and is continuing to
send routes after exceeding the second level threshold.
We believe this feature can help both providers and customers to
proactively manage their BGP connections by dynamic signaling,
monitoring and taking corrective actions before any drastic action is
necessary. In many cases, this can help avoid service interruption,
avoid finger-pointing when sessions are dropped, lower operation
cost, and increase customers satisfaction. In general, this feature
can be applied to provider - provider peering connections as well,
with similar advantages.
7. Security Considerations
This document does not change the underlying security issues in the
BGP protocol. It however, does provide an additional mechanism to
protect against Denial of service attacks based on exceeding
configured maximum prefix limits.
8. References
[BGP-4] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-
4)", draft-ietf-idr-bgp4-20.txt. Work in progress.
[BGP-CAP] Chandra, R., Scudder, J., "Capabilities Advertisement with
BGP-4", RFC 3392, May 2000.
[BGP-RREFRESH] Chen, E., "Route Refresh Capability for BGP-4", RFC
2918, September 2000.
[BGP-DYN-CAP] Chen, E., Sangli, S. R., "Dynamic Capability for BGP-
4", draft-ietf-idr-dynamic-cap-03.txt. Work in progress.
[BGP-STUDY] Chang, D., Govindan, R., Heidemann, J., "An Empirical
Study of Router Response to Large BGP Routing Table Load", ACM
SIGCOMM Internet Measurement Workshop, pp. 203-208, Marseille,
France, November 2002.
[BGP-CRF] Chen, E., Rekhter, Y., "Cooperative Route Filtering
Capability for BGP-4", draft-ietf-idr-route-filter-08.txt. Work
in progress.
[CEASECODE] Chen, E., "Subcodes for BGP Cease Notification Message",
draft-ietf-idr-cease-subcode-05.txt. Work in progress.
[BGP-SOFT-NOTIFY] Gargi, N., Patel, K., Scudder, J., Ward, D., "BGPv4
Soft-Notification Message", draft-nalawade-bgp-soft-notify-
00.txt. Work in progress.
9. IANA Considerations
This document uses a new capability type for the support of prefix
limits and the corresponding NOTIFICATION code along with the sub-
codes for non-support. This must be assigned by IANA.
10. Acknowledgements
The authors would like to thank George Matey, Marten Terpstra, Yakov
Rekhter, Enke Chen, Rob Thomas, Manish Gupta, Dan Joyal, Rajesh
Saluja and Elwyn Davies for their review and comments.
11. Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
12. Author's Addresses:
Srikanth Chavali
Vasile Radoaca
Paul Knight
Nortel Networks
600 Technology Park Drive
Billerica, MA 01821 USA
Email: schavali@nortelnetworks.com
Email: vasile@nortelnetworks.com
Email: paul.knight@nortelnetworks.com
Mo Miri
BellSouth
575 Morosgo Drive
4A62
Atlanta, GA 3032
home: +1 404-499-5526
email: mohammad.miri@bellsouth.com
Luyuan Fang
ATT Labs
200 Laurel Avenue,
Room C2-3B35,
Middletown, NJ 07748
Phone: +1 732 420 1921
Email: luyuanfang@att.com
Susan Hares
NextHop Technologies
825 Victors Way
Suite 100
Ann Arbor, MI 48108
Phone: +1 734 222 1610
Email: skh@nexthop.com| PAFTECH AB 2003-2026 | 2026-04-23 05:40:33 |