http://stupid.domain.name/ietf/

One document matched: draft-ietf-idr-flow-spec-05.xml
<?xml version="1.0"?>

<!DOCTYPE rfc  SYSTEM "rfc2629.dtd" [
    <!ENTITY rfc793 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.0793.xml'>
    <!ENTITY rfc2119 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
    <!ENTITY rfc4271 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4271.xml'>
    <!ENTITY rfc4760 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4760.xml'>
    <!ENTITY rfc4364 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4364.xml'>
    <!ENTITY rfc4360 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4360.xml'>
    <!ENTITY rfc5226 PUBLIC ''
	'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5226.xml'>
]>

<!--

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY RFC4271 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4271.xml">
<!ENTITY RFC5226 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5226.xml">
<!ENTITY RFC4272 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4272.xml">
<!ENTITY RFC4760 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4760.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>

-->

<?rfc toc="yes" ?>
<!-- <?rfc strict="yes"?> -->
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>

<rfc category="std" docName="draft-ietf-idr-flow-spec-05" ipr="trust200811">

<front>

<title abbrev="flow-spec">
Dissemination of flow specification rules
</title>

<author initials="P" surname="Marques" fullname="Pedro Marques">
	<organization>Juniper Networks</organization>
	<address>
		<postal>
			<street>1194 N. Mathilda Ave.</street>
			<city>Sunnyvale</city> 
			<region>CA</region>
			<code>94089</code>
			<country>US</country>
		</postal>
		<email>roque@juniper.net</email>
	</address>
</author>

<author initials="N" surname="Sheth" fullname="Nischal Sheth">
	<organization>Juniper Networks</organization>
	<address>
		<postal>
			<street>1194 N. Mathilda Ave.</street>
			<city>Sunnyvale</city> 
			<region>CA</region>
			<code>94089</code>
			<country>US</country>
		</postal>
		<email>nsheth@juniper.net</email>
	</address>
</author>

<author initials="R" surname="Raszuk" fullname="Robert Raszuk">
	<organization>Juniper Networks</organization>
	<address>
		<postal>
			<street>1194 N. Mathilda Ave.</street>
			<city>Sunnyvale</city> 
			<region>CA</region>
			<code>94089</code>
			<country>US</country>
		</postal>
		<email>raszuk@juniper.net</email>
	</address>
</author>

<author initials="B" surname="Greene" fullname="Barry Greene">
	<organization>Juniper Networks</organization>
	<address>
		<postal>
			<street>1194 N. Mathilda Ave.</street>
			<city>Sunnyvale</city> 
			<region>CA</region>
			<code>94089</code>
			<country>US</country>
		</postal>
		<email>bgreene@juniper.net</email>
	</address>
</author>

<author initials="J" surname="Mauch" fullname="Jared Mauch">
	<organization>NTT/Verio</organization>
	<address>
		<postal>
			<street>8285 Reese Lane</street>
			<city>Ann Arbor</city> 
			<region>MI</region>
			<code>48103-9753</code>
			<country>US</country>
		</postal>
<!--		<email>jmauch@verio.net</email>      -->
  		<email>jared@puck.nether.net</email>
	</address>
</author>

<author initials="D" surname="McPherson" fullname="Danny McPherson">
	<organization>Arbor Networks</organization>
	<address>
		<email>danny@arbor.net</email>
	</address>
</author>

<date month="January" year="2009" />
<area>Routing</area>
<workgroup>IDR Working Group</workgroup>
<keyword>I-D</keyword>
<keyword>Internet-Draft</keyword>
<keyword>IDR</keyword>


<abstract>

<t>
This document defines a new BGP NLRI encoding format that can be used
to distribute traffic flow specifications. This allows the routing
system to propagate information regarding more-specific components of
the traffic aggregate defined by an IP destination prefix. </t>

<t>
Additionally it defines two applications of that encoding format.
One that can be used to automate inter-domain coordination of traffic
filtering, such as what is required in order to mitigate
(distributed) denial of service attacks. And a second application to
traffic filtering in the context of a BGP/MPLS VPN service. </t>

<t>
The information is carried via the Border Gateway Protocol (BGP), thereby
reusing protocol algorithms, operational experience and administrative
processes such as inter-provider peering agreements.</t>
</abstract>
</front>

<middle>
<section title="Definitions of Terms Used in this Memo">

<t>NLRI - Network Layer Reachability Information</t>
<t>RIB - Routing Information Base</t>
<t>Loc-RIB - Local RIB</t>
<t>AS - Autonomous System Number</t>
<t>VRF - Virtual Routing and Forwarding instance</t>
<t>PE - Provider Edge router</t>

</section>

<section title="Introduction">
<t>
Modern IP routers contain both the capability to forward traffic according
to aggregate IP prefixes as well as to classify, shape, limit
filter or redirect packets based on administratively defined
policies. </t>

<t>
While forwarding information is, typically, dynamically signaled across
the network via routing protocols, there is no agreed upon mechanism to
dynamically signal flows across autonomous-systems. </t>

<t>
For several applications, it may be necessary to exchange control
information pertaining to aggregated traffic flow definitions which
cannot be expressed using destination address prefixes only. </t>

<t>
An aggregated traffic flow is considered to be an n-tuple consisting
of several matching criteria such as source and destination address
prefixes, IP protocol and transport protocol port numbers.</t>

<t>
The intention of this document is to define a general procedure to
encode such flow specification rules as a
<xref target="RFC4271">BGP</xref>
NLRI which can be reused
for several different control applications. Additionally, we define the
required mechanisms to utilize this definition to the problem of
immediate concern to the authors: intra and inter provider
distribution of traffic filtering rules to filter (Distributed) Denial
of Service (DoS) attacks. </t>

<t>
By expanding routing information with flow specifications, the routing
system can take advantage of the ACL/firewall capabilities in the
router's forwarding path. Flow specifications can be seen as more
specific routing entries to an unicast prefix and are expected to
depend upon the existing unicast data information.</t>

<t>
A flow specification received from a external
autonomous-system will need to be validated against unicast routing
before being accepted. If the aggregate traffic flow defined by the
unicast destination prefix is forwarded to a given BGP peer, then the
local system can safely install more specific flow rules which result in
different forwarding behavior, as requested by this system.</t>

<t>
The choice of BGP as the carrier of this control information is
also justifiable by the fact that the key issues in terms of complexity are
problems which are common to unicast route distribution and have
already been solved in the current environment.</t>

<t>
From an algorithmic perspective, the main problem that presents itself
is the loop-free distribution of <key, attribute>
pairs from one originator to N ingresses.
The key, in this particular instance, being a flow specification.</t>

<t>
From an operational perspective, the utilization of BGP as the carrier
for this information, allows a network service provider to reuse both
internal route distribution infrastructure (e.g.: route reflector or
confederation design) and existing external relationships
(e.g.: inter-domain BGP sessions to a customer network).</t>

<t>
While it is certainly possible to address this problem using other
mechanisms, the authors believe that this solution offers the substantial
advantage of being an incremental addition to deployed mechanisms.</t>

<t>
At the current deployments the information distributed by the flow-spec 
extension is originated both manually as well as automatically by systems 
which are able to detect malicious flows. When automated systems are used 
care should be taken to their correctness and rate of advertisement of flow
routes. </t>

<t>
This specification defines required protocol extensions to address most common 
applications of IPv4 unicast and VPNv4 unicast filtering. The same mechanism 
can be reused and new match criteria added to address similar filtering needs
for other BGP address families (for example IPv6 unicast). Authors believe that
those would be best to be addressed in a separate document.</t>

<t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in
<xref target="RFC2119">RFC 2119</xref>.
</t>
</section>

<section title="Flow specifications">

<t>
A flow specification is an n-tuple consisting on several matching
criteria that can be applied to IP traffic. A given IP packet is said
to match the defined flow if it matches all the specified criteria. </t>

<t>
A given flow may be associated with a set of attributes, depending on
the particular application, such attributes may or may not include
reachability information (i.e. NEXT_HOP). Well-known or
AS-specific community attributes can be used to encode a set of predetermined
actions. </t>

<t>
A particular application is identified by a specific
<xref target="RFC4760">(AFI, SAFI) pair</xref>
and corresponds to a distinct set of RIBs. Those RIBs should be treated
independently from each other in order to assure non-interference
between distinct applications.</t>

<t>
BGP itself treats the NLRI as an opaque key to an entry in its
databases. Entries that are placed in the Loc-RIB are then associated
with a given set of semantics which is application dependent. This is
consistent with existing BGP applications. For instance IP unicast routing
(AFI=1, SAFI=1) and IP multicast reverse-path information (AFI=1, SAFI=2)
are handled by BGP without any particular semantics being associated with
them until installed in the Loc-RIB.</t>

<t>
Standard BGP policy mechanisms, such as UPDATE filtering by NLRI
prefix and community matching, SHOULD apply to the newly defined
NLRI-type. Network operators can also control propagation of such
routing updates by enabling or disabling the exchange of a particular
(AFI, SAFI) pair on a given BGP peering session.</t>
</section>

<section title="Dissemination of Information">

<t>
We define a "Flow Specification" NLRI type that may include several
components such as destination prefix, source prefix, protocol, ports,
etc. This NLRI is treated as an opaque bit string prefix by BGP. Each
bit string identifies a key to a database entry 
which a set of attributes can be associated with.</t>

<t>
This NLRI information is encoded using MP_REACH_NLRI and MP_UNREACH_NLRI
attributes as defined in <xref target="RFC4760">RFC4760</xref>.
Whenever the corresponding
application does not require Next Hop information, this shall be
encoded as a 0 octet length Next Hop in the MP_REACH_NLRI attribute
and ignored on receipt.</t>

<t>
The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as
a 1 or 2 octet NLRI length field followed by a variable length NLRI
value. The NLRI length is expressed in octets.</t>

<figure title="flow-spec NLRI">
<artwork>
+------------------------------+
|    length (0xnn or 0xfn nn)  |
+------------------------------+
|    NLRI value  (variable)    |
+------------------------------+
</artwork>
</figure>

<t>
If the NLRI length value is smaller than 240 (0xf0 hex), the length
field can be encoded as a single octet. Otherwise, it is encoded as a
extended length 2 octet value in which the most significant nibble of
the first byte is all ones.</t>

<t>
The Flow Specification NLRI-type consists of several optional subcomponents. A
specific packet is considered to match the flow specification when it
matches the intersection (AND) of all the components present in the
specification.</t>

<t>
The following component types are defined:
<list>

<t>
Type 1 - Destination Prefix
<list>
<t>
Encoding: <type (1 octet), prefix length (1 octet), prefix>
</t>
<t>
Defines the destination prefix to match. Prefixes are encoded as in
BGP UPDATE messages, a length in bits is followed by enough octets to
contain the prefix information.
</t>
</list>
</t>

<t>
Type 2 - Source Prefix
<list>
<t>
Encoding: <type (1 octet), prefix-length (1 octet), prefix>
</t>
<t>
Defines the source prefix to match.
</t>
</list>
</t>

<t>
Type 3 - IP Protocol
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t> 
Contains a set of {operator, value} pairs that are used to match IP
protocol value byte in IP packets.
</t>
<t>
The operator byte is encoded as:
</t>
</list>

<figure title="Numeric operator"><artwork align="center">
  7   6   5   4   3   2   1   0
+---+---+---+---+---+---+---+---+
| e | a |  len  | 0 |lt |gt |eq | 
+---+---+---+---+---+---+---+---+
</artwork></figure>

<list style="symbols">
<t>
End of List bit. Set in the last {op, value} pair in the list.
</t>

<t>
And bit. If unset the previous term is logically ORed with the current
one. If set the operation is a logical AND. It should be unset in the
first operator byte of a sequence. The AND operator has higher priority
than OR for the purposes of evaluating logical expressions.
</t>

<t>
The length of value field for this operand is given as (1 << len).
</t>

<t>
Lt - less than comparison between data and value.
</t>

<t>
gt - greater than comparison between data and value.
</t>

<t>
eq - equality between data and value.
</t>

<t>
The bits lt, gt, and eq can be combined to produce "less or
equal", "greater or equal" and inequality values.
</t>
</list>
</t>

<t>
Type 4 - Port
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs that matches source OR
destination TCP/UDP ports. This list is encoded using the numeric
operand format defined above. Values are encoded as 1 or 2 byte
quantities. </t>
</list>
</t>

<t>
Type 5 - Destination port
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs used to match the
destination port of a TCP or UDP packet. Values are encoded as 1 or 2 byte
quantities.</t>
</list>
</t>

<t>
Type 6 - Source port
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs used to match the source
port of a TCP or UDP packet. Values are encoded as 1 or 2 byte
quantities. </t>
</list>
</t>

<t>
Type 7 - ICMP type
<list>
<t>
Encoding: <type (1 octet),  [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs used to match
the type field of an icmp packet. Values are encoded using a single
byte. </t>
</list>
</t>

<t>
Type 8 - ICMP code
<list>
<t>
Encoding: <type (1 octet),  [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs used to match
the code field of an icmp packet. Values are encoded using a single
byte.</t>
</list>
</t>

<t>
Type 9 - TCP flags
<list>
<t>
Encoding: <type (1 octet),  [op, bitmask]+>
</t>

<t>
Bitmask values are encoded using a single byte, using the bit definitions
specified in the <xref target="RFC0793">TCP header format</xref>.
</t>

<t>
This type uses the bitmask operand format, which differs from the
numeric operator format in the lower nibble. </t>
</list>

<figure align="center">
   <artwork align="center">
  7   6   5   4   3   2   1   0
+---+---+---+---+---+---+---+---+
| e | a |  len  | 0 | 0 |not| m | 
+---+---+---+---+---+---+---+---+
</artwork>
</figure>

<list style="symbols">
<t>
Top nibble: (End of List bit, And bit and Length field), as defined for
in the numeric operator format.
</t>

<t>
Not bit. If set, logical negation of operation.
</t>

<t>
Match bit. If set this is a bitwise match operation
defined as "(data & value) == value"; if unset (data & value)
evaluates to true if any of the bits in the value mask are set
in the data.
</t>
</list>
</t>

<t>
Type 10 - Packet length
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t>
Match on the total IP packet length (excluding L2 but including IP header).
Values are encoded using as 1 or 2 byte quantities.
</t>
</list>
</t>

<t>
Type 11 - DSCP
<list>
<t>
Encoding: <type (1 octet), [op, value]+>
</t>
<t>
Defines a list of {operation, value} pairs used to match the IP TOS
octet.
</t>
</list>
</t>

<t>
Type 12 - Fragment
<list>
<t>
Encoding: <type (1 octet), [op, bitmask]+>
</t>

<t>
Uses bitmask operand format defined above.
</t>
<t>
Bitmask values:
<list style="symbols">
<t>
Bit 0 - Dont fragment
</t>

<t>
Bit 1 - Is a fragment
</t>

<t>
Bit 2 - First fragment
</t>

<t>
Bit 3 - Last fragment
</t>

</list>
</t>
</list>
</t>
</list>
</t>

<t>
Flow specification components must follow strict type ordering. A
given component type may or may not be present in the specification,
but if present it MUST precede any component of higher numeric type
value.
</t>

<t>
If a given component type within a prefix in unknown, the prefix in
question cannot be used for traffic filtering purposes by the
receiver. Since a Flow Specification has the semantics of a logical AND of all
components, if a component is FALSE by definition it cannot be
applied. However for the purposes of BGP route propagation this prefix
should still be transmitted since BGP route distribution is
independent on NLRI semantics. </t>

<t>
Flow specification components are to be interpreted as a bit match at
a given packet offset. When more than one component in a flow
specification tests the same packet offset the behavior is
undetermined.</t>

<t>
The <type, value> encoding is chosen in order to account for future
extensibility.</t>

<texttable align='left'>
<preamble> An example of a Flow Specification encoding for:
"all packets to 10.0.1/24 and TCP port 25".</preamble>
<ttcol>destination</ttcol>
<ttcol>proto</ttcol>
<ttcol>port</ttcol>
<c>0x01 18 0a 00 01 </c>
<c>03 81 06 </c>
<c>04 81 19</c>
</texttable>

<texttable align='left'>
<preamble>Decode for protocol:</preamble>
<ttcol align='right'>Value</ttcol>
<ttcol align='left'></ttcol>
<ttcol align='left'></ttcol>
<c>0x03</c>    <c>type</c> <c></c>
<c>0x81</c>    <c>operator</c> <c>end-of-list, value size=1, =</c>
<c>0x06</c>    <c>value</c><c></c>
</texttable>

<texttable align='left'>
<preamble>
An example of a Flow Specification encoding for:
"all packets to 10.0.1/24 from 192/8 and port {range [137, 139] or 8080}".
</preamble>
<ttcol>destination</ttcol>
<ttcol>source</ttcol>
<ttcol>port</ttcol>
<c>0x01 18 0a 01 01 </c>
<c>02 08 c0 </c>
<c>04 03 89 45 8b 91 1f 90</c>
</texttable>

<texttable align='left'>
<preamble>Decode for port:</preamble>

<ttcol align='right'>Value</ttcol>
<ttcol align='left'></ttcol>
<ttcol align='left'></ttcol>
<c>0x04</c>    <c>type</c><c></c>
<c>0x03</c>    <c>operator</c><c> size=1, >=</c>
<c>0x89</c>    <c>value</c><c> 137</c>
<c>0x45</c>    <c>operator</c><c>&, value size=1, <=</c>
<c>0x8b</c>    <c>value</c><c> 139</c>
<c>0x91</c>    <c>operator</c><c>end-of-list, value-size=2, =</c>
<c>0x1f90</c>  <c>value</c><c> 8080</c>
<postamble>
This constitutes a NLRI with an NLRI length of 16 octets.
</postamble>
</texttable>

<t>
Implementations wishing to exchange flow
specification rules MUST use BGP's Capability Advertisement facility
to exchange the Multiprotocol Extension Capability Code (Code 1)
as defined in <xref target="RFC4760">RFC4760</xref>.
The (AFI, SAFI) pair carried in the Multiprotocol Extension capability
MUST be the
same as the one used to identify a particular application that uses this
NLRI-type. </t>
</section>

<section title="Traffic filtering">

<t>
Traffic filtering policies have been traditionally considered to be
relatively static.</t>

<t>
The popularity of traffic-based denial of service (DoS) attacks, which
often requires the network operator to be able to use traffic filters
for detection and mitigation, brings with it requirements that are not
fully satisfied by existing tools.</t>

<t>
Increasingly, DoS mitigation, requires coordination among several
Service Providers, in order to be able to identify traffic source(s)
and because the volumes of traffic may be such that they will
otherwise significantly affect the performance of the network.</t>

<t>
Several techniques are currently used to control traffic filtering of
DoS attacks.  Among those, one of the most common is to inject
unicast route advertisements corresponding to a destination prefix
being attacked. One variant of this technique marks such route
advertisements with a community that gets translated into a discard
next-hop by the receiving router. Other variants, attract traffic to a
particular node that serves as a deterministic drop point.</t>

<t>
Using unicast routing advertisements to distribute traffic filtering
information has the advantage of using the existing infrastructure and
inter-as communication channels. This can allow, for instance, for a
service provider to accept filtering requests from customers for
address space they own.</t>

<t>
There are several drawbacks, however. An issue that is immediately
apparent is the granularity of filtering control: only destination
prefixes may be specified. Another area of concern is the fact that
filtering information is intermingled with routing information.</t>

<t>
The mechanism defined in this document is designed to address these
limitations. We use the flow specification NLRI defined above to
convey information about traffic filtering rules for traffic that
should be discarded.</t>

<t>
This mechanism is designed to, primarily, allow an upstream
autonomous system to perform inbound filtering, in their ingress
routers of traffic that a given downstream AS wishes to drop.</t>

<t>
In order to achieve that goal, we define an application specific
NLRI identifier (AFI=1, SAFI=133) along with specific semantic rules.</t>

<t>
BGP routing updates containing this identifier use the flow
specification NLRI encoding to convey particular aggregated flows
that require special treatment.</t>

<t>
Flow routing information received via this (afi, safi) pair is subject
to the validation procedure detailed below.</t>

<section title="Order of traffic filtering rules">

<t>
With traffic filtering rules, more than one rule may match a
particular traffic flow. Thus it is necessary to define the order at
which rules get matched and applied to a particular traffic flow.
This ordering function must be such that it must not depend on the
arrival order of the flow specifications rules and must be constant in
the network. </t>

<t>
We choose to order traffic filtering rules such that the order of two
flow specifications is given by the comparison of NLRI key byte
strings as defined by the memcmp() function is the ISO C standard.</t>

<t>
Given the way that flow specifications are encoded this results in a
flow with a less-specific destination IP prefix being considered
less-than (and thus match before) a flow specification with a
more-specific destination IP prefix.</t>

<t>
This matches an application model where the user may want to define a
restriction that affects an aggregate of traffic and a subsequent rule
that applies only to a subset of that.</t>

<t>
A flow-specification without a destination IP prefix is considered to
match after all flow-specifications that contain an IP destination
prefix.</t>
</section>
</section>

<section title="Validation procedure">

<t>
Flow specifications received from a BGP peer and which are accepted in
the respective Adj-RIB-In are used as input to the route selection
process. Although the forwarding attributes of two routes for the same
Flow Specification prefix may be the same, BGP is still required to perform its
path selection algorithm in order to select the correct set of
attributes to advertise.</t>

<t>
The first step of the BGP Route Selection procedure
(section 9.1.2 of [RFC4271]) is to exclude from the selection procedure routes that
are considered non-feasible. In the context of IP routing information
this step is used to validate that the NEXT_HOP attribute of a given
route is resolvable.</t>

<t>
The concept can be extended, in the case of Flow Specification NLRI, to allow
other validation procedures.</t>

<t>
A flow specification NLRI must be validated such that it is considered
feasible if and only if:

<list style='hanging'>
<t hangText="a)">
The originator of the flow specification matches the originator of
the best-match unicast route for the destination prefix embedded in
the flow specification. </t>

<t hangText="b)">
There are no more-specific unicast routes, when compared with the
flow destination prefix, that have been received from a different
neighboring AS than the best-match unicast route, which has been
determined in step a).</t>
</list>
</t>

<t>
By originator of a BGP route, we mean either the BGP originator path
attribute, as used by route reflection, or the transport address of
the BGP peer, if this path attribute is not present.</t>

<t>
The underlying concept is that the neighboring AS that advertises the
best unicast route for a destination is allowed to advertise flow-spec
information that conveys a more or equally specific destination
prefix. Thus, as long as there are no more-specific unicast routes,
received from a different neighbor AS, which would be affected by that
filtering rule.</t>

<t>
The neighboring AS is the immediate destination of the traffic
described by the Flow Specification. If it requests these flows to be dropped
that request can be honored without concern that it represents a
denial of service in itself. Supposedly, the traffic is being dropped
by the downstream autonomous-system and there is no added value in
carrying the traffic to it.</t>

<t>
BGP implementations MUST also enforce that the AS_PATH attribute of a
route received via eBGP contains the neighboring AS in the left-most
position of the AS_PATH attribute. While this rule is optional in the
BGP specification, it becomes necessary to enforce it for security
reasons.</t>
</section>

<section title="Traffic Filtering Actions">

<t>
This specification defines a minimum set of filtering actions that it
standardizes as 
<xref target="RFC4360">
BGP extended community values</xref>.
This is not meant to be
an inclusive list of all the possible actions but only a subset that
can be interpreted consistently across the network.</t>

<t>
Implementations should provide mechanisms that map an arbitrary
BGP community value (normal or extended) to filtering actions that
require different mappings in different systems in the network. For
instance, providing packets with a worse than best-effort per-hop
behavior is a functionality that is likely to be implemented
differently in different systems and for which no standard behavior
is currently known. Rather than attempting to define it here, this can
be accomplished by mapping a user defined community value to platform / network
specific behavior via user configuration. </t>

<t>
The default action for a traffic filtering flow specification is to
accept IP traffic that matches that particular rule.</t>

<texttable>
<preamble>
The following extended community values can be used to specify
particular actions.
</preamble>
<ttcol>type</ttcol>
<ttcol>extended community</ttcol>
<ttcol>encoding</ttcol>
<c>0x8006</c>	<c>traffic-rate</c>	<c>2-byte as#, 4-byte float</c>
<c>0x8007</c>	<c>traffic-action</c>	<c>bitmask</c>
<c>0x8008</c>	<c>redirect</c>		<c>6-byte Route Target</c>
</texttable>

<list style='hanging'>
<t hangText="Traffic-rate">
The traffic-rate extended community is a non-transitive
extended community across the Autonomous system boundary and uses
following extended community encoding:
<list>

<t>The first two octets carry the 2 octet id which can be assigned
from a 2 byte AS number. When 4 byte AS number is locally present 2
least significant bytes of such AS number can be used.</t>

<t>
The remaining 4 octets carry the rate information in IEEE floating point 
format, units being bytes per second.  A traffic-rate of 0 should
result on all traffic for the particular flow to be discarded.</t> 
</list>
</t>

<t hangText="Traffic-action">
The traffic-action extended community consists of 6 bytes of which
only the 2 least significant bits of the 6th byte (from left to right)
are currently defined.

<list style='symbols'>
<t>
Terminal action (bit 0).

When this bit is set the traffic filtering engine will apply any
subsequent filtering rules (as defined by the ordering procedure). If
not set the evaluation of the traffic filter stops when this rule is
applied.</t>

<t>
Sample (bit 1).
 
Enables traffic sampling and logging for this flow specification.</t>
</list>
</t>

<t hangText="Redirect">
The redirect extended community allows the traffic to be redirected to
a VRF routing instance that list the specified route-target in its
import policy. If several local instances match this criteria, the
choice between them is a local matter (for example, the instance with
the lowest Route Distinguisher value can be elected).

The traffic marking extended community instruct a system to modify the
DSCP bits of a transiting IP packet to the corresponding value. This
extended community is encoded as a sequence of 5 zero bytes followed
by the DSCP value.</t>

</list>
</section>

<section title="Traffic filtering in RFC2547bis networks">

<t>
Provider-based layer 3 VPN networks, such as the ones using an
<xref target="RFC4364">BGP/MPLS IP VPN </xref>
control plane,
have different traffic filtering requirements than
internet service providers.</t>

<t>
In these environments, the VPN customer network often has traffic
filtering capabilities towards their external network connections
(e.g. firewall facing public network connection).  Less common is the
presence of traffic filtering capabilities between different VPN
attachment sites. In an any-to-any connectivity model, which is the
default, this means that site to site traffic is unfiltered. </t>

<t>
In circumstances where a security threat does get propagated inside
the VPN customer network, there may not be readily available
mechanisms to provide mitigation via traffic filter.</t>

<t>
This document proposes an additional BGP NLRI type (afi=1, safi=134)
value, which can be used to propagate traffic filtering information in
a BGP/MPLS VPN environment.</t>

<t>
The NLRI format for this address family consists of a fixed length
Route Distinguisher field (8 bytes) followed by a flow specification,
following the encoded defined in this document. The NLRI length field
shall includes the both 8 bytes of the Route Distinguisher as well as
the subsequent flow specification.</t>

<t>
Propagation of this NLRI is controlled by matching Route Target
extended communities associated with the BGP path advertisement with
the VRF import policy, using the same mechanism as described in
<xref target="RFC4364">"BGP/MPLS IP VPNs"</xref>
.</t>

<t>
Flow specification rules received via this NLRI apply only to traffic
that belongs to the VRF(s) in which it is imported. By default,
traffic received from a remote PE is switched via an mpls forwarding
decision and is not subject to filtering.</t>

<t>
Contrary to the behavior specified for the non-VPN NLRI, flow
rules are accepted by default, when received from remote PE routers.</t>
</section>

<section title="Monitoring">

<t>
Traffic filtering applications require monitoring and traffic
statistics facilities. While this is an implementation specific
choice, implementations SHOULD provide:</t>

<list style='symbols'>
<t>A mechanism to log the packet header of filtered traffic,</t>

<t>A mechanism to count the number of matches for a given Flow
Specification rule.</t>
</list>
</section>

<section title="Security considerations">

<t>
Inter-provider routing is based on a web of trust. Neighboring
autonomous-systems are trusted to advertise valid reachability
information. If this trust model is violated, a neighboring autonomous
system may cause a denial of service attack by advertising
reachability information for a given prefix for which it does not
provide service.</t>

<t>
As long as traffic filtering rules are restricted to match the
corresponding unicast routing paths for the relevant prefixes, the
security characteristics of this proposal are equivalent to the
existing security properties of BGP unicast routing.</t>

<t>
Where it not the case, this would open the door to further denial of
service attacks.</t>

<t>
Enabling firewall like capabilities in routers without centralized 
management could make certain failures harder to diagnose. For example, 
with the extensions it is possible to allow TCP packets to pass between 
a pair of addresses but not ICMP packets. It would also be possible to 
permit packets smaller than 900 or greater than 1000 bytes to pass 
between a pair of addresses, but not packets whose length is in the 
range 900-1000. The Internet has become sufficiently aware of firewalls 
that such behavior is less likely to be confusing than it was a few years 
ago and there are no new capabilities introduced by these extensions, 
just an increased likelihood that such capabilities will be used.</t>

</section>

<section title="IANA Considerations">

<t>
A flow specification consists of a sequence of flow components, which
are identified by a an 8-bit component type.  Types must be assigned
and interpreted uniquely.  The current specification defines types 1
though 12, with the value 0 being reserved.</t>

<t>
For the purpose of this work IANA has allocated values for two SAFIs: 
SAFI 133 for IPv4 and SAFI 134 for VPNv4 dissemination of flow 
specification rules.</t>

<t>
The following traffic filtering flow specification rules are to be 
allocated by IANA from BGP Extended Communities Type - Experimental 
Use registry. Authors recommend the following type values:</t>

<list>
<t>0x8006 - Flow spec traffic-rate</t>
<t>0x8007 - Flow spec traffic-action</t>
<t>0x8008 - Flow spec redirect</t>
</list>

<t>
Authors would like to ask IANA to create and maintain a new registry 
entitled: "Flow Spec Component Type". Authors recommend to allocate 
the following component types:</t>

<list>
<t>Type 1 - Destination Prefix</t>
<t>Type 2 - Source Prefix</t>
<t>Type 3 - IP Protocol</t>
<t>Type 4 - Port</t>
<t>Type 5 - Destination port</t>
<t>Type 6 - Source port</t>
<t>Type 7 - ICMP type</t>
<t>Type 8 - ICMP code</t>
<t>Type 9 - TCP flags</t>
<t>Type 10 - Packet length</t>
<t>Type 11 - DSCP</t>
<t>Type 12 - Fragment</t>
</list>

<t>
In order to manage the limited number space and accommodate several
usages the following policies defined by RFC 5226 [RFC5226] are used:</t>

<texttable align='left'>
<ttcol>Range</ttcol>
<ttcol>Policy</ttcol>
<c>0</c><c>Invalid value</c>
<c>[1 .. 12]</c><c>Defined by this specification</c>
<c>[13 .. 127]</c><c>Specification Required</c>
<c>[128 .. 255]</c><c>Private Use</c>
</texttable>

<t>
The specification of a particular "flow component type" must clearly
identify what is the criteria used to match packets forwarded by the
router.  This criteria should be meaningful across router hops and
not depend on values that change hop-by-hop such as ttl or layer-2
encapsulation.</t>

<t>
The "Traffic-action" extended community defined in this document has
6 unused bits which can be used to convey additional meaning. Authors
would like to ask IANA to create and maintain a new registry entitled: 
"Traffic Action Fields". These values should be assigned via IETF 
Review rules only. Authors recommend to allocate the following traffic 
action fields:</t>

<list>
<t>0 Terminal Action</t>
<t>1 Sample</t>
<t>2-47 Unassigned</t>
</list>

</section>

<section title="Acknowledgments">
<t>
The authors would like to thank Yakov Rekhter, Dennis Ferguson,
Chris Morrow, Charlie Kaufman and David Smith for their comments.</t>

<t>
Chaitanya Kodeboyina helped design the flow validation procedure.</t>

<t>
Steven Lin and Jim Washburn ironed out all the details necessary to
produce a working implementation.</t>
</section>

</middle>

<back>
<references title="Normative References">
	&rfc793;
	&rfc2119;
	&rfc4271;
	&rfc4760;
	&rfc4360;
	&rfc4364;
	&rfc5226;
</references>

</back>
</rfc>
PAFTECH AB 2003-2026
2026-04-24 10:23:48