One document matched: draft-perlman-rbridge-03.txt
Differences from draft-perlman-rbridge-02.txt
Network Working Group R. Perlman
Internet Draft Sun
Expires: November 2005 J. Touch
USC/ISI
A. Yegin
Samsung
May 2, 2005
RBridges: Transparent Routing
draft-perlman-rbridge-03.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that
any applicable patent or other IPR claims of which he or she is
aware have been or will be disclosed, and any of which he or she
becomes aware will be disclosed, in accordance with Section 6 of
BCP 79.
This document may not be modified, and derivative works of it may not
be created, except to publish it as an RFC and to translate it into
languages other than English.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on November 2, 2005.
Copyright Notice
Copyright (C) The Internet Society (2005). All Rights Reserved.
Perlman Expires November 2, 2005 [Page 1]
Internet-Draft RBridges: Transparent Routing May 2005
Abstract
RBridges provide the ability to have an entire campus, with multiple
physical links, look to IP like a single subnet. The design allows
for zero configuration of switches within a campus, optimal pair-wise
routing, safe forwarding even during periods of temporary loops, and
the ability to cut down on ARP/ND traffic. The design also supports
VLANs, and allows forwarding tables to be based on RBridge
destinations (rather than endnode destinations), which allows
internal routing tables to be substantially smaller than in
conventional bridge systems.
This document is a work in progress; we invite you to participate on
the mailing list at http://www.postel.org/RBridge
Table of Contents
1. Introduction...................................................3
2. Detailed RBridge Design........................................5
2.1. Link State Protocol.......................................5
2.2. Spanning Tree.............................................6
2.3. Designated RBridge........................................7
2.4. Learning Endnode Location.................................8
2.5. Forwarding Behavior.......................................8
2.6. Forwarding Header on 802 Links............................8
2.7. Distributed ARP Query....................................11
3. RBridge Addresses, Parameters, and Constants..................12
4. Handling ARP Queries..........................................12
5. Issues........................................................13
5.1. How Many Spanning Trees?.................................13
5.1.1. Per-ingress Spanning Tree...........................13
5.1.2. Per VLAN............................................13
5.1.3. Single Spanning Tree................................13
5.2. Reasons Not to Optimize Handling of IP packets...........14
5.2.1. Avoiding Encapsulation for On-campus IP Packets.....14
5.2.2. Avoiding Encapsulation for Cff-campus IP Packets....15
5.3. Supporting Heterogeneous Link Types......................15
5.4. Effects on L3 TTL........................................15
5.5. Using L3 encapsulation...................................15
5.6. Optimizing ARP/ND........................................16
6. Security Considerations.......................................17
7. Conclusions...................................................17
8. Acknowledgments...............................................17
9. References....................................................17
9.1. Normative References.....................................17
9.2. Informative References...................................18
Perlman Expires November 2, 2005 [Page 2]
Internet-Draft RBridges: Transparent Routing May 2005
Author's Addresses...............................................19
Intellectual Property Statement..................................19
Disclaimer of Validity...........................................20
Copyright Statement..............................................20
Acknowledgment...................................................20
1. Introduction
In traditional IPv4 and IPv6 networks, each link must have a unique
prefix. This means that a node that moves from one link to another
must change its IP address, and a node with multiple links must have
multiple addresses. It also means that a company with many links
(separated by routers) will have difficulty making full use of its IP
address block (since any link not fully populated will waste
addresses), and routers require significant configuration.
Bridges avoid these problems because bridges can transparently glue
many physical links into what appears to IP to be a single LAN.
However, bridge routing via the spanning tree concentrates traffic
onto selected links, forward based on a header for which any
temporary loops (which might arise due to topology changes or lost
spanning tree messages or components such as repeaters coming up) are
very dangerous (because there is no hop count in the header and there
may be exponential proliferation of packets during loops), and routes
cannot be pair-wise shortest paths, but instead whatever path remains
after the spanning tree eliminates redundant paths.
We define the term "campus" to be the set of links connected by any
combination of RBridges and bridges. In other words the term 'campus'
needs to be clearly defined. A campus refers to a set of links
connected by either RBridges or bridges. In other words, the campus
is terminated by traditional IP routers, in the same way that an IP
subnet would be terminated by an IP router. A campus will look to IP
nodes like a single IP subnet, whether the interconnection of the
links is done with bridges, RBridges, or some combination of the two.
There have been proposals for having routers within a campus
automatically number links with distinct IP subnet numbers. Although
this makes a campus plug-and-play, it requires a large number of IP
subnet numbers, a node must change its address if it moves to a
different link, and addresses of nodes might fluctuate as the
topology changes and links must be renumbered.
This proposal introduces RBridges [8] (Routing Bridges), which
combine the advantages of bridges and routers. Like bridges, RBridges
are zero configuration, and are transparent to IP nodes. Like
routers, RBridges forward on pair-wise shortest paths, and do not
Perlman Expires November 2, 2005 [Page 3]
Internet-Draft RBridges: Transparent Routing May 2005
have dangerous behavior during temporary loops. RBridges have the
additional advantage that they can suppress the broadcast/multicast
for neighbor discovery by doing proxy ARP (IPv4) or proxy ND (IPv6).
RBridges are fully compatible with current bridges as well as current
IPv4 and IPv6 routers and endnodes. They are as invisible to current
IP routers as bridges are, and like routers, they terminate a bridged
spanning tree.
The main idea is to have RBridges run a link state protocol amongst
themselves (IS-IS is ideal, since its TLV encoding easily allows new
information to be carried in link state information, as this proposal
requires, and also makes zero configuration easier because IS-IS does
not require assigning IP addresses to the RBridges).
The next step is for RBridges to learn the location of endnodes. They
can learn the location and layer 2 addresses of attached nodes from
the source address of data packets, as bridges do. Additionally, in
order to facility proxy ARP or proxy ND optimizations, RBridges can
also learn the (layer 3, layer 2) addresses of attached IP nodes from
ARP or ND replies.
Once an RBridge learns the location of a directly attached endnode,
it informs the other RBridges in its link state information.
RBridge forwarding can be done, as with a router, via pairwise
shortest paths. RBridges could also utilize forwarding
optimizations, e.g., MPLS.
To prevent the temporary loop issues with bridges, RBridges must
always forward based on a header with a hop count. Although the hop
count will quickly discard looping packets, it is also desirable not
to spawn additional copies of packets. This can be accomplished by
having RBridges specify the next RBridge recipient while forwarding
across a shared-media link.
For two reasons, packets must be encapsulated as they are traveling
between RBridges:
1. so that intermediate RBridges (and bridges) will not be confused
about the location of the source by learning the source address
from packets in transit
2. so that the packet can be directed towards the egress RBridge, and
can include a hop count (for links, like Ethernet, that do not
already contain a hop count).
Perlman Expires November 2, 2005 [Page 4]
Internet-Draft RBridges: Transparent Routing May 2005
RBridges are similar to Recursive Routers, which provide similar
transit to emulate a single L3 router, in that case using L3 + L2
encapsulation [10][11].
A VLAN is a broadcast domain. That means that a layer 2 broadcast
(multicast) packet sent to a VLAN must only be delivered to links
that are in that VLAN. A packet for a particular VLAN may transit any
link on the campus, but an unencapsulated VLAN packet must only be
delivered to links that RBridges have been configured to know support
that VLAN. Support of VLANs does traditionally require configuration
of the bridges (or in this case RBridges) to know which links belong
to which VLANs. In theory some other mechanism might allow an RBridge
to know which VLANs should be supported on which port. The RBridge
design does not care how RBridges discover which VLANs are supported
by each of their ports, but for simplicity we assume here that
RBridges (like bridges) are configured with this information.
RBridges must calculate a spanning tree for each broadcast domain. In
a campus without VLANs, this means a single spanning tree would be
used for delivery of packets with unknown or group address layer 2
destination.
It is possible to support VLANs with a single spanning tree, and just
avoid forwarding the decapsulated packet onto links that do not
support that VLAN. However, it will allow for more optimal delivery
if a different spanning tree is calculated for each broadcast domain.
It is not necessary to use the bridge spanning tree algorithm to
calculate the spanning trees. Instead, they can be calculated based
on the link state information. Using the link state protocol to
calculate spanning trees makes the design very flexible and
efficient. The link state database gives sufficient information so
that RBridges can calculate a single spanning tree, spanning trees
per VLAN, or per-ingress RBridge spanning trees without requiring any
additional exchange of information between RBridges.
2. Detailed RBridge Design
2.1. Link State Protocol
Running a link state protocol among RBridges is straightforward. It
is the same as running a level 1 routing protocol in an area. IS-IS
is a more appropriate choice than OSPF in this case because it is
easy in IS-IS to define new TLVs for carrying new information.
However, the instance of IS-IS that RBridges will implement will be
separate from any routing protocol that IP routers will implement,
just as the spanning tree messages are not implemented by IP routers.
Perlman Expires November 2, 2005 [Page 5]
Internet-Draft RBridges: Transparent Routing May 2005
To keep the instances separate, RBridge routing messages should be
sent to a different layer 2 multicast address than IS-IS routing
messages. Alternatively, they can be differentiated by having a
different "area address", where, in order to keep RBridges
configuration-free, the RBridge area address would be a constant for
all RBridges, and would not be one that would ever appear as a real
IS-IS area address.
Additional information that RBridge link state information will carry
is:
o layer 2 addresses of nodes within the campus which have
transmitted packets but have not transmitted ARP or ND replies
o layer 3, layer 2 addresses of IP nodes within the campus. For
data compression, perhaps only the portion of the address
following the campus-wide prefix need be carried. This will be
more of an issue for IPv6 than for IPv4.
o VLANs directly connected to this RBridge
The endnode information (the endnode information) need only be
delivered to RBridges supporting the VLAN in which the endnode
resides. So for instance, if endnode E is discovered through a VLAN A
packet, then E's location need only be delivered to other RBridges
that are attached to VLAN A links.
Given that RBridges must support delivery only to links within a VLAN
(for multicast or unknown packets marked with the VLAN's tag), this
mechanism can be used to advertise endnode information solely to
RBridges within a VLAN. Although a separate instance of the link
state protocol could be run for this purpose, the topology is so
restricted (just a single broadcast domain), that it might be
preferable to design a special case mechanism where each DR
advertises its attached endnodes, and receives explicit acks from the
other RBridges.
2.2. Spanning Tree
There will be cases when RBridges may need to send packets to all
links. These cases include:
o layer 2 multicast or broadcast packets
o unknown layer 2 destination addresses
o distributed RBridge layer 3 address location query
Perlman Expires November 2, 2005 [Page 6]
Internet-Draft RBridges: Transparent Routing May 2005
In this case the packets must be sent through a spanning tree.
However, there is no need to implement a separate spanning tree
protocol in addition to the link state protocol. Instead, the link
state information can be used to create a single spanning tree
throughout the campus. This is done by choosing the RBridge with
lowest ID, and calculating the Dijkstra tree with that RBridge as
Root.
In the case of multiple equal cost links, some tie-breaker must be
used to ensure that all RBridges calculate the same spanning tree. We
suggest using the ID of the parent as the tie breaker (if a node can
be attached to either parent P1 or P2 with the same cost, choose P1
if P1's ID is lower than P2).
In the case of multicast L2 addresses, the RBridge may treat these as
broadcast, or may include existing techniques for emulating multicast
at L2, i.e., snooping IGMP and/or PIM-SM packets to configure an
internal, L2 multicast tree.
For a packet tagged with a VLAN ID (e.g., VLAN A), the packet is only
delivered to links that support VLAN A. It would provide for more
optimal delivery if a different spanning tree were calculated for
each VLAN. This would be done by choosing the RBridge with lowest ID
that connects to that VLAN as root, and calculating a tree of
shortest paths from that RBridge. RBridges that do not support VLAN A
may be on the delivery path for VLAN A packets, but they will not
decapsulate the packet onto links that are not VLAN A links.
If IGMP snooping is used to know where recipients of a multicast
packet reside, then the total number of packet-hops to deliver the
packet can be optimized by calculating a separate spanning tree per
ingress RBridge. This, however, requires a lot more computation (one
tree per RBridge). The tradeoffs will be discussed in the "Issues"
section at the end of this document.
2.3. Designated RBridge
It is useful for one RBridge on each link to have special duties.
Thus one RBridge per link should be elected Designated RBridge. IS-IS
already holds such an election.
The Designated RBridge is the one on the link that will learn the
identities of attached endnodes, initiate a distributed ARP when an
ARP query is received for an unknown destination, and answer ARP
queries when the target node is known.
Perlman Expires November 2, 2005 [Page 7]
Internet-Draft RBridges: Transparent Routing May 2005
2.4. Learning Endnode Location
RBridges learn endnode location from data packets. They learn (layer
3, layer 2) pairs (for the purpose of supporting proxy ARP/ND) from
listening to ARP or ND replies.
This endnode information is learned by the DR, and distributed to
other RBridges through the link state protocol.
2.5. Forwarding Behavior
When a DR R1 receives a native packet with layer 2 address S and
layer 2 destination address D, R1 looks up the location of D. If D is
claimed by egress RBridge R2, then R1 encapsulates the packet,
directing it towards R2.
When an RBridge receives an encapsulated packet, it forwards based on
the specified egress RBridge (rather than the ultimate destination
endnode).
If the packet belongs in VLAN A, then R1 (the ingress RBridge) looks
up D's location in R1's table of VLAN A endnodes.
2.6. Forwarding Header on 802 Links
It is essential that RBridges coexist with ordinary bridges.
Therefore, a packet in transit must look to ordinary bridges like an
ordinary layer 2 packet. However, it must also be differentiable from
a native layer 2 packet by RBridges. To accomplish this, we use a new
layer 2 protocol type ("Ethertype").
A packet in transit on an 802 link will therefore have two 802
headers, since the original frame (including the original 802 header)
will be tunneled by the RBridges. But rather than just having an
additional 802 header, we include additional information between the
two headers; at least a hop count.
An encapsulated packet would look as follows:
+--------------+-------------+-----------------+
| outer header | shim header | original packet |
+--------------+-------------+-----------------+
Figure 1 Encapsulated packet
Perlman Expires November 2, 2005 [Page 8]
Internet-Draft RBridges: Transparent Routing May 2005
The outer header contains:
o L2 destination = next RBridge
o L2 source = transmitting RBridge (the most one that most recently
handled this packet)
o protocol type = "to be assigned...RBridge encapsulated packet"
The shim header includes:
o TTL = starts at some value and decremented by each RBridge.
Discarded if=0
o egress RBridge (in the case of unicast), or ingress RBridge (in
the case of multicast)
Note that one variation is to have the egress RBridge specified in
the outer header rather than in the shim header. This will mean that
some packet duplication might occur during temporary loops. But the
advantage is that the header will be 6 bytes smaller. This is
discussed in the "issues" section.
The following is a walk-through of a packet traversing an RBridge
campus. Consider a packet consisting of "data" to be sent from node A
to node B through an RBridge campus (dotted area) as per Figure 2.
...............................
. .
+--------+ .+-----+ +-----+ +-----+. +--------+
| | .| | | | | |. | |
| Host A ----- Rb1 ------ Rb2 ------ Rb3 ------ Host B |
| | .| | | | | |. | |
+--------+ .+-----+ +-----+ +-----+. +--------+
. .
. RBridge campus .
...............................
Figure 2 Sample path for packet traversing an RBridge campus
In this figure, Host A is the source, Host B the sink, and Rb1..Rb3
are nodes of the RBridge campus. Rb1 is the ingress, and Rb3 is the
egress. Additionally, layer 2 (L2) addresses are as shown below the
components on the particular ports in Figure 3; note that addresses
are required for RBridge nodes for encapsulation and routing within
the campus. Different addresses are shown for each port on an RBridge
node for simplicity, although this is not required.
Perlman Expires November 2, 2005 [Page 9]
Internet-Draft RBridges: Transparent Routing May 2005
...............................
. .
+--------+ .+-----+ +-----+ +-----+. +--------+
| | .| | | | | |. | |
| Host A ----- Rb1 ------ Rb2 ------ Rb3 ------ Host B |
| a b1x b1y b2x b2y b3x b3y b |
| | .| | | | | |. | |
+--------+ .+-----+ +-----+ +-----+. +--------+
. .
. RBridge campus .
...............................
Figure 3 Sample path including L2 addresses
Consider the originating packet as per Figure 4; "L2 a->b" means the
layer 2 (L2) source address is "a" and the L2 destination address is
"b", and "IP A->B" means the IP source address is A and the IP
destination is B.
+---------+---------+--------+
| L2 a->b | IP A->B | data |
+---------+---------+--------+
Figure 4 Packet as originated at Host A
The ingress RBridge Rb1 looks up 'b' in its encapsulation tables,
which indicate that Rb3 is the egress RBridge. The packet gets
wrapped to direct it to Rb3 using a shim header (SH), where the
destination is based on the L2 address of Rb3 (the egress) and uses a
TTL of 20, as shown in Figure 5.
+-----------------+---------+---------+--------+
| SH ->b3y TTL=20 | L2 a->b | IP A->B | data |
+-----------------+---------+---------+--------+
Figure 5 Packet with shim header
Note that the shim header includes only egress addresses for unicast
packets; for multicast packets, ingress L2 is used instead.
Rb1 then looks up the shim header destination in its (campus)
forwarding tables, yielding Rb2 as the next hop inside the campus.
Rb1 then sends the packet on to Rb2 by adding the appropriate L2
header, as shown in Figure 6.
Perlman Expires November 2, 2005 [Page 10]
Internet-Draft RBridges: Transparent Routing May 2005
+-------------+-----------------+---------+---------+--------+
| L2 b1y->b2x | SH ->b3y TTL=20 | L2 a->b | IP A->B | data |
+-------------+-----------------+---------+---------+--------+
Figure 6 Packet as sent from Rb1 to Rb2
Rb2 unwraps the outermost L2, decrements the shim TTL, and looks up
the shim destination's next hop (which is Rb3 here). Rb2 then adds a
new L2 header addressed to Rb3, as shown in Figure 7.
+-------------+-----------------+---------+---------+--------+
| L2 b2y->b3x | SH ->b3y TTL=19 | L2 a->b | IP A->B | data |
+-------------+-----------------+---------+---------+--------+
Figure 7 Packet as sent from Rb2 to Rb3
Rb3 unwraps the outer L2, notices that the shim destination has been
reached (itself), and unwraps the shim too. At that point, it
proceeds to send the original packet shown in Figure 4 to Host B.
2.7. Distributed ARP Query
The distributed ARP query is carried by RBridges through the RBridge
spanning tree. Each Designated RBridge, in addition to forwarding the
query through the spanning tree, initiates an ARP query on its
link(s). If a reply is received by Designated RBridge R2, R2
initiates a link state update to inform all the other RBridges of D's
location, layer 3 address, and layer 2 address.
The distributed ARP query must be sent to a (new, to be assigned)
layer 2 multicast address. The fields it must contain are:
Outer Layer 2 header:
o destination = newly defined l2 multicast address
o source = transmitting RBridge (replaced hop by hop)
o protocol type = same as encapsulated RBridge
Shim header:
o TTL (for safety if the RBridge spanning tree has temporary loops,
and where the L2 header lacks an existing TTL)
Perlman Expires November 2, 2005 [Page 11]
Internet-Draft RBridges: Transparent Routing May 2005
o ingress RBridge (rather than egress RBridge, which would be
specified in unicast packets to known destinations); this is used
for ingress-specific forwarding, e.g., for VLANs
RBridge payload:
o original ARP or ND query
Intermediate RBridges decrement the above TTL, and replace the source
RBridge with their own layer 2 address on the outgoing interface.
3. RBridge Addresses, Parameters, and Constants
Each RBridge needs a unique ID within the campus. The simplest such
address is a unique 6-byte ID, since such an ID is easily obtainable
as any of the EUI-48's owned by that RBridge. IS-IS already requires
each router to have such an address.
A parameter is the value to which to initially set the hop count in
the envelope. Recommended default=20.
A new Ethertype must be assigned to indicate an RBridge-encapsulated
packet.
A layer 2 multicast address must be assigned for use as the
destination address in distributed ARP queries.
To support VLANs, RBridges (like bridges today), must be configured,
for each port, with the VLAN in which that port belongs.
4. Handling ARP Queries
If the target address is unknown, initiate a distributed ARP query.
If the target address is known, reply with a proxy ARP reply, giving
the target's true layer 2 address.
When initiating a distributed ARP query (or IPv6 neighbor
solicitation) remember the address of the requesting node. When the
information is discovered, respond to the requester.
Perlman Expires November 2, 2005 [Page 12]
Internet-Draft RBridges: Transparent Routing May 2005
5. Issues
5.1. How Many Spanning Trees?
5.1.1. Per-ingress Spanning Tree
If a separate spanning tree is calculated per ingress RBridge, then
delivery of both broadcast and multicast packets, where the recipient
locations are known through some mechanism such as IGMP snooping, can
be optimized (for number of packet hops to deliver the multicast
packet).
Also, if a separate spanning tree is calculated per ingress RBridge,
then out of order delivery is minimized when RBridges learn the
location of the destination, since the packet will traverse the same
path whether it is being delivered via the "destination unknown" tree
to that broadcast domain, or the direct path to that destination.
However, there is obvious overhead involved in calculating separate
spanning trees.
This mechanism of avoiding out of order delivery by calculating
separate spanning trees per ingress RBridge was presented at the IETF
TRILL BOF on March 10, 2005.
5.1.2. Per VLAN
If there are not many links that support VLAN A, then total number of
packet hops to deliver a packet within the VLAN A broadcast domain is
minimized by calculating a separate spanning tree for each VLAN.
It would be possible to still support VLANs with a single spanning
tree, by having RBridges only decapsulate a VLAN A packet onto VLAN A
links, but the number of transit links such a packet would traverse
would be more than necessary (assuming that the location of VLAN A
links within the campus is somewhat sparse).
5.1.3. Single Spanning Tree
Broadcast and multicast and VLANs can be supported with a single
spanning tree, which the simplest solution and requires the least
computation and smallest forwarding tables in the RBridges. In that
case all such packets would be delivered to all the RBridges, and
only Designated RBridges would differentiably not forward onto links
that the packet does belong on. So from the endnodes' point of view,
things are still correct; a packet will only be delivered to the
Perlman Expires November 2, 2005 [Page 13]
Internet-Draft RBridges: Transparent Routing May 2005
proper links. But the cost to deliver the packet within the core can
be much greater.
Additionally, the more different spanning trees that are utilized,
the more all the links within the core can be fully utilized.
The cases in which a broadcast/multicast packet is not delivered to
all the links in the campus are:
o when there is a VLAN tag, in which case the packet will only be
delivered to links that support that VLAN
o when the layer 2 multicast is derived from an IP multicast, and
the RBridges have learned, through IGMP snooping, which links wish
to receive the packet
5.2. Reasons Not to Optimize Handling of IP packets
There are two optimizations that were considered but abandoned due to
their impact on transparency, i.e., that an RBridge should appear
like a bridged network to upper layer protocols. These optimizations
focus on ways of merging the shim layer functionality with the
existing headers of IP packets.
5.2.1. Avoiding Encapsulation for On-campus IP Packets
In theory, on-campus IP packets need not be encapsulated with an
additional layer 2 header. The original layer 2 header can be
discarded and replaced with one where the layer 2 destination is
replaced by the next RBridge, and the source layer 2 address is
replaced by something that will not confuse bridge learning (since
packets will be injected into each segment from unpredictable
directions because shortest path routes will be used).
The disadvantages of this approach are:
o the IP header's TTL would be decremented by each RBridge, making
the customer aware that bridges have been replaced by RBridges,
and possibly breaking IP protocols that expect the TTL not to be
decremented over an L2 system
o the original layer 2 addresses might need to be preserved for some
conceivable uses
The real disadvantage, though, is that RBridges would have to have
more complex forwarding behavior. They would need to forward based on
layer 2 addresses sometimes, and layer 3 addresses at other times.
Perlman Expires November 2, 2005 [Page 14]
Internet-Draft RBridges: Transparent Routing May 2005
Even if all packets were IP, RBridges would need to forward packets
for off-campus IP destinations based on the layer 2 address of the IP
router.
5.2.2. Avoiding Encapsulation for Cff-campus IP Packets
Likewise, in theory, off-campus IP packets need not be encapsulated.
The TTL in the IP header can be decremented. The same disadvantages
as for on-campus IP packets apply, including the concerns on the
impact of decremented TTL on other IP protocol behavior. However,
there is the additional disadvantage that since the actual layer 2
destination has to be preserved end-to-end there is the danger of
packet proliferation if multiple RBridges decide to forward the
packet, which can occur while the topology is adjusting.
5.3. Supporting Heterogeneous Link Types
It is easy to support link types other than 802 links with RBridges.
However, mixing link types within a single campus raises
complexities, such as packet size, incompatible layer 2 addresses,
and other layer 2 features (such as priority) that might be lost when
trying to "bridge" two different link types.
5.4. Effects on L3 TTL
In general, an RBridge should have no effect on a Layer 3, e.g., IP
TTL field, since the RBridge is a Layer 2 device. The TTLs which
ensure loop-free operation in an RBridge system should occur in the
encapsulation header, and not affect any of the headers of the packet
passed through the RBridge system. The RBridge should do nothing to
transited packets other than that which would be done by an
equivalent L2 system.
5.5. Using L3 encapsulation
RBridges may use L3, e.g., IP encapsulation to provide a routable
internal address and a loop-check indicator. This allows the RBridge
system to use L3 routing algorithms, e.g., OSPF, using existing L3
implementations. As with any RBridge system, packets are forwarded
only within the preconfigured RBridge system. Intermediate L2
bridges are allowed whether L2 or L3 encapsulation is used. L3
encapsulation processing - including ICMP handling, fragmentation,
etc., are well-defined (e.g., RFC2003).
In this case, the L3 encapsulation should not decrement the TTL of
the inner transited packet, since (as per RFC2003) the RBridge system
would not be considered a forwarding (i.e., L3) 'tunnel'. Further,
Perlman Expires November 2, 2005 [Page 15]
Internet-Draft RBridges: Transparent Routing May 2005
changing the IP TTL would potentially affect the reachability of all
1's broadcast or multicast, which would not reach the full L2 subnet.
The primary disadvantage to L3 encapsulation is the increased
overhead of encapsulation (e.g., adding both an L3 and subsequent
outer L2 header) and complexity of providing L2 services (broadcast
notably) within the L3 subnet (RFC1122, RFC1812). Note that L3
supports fragmentation and reassembly for tunnels, notably both for
IPv4 and IPv6 encapsulation. Reassembly would be required at the
egress, which increases the load on the egress RBridge in tracking
and storing the fragments, but the resulting transited packet is
generally transparent to the process. The primary effect would be if
there were a large amount of reordering (increasing the reassembly
load) or high packet loss (resulting in failed reassembly and thus
lost packets). In the latter case, packet loss is amplified because
of the lack of fate sharing of the fragments of a single transited
packet.
5.6. Optimizing ARP/ND
There are various alternatives for how an RBridge could handle
ARPs/NDs when the target is known (because of having been
disseminated through the link state protocol). Listed from most
expensive to least expensive:
o treat ARP/ND like any multicast packet, and send along the
(appropriate) spanning tree, and let the target respond
o route the ARP/ND to the RBridge that claims attachment to the
target
o do proxy ARP/ND
The only reason not to do proxy ARP/ND is in case the target node has
actually moved, and has not yet been discovered by the RBridges. If
the actual target needs to respond, then obviously the target is
there. If the query is routed to the expected link, then there won't
be a false positive, but the real location of the target may not be
found, if the target has moved.
Some mix of these strategies might be the best solution. For
instance, if the target's location has not been recently verified
through a broadcast ARP/ND, then the source's RBridge should
broadcast the ARP/ND. Otherwise it should do proxy ARP. So for
instance, RBridges could keep track of the last time a broadcast
ARP/ND occurred for each endnode E (by any source, and injected by
any RBridge). Let's say the parameter is 20 seconds. If a source S on
Perlman Expires November 2, 2005 [Page 16]
Internet-Draft RBridges: Transparent Routing May 2005
RBridge R1's link does an ARP/ND for D, if R1 has not seen an ARP/ND
for D within the last 20 seconds, R1 broadcasts the query; otherwise
it proxies the reply.
6. Security Considerations
The goal is for RBridges to not add additional security issues over
what would be present with traditional bridges. RBridges will not be
able to prevent nodes from impersonating other nodes, for instance,
by issuing bogus ARP replies. However, RBridges will not interfere
with any schemes that would secure neighbor discovery.
As with routing schemes, authentication of RBridge messages would be
a simple addition to the design (and it would be accomplished the
same way as it would be in IS-IS). However, any sort of
authentication requires additional configuration, which might
interfere with the perception that RBridges, like bridges, are zero
configuration.
7. Conclusions
This design allows transparent interconnection of multiple links into
a single IP subnet. Management would be just like with bridges
(plug-and-play). But this design avoids the disadvantages of
bridges. Temporary loops are not a problem so failover can be as
fast as possible, and shortest paths can be followed.
The design is compatible with current IP nodes and routers, and with
current bridges.
8. Acknowledgments
We anticipate that many people will contribute to this design, and
invite you to join the mailing list at http://www.postel.org/rbridge
9. References
9.1. Normative References
[1] Perkins, C., "IP Encapsulation within IP", RFC 2003 (Standards
Track), October 1996.
[2] Braden, R., "Requirements for Internet Hosts - Communication
Layers", STD 3, RFC 1122, October 1989.
[3] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812
(Standards Track), June 1995.
Perlman Expires November 2, 2005 [Page 17]
Internet-Draft RBridges: Transparent Routing May 2005
[4] Plummer, D., "Ethernet Address Resolution Protocol: Or
converting network protocol addresses to 48.bit Ethernet
address for transmission on Ethernet hardware", STD 37, RFC
826, November 1982.
[5] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery
for IP Version 6 (IPv6)", RFC 2461 (Standards Track), December
1998.
[6] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and dual
environments", RFC 1195, December 1990.
[7] IEEE 802.1d bridging standard, "IEEE 802.1d bridging standard".
[8] Perlman, R., "RBridges: Transparent Routing", Proc. Infocom
2005, March 2004.
[9] Perlman, R., "Interconnection: Bridges, Routers, Switches, and
Internetworking Protocols", Addison Wesley Chapter 3, 1999.
[10] Touch, J., "Dynamic Internet overlay deployment and management
using the X-Bone", Computer Networks Vol. 36, No. 2-3, July
2001.
[11] Touch, J., Wang, Y., Eggert, L. and G. Finn, "A Virtual
Internet Architecture", ISI Technical Report ISI-TR-570,
Presented at the Workshop on Future Directions in Network
Architecture (FDNA) 2003 at Sigcomm 2003, March 2003.
9.2. Informative References
[12] Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)",
RFC 2409 (Standards Track), November 1998.
[13] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
November 1990.
[14] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923
(Informational), September 2000.
[15] Kent, S., "IP Encapsulating Security Payload (ESP)",
draft-ietf-ipsec-esp-v3-10 (work in progress), March 2005.
[16] Kent, S., "IP Authentication Header",
draft-ietf-ipsec-rfc2402bis-011 (work in progress), March 2005.
Perlman Expires November 2, 2005 [Page 18]
Internet-Draft RBridges: Transparent Routing May 2005
[17] Kaufman, C., "Internet Key Exchange (IKEv2) Protocol",
draft-ietf-ipsec-ikev2-17 (work in progress), Oct. 2004.
Author's Addresses
Radia Perlman
Sun Microsystems
Email: Radia.Perlman@sun.com
Joe Touch
USC/ISI
4676 Admiralty Way
Marina del Rey, CA 90292 U.S.A.
Phone: +1 (310)_448-9151
Email: touch@isi.edu
Alper Yegin
Samsung Advanced Institute of Technology
Email: alper.yegin@samsung.com
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
Perlman Expires November 2, 2005 [Page 19]
Internet-Draft RBridges: Transparent Routing May 2005
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Perlman Expires November 2, 2005 [Page 20]
| PAFTECH AB 2003-2026 | 2026-04-23 15:03:46 |