One document matched: draft-jen-apt-01.txt
Differences from draft-jen-apt-00.txt
Network Working Group D. Jen
Internet-Draft M. Meisel
Intended status: Informational D. Massey
Expires: May 21, 2008 L. Wang
B. Zhang
L. Zhang
November 18, 2007
APT: A Practical Transit Mapping Service
draft-jen-apt-01.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 21, 2008.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
The size of the global routing table is a rapidly growing problem.
Several solutions have been proposed. These solutions commonly
divide the Internet into two address spaces, one for determining the
delivery location, and one to use during transit. Packets destined
for delivery addresses are tunneled through the default-free zone
Jen, et al. Expires May 21, 2008 [Page 1]
Internet-Draft Transit Mapping November 2007
(DFZ), which uses only transit addresses. For this process to work,
there must be a mapping service that can supply an appropriate
destination transit address for any given delivery address. We
present a design for such a mapping service. We adhere to a "do no
harm" design philosophy: maintain all desirable features of the
current architecture without negatively affecting its security or
reliability. Our design aims to minimize delay and prevent loss in
packet encapsulation, minimize the number of modifications to
existing hardware, minimize the number of new devices, and keep the
level of control traffic manageable.
Jen, et al. Expires May 21, 2008 [Page 2]
Internet-Draft Transit Mapping November 2007
Table of Contents
1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4
2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. APT Overview and Requirements . . . . . . . . . . . . . . . . 6
5. The Mapping Service . . . . . . . . . . . . . . . . . . . . . 8
5.1. A Mapping Example . . . . . . . . . . . . . . . . . . . . 9
6. Multihoming Support . . . . . . . . . . . . . . . . . . . . . 11
6.1. Using Alternate ETRs During Failures . . . . . . . . . . . 12
6.1.1. Handling Taddr Prefix Failures . . . . . . . . . . . . 12
6.1.2. Handling Single-ETR Failures . . . . . . . . . . . . . 13
6.1.3. Handling TR-to-DN Link Failures . . . . . . . . . . . 13
7. Exchanging MapSets Between TNs . . . . . . . . . . . . . . . . 14
7.1. MapSet Dissemination via DM-BGP . . . . . . . . . . . . . 14
7.2. Regular MapSet Refresh . . . . . . . . . . . . . . . . . . 15
8. Security and Reliability . . . . . . . . . . . . . . . . . . . 15
8.1. Authenticating the Originator of Mapping Updates . . . . . 15
8.2. Detecting MapSet Misconfigurations . . . . . . . . . . . . 16
8.3. APT Control Messages . . . . . . . . . . . . . . . . . . . 17
9. Scalability through Recursion . . . . . . . . . . . . . . . . 17
10. Mapping Announcements . . . . . . . . . . . . . . . . . . . . 18
11. APT Header and Control Messages . . . . . . . . . . . . . . . 19
11.1. APT Header Fields . . . . . . . . . . . . . . . . . . . . 19
11.2. Cache Add Messages . . . . . . . . . . . . . . . . . . . . 20
11.3. Cache Drop Messages . . . . . . . . . . . . . . . . . . . 20
11.4. ETR Unreachable Messages . . . . . . . . . . . . . . . . . 20
11.5. DN Unreachable Messages . . . . . . . . . . . . . . . . . 21
11.6. The ETR-to-DN Link Failure Message Type . . . . . . . . . 21
12. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 21
13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
14. Security Considerations . . . . . . . . . . . . . . . . . . . 21
15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
15.1. Normative References . . . . . . . . . . . . . . . . . . . 21
15.2. Informative References . . . . . . . . . . . . . . . . . . 22
Appendix A. Open Issues . . . . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23
Intellectual Property and Copyright Statements . . . . . . . . . . 25
Jen, et al. Expires May 21, 2008 [Page 3]
Internet-Draft Transit Mapping November 2007
1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Problem Statement
The unexpected, explosive growth of the Internet is causing a greater
and greater strain on its infrastructure. This problem has been
well-documented in [RAWS][AddrAlloc]. Several solutions have been
proposed to address this problem [CRIO][EFIT][EFIT-ID][LISP][SixOne]
the majority of which involve separating the Internet into two parts,
one for determining the delivery location, and one to use during
transit. Routers in transit space would only need to know how to
route to transit prefixes, which are stable and conducive to
topological aggregation. When a packet is sent from source delivery
address A to destination delivery address B, A's provider-edge router
(the ingress tunnel router, or "ITR", as defined in [LISP])
encapsulates the packet and sends it through transit space to B's
provider-edge router (the egress tunnel router, or "ETR"). B's ETR
decapsulates the packet and forwards it to the appropriate recipient,
B.
When encapsulating a packet, A's ITR must somehow determine B's ETR's
transit address and include it in the outer header. In general, any
ITR must be able to map any given delivery address to a corresponding
ETR transit address for proper tunneling through transit space. This
illustrates the need for a mapping service that can provide this
address. The design details of this mapping service will play a
large part in determining the effectiveness of any proposed
implementation of a delivery/transit address space separation. The
mapping service also presents a new opportunity to enhance the
services currently offered by the Internet, which is further reason
to carefully consider how this service should be implemented. Should
mapping information be distributed via a push or a pull model? What
additional information, if any, should be distributed along with the
mapping information? Can we satisfy the mapping requirement without
impacting packet delivery quality?
Our answers to these questions are rooted in a "do no harm" design
philosophy: improve routing scalability without sacrificing any
desirable features in the current architecture or negatively
affecting its security and reliability. To this end, we present APT,
a practical transit mapping service designed with the following goals
in mind.
Jen, et al. Expires May 21, 2008 [Page 4]
Internet-Draft Transit Mapping November 2007
o Minimize delay and prevent loss in packet encapsulation.
o Minimize the number of devices that need to be modified to support
APT.
o Minimize the number of devices that will require additional
resources or complexity.
o Keep the design modular so that the method used to propagate
mapping information is independent from the method used to
retrieve mapping information for tunneling.
3. Terminology
Transit Network (TN) - An AS whose business is to provide packet
transport services for its customers. Transit networks provide
packet forwarding services for delivery networks (see definition
below). As a rule of thumb, if the AS appears in the middle of any
ASPATH in a BGP route today, it is considered a transit network.
Delivery Network (DN) - A network that is a source or destination of
IP packets, but forwards packets between neither TNs nor other
delivery networks.
Transit Space - The IP address space used by transit networks. We
will also use the term "transit space" to refer to the topological
area of the Internet where transit addresses are routable.
Delivery Space - The set of all IP address spaces used by delivery
networks. We will also use the term "delivery space" to refer to the
topological area of the Internet outside of transit space -- that is,
where only delivery addresses are routable.
Transit Address (Taddr) - A Taddr is an address in transit space.
Delivery Address (Daddr) - A Daddr is an address in delivery space.
Default Mapper - A new device required by APT. Each transit network
MUST have at least one default mapper. A default mapper maintains a
complete mapping table. In other words, given any Daddr, default
mappers can return a corresponding Taddr. To support the growing
trend towards multihoming, the mappings stored in default mappers
will map a Daddr prefix to a non-empty SET of destination Taddrs, all
of which are expected to have a direct connection to the DN.
Tunnel Router (TR) - All edge routers in a TN will become TRs. Like
ITRs and ETRs in LISP [LISP], TRs provide the encapsulation and
Jen, et al. Expires May 21, 2008 [Page 5]
Internet-Draft Transit Mapping November 2007
decapsulation services required for tunneling packets through transit
space. A TR has both ITR and ETR functionality, meaning that any TR
can perform both encapsulation and decapsulation of packets. To
properly encapsulate any given packet, TRs can query the default
mappers for mapping information. TRs also cache commonly used
MapRecs locally. Note that TR cache entries are NOT identical to the
mappings stored at default mappers (see the definitions of "MapSet"
and "MapRec" below).
APT Node - A general term referring to any device type introduced by
APT. This includes both default mappers and TRs.
MapSet - A MapSet contains a Daddr prefix and a non-empty SET of ETR
Taddrs associated with the prefix. MapSets also include related
information such as priority rankings for each of the ETRs in the
set. Default mappers store MapSets.
MapRec - A MapRec contains a Daddr prefix and any SINGLE ETR Taddr
associated with that prefix. Any MapRec is a subset of the complete
MapSet for its Daddr prefix. TRs store MapRecs along with an
associated TTL. A MapRec is removed from a TR's cache once its TTL
expires.
4. APT Overview and Requirements
This section is a comprehensive overview of the devices and protocols
introduced by APT. For explanations and justifications, see the
corresponding referenced sections.
Default Mapper Requirements (see Section 5)
o Default mappers must have enough storage space to store the full,
global mapping table and associated metadata.
o Every destination Taddr in a MapSet MUST have an associated time
before retry (TBR, see Section 6.1).
o Default mappers MUST keep track of the Taddrs of the TRs they
serve.
o Default mappers MUST examine the destination Taddr of incoming
packets for addresses other than their own.
TR Requirements (see Section 5)
o TRs MUST keep a small cache to hold recently-used MapRecs and
their TTLs.
Jen, et al. Expires May 21, 2008 [Page 6]
Internet-Draft Transit Mapping November 2007
o TRs MUST have a default route to their default mapper.
o TRs MUST be able to encapsulate and decapsulate IP-in-UDP packets
with an APT header (see Section 11).
Failover for Multihomed DNs (see Section 6.1)
o When a Taddr prefix is withdrawn via BGP (see Section 6.1.1)
* ITRs forward packets destined for unroutable Taddrs to their
default mapper.
* The default mapper forwards the packet to an alternate ETR if
one is available.
* The default mapper sends a Cache Add Message to the originating
ITR.
o When a TR becomes unreachable (see Section 6.1.2)
* Packets destined for the TR are intercepted by its default
mapper.
* The default mapper sets the TBR for the appropriate MapRec.
* The default mapper forwards TR-addressed packets to an
alternate ETR if one is available.
* The default mapper sends an ETR Unreachable packet to the ITR's
default mapper.
* The default mapper broadcasts a Cache Drop Message to its TRs.
* The ITR's default mapper sets the TBR for the appropriate
MapRec.
* The ITR's default mapper broadcasts a Cache Drop Message to its
TRs.
o When a DN becomes unreachable from its TR (see Section 6.1.3)
* The TR forwards packets destined for the DN to its default
mapper, setting the APT packet type to ETR-to-DN link failure
(see Section 11.1).
* The default mapper sets the TBR for the appropriate MapRec.
Jen, et al. Expires May 21, 2008 [Page 7]
Internet-Draft Transit Mapping November 2007
* The default mapper forwards the packet to an alternate ETR if
one is available.
* The default mapper sends a Delivery Network Unreachable packet
to the ITR's default mapper.
* The default mapper broadcasts a Cache Drop Message to its TRs.
* The ITR's default mapper sets the TBR for the appropriate
MapRec.
* The ITR's default mapper broadcasts a Cache Drop Message to its
TRs.
Mapping Dissemination
o Default mappers MUST sign updates with their TN's private key.
o Default mappers MUST verify the signature before processing or
forwarding MapSet updates (see Section 8).
o Default mappers MUST NOT remove or alter the signature when
forwarding the update.
o Default mappers MUST cryptographically sign control messages that
may need to travel between ASes.
o Default mappers MUST speak DM-BGP and peer with other default
mappers (see Section 7.1).
* DM-BGP is a separate instance of standard BGP that runs on a
different TCP port.
* Only default mappers speak DM-BGP.
* DM-BGP updates carry mapping updates in a new attribute type.
5. The Mapping Service
TRs serve as the gateway between delivery and transit space. When a
TR receives a packet from a DN that needs to be routed through
transit space, it maps the packet's destination Daddr to an
appropriate destination Taddr (the mapping lookup details are
presented below). The TR will then encapsulate the packet with a UDP
header containing an APT header followed by the original layer-3
packet as the UDP payload (see Section 11). The packet can then be
routed through transit space.
Jen, et al. Expires May 21, 2008 [Page 8]
Internet-Draft Transit Mapping November 2007
To minimize the latency introduced by encapsulation, APT seeks to
store mapping information as close to the ITR as possible. However,
the global mapping table is likely to grow very large over time. To
avoid undue memory requirements for ITRs while still keeping mapping
information within reach, we introduce the concept of default
mappers.
A TR does not need to store the entire global mapping table.
Instead, it queries a default mapper for mapping information and
caches recently used MapRecs.
Default mappers are the only devices in the network that need to
store the complete global mapping table. As we will see in the
following example, TRs only make use of default mappers in the event
of a cache miss. This means that, given sufficiently sized caches at
the TRs, network latency will not heavily depend upon default mapper
performance. Note that each TN need only have a single default
mapper, but may choose to deploy more to avoid a single point of
failure and to enhance overall performance. In the latter case, a TN
MAY choose to use anycast to reach one of the default mappers or use
multicast to reach all of them.
5.1. A Mapping Example
Jen, et al. Expires May 21, 2008 [Page 9]
Internet-Draft Transit Mapping November 2007
Below is a simple topology for demonstrative purposes. A and B are
DNs, each addressable via a single Daddr prefix, TN1 and TN2 are TNs,
ITR1, ETR1, and ETR2 are TRs, any node labeled "X" is a router, and
M1 and M2 are default mappers. A portion of the mapping table for M1
is shown.
___ ___
/ A \ / B \_________
\___/ \___/ | Delivery Space
- - - - -|- - - - - - - - - - - - - - - - -| - - - - -|- - - - - - - - -
.--+---. .--+---. | Transit Space
__-| ITR1 |-__ __-| ETR1 |-__ |
/ '------' .`--. .--'. '------' .`--+--.
| T ____ | X |------------| X | T ____ | ETR2 |
| N | M1 | '-;-' '-:-' N | M2 | '-;----'
\ 1 '-/\-' / \ 2 '----' /
\_____/ \___/ \____________/
_______/ \___________________
| DN | TS Addr | Priority |
|----------|----------|----------|
| ... | ... | ... |
|----------|----------|----------|
| B | ETR1 | 10 |
| | ETR2 | 20 |
|----------|----------|----------|
| ... | ... | ... |
'--------------------------------'
Figure 1
In this section, we illustrate how TRs and default mappers interact
within a TN to properly tunnel packets through transit space.
In Figure 1, a node in network A sends a packet to a Daddr in network
B. When this packet arrives at ITR1, ITR1 looks up the destination
Daddr in its MapRec cache. If a matching prefix is present in its
cache, ITR1 simply encapsulates the packet with the corresponding
destination Taddr and sends it across transit space. If a matching
prefix is not present, ITR1 will send the packet through its default
mapper, M1. It does this by encapsulating the packet with the
(possibly anycast) address for its default mapper(s) as the
destination Taddr.
This packet will arrive at M1, the only default mapper in TN1. When
M1 receives the packet, it decapsulates the packet and examines the
destination Daddr. Since default mappers store the full, global
mapping table, a default mapper will always be able to encapsulate
the packet with a valid destination Taddr. All packets encapsulated
Jen, et al. Expires May 21, 2008 [Page 10]
Internet-Draft Transit Mapping November 2007
by a default mapper MUST contain the default mapper's Taddr as the
source address.
In addition to forwarding the packet to an appropriate TR (ETR1, in
this case), M1 also treats the incoming packet as an implicit request
from ITR1 for mapping information. M1 responds to ITR1 with a Cache
Add Message (see Section 11.2) containing a MapRec that maps B to
ETR1. This allows ITR1 to add this MapRec to its cache so that ITR1
can tunnel further packets destined for B directly to ETR1. The
MapRec also has an associated time to live (TTL) that is set by M1.
The TTL ensures that ITR1 will occasionally re-request this mapping
information from M1. At this time, if the mapping information has
changed in any way since ITR1's prior request, M1 can respond with an
updated MapRec. Without this TTL, ITR1's cached information may
become stale over time.
6. Multihoming Support
In the example above, the observant reader may have noted that B is
multihomed. That is, B can be reached through both ETR1 and ETR2.
Multihoming provides B with both enhanced reliability in case of a
connectivity failure and the flexibility to distribute incoming
traffic across different tunnel endpoints.
In accordance with our design goals, all of the logic for selecting a
tunnel endpoint for a multihomed DN is contained within default
mappers. Default mappers store full MapSets containing the addresses
of all ETRs for a given Daddr prefix, while TRs only store a single
MapRec per Daddr prefix. When a TR requests a MapRec for a
multihomed DN, it is up to the default mapper to decide which one to
return.
Many DNs will want to have some control over which tunnel endpoint is
used for incoming traffic. Therefore, each MapRec in a MapSet has an
associated priority value, which is made available to all default
mappers throughout the transit space (see Section 7). The number is
to be treated like a ranking -- an ETR with a lower priority value is
more preferable.
At the same time, a sending TN may have its own preference regarding
which of the ETRs to use for a given Daddr prefix. Default mappers
can use a combination of locally configured routing policies and
MapSet priority information to choose from the set of valid ETR
addresses. Going back to Figure 1, assume that ITR1 does not have a
MapRec for B in its cache. When A addresses a packet to B, ITR1 will
send the packet to M1. If M1 has no preference between ETR1 and
ETR2, it will examine the priority values in B's MapSet and select
Jen, et al. Expires May 21, 2008 [Page 11]
Internet-Draft Transit Mapping November 2007
ETR1, B's most preferred ETR. M1 forwards the packet to ETR1 and
returns the corresponding MapRec to ITR1, which stores the MapRec in
its cache.
In the case of a priority value tie, the default mapper can break the
tie by picking the ETR to which it has the shortest path. If some
ETRs are tied in terms of both lowest priority value and shortest
path, the default mapper is free to break the tie arbitrarily. The
address of the selected ETR will be used as the destination address
when encapsulating the packet.
We envision that DNs will be able to manipulate their incoming
traffic load by setting appropriate priority values in their MapSet.
A DN who wants load balancing can assign the same priority value to
all of his MapRecs. A DN who wants to have one TN as a primary
provider and another only as a backup can simply assign a higher
priority value to his ETR at his backup provider.
6.1. Using Alternate ETRs During Failures
When a network failure has rendered an ETR unable to perform its
duties, an affected multihomed user will expect his traffic to be
temporarily routed through an alternate ETR. There are three general
types of failures that would require an ITR to use an alternate ETR:
(1) an ITR may have discovered via BGP that it can no longer reach
the Taddr prefix containing the address of the intended ETR, (2) the
ETR itself may go down or lose connectivity, and (3) the link between
a DN and its TR may be down, a new problem introduced by the
tunneling architecture. This section will explain how each type of
failure is handled, using Figure 1 as a reference. We assume that,
at the time of failure, all TNs are using ETR1 to reach B.
To assist in handling these failures, default mappers store a time
before retry (TBR) for each MapRec. Normally, the TBR for each
MapRec is set to zero, indicating that it is usable. Any MapRec with
a non-zero TBR value is considered invalid. We will refer to the
action of setting a MapRec's TBR to a non-zero value as "invalidating
a MapRec." MapRecs that map to unroutable destinations are also
considered invalid. So long as a MapRec is invalid, default mappers
will not use this entry as a destination address or include it in
mapping responses. The role of the TBR in handling failures will
become clear in the explanations below.
6.1.1. Handling Taddr Prefix Failures
For failures of type (1), ITR1 has no route to ETR1. Assume a host
in network A attempts to send a packet to a host in network B. If
ITR1 does not have a MapRec for B in its cache, it will forward the
Jen, et al. Expires May 21, 2008 [Page 12]
Internet-Draft Transit Mapping November 2007
packet to M1 (see Section 5.1). If ITR1 does have a MapRec for B in
its cache, it will see that it has no route to ETR1, and forward the
packet to its default mapper, M1. M1 will also see that it has no
route to ETR1, and thus select the next-most-preferred ETR for B,
ETR2. If it has a route to ETR2, it sends the packet with ETR2 as
the destination Taddr and replies to ITR1 with the corresponding
MapRec. M1 can assign a relatively short TTL to the MapRec in its
response. Once this TTL expires, ITR1 will forward the next packet
for B to the default mapper, which will respond with the most-
preferred MapRec that is routable at that time. This allows ITRs to
quickly revert to using ETR1 once it becomes reachable again.
6.1.2. Handling Single-ETR Failures
In the second case, the Taddr prefix containing ETR1 is still
routable from ITR1, but ETR1 has failed or is otherwise unreachable.
Since this failure is confined to TN2, all routers in TN2 should be
able to detect that ETR1 is unreachable via TN2's IGP. In order to
prepare for this situation, M2 announces a very high-cost link to all
of the TRs it serves (in this case, ETR1 and ETR2) via IGP. When
ETR1 fails, since the normal IGP path to ETR1 will no longer be
valid, all packets addressed to ETR1 will be forwarded to M2 instead.
When M2 receives a data packet addressed to one of the TRs it serves
(ETR1, in this case), it will assume the TR is unreachable,
invalidate the corresponding MapRec, and broadcast a Cache Drop
Message (see Section 11.3) to all of the TRs it serves. Using the
default mapper address in the APT header (see Section 11), it will
also reply to the sender's default mapper (in this case, M1) with an
ETR Unreachable Message (see Section 11.4). M1 can then also
invalidate the corresponding MapRec and broadcast a Cache Drop
Message to its TRs.
In order to minimize packet losses, M2 should not simply drop data
packets addressed to ETR1. Instead, M2 should attempt to reroute the
packet to an alternate ETR, even if that ETR is in a different TN.
It can do this by simply decapsulating the packet, looking up the
MapSet for the Daddr prefix, and re-encapsulating the packet with a
valid ETR as the destination Taddr according to the normal ETR-
selection guidelines.
6.1.3. Handling TR-to-DN Link Failures
The final case involves a failure of the link connecting ETR1 to B.
When ETR1 discovers it cannot reach B, it will send packets destined
for B to its default mapper, M2, setting the APT message type to ETR-
to-DN Link Failure (see Section 11.6) when encapsulating the packet.
M2 will see that the packet's APT message type is ETR-to-DN Link
Jen, et al. Expires May 21, 2008 [Page 13]
Internet-Draft Transit Mapping November 2007
Failure, and handle this situation in the same way as situation 2
(see Section 6.1.2), except that the message it sends to M1 will be a
DN Unreachable Message (see Section 11.5) instead of an ETR
Unreachable Message.
DN Unreachable and ETR Unreachable Messages are handled the same way.
However, we have kept them as separate notification types in order to
allow for divergent behavior in the future.
7. Exchanging MapSets Between TNs
To avoid introducing latency or packet loss when encapsulating
packets, the default mappers must have all MapSets available locally.
In order for default mappers to store a full, global mapping table,
there must be some way for them to receive MapSets from other TNs.
However, only default mappers should receive MapSets. In this
section, we propose a method for MapSet dissemination. The APT
design in general does not depend on this particular method; it only
requires that SOME method exists for secure, up-to-date, lightweight
MapSet dissemination.
7.1. MapSet Dissemination via DM-BGP
MapSet dissemination can be accomplished using a separate BGP
instance that is only run between default mappers. We refer to this
new BGP instance as 'DM-BGP'. As a protocol, DM-BGP is identical to
BGP, but it serves a different purpose. DM-BGP is used to
disseminate MapSets, not as a reachability protocol. It is simply
run on a different TCP port and is only used by default mappers so as
not to affect the RIB-In of other nodes.
When a default mapper wishes to distribute his TN's mapping
information to other default mappers, he sends out a DM-BGP update
with the mapping information included as an optional, transitive BGP
attribute with a new type. The NLRI included MUST be a prefix that
uniquely identifies the source TN. When other default mappers
receive DM-BGP updates, they store this information in their MapSet
tables, replacing any existing MapSets. BGP policy knobs can still
be tuned as desired by each TN. Upon receiving mapping updates, TNs
can choose whether to forward the update to each of their peers, so
long as their actions are in accordance with the BGP protocol.
A default mapper may receive the same mapping update more than once.
This will occur when there is more than one DM-BGP path from the
source default mapper's TN to the receiving default mapper's TN.
Along with the mapping information, the new attribute should include
a sequence number to allow receivers to detect duplicate mapping
Jen, et al. Expires May 21, 2008 [Page 14]
Internet-Draft Transit Mapping November 2007
updates. Default mappers MUST regularly announce MapSets to the rest
of the network for all of the DNs to which their TN connects. As a
precaution, however, these DM-BGP updates should be infrequent and
rate-limited.
7.2. Regular MapSet Refresh
Regardless of the protocol used to disseminate MapSets, MapSets are
not transient data. In order for default mappers to prevent their
MapSet tables from strictly increasing in size without bound, they
must be able to remove stale MapSets. For this reason, each MapSet
entry MUST contain a time to live (TTL). A default mapper MAY remove
a MapSet from its table at any time after this TTL has expired. In
order to avoid premature removal from the global mapping table,
default mappers MUST (1) regularly re-announce all MapSets for DNs
they connect to and (2) set the TTL for each MapSet to no less than
three times their refresh interval.
8. Security and Reliability
Using DM-BGP to distribute mapping announcements guarantees that they
are only accepted from manually configured DM-BGP peers. This
ensures that mapping updates are no less secure than routing updates
are today. However, mapping updates have the potential to cause far
more damage; with no security measures in place, a mapping update
could direct ALL traffic for an entire Daddr prefix to an arbitrary
Taddr. APT strives to prevent attacks and misconfigurations from
having adverse effects outside of the TN in which they occur.
Therefore, mapping updates will require some level of security.
8.1. Authenticating the Originator of Mapping Updates
Our first step towards authenticating mapping updates is to
authenticate an update's originator. For this reason, each default
mapper MUST cryptographically sign the mapping data in any update it
originates. All default mappers within a single TN SHOULD use the
same key pair, but default mappers in different TNs MUST use
different key pairs. When a default mapper receives a mapping
update, it MUST verify this signature before processing or forwarding
the update. Default mappers MUST NOT remove or alter this signature
when forwarding the update.
Clearly, this scheme can only work if there is a secure way to
distribute all public keys to all default mappers. This should be a
relatively straightforward problem to solve. We describe one simple,
appropriate method for secure key distribution in a network of
manually configured peers in a separate document (forthcoming).
Jen, et al. Expires May 21, 2008 [Page 15]
Internet-Draft Transit Mapping November 2007
8.2. Detecting MapSet Misconfigurations
Though the scheme outlined in Section 8.1 allows for secure
authentication of the originator of a mapping update, it does not
guarantee the correctness of the data. Since DM-BGP peerings are
manually configured and therefore form a relatively closed network,
misconfigurations are far more likely than attacks to be the cause of
inaccurate mapping data.
The types of misconfigurations that could potentially be harmful are
those that result in one TN accidentally interfering with the MapSet
for a DN that it is not connected to. This can happen whenever a
provider accidentally announces a MapSet for the wrong Daddr prefix.
These types of accidental conflicts fall into three categories: (1) a
TN announces a MapSet for the wrong Daddr prefix when that prefix
already has a MapSet in the global mapping table, (2) a TN announces
a MapSet for a Daddr prefix that subsumes a longer Daddr prefix that
already has a MapSet, and (3) a TN announces a MapSet for a Daddr
prefix that is a subset of a shorter Daddr prefix that already has a
MapSet.
The first category of conflicts is the only one that we intend to
actively prevent. Clearly, the DN that owns a particular Daddr
prefix should be the ultimate authority for his mapping information.
However, DNs do not announce their MapSet to the network directly,
but rather through the TNs they connect to. In order to ensure a
mapping update for a Daddr prefix is approved by its rightful owner,
we must first include some sort of prefix owner identification in
each MapSet. To this end, we introduce a DN key field into each
mapping. This field SHOULD contain a cryptographically valid public
key, but it is not currently used as such. When a default mapper
receives a new MapSet that would replace an existing one, it only
needs to ensure that the DN key has not changed. (This scheme is
similar in spirit to the way that OpenSSH uses its 'known_hosts'
file.) Note that DN keys are different from the keys used by default
mappers to authenticate DM-BGP updates.
For the other two categories, it is less clear that such an
announcement is the result of a misconfiguration. It is possible,
for example, that the owner of a /16 Daddr prefix has resold some of
the /24 prefixes it contains to other DNs. In such a case, only the
administrators will know if the announcement is valid. It is for
this reason that (in the spirit of PHAS [PHAS]) we do not attempt to
prevent such changes, but only detect and notify interested parties.
Since legitimate MapSet changes are infrequent, notifying interested
parties of MapSet changes via e-mail is a perfectly viable option.
These notifications could also prove useful in debugging the mapping
service, or a particular TN's configuration.
Jen, et al. Expires May 21, 2008 [Page 16]
Internet-Draft Transit Mapping November 2007
8.3. APT Control Messages
APT never requires Cache Drop and Cache Add Messages to traverse AS
boundaries. Any such message that does traverse an AS boundary must
be an error or an attack. Therefore, TRs MUST ignore Cache Drop and
Cache Add messages with a source Taddr outside of their TN. Since
ISPs already generally drop packets from an external source when they
contain a local source address, this simple policy should be
sufficient to prevent TR cache poisoning, whether accidental or
intentional.
Since any APT control message that may need to travel between ASes
can also affect traffic flow, such control messages MUST be
cryptographically signed. This currently includes ETR Unreachable
Messages (see Section 11.4) and DN Unreachable Messages (see
Section 11.5). Recall that the infrastructure required to generate
and verify cryptographic signatures is already required for mapping
update dissemination (see Section 8.1). When a default mapper
receives such a control message, it MAY choose to verify this
signature.
9. Scalability through Recursion
It is conceivable that the global mapping table could eventually grow
large enough that it would no longer be possible to store it in a
single default mapper. Theoretically, the global mapping table could
grow to contain a separate MapSet for every Daddr prefix. In the
case of IPv6 prefixes, the total number of MapSets would be on the
order of 10^18, far more than we can expect to be able to store on a
single device. If the global mapping table were to approach such
gargantuan proportions, APT can simply be applied recursively.
In the recursive case, the terms "transit" and "delivery" are only
meaningful relative to a particular depth of recursion, or number of
times the packet has been encapsulated. We will refer to the non-
recursive deployment of APT as the global level (G). What we have up
until now referred to as delivery space and transit space are in fact
G delivery space and G transit space.
At one level of recursion, G transit space is split into two address
spaces: recursion depth 1 (R1) delivery space and R1 transit space.
R1 delivery space is just another name for G transit space. Which
name is used will depend on the context. R1 transit space can be
further split into two R2 spaces, and so on. Using this terminology,
all protocols and concepts in APT can be understood to apply
generally at any level of recursion.
Jen, et al. Expires May 21, 2008 [Page 17]
Internet-Draft Transit Mapping November 2007
This figure shows the layout of a packet while being tunneled at an
APT recursion depth of two.
________________________________________
| R2 transit header |
|--------------------------------------|
| R2 delivery a.k.a. R1 transit header |
|--------------------------------------|
| R1 delivery a.k.a. G transit header |
|--------------------------------------|
| G delivery header |
|--------------------------------------|
| |
| payload |
| |
|______________________________________|
Figure 2
10. Mapping Announcements
Each mapping announcement has the following fields:
o Address Type - This field specifies the type of Daddrs used in the
announcement. All Daddr prefixes in a single mapping announcement
MUST be of the same address type. Currently, this is expected to
be either IPv4 or IPv6, but other address types are also allowed.
o Total Length - This field specifies the total number of bytes used
by all MapSets in the announcement. Each mapping announcement can
contain MapSets for multiple prefixes, each with multiple MapRecs.
o Sequence Number - This field reflects the freshness of an update.
Default mappers can avoid processing updates with old sequence
numbers.
o Signature - The message should be cryptographically signed using
the private key of the sending default mapper.
These fields are followed by one or more MapSets. Each MapSet in the
announcement is described by the following fields:
o Daddr Prefix - This is the Daddr prefix for the MapSet.
o Time To Live (TTL) - This is the amount of time in hours that this
MapSet should persist in default mappers before being considered
obsolete and erased. This value MUST be set to at least three
Jen, et al. Expires May 21, 2008 [Page 18]
Internet-Draft Transit Mapping November 2007
times the sender's regular refresh interval. The TTL is specified
in hours to prevent misconfigurations from causing excessive
mapping updates.
o ETR Count - This is the total number of ETRs that the
corresponding Daddr prefix maps to.
o Each ETR in a MapSet is described by the following fields:
* Taddr - The address of this ETR.
* Priority - Priorities are arbitrary integers that only have
meaning in reference to each other. Taddrs with lower priority
values are considered more preferable.
* DN Public Key - This public key SHOULD uniquely identify the DN
that owns this MapSet. It can be used to help identify
configuration errors, and possibly for authoritative,
cryptographic authentication of MapSet data in the future.
11. APT Header and Control Messages
Delivery space packets are encapsulated with a UDP header by an ITR.
The UDP header should specify a well-known port reserved for APT, and
the UDP payload MUST begin with an APT header. For regular data, a
layer-3 header immediately follows the APT header. For other message
types, we describe the fields that follow below.
11.1. APT Header Fields
The APT header contains the following fields:
o Version - The version of APT that should be used to interpret the
header information.
o Tag - Extra field reserved for future use.
o Type - Determines the type of message being sent. Appropriate
values are as follows:
0: Regular Data
1: Cache Add (Section 11.2)
2: Cache Drop (Section 11.3)
Jen, et al. Expires May 21, 2008 [Page 19]
Internet-Draft Transit Mapping November 2007
3: ETR Unreachable (Section 11.4)
4: DN Unreachable (Section 11.5)
5: ETR-to-DN link failure (Section 11.6)
o Default Mapper Taddr - The address of the default mapper for the
ITR that generated this header. This is the Taddr where any
failure notifications from the destination TN will be sent. If
this header was generated by a default mapper, this field SHOULD
contain the same address as the source address in the
encapsulating IP header.
11.2. Cache Add Messages
Cache Add Messages are only sent by default mappers to TRs within
their own TNs, most notably in response to data packets. When a TR
receives a Cache Add Message, it simply adds the enclosed MapRec to
its cache, replacing any existing cache entry.
o Daddr Prefix - This is the Daddr prefix portion of the MapRec to
be added to the receiving TR's cache.
o ETR Taddr - This is the Taddr portion of the MapRec to be added to
the receiving TR's cache. It is the address of the ETR that can
reach the Daddr prefix in the previous field.
o TTL - The TTL specifies the amount of time in seconds before the
added cache entry should expire. Expired cache entries should be
deleted from the TR's cache.
11.3. Cache Drop Messages
Cache Drop Messages are only sent by default mappers to TRs within
their own TNs. When a TR receives a Cache Drop Message, it simply
removes the cache entry corresponding to the enclosed Daddr prefix
from its cache, if such an entry exists.
o Daddr Prefix - This is the Daddr prefix of the MapRec to be
dropped.
11.4. ETR Unreachable Messages
ETR Unreachable Messages are sent by default mappers to other default
mappers to notify them of failures.
o Transit Address - This is the Taddr of the ETR that cannot be
reached.
Jen, et al. Expires May 21, 2008 [Page 20]
Internet-Draft Transit Mapping November 2007
o Signature - The message should be cryptographically signed using
the private key of the sending default mapper.
11.5. DN Unreachable Messages
DN Unreachable Messages are sent by default mappers to other default
mappers to notify them of failures.
o Daddr Prefix - This is the Daddr prefix of the DN that cannot be
reached.
o Signature - The message should be cryptographically signed using
the private key of the sending default mapper.
11.6. The ETR-to-DN Link Failure Message Type
This message type is used by an ETR for two purposes: (1) to indicate
to its default mapper that its direct link to the DN for the enclosed
data packet is down, and (2) to preserve that data packet so that the
ETR's default mapper might deliver it to the DN by way of a different
ETR.
12. Incremental Deployment
Incremental deployment methods and incentives for APT will be
discussed in a separate draft (forthcoming).
13. IANA Considerations
This memo includes no request to IANA.
14. Security Considerations
Security considerations for APT are discussed in Section 8.
15. References
15.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Jen, et al. Expires May 21, 2008 [Page 21]
Internet-Draft Transit Mapping November 2007
15.2. Informative References
[AddrAlloc]
Meng, X., Xu, Z., Zhang, B., Huston, G., Lu, S., and L.
Zhang, "IPv4 Address Allocation and BGP Routing Table
Evolution", ACM SIGCOMM Computer Communication Review
(CCR) special issue on Internet Vital Statistics, Volume
35, Issue 1, p71-80.
[CRIO] Zhang, X., Francis, P., Wang, J., and K. Yoshida, "Scaling
IP Routing with the Core Router-Integrated Overlay", Proc.
International Conference on Network Protocols , 11 2005.
[EFIT] Massey, D., Wang, L., Zhang, B., and L. Zhang, "A Scalable
Routing System Design for Future Internet", SIGCOMM IPv6
Workshop , 8 2007.
[EFIT-ID] Massey, D., Wang, L., Zhang, B., and L. Zhang, "A Proposal
for Scalable Internet Routing and Addressing", Internet Dr
aft, http://www.ietf.org/internet-drafts/
draft-wang-ietf-efit-00.txt, 2 2007.
[LISP] Farinacci, D., Fuller, V., Oran, D., and D. Meyer,
"Locator/ID Separation Protocol (LISP)", Internet Draft, h
ttp://www.ietf.org/internet-drafts/
draft-farinacci-lisp-05.txt, 2007.
[PHAS] Lad, M., Massey, D., Pei, D., Wu, Y., Zhang, B., and L.
Zhang, "PHAS: A Prefix Hijack Alert System", USENIX
Security .
[RAWS] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB
Workshop on Routing and Addressing", Internet Draft, http:
//www.ietf.org/internet-drafts/
draft-iab-raws-report-02.txt, 2007.
[SixOne] Vogt, C., "Six/One: A Solution for Routing and Addressing
in IPv6", Internet Draft, http://www.ietf.org/
internet-drafts/draft-vogt-rrg-six-one-00.txt.
Appendix A. Open Issues
MapSets contain a priority field for each ETR, but this does not
allow for uneven distribution of traffic across ETRs with the same
priority, e.g. a 75/25 split. To provide a mechanism for DNs to
request such traffic distributions, we should also include a weight
field for each ETR.
Jen, et al. Expires May 21, 2008 [Page 22]
Internet-Draft Transit Mapping November 2007
If a TN sends out inaccurate mapping announcements, other TNs can
identify and respond to the misbehaving source TN. However, there
are no preventative security measures in place. Is detection and
response enough of a security measure?
We are considering automating customer-DN-to-provider-TN mapping
updates. Under our current design, whenever a DN needs to update its
mapping information (it may add, subtract, or change providers, or
change its priority values), the DN must contact its provider TNs
offline and request that they announce the updated mapping
information. It is then up to the provider TNs to update the mapping
information. As we have seen with DNS updates, human involvement
introduces the possibility of human error and delay. We hope to
provide DNs with an automated way to manage their mapping
information.
Is it too much to ask ISPs to change all of their PE routers into
TRs? We suspect that TR implementation should involve only software
changes. Existing router hardware can do everything required by a
TR. Thus, we suspect the cost should be reasonable.
Authors' Addresses
Dan Jen
Email: jenster@cs.ucla.edu
Michael Meisel
Email: meisel@cs.ucla.edu
Dan Massey
Email: massey@cs.colostate.edu
Lan Wang
Email: lanwang@memphis.edu
Beichuan Zhang
Email: bzhang@cs.arizona.edu
Jen, et al. Expires May 21, 2008 [Page 23]
Internet-Draft Transit Mapping November 2007
Lixia Zhang
Email: lixia@cs.ucla.edu
Jen, et al. Expires May 21, 2008 [Page 24]
Internet-Draft Transit Mapping November 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Jen, et al. Expires May 21, 2008 [Page 25]
| PAFTECH AB 2003-2026 | 2026-04-21 18:23:23 |