One document matched: draft-bless-nsis-resv-aggr-01.txt
Differences from draft-bless-nsis-resv-aggr-00.txt
NSIS Working Group R. Bless
Internet-Draft M. Doll
Expires: January 8, 2008 Univ. of Karlsruhe
Jul 07, 2007
Inter-Domain Reservation Aggregation for QoS NSLP
draft-bless-nsis-resv-aggr-01
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 8, 2008.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
QoS NSLP is a recently proposed signaling protocol that allows to
establish QoS reservations in the Internet. In order to enable large
scale deployment, inter-domain aggregation should be considered as
mechanism to allow for the necessary scalability in the control
plane. This draft describes the major problems that must be solved
and proposes also solutions to these problems, requiring only modest
modifications and extensions to the currently defined GIST and QoS
NSLP specifications.
Bless & Doll Expires January 8, 2008 [Page 1]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Aggregation Concept . . . . . . . . . . . . . . . . . . . . . 3
2.1. Aggregate Setup . . . . . . . . . . . . . . . . . . . . . 4
2.2. Aggregate Use and Changes . . . . . . . . . . . . . . . . 4
2.3. Aggregate Teardown . . . . . . . . . . . . . . . . . . . . 5
3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Determination of Aggregator and Deaggregator . . . . . . . 5
3.2. Signaling between Aggregator and Deaggregator . . . . . . 6
3.3. Route Change Detection for Aggregated Flows in an
Aggregate . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4. A Priori Determination of a Flow's Path . . . . . . . . . 8
4. Solution Proposals . . . . . . . . . . . . . . . . . . . . . . 9
4.1. Determination of Aggregator and Deaggregator . . . . . . . 9
4.2. Signaling Between Aggregate Endpoints . . . . . . . . . . 10
4.3. Route Change Detection for Aggregated Flows in an
Aggregate . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3.1. IP Layer Solution . . . . . . . . . . . . . . . . . . 13
4.3.2. GIST Layer Solution . . . . . . . . . . . . . . . . . 13
4.3.3. NSLP Layer Solution . . . . . . . . . . . . . . . . . 15
4.4. A Priori Determination of a Flow's Path . . . . . . . . . 15
4.5. Example . . . . . . . . . . . . . . . . . . . . . . . . . 16
5. Security Considerations . . . . . . . . . . . . . . . . . . . 19
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.1. Normative References . . . . . . . . . . . . . . . . . . . 20
6.2. Informative References . . . . . . . . . . . . . . . . . . 20
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20
Intellectual Property and Copyright Statements . . . . . . . . . . 22
Bless & Doll Expires January 8, 2008 [Page 2]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
1. Introduction
A primary objective of NSIS QoS Signaling is to perform resource-
based admission control for data flows. Per-flow signaling, however,
has scalability issues. Aggregation of resource reservations can be
used to achieve better scalability in the control plane (e.g., as
proposed in [RFC3175]). Aggregation achieves two important
reductions: reduction of state (or reservation context) information
and reduction of signaling message processing. Aggregation in the
data plane can be achieved by using packet forwarding mechanisms
according to the Differentiated Services architecture [RFC2475]. For
the sake of simplicity, we assume that the latter is used to provide
QoS for packet forwarding and that Autonomous Systems contain one or
more Differentiated Services domains.
For the following discussion, we assume that the reader is familiar
with RSVP aggregation concepts as described in [RFC3175]. For the
remainder of this memo the terms "aggregate" and "aggregation" are
used in the meaning of "reservation aggregation", i.e. aggregation of
reservation state in the control plane. Furthermore, the term
"aggregated flow" denotes a flow that is contained in a reservation
aggregate that encompasses several single reservations (of the
aggregated flows). These aggregated flows also share some properties
in the data path, i.e. usually they belong to the same service class
and share a part of the same data path (but they need not be in the
same address aggregate or have the same destination).
Currently, QoS NSLP describes coarsely the process of reservation
aggregation, and it supports a single aggregation level using two
different router alert option (RAO) values. This is usually
sufficient if only intra-domain aggregation is considered. But if
the ultimate goal of providing QoS for an end-to-end communication is
considered from a global perspective, intra-domain aggregation is not
sufficient for a scalable end-to-end QoS support: if aggregated flows
leave an aggregation domain, the next domain sees all individual
(i.e., non-aggregated) flows again. Thus, especially larger transit
providers will have to manage a lot of individual flows and thus they
will be suffering from scalability problems. Moreover, using
manually established static aggregates between providers would be a
huge management overhead. Therefore, we want to design a mechanism
that allows to dynamically create aggregates between different
providers on demand.
2. Aggregation Concept
This section briefly describes the concept of aggregated reservations
as assumed in this draft.
Bless & Doll Expires January 8, 2008 [Page 3]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
2.1. Aggregate Setup
Aggregation of QoS reservations is performed between an aggregator
and a deagreggator (which is located downstream from the aggregator,
cf. also RFC 3175 [RFC3175]). Based on some trigger (e.g., the
current number of reservations), the aggregator decides to subsume
several existing flow reservations along the same path segment (e.g.,
same AS hops) into a larger aggregate reservation. In order to make
this decision, the aggregator must find a potential deaggregator.
The flows (and their reservations) must follow exactly the same path,
at least up to the deaggregator. Therefore, the aggregator must
either know the actual path taken by the flows (e.g., by using
corresponding routing protocol information) or it must get notified
by the deaggregator explicitly. In this case the deaggregator must
know that the aggregator is an upstream node which is common to all
reservations under consideration.
The comprising aggregate reservation is setup by an appropriate
signaling between aggregator and deaggregator. During this
procedure, the existing reservations are moved into the aggregate
reservation, i.e., all intermediate nodes between aggregator and
deaggregator delete all state information related to the individual
reservations that should be aggregated, so they only see a single
aggregate reservation afterwards. This will achieve the desired
reduction of managed states. Aggregator and deaggregator manage both
the individual reservations and the aggregate reservation, i.e., they
don't save any state information but need to manage one additional
state for the aggregate.
Furthermore, in order to save message processing cost, the aggregate
capacity should be somewhat larger than actually required by the
subsumed flows. In this case the aggregator should not need to adapt
the aggregate capacity every time a flow leaves or joins the
aggregate. Thus, the aggregate capacity should change only
infrequently, usually by applying some hysteresis function (cf.
discussion in [RFC3175], sec. 1.4.4).
2.2. Aggregate Use and Changes
If a new reservation request approaches the aggregator it must
determine a priori whether the new flow "fits" into an existing
aggregate. So the flow's route must be known and whether enough
residual capacity is left in the aggregate to subsume the new
request. In case that the aggregate capacity is too small, it must
be increased prior to including the new reservation.
Signaling messages for all aggregated flows should be directly
forwarded from aggregator to deaggregator in order to save signaling
Bless & Doll Expires January 8, 2008 [Page 4]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
message processing by nodes between aggregator and deaggregator.
Furthermore, these nodes do not have any knowledge about the
aggregated flow sessions anymore, thus one must avoid to signal
messages related to these single flows to them.
2.3. Aggregate Teardown
The aggregate can be torn down some time after the last reservation
has left the aggregate. The aggregator will notice either an
explicit teardown or a refresh timeout for the last reservation. If
no new reservation request shows up after a waiting period, the
aggregate reservation will be torn down completely.
3. Problems
We see several problems for a QoS NSLP to support inter-domain
aggregation, namely:
o Determination of Aggregator and Deaggregator
o Signaling between Aggregator and Deaggregator
o A Priori Determination of a Flow's Path
o Route Change Detection for Aggregated Flows in an Aggregate
These points are discussed in more detail in the following sections.
3.1. Determination of Aggregator and Deaggregator
When aggregation within a domain is considered, it is no problem to
choose an aggregator and deaggregator for a set of flows, because
boundary routers at the domain borders ("aggregation region") are
typically acting as aggregators and deaggregators for flows entering
and leaving the domain respectively. Thus, their role is
predetermined.
But if aggregation across domains is considered, it is not obvious
which routers are aggregators or deaggregators for a set of flows,
because there are many choices due to the fact that flows usually
traverse several different administrative domains. Aggregates are
more efficient the longer they are, because longer aggregates save
more states and control message processing. A set of flows can be
aggregated along the path that they share, i.e., along the set of
nodes that are traversed by all flows within this set. In example of
Figure 1 three different flows are shown, f1 from host H1 to sink S1,
f2 from H2 to S1, and, f3 from H3 to S3. While all three flows can
be aggregated along domains Dd, De, and Df, only f2 and f3 can be
aggregated from Dd up to Dg. Furthermore, the flows are entering the
domain Dd usually at different ingress routers and join somewhere
within the domain. However, they may leave the domain by the same
egress router. For the deaggregation domain, the reverse is true:
Bless & Doll Expires January 8, 2008 [Page 5]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
the aggregated flows enter the domain via the same ingress, but may
split sooner or later within the domain, leaving it via different
exits.
H1--Da--+
\ S1
\ /
H2--Db----Dd---De--Df--Dg--S2
/ \
/ S3
H3--Dc--+
f1: data flow H1->S1 Hx: Host x
f2: data flow H2->S2 Sx: Sink x
f3: data flow H3->S3 Dx: Domain x
Example for aggregation of flows along different domains
Figure 1
In summary, there are many more choices to determine an aggregator-
deaggregator pair for a set of flows. Moreover, it is important to
consider who initiates the establishment of an aggregate. In RFC
3175 [RFC3175] it is the deaggregator that initiates the reservation,
which corresponds nicely to the receiver-initiated reservation scheme
of RSVP.
For QoS NSLP both ends are basically able to initiate an aggregate
reservation. The more natural choice would be that the aggregator
initiates establishment of an aggregate reservation. In this case,
it is required that the aggregator has knowledge about potential
deaggregators. This information may be collected during
establishment of the reservation for a single flow and reported back
to the initiator. See Section 4 for a possible solution.
3.2. Signaling between Aggregator and Deaggregator
Signaling messages related to flows that are aggregated in an
encompassing aggregate should be forwarded directly from aggregator
to deaggregator and vice versa. This is necessary, because in-
between nodes know only the aggregate flow and do not have any
information about individual flows that are contained in this
aggregate. Moreover, the aggregate should not only save states, but
it should also allow for saving signaling message processing.
The intra-domain aggregation defines that NTLP uses a router alert
option to signal directly from aggregator to deaggregator, i.e., all
Bless & Doll Expires January 8, 2008 [Page 6]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
NSIS QNEs in-between do not interpret the signaling message.
However, this simple and effective scheme does not work for inter-
domain aggregation, because the space of possible RAO values (16 bit)
is much too small to cover the huge set of potential unique
aggregator-deaggregator pairs, which would be required for different
provider domains.
Moreover, when considering GIST, a further problem occurs if the
aggregator has to send periodically a Query message for every flow in
order to detect any route changes for this flow. However, this
should not cancel the aggregation gain, i.e., nodes between
aggregator and deaggregator should ideally not process these messages
or store state about these flows. Thus, on the one hand these query
messages should detect any change in the path between aggregator and
deaggregator, on the other hand, nodes in-between should preferably
not process these per-flow signaling messages.
3.3. Route Change Detection for Aggregated Flows in an Aggregate
It may occur that the route of an aggregated flow changes during its
lifetime. If the routing change does not affect the part of the data
path that is also covered by the aggregate, it is not a problem,
because it will be managed by the usual GIST/QoS NSLP mechanisms. If
the route change also affects the encompassing aggregate in the same
way as the aggregated flows, it would be covered by trying to reserve
resources for the re-routed aggregate.
However, the flow may actually leave the aggregate's path and either
return to it before or after the deaggregator (cf. flows f1 and f2
respectively in Figure 2). An alternative, as mentioned in
[RFC3175], would be to "tunnel" the data packets between aggregator
and deaggregator. However, due to the burden for routers caused by
the overhead of tunneling data packets as well as MTU related
problems, we do not consider such solutions in this draft.
Therefore, a mechanism must be defined to detect any route changes
affecting aggregated flows.
Bless & Doll Expires January 8, 2008 [Page 7]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
+-----+
/ Dst \
| | |
|De |&&&&&&&&&&&
\ | / &&
+--|--+ && f2
| &
+--D--+ +-&---+
/ # \ / & \
|Dd # | | && Dg|
| #%%%%| |&& |
\ # /%% &&\ /
+--#--+ %%&& +-----+
# &&%%
+--#--+ && %% +-----+
/ # && % \
|Dc #&&& | | %%% Df|
| # | | % |
\ # / \ % /
+--#--+ +--%--+
# %
+--#--+ %
/ # \ f1 %
|Db #%%%%|%%%%%%%%%%%%
| # |
\ # /
+--A--+ Dx: Domain x
| #: Aggregate
+--|--+ A: Aggregator
/ | \ D: Deaggregator
|Da | | %: f1, returning to aggregate route
| Src | before deaggregator
\ / &: f2, returning to aggregate route
+-----+ behind deaggregator
Possible route changes of aggregated flows
Figure 2
3.4. A Priori Determination of a Flow's Path
In order to utilize an already established aggregate reservation, an
aggregator must know if a new incoming reservation can be integrated
into an already established aggregate. This requires that the
aggregator is able to determine the path that the flow will take a
priori. In case the flow runs along the same path as an already
established aggregate and the aggregate has enough unused capacity,
the aggregator may include the request into the aggregate and forward
Bless & Doll Expires January 8, 2008 [Page 8]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
it directly to the deaggregator. However, predicting a flow's path
is difficult in the inter-domain case: usually only an AS path (i.e.,
a sequence of AS numbers) for a given destination prefix can be
determined by using BGP routing table information. Thus, an
aggregator usually does not know the exact ingress or egress border
routers for a given AS. Especially multi-homing techniques between
ASes make it difficult to predict an exact path, e.g., flows whose AS
paths differ only in their destination AS may enter the same
penultimate AS through different ingress routers. Furthermore, some
mechanism must be provided in order to verify the prediction, i.e.,
to revert if the prediction was wrong.
An alternative to prediction would be to probe the actual path first,
preferably without installing any state. This would, however,
increase the reservation setup time, because a round-trip signaling
message exchange would be required before one could determine whether
there is an existing aggregate that would match the flow.
4. Solution Proposals
This section sketches some proposals to the previously described
problems. This is preliminary work and some details still need to be
worked out further in forthcoming version of this draft.
4.1. Determination of Aggregator and Deaggregator
Determination of aggregator and deaggregator could be accomplished by
using a QoS NSLP mechanism to record the route for individual
reservations. Therefore each QNE that is able and willing (i.e., if
local policy allows it) to serve as a deaggregator may simply append
its IP address to a new protocol object ("Route-Record") that holds a
list of such addresses. This protocol object would be carried in a
RESERVE message. Usually, it is sufficient to record only two
addresses per domain, i.e., ingress and egress QNE. The average AS
path length is usually well below 4 ASes, so the total number of
recorded addresses would still be small. It may be useful to record
also the AS number in addition to the QNE addresses. Moreover, the
ideal object to record would be the peer identity, but due to its
non-unique and potentially lengthy format they are probably harder to
process more efficiently than IP addresses. The completed "Route-
Record" object would be reported back to potential aggregators in the
RESPONSE message. So if aggregation is going to be used, requesting
a RESPONSE message by inserting an RII object at the aggregator would
be required. If an RII object is already present, the response must
be checked by the aggregator.
An aggregator may then store the list of traversed QNEs together with
Bless & Doll Expires January 8, 2008 [Page 9]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
the per-flow session data and pick a deaggregator according to its
own criteria later. Usually, the aggregator should choose a
deaggregator that is far away in order to achieve long and thus
efficient aggregates.
In case he determined a QNE to serve as a deaggregator, a new RESERVE
will be sent towards the deaggregator. The RESERVE message would
contain the totalized capacity of all individual reservations and a
list of the session IDs for all flows that should be aggregated.
This would require the use of a SESSION_ID_LIST object that is
contained in the aggregation messages RESERVE and RESPONSE. The list
in the RESPONSE message will contain all sessions that could not be
aggregated, e.g., in case of aggregation conflicts, i.e., when some
flows were already aggregated in a way that they cannot be aggregated
as intended by the new aggregation request. Furthermore, a flag
(AGGREGATION bit) in the RESERVE or RESPONSE message could indicate
that it is a special type of RESERVE and RESPONSE message containing
the additional SESSION_ID_LIST object. Using a flag instead of a new
message type may have some implementation advantages, because most of
the code is completely identical to a normal RESERVE processing.
However, it is also possible to define a new message type for
aggregation establishment (e.g, ARESERVE and ARESPONSE).
4.2. Signaling Between Aggregate Endpoints
The objective is to forward per flow signaling messages (e.g.,
refreshing RESERVEs of a flow's session) between aggregator and
deaggregator directly, so that no intermediate QNE has to process
these messages. The QNI sends a per flow message (e.g., a refreshing
RESERVE) that arrives at the aggregator. The aggregator detects that
this flow is part of a larger aggregate reservation and performs
"aggregate signaling", i.e., it sends the message along a special
direct messaging association (MA) that must be established between
aggregator and deaggregator for the aggregate. A possible mechanism
to establish a corresponding messaging association is described
below. At QoS NSLP level the aggregator should also insert the
BOUND_SESSION_ID object containing the session ID of the aggregate's
session. When the signaling message arrives at the deaggregator, it
notices that this message arrived via the direct message association
between the aggregate endpoints and removes the inserted
BOUND_SESSION_ID. Then normal message processing at QoS NSLP level
continues.
In order to establish a direct signaling message transport between
aggregator and deaggregator, the GIST Query message must be conveyed
to the deaggregator. This could be done via several ways, e.g., at
IP layer or at GIST layer as described in the next section in more
detail. However, the aggregator will create a new GIST session that
Bless & Doll Expires January 8, 2008 [Page 10]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
solely serves the purpose to directly transfer signaling messages
between aggregator and deaggregator. The Query will be sent as
Q-mode encapsulated message with the single flow as destination
address and a special MRM (further details for this special query
encapsulation are described in the next section).
Once the Query arrives at the deaggregator it will send a UDP
encapsulated Response directly back to the aggregator and a messaging
association should be created. Using an SCTP connection for a
messaging association would be a good choice so that messages for
different flows can be mapped to different streams. Signaling
messages for individual flows that arrive at the aggregator are
mapped to the GIST session for aggregate signaling, i.e., they are
directly sent to the deaggregator. A problem is, however, that the
session ID for the individual flow must be conveyed by additional
means, because GIST must use its present session ID for aggregate
encapsulation.
4.3. Route Change Detection for Aggregated Flows in an Aggregate
As mentioned above, it should be possible to establish a messaging
association directly between aggregator and deaggregator, e.g., using
the GIST bypass mechanism, so that intermediate QNEs drop out of the
signaling path. However, in case that all messages for aggregated
single reservations are passed over this direct message association
unconditionally, reservation path and data path would probably
diverge once the route of this flow changes and sheers off the
aggregate path.
Before describing the details of the method, we summarize the
sequence of the overall aggregate operations:
1. Deaggregator is discovered.
2. Aggregator establishes an aggregate reservation.
3. Aggregator initiates a direct signaling messaging association.
The messaging associations for all aggregated flows at both sides
are updated or installed, so that signaling messages for the
aggregated flows use the direct signaling MA.
4. Aggregator performs periodically route change checks for
aggregated flows.
5. Additional flows may be added to the aggregate later (leaving
flows are straightforward).
A direct forwarding of signaling messages between aggregator and
deaggregator would be a problem if a flow within the aggregate
changes its route within the aggregate, leaving the aggregate's path
(maybe even re-joining it later on, cf. flow f2 in Figure 2). In
case a flow diverges from the aggregate route it must establish a new
reservation along the new part of the path from the branching QNE. A
Bless & Doll Expires January 8, 2008 [Page 11]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
further problem occurs due to the fact, that the QNEs within the
aggregate don't know anything about this particular flow session.
Furthermore, the single flow should be removed from the aggregate
reservation, otherwise new requests have to be rejected
unnecessarily.
Therefore, it must be checked regularly for path divergence between
the flow's path and the aggregate's path. With RSVP aggregation or
the intra-domain aggregation for QoS NSLP this is automatically done
by using the router alert option, i.e. per flow signaling messages
will be routed along their natural path, possibly swerving from the
aggregate's path. The interior nodes still don't have to process the
signaling messages. Some boundary node will intercept the message in
its role as potential deaggregator and possibly trigger creation of a
new aggregate or initiate integration of the single flow into an
existing aggregate. Though as explained earlier, for inter-domain
aggregation, the RAO-based approach is not usable due to the limited
RAO value space with respect to the potential number of aggregator/
deaggregator peers, so another approach must be developed.
Route detection is the task of the NTLP layer. In GIST, a periodic
Query per routing entry is triggered in order to discover new routes
or route changes respectively. The GIST Query is sent with Q-mode
encapsulation. This also works with intra-domain aggregation by
setting the corresponding RAO. For inter-domain aggregation we want
to use a similar mechanism.
A GIST node cannot easily detect if the flow's path diverges from the
aggregate's route. A GIST node could detect that the IP next hop of
the flow and the IP next hop of the aggregate flow differ. But this
need not necessarily result in a change of the next GIST peer.
Therefore, a check for route divergence in a GIST node is not a
reliable indication that the flow actually left the aggregate route.
Thus, a better indication would be that a signaling message arrives
at a GIST node where the aggregate reservation is yet unknown (e.g.,
the session ID for the aggregate is unknown). In this case the node
should not establish a new GIST signaling session for the aggregate,
but send back an error indication to the aggregator instead. This
would at least detect the case when a flow leaves the aggregate and
hits a different GIST node. Not covered by this detection is the
case when a flow diverges from the route and rejoins the aggregate
route later thereby leaving out some GIST nodes on the aggregate
route. To detect such situations a GIST node must either determine
whether the previous peer who sent the message is still the same peer
as in the aggregate, or, use the GIST Hop Count as indicator that a
QNE was skipped.
Bless & Doll Expires January 8, 2008 [Page 12]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
Therefore, we define an "aggregate query encapsulation" mode that
detects any divergent routes for a flow and its encompassing
aggregate. This query aggregate encapsulation must not use the
direct messaging association between the end points, because it would
not detect any route changes then. It can, however, also be used to
initially establish such a direct signaling relationship.
4.3.1. IP Layer Solution
The most efficient solution would be a bypass at IP level, like the
special Router Alert Option for aggregate signaling. But, in this
inter-domain case, a simple Router Alert Option codepoint is not
enough to cover the huge set of different deaggregators in different
domains.
One solution would be a new IP option carring an additional
destination address as an IP packet option (a new hop-by-hop option
in IPv6, which we call Route Verify). On receipt of such a packet
the router simply performes one additional routing lookup for the
conveyed destination address and must compare the next hop for the
normal destination address and the additional destination address.
In our case, the outer IP destination address would be the one from
the deaggregator and the additional destination address the same as
the destination from the flow's path-coupled MRI. In case that the
next hop entry is the same for both destination addresses, the router
simply forwards the message, keeping this additional option. If the
next hops differ, however, flow and aggregate diverge at this router
and the originator of the signaling message should be notified of
this fact. This would require to send a special ICMP message back to
the aggregator indicating the route divergence. This ICMP message
must be correlated to the GIST signaling message at the aggregator
again. However, as mentioned above, route divergence between GIST
nodes may not be relevant. Furthermore, because only the IP
destination address is used for a routing decision and not the MRI,
routes may be different from MRI-based routing decisions.
4.3.2. GIST Layer Solution
The GIST only method would be as follows: if the aggregator would
send a Q-mode encapsulated message for a single reservation, a GIST
Query message with a new MRM (aggregate forwarding - AF) is sent
instead. We call this mode Aggregate Q-Mode (AQ-mode). The
Aggregate Q-mode encapsulation is as follows: the query is sent with
Q-mode encapsulation, i.e., it has the RAO set, uses the flow
destination address as IP destination address and uses the
aggregators address as source address (S-Bit is set in the GIST
header). The session-ID is the one of the single flow, but it
contains the new aggregate AF-MRM instead of the normal path-coupled
Bless & Doll Expires January 8, 2008 [Page 13]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
MRM. The AF-MRM contains a type field, the session ID of the
aggregate, and the path-coupled MRI of the single flow. In summary
its structure is like this:
MRI = type
session-id (aggregate flow)
PC-MRI (single flow)
The type field allows to differentiate between different AQ-modes
where interpretation of contained session IDs and MRIs differ. It
contains one of the following values:
o "Route Check": Perform a route check for the specified flow only.
o "Establish Direct": Establish a new direct signaling session for
the given flow.
o "Add Flow": if message hits endpoint, message should be passed to
NSLP.
The session ID carried within the aggregate forwarding MRI (AF-MRI)
is the one of the established aggregate (which we call session ID A
for now), the PC-MRI is a fully encapsulated path-coupled MRI object
(i.e., including the common object header).
First we describe how the route change check is performed and then
how a direct signaling messaging association can be established by
using the AQ-mode encapsulation. When the aggregator sends such a
signaling message in direction towards the deaggregator, the next
GIST node (supporting QoS NSLP) will intercept the message due to the
RAO and detect the AF-MRM. It then checks whether session ID A is
known. If session ID A is known and the node is not the endpoint
(deaggregator) for this session it will forward the message further
downstream (basically unchanged, i.e., IP hop count and Gist Hop
Count are decremented) according to the flows PC-MRI. If the node is
the endpoint for the session, message forwarding is terminated and
the message has to be processed by GIST, i.e., it must refresh the
routing state for the single flow. In contrast to a normal Query the
R flag in the GIST header need not be set, so a Response may be
suppressed, because the primary objective is to check for diverging
routes, that are indicated by Error messages. But if the R flag is
set, a Response message must be sent back directly to the aggregator
via the existing direct messaging association.
If a GIST node receives such an AQ-mode encapsulated message but does
not have any installed state for session ID A, it MUST send back an
error message (yet to be defined but indicating that the signaling
message left the aggregate's path) directly to the IP source of the
query message, which is the aggregator. The aggregator should then
indicate a route change to the QoS NSLP and should remove the single
flow reservation from the aggregate and initiate a normal single flow
Bless & Doll Expires January 8, 2008 [Page 14]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
reservation along the further path (optimizations like shifting the
aggregate reserved resources for this flow along the unchanged
aggregate path are for further study).
The Gist Hop Count must be checked on reception at the endpoint,
because it will indicate if a GIST-aware node was skipped due to
rerouting. Its value must be compared with the one that was received
stored during establishment of the direct signaling connection. This
requires that the latter is set up immediately after the
establishment of the reservation aggregate as described earlier.
Setting up a direct signaling messaging association is possible by
using the same AQ-mode encapsulation for the initial Query, having
the AF-MRI type set to "Establish Direct". The session ID carried by
GIST is the one of the newly to be established direct signaling MA.
The PC-MRI in the AF-MRI contains a description of the MA signaling
flow between the two aggregate endpoints. For this AF-MRM type the R
flag must be set (R=1) so the endpoint (deaggregator) must send a
Response message directly back to the aggregator to continue the
initial GIST handshake. Therefore, the S flag must be set, too.
Details for the last type of this AQ-mode (Add Flow) are described in
Section 4.4.
4.3.3. NSLP Layer Solution
In principle it is also possible to use the NSLP layer for the
functions described in the previous GIST related section, e.g.,
signaling messages are received by the NSLP layer and bypassed if the
node is not the deaggregator. The Bound-Session-ID object could be
used to refer to the aggregate session and a new flag in the NSLP
common header could be used to indicate that this message should be
forwarded unless arrived at the deaggregator. If this
Bound-Session-ID is unkown, the signaling message has left the
aggregate and an error message should indicate this fact.
The disadvantage of this method is, that is possesses more processing
overhead than using bypassing via GIST as describe above.
4.4. A Priori Determination of a Flow's Path
This is the most difficult task of inter-domain aggregation. We
propose to solve it by using (BGP) routing tables, the GIST AQ-mode,
and new QoS NSLP mechanisms.
A QNE may detect that the flow's destination address of a new
incoming reservation request is in the same prefix or AS as already
aggregated reservations. In this case, it may try to integrate this
Bless & Doll Expires January 8, 2008 [Page 15]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
reservation into the same already existing aggregate (either having
some capacity left or having to increase the capacity of the
aggregate reservation first). Moreover, the QNE may try to use an
existing aggregate if the flow traverses the same AS path as the
aggregate, but as stated in Section 3, this prediction may be
inaccurate. Thus, an optimistic approach would try to use the
existing aggregate first, but doing some path verification in nearly
the same way as for route change detection. Therefore, we use a
variant of the AQ-mode for that purpose, too.
However, because the Query in the AQ-mode cannot carry larger
payloads it is not well suited to carry any larger QoS NSLP RESERVE
message. Thus, we use the following mechanism: The RESERVE for the
newly to be established flow is sent directly to the (predicted)
deaggregator over the direct signaling messaging association. It
must not be forwarded, however, before not being sure that the flow
follows the path of the aggregate. Therefore, the RESERVE must not
be forwarded and has to wait until a path verification message via
AQ-mode arrives at the deaggregator. This would be a new type of QoS
NSLP message, simply carrying a unique message ID (e.g., a 128-bit
value) that is chosen by the aggregator. This message ID must be
also contained in the RESERVE message to allow for a successful
matching of these mutual dependent signaling messages. This MSG_ID/
BOUND_MSG_ID mechanism was introduced in version 14 of the QoS NSLP
specification draft [I-D.ietf-nsis-qos-nslp].
If the AQ-mode Query message arrives before the RESERVE message, the
deaggregator will note that the message ID was received and can
immediately forward the RESERVE, because the "Waiting Condition" is
already satisfied. However, waiting messages will time out after a
while, because the path prediction may have been wrong and the flow
diverged from the predicted path. Additionally, one could design an
explicit cancellation mechanism, so that the aggregator could
explicitly cancel waiting messages if it has been notified of a
diverging route.
The proposed method has the advantage of saving more than one round
trip time compared to a mechanism where the path is probed first.
Furthermore, it is advantageous when nested aggregates have to be
increased in their capacity. Details of such a concept and using
waiting conditions for messages in a signaling protocol are described
in [DARIS].
4.5. Example
We now describe an example of how the signaling messages will be
exchanged. First we assume, that the deaggregator can be determined
by feedback from the proposed Route Record object in all RESPONSE
Bless & Doll Expires January 8, 2008 [Page 16]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
messages for the individual flows. An aggregate capacity will be
determined and all flows sharing a part the same route will be
aggregated in a corresponding reservation between aggregator and
deaggregator (RA and RD respectively in Figure 4).
RA R1 R2 R3 R4 R5 RD
| | | | | | |
|---- (1) RESERVE(RA->RD, SID=A, SESSION_ID_LIST) ---->| Establish
| | an Aggregate
|<--- (2) RESPONSE(RA->RD, SID=A, SESSION_ID_LIST) ----| Reservation
| | | | | | |
|-- (3) Query(SID=S,AF-MRI(EstDirect,SID=A,RA->RD)) -->| Establish
| | | | | | | a direct
|<========== (4) Response(SID=S,AF-MRI) ===============| signaling
| | | | | | | association
|<======================= New MA =====================>| between
| | RA and RD
----> hop-by-hop signaling (intercepted/inspected by R1-R5)
====> direct signaling (no interpretation by R1-R5)
Example for establishing an aggregate
Figure 4
Message (1) is a RESERVE for the aggregate reservation from RA to RD
that is sent as usually: using a path-coupled MRM via query mode
encapsulation. The only additions are: SESSION_ID_LIST that lists
all the flows that are included into the aggregate. Furthermore, a
MSG_ID is included so that message number (3) will wait if it arrives
earlier than message (4). Each intermediate node (R1-R5) performs
admission control for the requested aggregated bandwidth while taking
into account that resources for the listed single flows are part of
the requested capacity. Intermediate nodes R1-R5 will delete
information about the single information flows that are listed in the
SESSION_ID_LIST object if message (2) indicates a successful
reservation.
Message (3) is requesting the setup of a direct signaling messaging
association between RA and RD. It will be an independent signaling
session (SID=S) but can be bound to the aggregate session (SID=A).
This session must be handled in a special manner by the RMF along
with all the reservation aggregation mechanisms anyway: any
individual flow that is now contained in the aggregate A should use
this MA instead. The PC-MRI in the AF-MRI is now the same as for the
signaling connection. A QoS NSLP Query message can be used to
initially set up this direct MA. It should, however, be some kind of
Bless & Doll Expires January 8, 2008 [Page 17]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
NULL QSPEC. This message must be sent in AQ-mode and every
intermediate node (R1-R5) will bypass the message if it is neither
the destination node of the PC-MRI nor the deaggregator of the
session-ID in the AF-MRI (session id is A in this case). The
deaggregator will set up an MA with RA as direct GIST peer (previous
hop). Subsequent messages for the aggregated flows could now be sent
explicitly routed by using the SII handle from the session S.
RA R1 R2 R3 R4 R5 RD
| | | | | | |
|============== (1) RESERVE(S->D, SID=f) =============>| per-flow
| | message
|<=========== (2) RESPONSE(S->D, SID=f) ===============| (not refresh)
| | | | | | |
|---(3) Query(SID=f,AF-MRI(RouteCheck,SID=A,S->D)----->| GIST refresh
| | | | | | |
| | | | | | |
----> hop-by-hop signaling (intercepted/inspected by R1-R5)
====> direct signaling (no interpretation by R1-R5)
Example for single flow end-to-end message and GIST route refresh
Figure 5
Figure 5 shows that end-to-end per flow messages (messages (1) and
(2) like a TEAR or reservation update) will now be sent via the
direct signaling connection (using the explicit routing feature by
specifying an appropriate SII handle). If GIST needs to refresh the
route of an aggregated flow, it will send a Query in AQ-Mode (message
(1)). This Query will be intercepted by each intermediate node
(R1-R5) and the AF-MRI is checked as described in Section 4.3.2: if
session ID A from the AF-MRI is known the message will be bypassed
and forwarded in direction of the PC-MRI(S->D) of the single flow
that is contained in the AF-MRI. If the node is endpoint of the
aggregate, i.e. it is the deaggregator RD for A, it will stop
forwarding and refresh the flows routing state instead. In case the
route of the single flow diverges from the aggregate's path an error
will be returned to the aggregator RA indicating that session A is
unknown at this node.
Bless & Doll Expires January 8, 2008 [Page 18]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
RA R1 R2 R3 R4 R5 RD
| | | | | | |
|===== (1) RESERVE(S->D, SID=g, BOUND_MSG_ID(x)) =====>| per-flow
| | | | | | | message
| | | | | | |
| | | | | | |
|--(2) Query(SID=g,AF-MRI(AddFlow,SID=A,S->D), |
| QUERY,MSG_ID(x)) ------->| path
| | | | | | | verification
| | | | | | |
----> hop-by-hop signaling (intercepted/inspected by R1-R5)
====> direct signaling (no interpretation by R1-R5)
Example for a new flow that is included into the already existing
aggregate
Figure 6
Figure 6 shows a new RESERVE message for a single flow g that should
also be aggregated into A. Thus it is sent directly to RD via the
direct MA. However, in order to verify the path prediction a second
message, e.g. a QUERY, will be sent in AQ-Mode using an AF-MRI of
type AddFlow. Every intermediate node will bypass this message until
the deaggregator of A is hit. This message will be the triggering
message for message (1) whose processing is suspended by the waiting
condition on message (2). If message (2) arrives at RD, the path was
verified and the prediction was correct. Any divergence from the
path will also result in an error that is sent back to the
aggregator.
5. Security Considerations
Basically, the security considerations of GIST and QoS-NSLP apply.
Inter-domain aggregation, however, may open new aspects due to
different trust relationships between domains. So not every provider
may be willing to accept aggregate reservations. On the other hand,
using the proposed mechanisms for deaggregator discovery, it is no
problem to avoid acting as deaggregator by not writing own addresses
into the Route-Record object. So the particular policy of a provider
could be easily realized. Furthermore, domains that share or carry a
lot of end-to-end reservations would likely cooperate with each
other.
The newly proposed waiting condition for messages cannot be used for
DoS attacks that try to exhaust state memory, because every
deaggregator will accept such messages only within an aggregate
Bless & Doll Expires January 8, 2008 [Page 19]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
context. Usually, a trust relationship between aggregator and
deaggregator exists and they may also use a secure direct signaling
messaging association (which is recommended). Thus, messages from
the aggregator could be authenticated. Attackers are not able to
send such message blindly, because the deaggregator would drop them
due to their unauthorized origin and a non-matching session ID.
These are preliminary considerations, so they probably cover not all
possible aspect of the proposed solutions. There will be more
details in the next versions of this draft.
6. References
6.1. Normative References
[I-D.ietf-nsis-ntlp]
Schulzrinne, H. and R. Hancock, "GIST: General Internet
Signalling Transport", draft-ietf-nsis-ntlp-13 (work in
progress), April 2007.
[I-D.ietf-nsis-qos-nslp]
Manner, J., "NSLP for Quality-of-Service Signaling",
draft-ietf-nsis-qos-nslp-14 (work in progress), June 2007.
6.2. Informative References
[DARIS] Bless, R., "Dynamic Aggregation of Reservations for
Internet Services", Telecommunications Systems Volume 26,
Issue 1, pp. 33--52, Kluwer,
http://tm.uka.de/doc/2003/ictsm-daris-journal-crc-web.pdf,
May 2004.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
and W. Weiss, "An Architecture for Differentiated
Services", RFC 2475, December 1998.
[RFC3175] Baker, F., Iturralde, C., Le Faucheur, F., and B. Davie,
"Aggregation of RSVP for IPv4 and IPv6 Reservations",
RFC 3175, September 2001.
Bless & Doll Expires January 8, 2008 [Page 20]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
Authors' Addresses
Roland Bless
Institute of Telematics, Universitaet Karlsruhe (TH)
Zirkel 2
Karlsruhe 76187
Germany
Phone: +49 721 608 6413
Email: bless@tm.uka.de
URI: http://www.tm.uka.de/~bless
Mark Doll
Institute of Telematics, Universitaet Karlsruhe (TH)
Zirkel 2
Karlsruhe 76187
Germany
Phone: +49 721 608 6403
Email: doll@tm.uka.de
URI: http://www.tm.uka.de/~doll
Bless & Doll Expires January 8, 2008 [Page 21]
Internet-Draft Inter-Domain Reservation Aggregation Jul 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Bless & Doll Expires January 8, 2008 [Page 22]
| PAFTECH AB 2003-2026 | 2026-04-22 20:01:58 |