http://stupid.domain.name/ietf/

One document matched: draft-willis-pwe3-requirements-00.txt
Internet Draft
Expires February 2005                               Peter Willis et al
                                                                    BT
                                                        September 2004


                 Service Provider requirements for PWs
                 draft-willis-pwe3-requirements-00.txt

Status of this Memo


   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   or will be disclosed, and any of which I become aware will be
   disclosed, in accordance with RFC 3668
   
   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.
   
   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.
   
   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."
   
   The list of current Internet-Drafts can be accessed at:
   http://www.ietf.org/ietf/1id-abstracts.txt.
   
   The list of Internet-Draft Shadow Directories can be accessed at:
   http://www.ietf.org/shadow.html.


Abstract

   This internet draft provides some requirements to help steer future
   PWE3 work from the perspective of a Service Provider.















   Willis et al           Expires February 2005               [Page 1]

1.      Introduction
   
This document, although not exhaustive, captures some of the more
important requirements to be met when considering further work on
Pseudo-Wires (PW), such as PW stitching.  This draft is NOT aimed at
modifying existing PWE3 encapsulations but is aimed at providing
guidance to future PWE3 work e.g. PW stitching.
   
Note on terminology used:

This document uses the more generic term "client" to refer to the PWE3
payload and the more generic term "server" to refer to the PSN. It is
commonly understood (by software developers and network architects)
that client/server relationships can be recursive so the terms
"client" and "server" are used in this document to avoid the need to
enumerate all client/server stacks fully, as would be the case if we
used the terms "payload" and "PSN". It should be noted that
client/server recursion is a fundamental requirement for Service
Providers (SP) and not just an architectural possibility. For example
SP A may buy connectivity from SP B and also sell connectivity to SP C
(In this example A,B,C have client/server relationships and are NOT
peers). SP C may sell connectivity to an Enterprise. Even in this
simple example we have 4 recursions of a client/server relationship
and 4 respective layer networks.

   
1.1     PWE3 Payload/PSN Independence Relationships

PWs create a client/server relationship between 2 layer networks.
There are two types of client/server relationships that must be
considered:

Case 1 - where the client and server layer networks are owned by the
same Service Provider (SP).

Case 2 - where the client layer network is owned by a customer, who
may be a SP (SP A), and the server layer network is owned by a
(different) SP (SP B).

For case 2, the functional components (such as routing, signalling,
OAM, management etc.) of the client layer network must be completely
independent of the functional components of the server layer network.
Although it is possible to design solutions where the server layer
network's functional components interact with the client layer
network's functional components this approach leads to the following
undesirable consequences:
1.      The service may break if the client changes any of their
   functional components.
2.      The SP has to track developments in the clients' technology
and implement upgrades in their network accordingly.
3.      Under fault conditions it becomes difficult to establish if
the fault is in the client or server network.
   
By requiring that the client and server layers are able to be run
independently of one another, it naturally follows that the server


   Willis et al           Expires February 2005               [Page 2]

layer should transparently transfer the client layer.  For example
consider the case where the client is an ATM network.  The client may
implement a proprietary feature (e.g. AAL, non-PNNI routing and
signalling, OAM) which if not carried transparently would break the
service.

This is not only a technical requirement but it also has commercial
implications because a SP is likely to consider the details of their
network to be commercially sensitive and will therefore wish to hide
those details from any client layer networks.  For example it would
not be desirable for the server layer network to peer with the routing
& signalling of the client in the example above.

Where the client and server layer networks are owned and operated by
the same SP (case 1 above) it may be possible to relax the degree of
independence between client and server layers.  However, the server
layer should still be able to transparently transfer the client layer
network's data-plane.  Non-transparent transfer optimisations may save
a tiny amount of bandwidth and possibly improve routing/signalling
protocol scalability but they ultimately increase operational
complexity, eg the behaviour for each spin of client compression is
different and requires a case by case consideration.
   
   
   
1.2     Basic OAM requirements
   
The transport defects of a layer network are determined by its mode.
There are 3 basic modes:

1 Connectionless packet switching (CL-PS) examples are IP and
Ethernet.
2 Connection Oriented Packet Switching (CO-PS) examples are Frame
Relay, ATM and MPLS (RSVP-TE).
3 Connection Oriented Circuit Switching (CO-CS) examples are SDH/SONET
and optical wavelength switching networks.

It should be noted that multi-point to point (MP2P) LDP creates a type
of MPLS that is a special case of the CO-PS mode. The MP2P LDP mode is
widely implemented today and its requirements are also addressed in
this document.

 The transport defects can be summarized as follows:

   
Connectionless Packet Switching
-------------------------------
(i)   Breaks


Connection Oriented Circuit Switching (co-cs)
---------------------------------------------
(i)   Breaks
(ii)  Swaps, but only between exactly alike entities (e.g. an SDH VC4
can swap with another SDH VC4 but not with another SDH VC12.


   Willis et al           Expires February 2005               [Page 3]

Alternatively a wavelength with colour A can swap with another
wavelength of colour A but not with a wavelength of colour B.)
This is  because the link-connections identifiers are constrained to
take on a real/physical appearance in either time/freq/space


Connection Oriented Packet Switching (co-ps)
--------------------------------------------
(i)   Breaks
(ii)  Swaps between any entities
(iii) Mismerge.  A mismerge is where traffic from one connection (e.g.
LSP, ATM VC) leaks into another connection.
Mismerges have the following subcases (where A1 and B1 are sources,
and A2 and B2 are sinks)
- A1->A2 mismerging into B1->B2, with A1->A2 traffic seemingly
unaffected
- A1->A2 mismerging into B1->B2, with A1->A2 traffic broken
- A1->A2 self-mismerging back into A1->A2 (One example of a where self-
mismerge can occur is due to a routing loop.)

MP2P LDP MPLS
-------------
(i)   Breaks. This is a complete failure of the whole MP2P tree.
(ii)  Swaps between any entities.
(iii) Mismerges, with the following subcases (where A1 and B1 are
sources, and A2 and B2 are sinks)
- A1->A2 mismerging into B1->B2, with A1->A2 traffic seemingly
unaffected
- A1->A2 mismerging into B1->B2, with A1->A2 traffic broken
- A1->A2 self-mismerging back into A1->A2 (One example of a
  where self-mismerge can occur is due to a routing loop.)
(iv)  Partial Breaks. It is possible for only partial failure of the
MP2P tree topology i.e. only some branches break so only some
ingresses are disconnected from the egress.
(v)   Partial swaps. It is possible that only some of the branches of
the MP2P tree experience swaps. The result would be that the traffic
egressing would be a mix of correct & incorrect traffic.
   
Many Service Providers operate ECMP on their MP2P LDP MPLS networks.
If ECMP is used then there further types of partial breaks & swaps are
introduced. This would be where only some of the flows on the MP2P
tree experience breaks or swaps. In this case no ingress would be
totally disconnected from the egress but some of the flows from some
of the ingresses could be disconnected or swapped to an incorrect
output.

Suitable OAM must be provided in the traffic data-plane of each mode
and the special case of MP2P LDP to automatically detect the above.
Defect detection is required to be unidirectional in the co-ps and co-
cs modes. Unidirectional detection is required to detect errors in
each direction independently i.e. it is possible to unambiguously
resolve which direction the defect is operating in. It must also be
possible to unambiguously resolve whether the fault is in the control
plane or the traffic data plane. For example if the control plane
fails in direction B to A and forwarding A to B continues (because


   Willis et al           Expires February 2005               [Page 4]

service providers require data forwarding even when the control plane
fails) then the fault must be correctly identified as a control plane
fault without false alarms from the data traffic plane OAM.
   
We must specify appropriate entry/exit criteria and consequent actions
for each defect. The entry/exit criteria define when the network
service is "up" or "down". It is essential to accurately define
network availability for Service Level Agreements based on persistent
defects (10s is the normal default here), especially for performance
SLAs as performance measures taken whilst the network is "unavailable"
should be disregarded (See the "Requirements for SLA verification"
section). An example of a "consequent action" would be to suppress the
traffic on a connection (LSP, ATM VC) if it is swapped. A consequent
action for a break would be to raise an appropriate alarm and may be
to initiate a reroute. Generation of forward and backward defect
indicators (BDI) would also be a consequent action. BDI should ideally
also be supported (this can be in-band or out-of-band) to allow for
both direction defect/availability monitoring from a single end.

Further work is required to define all the consequential actions for
all the possible defects.
   
For the co-ps and co-cs modes the OAM must be independent of the
manner in which the data-plane path is instantiated, ie whether by
signaling (any protocol) or provisioning.  If OAM is not independent
of the PW instantiation method then not only is operational complexity
increased (N types of OAM messages, N MIBs, N fault finding tools) but
there is no guarantee the different OAM methods are compatible (e.g. a
LDP provisioned PW might be mismerged with a static provisioned PW and
the fault may not be detected).

Further, the OAM activation/deactivation must be harmonized with the
set-up/tear-down of the path.  Failure to harmonize OAM
activation/deactivation with PW set-up/tear-down will lead to either:
- lack of OAM protection when the PW is set-up, or false alarms
  when the PW is torn-down;  or
- OAM being activated prior to PW set-up and significant problems due
  to operator error.
   
   
   
1.3     Client/server OAM requirements
   
Defects must be detected/handled at the path (co-cs and co-ps) or flow
(cl-ps) termination point of a layer network.  Failure to do this will
lead to ambiguous fault indications which significantly increase
operational complexity and the time taken to resolve a fault (e.g.
when a fault happens we should avoid passing trouble tickets between
SPs to locate the fault - defect detection at the correct termination
point of a layer network will aid this).

To prevent alarm storms in any co-cs or co-ps client layer networks a
FDI (Forward Defect Indication) signal should be passed to the client
layer networks.  This must use the appropriate FDI syntax of the OAM
used by the particular client layer technology affected.
   
   
   Willis et al           Expires February 2005               [Page 5]

   
1.4     Client/server adaptation requirements
   
Service Providers who deploy MPLS networks wish to obtain maximum
benefit from their MPLS network. If the PWE3 functions assume IP and
MPLS are the same then the SP using MPLS gains less benefit from their
MPLS network than is possible. By recognizing that MPLS and IP PSNs
are different then it should be possible to optimize the PWE3
functions for MPLS which may give benefits. This section discusses the
general case of this by considering some of the 9 possible
client/server combinations between the 3 network modes of co-cs, co-ps
and cl-ps.
   
The adaptation between a given client and server layer network should
be a function of the nature of both the client mode and the server
mode, and it is not the same in all cases.

To fully address each one of the possible 9 client/server modal
combinations would be an onerous task (noting that we would also
need to consider each particular technology as there are some
differences at this level too).  However, some examples of the issues
that need considering for client/server adaptation are given in the
rest of this section.

When there is a 'many to one' relationship between the client and the
server, and the server is either co-ps or co-cs, then the adaptation
function must include a muxing capability.  This capability is not
required if the server is cl-ps (for any client mode) since the cl-ps
layer network does muxing as a consequence of its intrinsic nature.
I.e. each packet here carries a network-unique DA/SA pair and client
layer identification is carried out via the 'protocol' or 'next
protocol' field.
   
In the case of client/server relationships between the cl-ps and co-ps
modes (either way round) the adaptation function may require a
fragmentation/reassembly capability if the server layer packet MTU is
less than the client packet size.

A fragmentation/reassembly function is clearly not required when the
server layer is co-cs for any client layer mode.  However, it does
require the server layer payload bit rate to exceed the average client
layer bit rate, but this is a general traffic requirement anyway for a
co-cs server layer.  Further, if the client is cl-ps or co-ps, it also
implies a need for rate-decoupling (ie idle-fill) and client traffic
unit delineation.
   
As a final example, when the server layer is cl-ps then this may
create misordering of the client (any mode).  Re-ordering should never
occur in a co-ps or co-cs server layer of course for any client mode.

So as we can see, the client/server adaptation cannot be the same for
all cases and each pair of modes that form a client/server
relationship must be considered in their own right.

In PWE3 terminology this means that the PWE3 processing functions have


   Willis et al           Expires February 2005               [Page 6]

no need to be the same for IP and MPLS PSNs. We need to understand the
optimum benefits from reusing IP and MPLS PWE3 functions against
optimizing the PWE3 functions for the particular PSN that is deployed.
   
1.5     Requirement for OOB management/control
   
The sensitive internal control/management-plane protocols must be made
secure from attack, and the network must remain stable under
situations of extreme stress, i.e. serious failures.

It is therefore a requirement that such protocols should be separated,
in terms of performance, addressing and availability (fate sharing),
from the customer traffic where this is possible.

It is noted that running the control/management-plane protocols OOB
(Out-Of-Band) in relation to customer traffic is an excellent way to
satisfy this requirement.  OOB may be logical or physical separation.
It is possible that current state of the art methods of control plane
separation e.g. separate queues for control plane protocols operating
in an address space separated from the customer traffic, might satisfy
the security requirement but this will be dependent upon
implementation detail.
   
A key implication of the use of an OOB control plane is that the
control messages are no longer a reliable means of determining the
integrity to the user's data as the control plane traffic and the
traffic data plane traffic are subject different failure mechanisms.
For example, it cannot be assumed that a failure of the control plane
will mean that there is also a failure in the traffic data plane and
vice versa. It is highly desirable, and normal operational practice in
current connection oriented networks, that any failure in the control
plane does not force a failure on the traffic data plane if the
traffic data plane is otherwise working correctly. If an OOB control
plane is to have reliable information on the state of the traffic data
plane, then the traffic data plane must have an in-band OAM flow in
order to verify the state of the traffic data plane and pass this
state information to the OOB control plane.
   
   
1.6     Requirements for SLA verification
   
SLAs are an increasingly important issue for a SP.

SLAs exist between a SP and a customer (where the customer may be
another SP).  And in the case of SLAs between different SPs this can
be both in a client/server sense (ie SP A leases capacity from SP B)
or in a peer-partition sense across an E-NNI between two SP domains in
the same layer network.

 Note - It is a networking truth that a link-connection (i.e. 'hop')
 in a client layer network is provided by an end-end path/flow in a
 server layer network.

 The performance of a client layer network is therefore determined by
 the performance of itself and that inherited from its server layer.


   Willis et al           Expires February 2005               [Page 7]

 This is a recursive behaviour to the duct and is something that
 should be taken into account when considering SLAs.
 
 Although there are 9 possible client/server combinations between the
 3 network modes of co-cs, co-ps and cl-ps, some may make more sense
 than other from performance inheritance and SLA considerations.

 This also raises questions regarding end-end performance allocations
 and their fair apportionment to SP domains.  Although this topic is
 not covered here, it is mentioned because it has significant
 commercial importance.  The network architecture therefore requires
 careful consideration regarding its ability to allow such a
 specification and measurement.

 
A service provider can simplify its SLA monitoring by monitoring at
only the layer network(s) that offer service(s) to customers rather
than all its server layers. However SLA measuring is not the same as
OAM for defect detection/handling. OAM is a generic requirement for
all layer networks and is an enabler to SLA monitoring.
SLAs have 2 distinct parts:

-    There is an availability part that defines the amount of time
     (usually as a percentage) that the service must be in the up-
     state, and

-    there is a Network Performance part that defines the transfer
     metrics/objectives that must be met whilst the service is in the
     up-state.
 
There is a logical ordering of processing required to ensure that SLAs
are implemented in the correct and most efficient manner.

Firstly, we must respect the allowed connectivity constraints for the
mode considered.  This is vital so that we can correctly identify the
defect types that are relevant.  Failure to identify all the ways a
network can break will lead to situations where the Operations
instrumentation will say a service is working whilst the customers are
arguing the service is broken.
 
Secondly, we must define suitable OAM techniques for the data-plane of
the mode considered.  This must define appropriate defect entry/exit
criteria and consequent actions.

Thirdly, based on defect persistency (usually 10s) we must define the
unavailable state entry/exit criteria and consequent actions.

Fourthly, once we have defined/specified the defects and
unavailability we now have a temporal basis against which we can
define/measure the up-state Network Performance SLA.

Note also that for many services/applications, availability must be a
bi-directional parameter.  That is, even if only one direction is
broken the total service (ie both directions) is considered broken.
This has implications for the starting/stopping of the collecting of


   Willis et al           Expires February 2005               [Page 8]

Network Performance measurements for the up-state SLA (noting that
during the up-state Network Performance is a unidirectional
measurement).
   
   
   
2.      Security Considerations
   
This document raises no security issues per se.  However, it does make
reference to using techniques to ensure that both the traffic data-
plane and the internal/sensitive control/management-plane protocols
have security measures in place.  For example, two key requirements
identified in this document are:

-       A trail termination source identifier should be used in the
      OAM of co-ps and co-cs trails to detect instances of misconnectivity;

-       A physical or logical OOB control/management-plane network
      should be used.

   
   
3.      Acknowledgements
   
   
Many BT folks have contributed in one way or another to this document
including:  Neil Harrison, Peter Willis, Alan McGuire, Richard
Spencer, Ben Niven-Jenkins, Andy Reid, Dave Milham, Tony Flavin,
Adrian Smith.




4.       Author's contact details

   
   Peter Willis
   BT Group CTO
   peter.j.willis@bt.com
   
   
5.      Full Copyright Statement
   
   "Copyright (C) The Internet Society (2004).  This document is
   subject to the rights, licenses and restrictions contained in BCP
   78, and except as set forth therein, the authors retain all their
   rights.
   
   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain
   it or assist in its implementation may be prepared, copied,
   published and distributed, in whole or in part, without restriction
   of any kind, provided that the above copyright notice and this
   paragraph are included on all such copies and derivative works.
   However, this document itself may not be modified in any way, such


   Willis et al           Expires February 2005               [Page 9]

   as by removing the copyright notice or references to the Internet
   Society or other Internet organizations, except as needed for the
   purpose of developing Internet standards in which case the
   procedures for rights in submissions defined in the IETF Standards
   Process must be followed, or as required to translate it into
   languages other than English.
   
   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.
   
   This document and the information contained herein is provided on
   an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
   REPRESENTS (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
   







































   Willis et al           Expires February 2005              [Page 10]
PAFTECH AB 2003-2026
2026-04-24 19:55:36