One document matched: draft-bryant-filsfils-fat-pw-00.txt
PWE3 S. Bryant
Internet-Draft C. Filsfils
Intended status: Standards Track Cisco Systems
Expires: May 11, 2008 J. Kuechemann
Deutsche Telekom
November 8, 2007
Load Balancing Fat MPLS Pseudowires
draft-bryant-filsfils-fat-pw-00
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 11, 2008.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
Where the payload carried over a pseudowire carries a number of
identifiable flows it can in some circumstances be desirable to carry
those flows over the equal cost multiple paths that exist in the
packet switched network. This draft describes two methods of
achieving that, the one by including an additional label in the label
stack, the other by using a block of alternative pseudowire labels.
Bryant, et al. Expires May 11, 2008 [Page 1]
Internet-Draft LB-fat-pw November 2007
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [RFC2119].
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Load Balance Label . . . . . . . . . . . . . . . . . . . . 3
1.2. Pseudowire Labels Block . . . . . . . . . . . . . . . . . 4
2. Native Service Processing Function . . . . . . . . . . . . . . 4
3. Pseudowire Forwarder . . . . . . . . . . . . . . . . . . . . . 4
3.1. Encapsulation when using LB Label . . . . . . . . . . . . 5
4. Load Balance Signaling . . . . . . . . . . . . . . . . . . . . 6
4.1. Signaling Load Balance Label . . . . . . . . . . . . . . . 7
4.1.1. Structure of Load Balance Label TLV . . . . . . . . . 7
4.2. Signalling Label Block . . . . . . . . . . . . . . . . . . 7
4.2.1. Structure of Multiple VC TLV . . . . . . . . . . . . . 8
5. OAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 9
7. Comparision of the Approaches . . . . . . . . . . . . . . . . 10
8. Security Considerations . . . . . . . . . . . . . . . . . . . 10
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
10. Congestion Considerations . . . . . . . . . . . . . . . . . . 11
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
12.1. Normative References . . . . . . . . . . . . . . . . . . . 11
12.2. Informative References . . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12
Intellectual Property and Copyright Statements . . . . . . . . . . 14
Bryant, et al. Expires May 11, 2008 [Page 2]
Internet-Draft LB-fat-pw November 2007
1. Introduction
A pseudowire is defined as a mechanism that carries the essential
elements of an emulated service from one provider edge (PE) to one or
more other PEs over a packet switched network (PSN) [RFC3985].
A pseudowire is normally transported over one single network path,
even if multiple ECMP paths exit between the ingress and egress
PEs[RFC4385] [RFC4928]. This is required to preserve the
characteristics of the emulated service (e.g. avoid misordering for
example for SAToP pseudowire's [RFC4553]). Except in the extreme
case described in Section 6, the new capability proposed in this
draft does not change this default property of pseudowires.
Some pseudowire's (for example Ethernet pseudowires) transport IP
packets between two router locations (creating a virtual direct
Ethernet link between these two routers). Such pseudowire's may
carry from hundred's of Mbps to Gbps of traffic. Such pseudowire's
do not require ordering to be preserved between packets of the
pseudowire. They only require ordering to be preserved within the
context of each individual transported IP flow. Operators have
requested the ability to explicitly configure such a pseudowire to
leverage the availability of multiple ECMP paths. This allows for
better capacity planning as the statistical multiplexing of a larger
number of smaller flows is more efficient than with a smaller set of
larger flows.
This specification describes two methods of load balancing the
pseudowire
o Use of a load balance label
o Allocation of multiple pseudowire labels
The load balance label mechanism is the more general and more
powerful method, and as such is the prefered approach. The
pseudowire lable block is an OPTINAL method that may be negotiated by
PEs unable to support the additional label needed by the load balance
label method.
1.1. Load Balance Label
In this approach an additional label is interposed between the
pseudowire label and the control word, or if the control word is not
present, between the pseudowire label and the pseudowire payload.
This additional label is called the pseudowire load balancing label
(LB label). Indivisible flows within the pseudowire MUST be mapped
to the same pseudowire LB label by the ingress PE. The pseudowire
Bryant, et al. Expires May 11, 2008 [Page 3]
Internet-Draft LB-fat-pw November 2007
load balancing label stimulates the correct ECMP load balancing
behaviour in the PSN. On receipt of the pseudowire packet at the
egress PE (which knows this additional label is present) the label is
discarded without processing.
Note that there is no protocol constraint on the value of a LB label.
This is the prefered method of acheiving load balancing.
1.2. Pseudowire Labels Block
In this approach a contiguous block of pseudowire labels are
allocated to each pseudowire by the egress PE. Flows carried by the
pseudowire labels are spread over the set of pseudowire labels, such
that Indivisible flows within the pseudowire MUST are mapped to the
same pseudowire label by the ingress PE. The use of a multiplicity
of alternate pseudowire labels stimulates the correct ECMP load
balancing behaviour in the PSN.
Note that the pseudowire labels MUST be allocated as a contiguous
block.
Support for this method is OPTIONAL.
2. Native Service Processing Function
The Native Service Processing (NSP) function is a component of a PE
that has knowledge of the structure of the emulated service and is
able to take action on the service outside the scope of the
pseudowire. In this case it is required that the NSP in the ingress
PE identify flows, or groups of flows within the service, and
indicate the flow (group) identity of each packet as it is passed to
the pseudowire forwarder. Since this is an NSP function, by
definition, the method used to identify a flow is outside the scope
of the pseudowire design. Similarly, since the NSP is internal to
the PE, the method of flow indication to the pseudowire forwarder is
outside the scope of this document
3. Pseudowire Forwarder
The pseudowire forwarder must be provided with a method of mapping
flows to load balanced paths. Where the method chosen is the label
block method, the forwarder uses the flow information provided by the
NSP to allocate a flow to one of the VC labels in the load balancing
label block. In all other respects forwarder operation is identical
to the normal single VC label case.
Bryant, et al. Expires May 11, 2008 [Page 4]
Internet-Draft LB-fat-pw November 2007
When the load balance label method is used the forwarder must
generate a label for the flow or group of flows. How the load
balance label values are determined is outside the scope of this
document, however the load balance label allocated to a flow SHOULD
remain constant. It is recommended that the method chosen to
generate the load balancing labels introduces a high degree of
entropy in their values, to maximise the entropy presented to the
ECMP path selection mechanism in the LSRs in the PSN, and hence
distribute the flows as evenly as possible over the available PSN
ECMP paths. The forwarder at the ingress PE prepends the pseudowire
control word (if applicable), then prepends either the pseudowire
load balancing label, followed by the pseudowire label.
Alternatively it prepends the pseudowire control word (if
applicable), then selects and appends one of the allocated pseudowire
labels.
The forwarder at the egress PE uses the pseudowire label to identify
the pseudowire. If the label block approach is used operation is
identical to the current non-load balanced case. Alternatively, from
the pseudowire context, the egress PE can determine whether a
pseudowire load balancing label is present, and if one is present,
the label is discarded.
All other pseudowire forwarding operations are unmodified by the
inclusion of the pseudowire load balancing label.
3.1. Encapsulation when using LB Label
The PWE3 Protocol Stack Reference Model modified to include
pseudowire LB label is shown in Figure 1 below
Bryant, et al. Expires May 11, 2008 [Page 5]
Internet-Draft LB-fat-pw November 2007
+-------------+ +-------------+
| Emulated | | Emulated |
| Ethernet | | Ethernet |
| (including | Emulated Service | (including |
| VLAN) |<==============================>| VLAN) |
| Services | | Services |
+-------------+ +-------------+
| Load balance| | Load balance|
+-------------+ Pseudowire +-------------+
|Demultiplexer|<==============================>|Demultiplexer|
+-------------+ +-------------+
| PSN | PSN Tunnel | PSN |
| MPLS |<==============================>| MPLS |
+-------------+ +-------------+
| Physical | | Physical |
+-----+-------+ +-----+-------+
Figure 1: PWE3 Protocol Stack Reference Model
The encapsulation of a pseudowire with a pseudowire LB label is shown
in Figure 2 below
+-------------------------------+
| MPLS Tunnel label(s) | n*4 octets (four octets per label)
+-------------------------------+
| PW label | 4 octets
+-------------------------------+
| Load Balance label | 4 octets
+-------------------------------+
| Optional Control Word | 4 octets
+-------------------------------+
| Payload |
| |
| | n octets
| |
+-------------------------------+
Figure 2: Encapsulation of a pseudowire with a pseudowire load
balancing label
4. Load Balance Signaling
This section describes the signalling procedures when [RFC4447] is
used. It is expected that only one of the two load balance
mechanisms will chosen, in which case the signalling mechanism
Bryant, et al. Expires May 11, 2008 [Page 6]
Internet-Draft LB-fat-pw November 2007
corresponding to the rejected method will not be progressed.
4.1. Signaling Load Balance Label
When using the signalling procedures in [RFC4447], there is a
Pseudowire Interface Parameter Sub-TLV type used to signal the desire
to include the load balance label when advertising a VC label.
The presence of this parameter indicates that the egress PE requests
that the ingress PE place a load balance label between the pseudowire
label and the control word (or is the control word is not present
between the pseudowire label and the pseudowire payload).
If the ingress PE recognises load balance label indicator parameter
but does not wish to include the load balance label, it need only
issue its own label mapping message for the opposite direction
without including the load balance label Indicator. This will
prevent inclusion of the load balance label in either direction.
If PWE3 signalling [RFC4447] is not in use for a pseudowire, then
whether the load balance label is used MUST be identically
provisioned in both PEs at the pseudowire endpoints. If there is no
provisioning support for this option, the default behaviour is not to
include the load balance label.
4.1.1. Structure of Load Balance Label TLV
The structure of the load balance label TLV is shown in Figure 3.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| LBL | Length | must be zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Multiple VC TLV
Where:
o LBL is the load balance label TLV identifier assigned by IANA.
o Length is the length of the TLV in octets and is 4.
4.2. Signalling Label Block
When using the signalling procedures in [RFC4447], there is a
Pseudowire Interface load balance sub-TLV type used to signal the
desire load balance the pseudowire over a block of pseudowire VC
Bryant, et al. Expires May 11, 2008 [Page 7]
Internet-Draft LB-fat-pw November 2007
labels.
The presence of this TLV indicates that the egress PE requests that
the ingress PE distribute the ingress flows present in the pseudowire
over the block of VC labels sent in this TLV.
This mechanism is fully backwards compatible. If the ingress PE does
not recognise the load balance TLV or does not wish to use it, it
simply ignores this TLV.
4.2.1. Structure of Multiple VC TLV
The field structure is defined in Figure 4.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MultipleVC | Length=12 | must be zero |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| First Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Last Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Multiple VC TLV
Where:
o MultipleVC is the TLV identifier assigned by IANA
o Length is 12
o First Label is a 20-bit label value as specified in [RFC3032]
represented as a 20-bit number in a 4 octet field that indicates
the start of the load balance label block
o Last Label is a 20-bit label value as specified in [RFC3032]
represented as a 20-bit number in a 4 octet field that indicates
the end of the load balance label block
First Label SHOULD be the normal VC for this pseudowire.
5. OAM
The following OAM considerations apply to both methods of load
balancing.
Bryant, et al. Expires May 11, 2008 [Page 8]
Internet-Draft LB-fat-pw November 2007
Where the OAM is only to be used to perform a basic test that the
pseudowires have been configured at the PEs VCCV[I-D.ietf-pwe3-vccv]
messages may be sent using any load balance pseudowire path, i.e.
over any of the multiple pseudowire labels, or using any pseudowire
load balance label.
Where it is required to verify that a pseudowire is fully functional
for all flowsVCCV [I-D.ietf-pwe3-vccv] connection verification
message MUST be sent over each ECMP path to the pseudowire egress PE.
This problem is difficult to solve and scales poorly. . We believe
that this problem is addressed by the following two methods:
1. If a failure occurs within the PSN, this failure will normally be
detected by the PSN's IGP (link/node failure, link or BFD or IGP
hello detection), and the IGP convergence will naturally modify
the ECMP set of network paths between the Ingress and Egress
PE's. Hence the PW is only impacted during the normal IGP
convergence time.
2. If the failure is related to the individual corruption of an LFIB
entry in a router, then only the network path using that specific
entry is impacted. If the PW is load balanced over multiple
network paths, then this failure can only be detected if, by
chance, the transported OAM flow is mapped onto the impacted
network path, or all paths are tested. This type of error may be
better solved be solved by other means such as LSP self test
[I-D.ietf-mpls-lsr-self-test].
To troubleshoot the MPLS PSN, including multiple paths, the
techniques described in [RFC4378] and [RFC4379] can be used.
6. Applicability
This design applies to MPLS pseudowires where it is meaningful to
deconstruct the packets presented to the ingress PE into flows. The
mechanism described in this document promotes the distribution of
flows within the pseudowire over different network paths. This in
turn means that whilst packets within a flow are delivered in order
(subject to normal IP delivery perturbations due to topology
variation), order is not maintained amongst packets of different
flows. It is not proposed to associate a different sequence number
with each flow. If sequence number support is required this
mechanism is not applicable.
Where it is known that the traffic carried by the pseudowire is IP
the method of identifying the flows are well known and can be
applied. Of particular importance when the pseudowire is an Ethernet
Bryant, et al. Expires May 11, 2008 [Page 9]
Internet-Draft LB-fat-pw November 2007
the Ethernet control frames are always using the same network path
and hence remain in order.
Methods of identifying separable flows for payloads other than IP are
less well known, but commonly applied in link bundling between
adjacent switches. In this case however the latency distribution
would be larger than is found in the link bundle case. The
acceptability of the increased latency is for further study.
7. Comparision of the Approaches
There are a number of advantages and disadvantages to each approach:
o The LB label method has a better statistical multiplexing
capability.
o The LB label method has a better semantic than the PW Label block
approach as this latter merges the pseudowire emultiplexor and the
load balance semantics.
o The LB label preserves label space and hence the FIB table size.
o The PW Label Block preserves the data plane path of the egress PE
o The PW Label Block has better hardware backwards compatibility.
o Both approach anyway require data plane forwarding change for the
ingress PE.
o The LB Label method requires a data plane forwarding change for
the egress PE.
8. Security Considerations
The pseudowire generic security considerations described in [RFC3985]
and the security considerations applicable to a specific pseudowire
type (for example, in the case of an Ethernet pseudowire [RFC4448]
apply.
There are no additional security risks introduced by this design.
9. IANA Considerations
IANA is requested to allocate the next available values from the IETF
Consensus range in the Pseudowire Interface Parameters Sub-TLV type
Bryant, et al. Expires May 11, 2008 [Page 10]
Internet-Draft LB-fat-pw November 2007
Registry as a Load Balance Label indicator.
Parameter Length Description
TBD 4 Load Balancing Label
TBD 12 Load Balance VC Block
10. Congestion Considerations
The congestion considerations applicable to pseudowires as described
in [RFC3985] and any additional congestion considerations developed
at the time of publication apply to this design.
The ability to explicitly configure a PW to leverage the availability
of multiple ECMP paths is beneficial to capacity planning as, all
other parameters being constant, the statistical multiplexing of a
larger number of smaller flows is more efficient than with a smaller
number of larger flows.
Note that if the classification into flows is only performed on IP
packets the behaviour of those flows in the face of congestion will
be as already defined by the IETF for packets of that type and no
additional congestion processing is required.
Where flows that are not IP are classified pseudowire congestion
avoidance must be applied to each non-IP load balance group.
11. Acknowledgements
The authors wish to thank Ulrich Drafz, Wilfried Maas, Luca Martini
and Mark Townsley for valuable comments and contributions to this
design.
12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
Encoding", RFC 3032, January 2001.
[RFC4378] Allan, D. and T. Nadeau, "A Framework for Multi-Protocol
Bryant, et al. Expires May 11, 2008 [Page 11]
Internet-Draft LB-fat-pw November 2007
Label Switching (MPLS) Operations and Management (OAM)",
RFC 4378, February 2006.
[RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol
Label Switched (MPLS) Data Plane Failures", RFC 4379,
February 2006.
[RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson,
"Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for
Use over an MPLS PSN", RFC 4385, February 2006.
[RFC4447] Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G.
Heron, "Pseudowire Setup and Maintenance Using the Label
Distribution Protocol (LDP)", RFC 4447, April 2006.
[RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
"Encapsulation Methods for Transport of Ethernet over MPLS
Networks", RFC 4448, April 2006.
[RFC4553] Vainshtein, A. and YJ. Stein, "Structure-Agnostic Time
Division Multiplexing (TDM) over Packet (SAToP)",
RFC 4553, June 2006.
[RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal
Cost Multipath Treatment in MPLS Networks", BCP 128,
RFC 4928, June 2007.
12.2. Informative References
[I-D.ietf-mpls-lsr-self-test]
Swallow, G., "Label Switching Router Self-Test",
draft-ietf-mpls-lsr-self-test-07 (work in progress),
May 2007.
[I-D.ietf-pwe3-vccv]
Nadeau, T. and C. Pignataro, "Pseudowire Virtual Circuit
Connectivity Verification (VCCV) A Control Channel for
Pseudowires", draft-ietf-pwe3-vccv-15 (work in progress),
September 2007.
[RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-
Edge (PWE3) Architecture", RFC 3985, March 2005.
Bryant, et al. Expires May 11, 2008 [Page 12]
Internet-Draft LB-fat-pw November 2007
Authors' Addresses
Stewart Bryant
Cisco Systems
250 Longwater Ave
Reading RG2 6GB
United Kingdom
Phone: +44-208-824-8828
Email: stbryant@cisco.com
Clarence Filsfils
Cisco Systems
Brussels
Belgium
Email: cfilsfil@cisco.com
Joerg Kuechemann
Deutsche Telekom
Muenster,
Germany
Phone:
Fax:
Email: Joerg.Kuechemann@telekom.de
URI:
Bryant, et al. Expires May 11, 2008 [Page 13]
Internet-Draft LB-fat-pw November 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Bryant, et al. Expires May 11, 2008 [Page 14]
| PAFTECH AB 2003-2026 | 2026-04-24 01:33:14 |