One document matched: draft-fuxh-mpls-delay-loss-te-framework-04.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
	<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
	<!ENTITY RFC3209 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3209.xml">
	<!ENTITY RFC3473 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3473.xml">
	<!ENTITY RFC3477 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3477.xml">
	<!ENTITY RFC3630 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3630.xml">
	<!ENTITY RFC4203 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4203.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="4"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>

<rfc category="std" docName="draft-fuxh-mpls-delay-loss-te-framework-04"
     ipr="trust200902">
  <front>
    <title abbrev="Link Latency, Jitter and Loss"> Loss and Delay Traffic Engineering Framework for MPLS </title>
    <author fullname="Xihua Fu" initials="X" surname="Fu">
			<organization>ZTE</organization>
			<address>				
				<email>fu.xihua@zte.com.cn</email>				
			</address>
     </author>
    <author fullname="Vishwas Manral" initials="V." surname="Manral">
      <organization>Hewlett-Packard Corp.</organization>
      <address>
        <postal>
          <street>191111 Pruneridge Ave.</street>
          <city>Cupertino</city>
          <region>CA</region>
          <code>95014</code>
          <country>US</country>
        </postal>
        <phone>408-447-1497</phone>
        <email>vishwas.manral@hp.com</email>
        <uri></uri>
      </address>
    </author>
     <author fullname="Dave McDysan" initials="D." surname="McDysan">
			<organization>Verizon</organization>
			<address>				
				<email>dave.mcdysan@verizon.com</email>
			</address>
     </author>
     <author fullname="Andrew Malis" initials="A." surname="Malis">
			<organization>Verizon</organization>
			<address>				
				<email>andrew.g.malis@verizon.com</email>
			</address>
     </author>	       
    <author fullname="Spencer Giacalone" initials="S." surname="Giacalone">
      <organization>Thomson Reuters</organization>
      <address>
        <postal>
          <street>195 Broadway</street>
          <city>New York</city>
          <region>NY</region>
          <code>10007</code>
          <country>US</country>
        </postal>
        <phone>646-822-3000</phone>
        <email>spencer.giacalone@thomsonreuters.com</email>
        <uri></uri>
      </address>
    </author> 
     <author fullname="Malcolm Betts" initials="M" surname="Betts">	
			<organization>ZTE</organization>
			<address>				
				<email>malcolm.betts@zte.com.cn</email>				
			</address>
     </author>
     <author fullname="Qilei Wang" initials="Q.L" surname="Wang">
			<organization>ZTE</organization>
			<address>				
				<email>wang.qilei@zte.com.cn</email>				
			</address>
     </author>	    
    <author fullname="John Drake" initials="J." surname="Drake">
			<organization>Juniper Networks</organization>
			<address>				
				<email>jdrake@juniper.net</email>
			</address>
     </author>
    <date year="2012" />

    <abstract>
          <t> With more and more enterprises using cloud based services, the 
          distances between the user and the applications are growing. A lot
          of the current applications are designed to work across LAN's and
          have various inherent assumptions. For multiple applications such
          as High Performance Computing and Electronic Financial markets, the
          response times are critical as is packet loss, while other 
          applications require more throughput.</t>

          <t> [RFC3031] describes the architecture of MPLS based networks.
          This draft extends the MPLS architecture to allow for latency, 
          loss and jitter as properties. It describes requirements and control plane implication
          for latency and packet loss as a traffic engineering performance
				  metric in today's network which is consisting of potentially multiple
				  layers of packet transport network and optical transport network in
				  order to make a accurate end-to-end latency and loss prediction
				  before a path is established.
          </t>

          <t> Note MPLS architecture for Multicast will be taken up in a 
          future version of the draft. </t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in [RFC 2119].</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
          <t> In High Frequency trading for Electronic Financial markets, 
          computers make decisions based on the Electronic Data received, without
          human intervention. These trades now account for a majority of the  
          trading volumes and rely exclusively on ultra-low-latency direct
          market access. </t>

          <t> Extremely low latency measurements for MPLS LSP tunnels are 
          defined in [draft-ietf-mpls-loss-delay]. They allow a mechanism 
          to measure and monitor performance metrics for packet loss, and 
          one-way and two-way delay, as well as related metrics like delay
          variation and channel throughput. </t>
          
          <t> The measurements are however effective only after the LSP is
          created and cannot be used by MPLS Path computation engine to define
          paths that have the latest latency. This draft defines the
          architecture used, so that end-to-end tunnels can be set up based on
          latency, loss or jitter characteristics.</t>

          <t> End-to-end service optimization based on latency and packet loss
          is a key requirement for service provider. This type of function will
          be adopted by their "premium" service customers. They would like to 
          pay for this "premium" service. Latency and loss on a route level will
          help carriers' customers to make his provider selection decision.				   
	  </t>

    </section>
    <section title="Architecture requirements overview">
				<section title="Communicate Latency and Loss as TE Metric">
						<t>The solution MUST provide a means to communicate latency, latency variation and 
						   packet loss of links and nodes as a traffic engineering performance metric into IGP.</t>
						<t>Latency, latency variation and packet loss may be unstable, for example, 
						   if queueing latency were included, then IGP could become unstable. 
						   The solution MUST provide a means to control latency and loss IGP message advertisement rate
						   and avoid instability when the latency, latency variation and packet loss value changes frequently.
						</t>
            <t> In the case where it is known that either the changes are too frequent or there is a
                backup which is preferred, the solution shall put the node or the link in unusable state for services
                requiring a particular service capability. This unusable state is on a capability basis
                and not a global basis. The condition to get into the state is locally configured and all routers in a domain should have this criteria synchronized.
            </t>            
						<t>Path computation entity MUST have the capability to compute one end-to-end path 
						   with latency and packet loss constraint. For example, it has the capability to 
						   compute a route with X amount of bandwidth with less than Y ms of latency and less than Z% 
						   packet loss limit based on the latency and packet loss traffic engineering database.
						   It MUST also support the path computation with routing constraints combination 
						   with pre-defined priorities, e.g., SRLG diversity, latency, loss, jitter and cost.
						   If the performance of link exceeds its configured maximum threshold, 
						   path computation entity may not select this kind of link although end-to-end performance is still met.
            </t>
				</section>

         <section title="Requirement for Composite Link">
              <t> One end-to-end LSP may traverses some Composite Links [CL-REQ]. Even 
                  if the transport technology (e.g., OTN) component links are identical, 
                  the latency and packet loss characteristics of the component links may
                  differ due to factors such as fiber distance and/ or fiber characteristics.
              </t>
              <t> The solution MUST provide a means to indicate that a traffic flow should
                  select a component link with minimum latency and/or packet loss, maximum
                  acceptable latency and/or packet loss value and maximum acceptable delay 
                  variation value as specified by protocol. The endpoints of Composite Link 
                  will take these parameters into account for component link selection or
                  creation. Details of how transient respose is taken is specified in Section 4.1 [CL-REQ]. The exact details for component links will be taken up 
                  separately and are not part of this document.
              </t>
        </section>

     <section title ="Requirement for Hierarchy LSP">
               <t> Heirarchical LSP's may traverse server layer LSP's. For such LSP's there may be some
                 latency and packet loss constraint requirement for the segment in server layer. 
               </t>

               <t> The solution MUST provide a means to indicate FA selection or FA-LSP
                 creation with minimum latency and/or packet loss, maximum acceptable latency
                 and/or packet loss value and maximum acceptable delay variation value. The 
                 boundary nodes of FA-LSP will take these parameters into account for  FA 
                 selection or FA-LSP creation. 
               </t>
      </section>

			<section title ="Latency Accumulation and Verification">

						   <t>The solution SHOULD provide a means to accumulate (e.g., sum) 
						   latency information of links and nodes along that an LSP traverses, (e.g., Inter-AS, Inter-Area or Multi-Layer) so that the source node can validate if the desired maximum latency constraint can be satisfied for a packet traversing the
LSP. [Y.1541] provides details of how the latency value is accumulated. </t>
       <t> Both One-way and
						   Round-trip latency collection along the LSP by signaling protocol and
						   latency verification at the end of LSP should be supported.</t>

						   <t> The accumulation of the delay is "simple" for the static component
						   i.e. its a linear addition, the dynamic/network loading component is
						   more interesting and would involve some estimate of the "worst case".
						   However, method of deriving this worst case appears to be more in the
						   scope of Network Operator policy than standards i.e. the operator
						   needs to decide, based on the SLAs offered, the required confidence
						   level.</t>
			</section>
      <section title ="Restoration, Protection and Rerouting">   				
			  <t> Some customers may insist on having the ability to re-route if the
            latency and loss SLA is not being met. If a "provisioned" end-to-end
            LSP latency and/or loss could not meet the latency and loss agreement 
            between operator and his user, the solution SHOULD support pre-defined 
            or dynamic re-routing (e.g., make-before-break) to handle this case based on the local policy.            
            In revertive behaviour is supported, the original LSP must not be released 
            and is monitored by control plane. 
            When the end-to-end performance is repaired, the service is restored to the original LSP.
       </t>
       <t> The solution SHOULD support to move an end-to-end LSP away from any link whose performance violates the configured threshold. </t>

       <t> End-to-end measurements of the LSP also need to be performed in addition to the link-by-link measurements. A threshold violation of the End-to-End criteria as measured by the head end node should cause rerouting of the LSP. </t>
        <t> The anomalous path can be switch to protection path or rerouted to new path because of end-to-end performance couldn't meet any more.
       </t>
        <t> If a "provisioned" end-to-end LSP latency and/or loss performance is 
            improved (i.e., beyond a configurable minimum value), the solution SHOULD support the re-routing
            to optimize latency and/or loss end-to-end cost.</t>

        <t> The latency performance of pre-defined protection or dynamic re-routing LSP
            MUST meet the latency SLA parameter. </t>

        <t> Due to some flapping conditions the latency and loss of an LSP may change, this may cause the LSP to be frequently switched to a new path. In order to avoid churn, the solution SHOULD specify the 
            switchover of the LSP according to maximum acceptable change rate. </t>
      </section>
    </section>
  
    <section title="End-to-End Latency">
      <t> Procedures to measure latency and loss has been provided in ITU-T [Y.1731],
          [G.709] and [ietf-mpls-loss-delay]. The control plane can be independent
          of the mechanism used and different mechanisms can be used for measurement 
          based on different standards.
      </t>
  
      <t> Latency on a path has two sources: Node latency which is caused by the node
          as a result of process time in each node and: Link latency as a result of 
          packet/frame transit time between two neighbouring nodes or a FA-LSP/ 
          Composite Link [CL-REQ].
      </t>


      <t> Latency or one-way delay is the time it takes for a packet within
      a stream going from measurement point 1 to measurement point 2, as defined in [Y.1540]. </t>

      <t> The architecture uses assumption that the sum of the latencies of
      the individual components approximately adds up to the average latency of
      an LSP. Though using the sum may not be perfect, it however gives a good
      approximation that can be used for Traffic Engineering (TE) purposes. </t>

      <t> The total measured latency of an LSP consists of the sum of the latency of the
      LSP hop, as well as the average latency of switching on a device, which
      may vary based on queuing and buffering. </t>

      <t> Hop latency can be measured by getting the latency measurement 
      between the egress of one MPLS LSR to the ingress of the nexthop LSR. This
      value may be constant for most part, unless there is protection switching,
      or other similar changes at a lower layer. </t>

      <t> The switching latency on a device, can be measured internally, and 
      multiple mechanisms and data structures to do the same have been defined.
      [Add references to papers by Verghese, Kompella, Duffield]. </t>

      <t> We also looked at other measurement granularities before deciding on an interface based measurement. An approximation of the Flow based measurement is the per DSCP value, measurement from the ingress of one port to the egress of every other port in the device.</t>

      <t> Another approximation that can be used is per interface DSCP based measurement, which can be an agrregate of the average measurements per interface. The average can itself be calculated in ways, so as to provide closer approximation. </t>

      <t>For the purpose of this draft it is assumed that the node latency is a small factor of the total latency in the networks where this solution is deployed.  The node latency is hence ignored for the
benefit of simplicity in this solution.</t>
 
      <t> The average link delay over a configurable interval should be reported by data plane in micro-seconds.</t>

      </section>

      <section title="End-to-End Jitter">
      <t> Jitter or Packet Delay Variation of a packet within
      a stream of packets is defined for a selected pair of packets in the stream going from measurement point 1 to measurement point 2. </t>

      <t> This architecture uses the assumptions of [Y.1540] to calculate the accumulated jitter from the individual components approximately. Though using this may not be perfect, it however gives a good
      approximation that can be used for Traffic Engineering (TE) purposes.</t>

      <t> The buffering and queuing within a device will lead to the jitter. Just like latency measurements, jitter measurements can be 
      approximated as either per DSCP per port pair (Ingress and Egress) or as per DSCP per egress port, however such measurements have been left out for the sake of simplicity of the solution. </t> 

      <t>For the purpose of this draft it is assumed that the node latency is a small factor of the total latency in the networks where this solution is deployed.  The node latency is hence ignored for the      benefit of simplicity.</t>
  
  <t> The jitter is measured in micro-seconds. </t>

      </section>

      <section title="End-to-End Loss">
      <t> Loss or Packet Drop probability of a packet within a stream of 
      packets is defined as the number of packets dropped within a given
      interval. </t>

      <t> This architecture uses the assumptions of [Y.1540] to calculate the accumulated loss from the individual components approximately. Though using the accumulated metrics may not be perfect, it however gives a good approximation that can be used for Traffic Engineering (TE) purposes.</t>

           <t> The buffering and queuing mechanisms within a device will decide which packet is to be dropped. Just like latency and jitter measurements, the loss can best be appproximated as either per DSCP per port pair (Ingresss and Egress) or as per DSCP per egress port. However such mechanisms are not used in this solution to keep the solution simple.</t> 

      <t> The loss is measured in terms of the number of packets per million packets. </t>
      </section>

   <section title="Protocol Considerations">
      <t> The protocol metrics above can be sent in IGP protocol packets as defined in RFC 3630. They can then be used by Source Node or the Path Computation engine to decide paths with the desired path properties. </t>

      <t> As Link-state IGP information is flooded throughout an area, 
      frequent changes can cause a lot of control traffic. To prevent such flooding, data should only be flooded when it crosses a certain      configured maximum. </t>

      <t> A separate measurement should be done for an LSP when it is UP. Also LSP's path should only be recalculated when the end-to-end metrics changes in a way it becomes more than desired.</t>
    </section>

     <section title="Control Plane Implication">		
       <section title="Implications for Routing">
        <t> The latency and packet loss performance metric MUST be advertised into path computation entity by IGP (OSPF-TE, OSPFv3-TE or IS-IS-TE) to perform route computation and network planning based on latency and packet loss SLA target. </t>

        <t> Latency, latency variation and packet loss value MUST be reported as a average value which is calculated by data plane measurements.</t>

        <t> Latency and packet loss characteristics of these links and nodes may change dynamically. In order to control IGP messaging and avoid being unstable when the latency, latency variation and packet loss value changes, a threshold and a limit on rate of change MUST be configured in the IGP control plane. </t>

        <t> Latency and packet loss values changes need to be updated and flooded in the IGP control messages only when there is significant changes in the value. When the head end-node deterimines the IGP update affects the LSP for which it is ingress, it recalculates the LSP. </t>

        <t> A target value MUST be configured to control plane for each link. If the link performance improves beyond a configurable target value, it must be re-advertised. The receiving node determines whether a "provisioned" end-to-end LSP latency and/or loss performance is improved. </t>

        <t> It is sometimes important for paths that desire low latency to avoid nodes that have a significant contribution to latency. Control plane should report two components of the delay, "static" and "dynamic". The dynamic component is always caused by traffic loading and queuing. The "dynamic" portion SHOULD be reported as an approximate value. The static component should be a fixed latency through the node without any queuing. Link latency attribute should also take into account the latency of node, i.e., the latency between the incoming port and the outgoing port of a network element. Half of the fixed node latency can be added to each link.
        </t>

        <t> When the Composite Links [CL-REQ] is advertised into IGP, there are following considerations.
					<list style="symbols">
						<t> One option is that the latency and packet loss of composite link may be the range (e.g., at least minimum and maximum) latency value of all component links. It may also be the maximum or average latency value of all component links. In both cases, only partial information is transmited in the IGP. So the path computation entity has insufficient information to determine whether a particular path can support its latency and packet loss requirements. This leads to signaling crankback.
 	          </t>
 	          <t>Another option is that latency and packet loss of each component link within one Composite Link could be advertised but having only one IGP adjacency. 	             
 	          </t>
					</list>
				</t>				
				<t>One end-to-end LSP (e.g., in IP/MPLS or MPLS-TP network) may traverse a FA-LSP of server layer (e.g., OTN rings). The boundary nodes of the FA-LSP SHOULD be aware of the latency and packet loss information of this FA-LSP. 
				</t>
		   	<t>If the FA-LSP is able to form a routing adjacency and/or as a TE link in the client network, the total latency and packet loss value of the FA-LSP can be as an input to a transformation that results in a FA traffic engineering metric and advertised into the client layer routing instances.
		       Note that this metric will include the latency and packet loss of the links and nodes that the trail traverses.
		    </t>					     
			  <t>If total latency and packet loss information of the FA-LSP changes (e.g., due to a maintenance action or failure in OTN rings), 
			     the boundary node of the FA-LSP will receive the TE link information advertisement
			     including the latency and packet value which is already changed and if it is over than the threshold and a limit on rate of change, 
			     then it will compute the total latency and packet value of the FA-LSP again. 
			     If the total latency and packet loss value of FA-LSP changes, the client layer MUST also be notified about the latest value of FA. 
			     The client layer can then decide if it will accept the increased latency and packet loss or request a new path that meets the latency and packet loss requirement.
		    </t>
			</section>
      <section title="Implications for Signaling">
           <t> In order to assign the LSP to one of component links with different latency and loss
               characteristics, RSVP-TE message needs to carry a indication of request minimum
               latency and/or packet loss, maximum acceptable latency and/or packet loss value and
               maximum acceptable delay variation value for the component link selection or creation.
               The composite link will take these parameters into account when assigning traffic of
               LSP to a component link.</t>

           <t> One end-to-end LSP (e.g., in IP/MPLS or MPLS-TP network) may traverse a FA-LSP
               of server layer (e.g., OTN rings). There will be some latency and packet loss
               constraint requirement for the segment route in server layer. So RSVP-TE message
               needs to carry a indication of request minimum latency and/or packet loss, 
               maximum acceptable latency and/or packet loss value and maximum acceptable delay
               variation value. The boundary nodes of FA-LSP will take these parameters into
               account for FA selection or FA-LSP creation.</t>

           <t> RSVP-TE needs to be extended to accumulate (e.g., sum) latency information of links
               and nodes along one LSP across multi-domain (e.g., Inter-AS, Inter-Area or Multi-Layer)
               so that an latency verification can be made at end points. One-way and round-trip
               latency collection along the LSP by signaling protocol can be supported. So the end
               points of this LSP can verify whether the total amount of latency could meet the
               latency agreement between operator and his user. When RSVP-TE signaling is used,
               the source can determine if the latency requirement is met much more rapidly
               than performing the actual end-to-end latency measurement.</t>

          <t> Restoration, protection and equipment variations can impact "provisioned"
              latency and packet loss (e.g., latency and packet loss increase). For example, 
              restoration/provisioning action in transport network that increases latency 
              seen by packet network observable by customers, possibly violating SLAs. The 
              change of one end-to-end LSP latency and packet loss performance MUST be known
              by source and/or sink node. So it can inform the higher layer network of a 
              latency and packet loss change. The latency or packet loss change of links and
              nodes will affect one end-to-end LSPs total amount of latency or packet loss.
              Applications can fail beyond an application-specific threshold. Some remedy 
              mechanism could be used. </t>

          <t> Pre-defined protection or dynamic re-routing could be triggered to handle this
              case. In the case of predefined protection, large amounts of redundant capacity
              may have a significant negative impact on the overall network cost. Service 
              provider may have many layers of pre-defined restoration for this transfer, but
              they have to duplicate restoration resources at significant cost. Solution 
              should provides some mechanisms to avoid the duplicate restoration and reduce 
              the network cost. Dynamic re-routing also has to face the risk of resource
              limitation. So the choice of mechanism MUST be based on SLA or policy. In the 
              case where the latency SLA can not be met after a re-route is attempted, control
              plane should report an alarm to management plane. It could also try restoration
              for several times which could be configured. </t>
	  			</section>        	
    </section>
		      

    <section anchor="IANA" title="IANA Considerations">
      <t> No new IANA consideration are raised by this document. </t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>This document raises no new security issues. </t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t> TBD. </t>
    </section>
  </middle>

	<back>
    <references title="Normative References">
      &RFC2119;
      &RFC3209;
      &RFC3473;
      &RFC3477;
      &RFC3630;
      &RFC4203;
    </references>
    
	<references title="Informative References">
		<reference anchor="CL-REQ">
			<front>
				<title>
				Requirements for MPLS Over a Composite Link 
				</title>
				<author surname="C. Villamizar">
				<organization>Infinera Corporation</organization>
			    </author>		
			</front>
			<seriesInfo name="draft-ietf-rtgwg-cl-requirement-04" value=""/>
		</reference>
		
		<reference anchor="G.709">
			<front>
				<title>
				Interfaces for the Optical Transport Network (OTN)
				</title>
				<author>
				<organization>
				ITU-T Recommendation G.709
				</organization>
				</author>
				<date month="December" year="2009"/>
			</front>			
		</reference>

		<reference anchor="Y.1731">
			<front>
				<title>
				OAM functions and mechanisms for Ethernet based networks
				</title>
				<author>
				<organization>
				ITU-T Recommendation Y.1731
				</organization>
				</author>
				<date month="Feb" year="2008"/>
			</front>			
		</reference>
    <reference anchor="ietf-mpls-loss-delay">
			<front>
				<title>
				Packet Loss and Delay Measurement for MPLS Networks
				</title>
				<author surname="D. Frost">
				<organization>Cisco Systems</organization>
			    </author>		
			</front>
			<seriesInfo name="draft-ietf-mpls-loss-delay-03" value=""/>
		</reference>

    <reference anchor="EXPRESS-PATH">
			<front>
				<title>
				OSPF Traffic Engineering (TE) Express Path
				</title>
				<author surname="S. Giacalone">
				<organization>Thomson Reuters</organization>
			    </author>		
			</front>
			<seriesInfo name="draft-giacalone-ospf-te-express-path-01" value=""/>
		</reference>
				
	</references>
	</back>
</rfc>

PAFTECH AB 2003-20262026-04-23 15:36:39