One document matched: draft-scharf-mptcp-api-01.xml


<?xml version="1.0" encoding="US-ASCII"?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
  <!ENTITY RFC0793 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.0793">
  <!ENTITY RFC1122 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1122">
  <!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119">
  <!ENTITY RFC3542 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3542">
  <!ENTITY MPTCPARCH SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-mptcp-architecture-00">
  <!ENTITY MPTCP SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ford-mptcp-multiaddressed-02">
  <!ENTITY MPTCPSEC SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-mptcp-threat-00">
  <!ENTITY MPTCPCC SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-raiciu-mptcp-congestion-00">
  <!ENTITY SHIMAPI SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-shim6-multihome-shim-api-13">
  <!ENTITY HIPAPI SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-hip-native-api-12">
  <!ENTITY SCTPAPI SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-tsvwg-sctpsocket-21">
  <!ENTITY MIFPRACTICE SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-mif-current-practices-00">
  <!ENTITY AFMP SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-sarolahti-mptcp-af-multipath-01">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<?rfc toc="yes"?>
<?rfc symrefs="no"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc strict="no"?>
<?rfc rfcedstyle="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>

<rfc category="info" docName="draft-scharf-mptcp-api-01" ipr="trust200902">
  <front>
    <title abbrev="MPTCP API">MPTCP Application Interface Considerations</title>

    <author fullname="Michael Scharf" initials="M." surname="Scharf">
      <organization>Alcatel-Lucent Bell Labs</organization>

      <address>
        <postal>
          <street>Lorenzstrasse 10</street>

          <city>70435 Stuttgart</city>

          <country>Germany</country>
        </postal>

        <email>michael.scharf@alcatel-lucent.com</email>
      </address>
    </author>

    <author fullname="Alan Ford" initials="A." surname="Ford">
      <organization>Roke Manor Research</organization>

      <address>
        <postal>
          <street>Old Salisbury Lane</street>

          <city>Romsey, Hampshire  SO51 0ZN</city>

          <country>UK</country>
        </postal>

        <phone>+44 1794 833 465</phone>

        <email>alan.ford@roke.co.uk</email>
      </address>
    </author>

    <date year="2010"/>

    <area>Transport Area</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>MPTCP</keyword>

    <keyword>TCP</keyword>

    <abstract>
      <t>Multipath TCP (MPTCP) adds the capability of using multiple
      paths to a regular TCP session. Even though it is designed to be
      totally backwards compatible to applications, the data transport
      differs compared to regular TCP, and there are several
      additional degrees of freedom that applications may wish to
      exploit. This document summarizes the impact that MPTCP may have
      on applications, such as changes in performance.  Furthermore,
      it describes an optional extended application interface that
      provides access to multipath information and enables control of
      some aspects of the MPTCP implementation's behaviour.</t>
    </abstract>

  </front>

  <middle>

    <section title="Introduction">

      <t>Multipath TCP (MPTCP) adds the capability of using multiple
      paths to a regular TCP session <xref target="RFC0793"/>. The
      motivations for this extension include increasing throughput,
      overall resource utilisation, and resilience to network failure,
      and these motivations are discussed, along with high-level
      design decisions, as part of the MPTCP architecture
      <xref target="I-D.ietf-mptcp-architecture"/>. MPTCP
      <xref target="I-D.ford-mptcp-multiaddressed"/> offers the same
      reliable, in-order, byte-stream transport as TCP, and is
      designed to be backward-compatible with both applications and
      the network layer. It requires support inside the network stack
      of both endpoints. This document presents the impacts that MPTCP
      may have on applications, such as performance changes compared
      to regular TCP. Furthermore, it specifies an extended
      Application Programming Interface (API) describing how
      applications can exploit additional features of multipath
      transport. MPTCP is designed to be usable without any
      application changes. The specified API is an optional extension
      that provides access to multipath information and enables
      control of some aspects of the MPTCP implementation's behaviour,
      for example switching on or off the automatic use of MPTCP.</t>

      <t>The de facto standard API for TCP/IP applications is the
      "sockets" interface. This document defines experimental
      MPTCP-specific extensions, in particular additional socket
      options. It is up to the applications, high-level programming
      languages, or libraries to decide whether to use these optional
      extensions. For instance, an application may want to turn on or
      off the MPTCP mechanism for certain data transfers, or provide
      some guidance concerning its usage (and thus the service the
      application receives). The syntax and semantics of
      the specification is in line with the Posix standard
      <xref target="POSIX"/> as much as possible.</t>

      <t>Some network stack implementations, specially on mobile
      devices, have centralized connection managers or other
      higher-level APIs to solve multi-interface issues, as surveyed
      in <xref target="I-D.ietf-mif-current-practices"/>. Their
      interaction with MPTCP is outside the scope of this note.</t>

      <t>There are also various related extensions of the sockets
      interface: <xref target="I-D.ietf-shim6-multihome-shim-api"/>
      specifies sockets API extensions for a multihoming shim
      layer. The API enables interactions between applications and the
      multihoming shim layer for advanced locator management and for
      access to information about failure detection and path
      exploration. Other experimental extensions to the sockets API
      are defined for the Host Identity Protocol (HIP)
      <xref target="I-D.ietf-hip-native-api"/> in order to manage the
      bindings of identifiers and locator. Other related API
      extensions exist for IPv6 <xref target="RFC3542"/> and SCTP
      <xref target="I-D.ietf-tsvwg-sctpsocket"/>. There can be
      interactions or incompatibilities of these APIs with MPTCP,
      which are discussed later in this document.</t>

      <t>The target readers of this document are application
      programmers who develop application software that may benefit
      significantly from MPTCP. This document also provides the
      necessary information for developers of MPTCP to implement the
      API in a TCP/IP network stack.</t>

    </section>

    <section title="Terminology">
       
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described
      in <xref target="RFC2119"/>.</t>

      <t>This document uses the terminology introduced in
      <xref target="I-D.ford-mptcp-multiaddressed"/>.</t>

    </section>

    <section title="Comparison of MPTCP and Regular TCP">
    <!-- <section title="Impact of MPTCP on Applications"> -->

      <t>This section discusses the impact that the use of MPTCP will
      have on applications, in comparison to what may be expected from
      the use of regular TCP.</t>

      <section title="Performance Impact">

        <t>One of the key goals of adding multipath capability to TCP
        is to improve the performance of a transport connection by
        load distribution over separate subflows across potentially
        disjoint paths. Furthermore, it is an explicit goal of MPTCP
        that it should not provide a worse performing connection that
        would have existed through the use of legacy, single-path
        TCP. A corresponding congestion control algorithm is described
        in <xref target="I-D.raiciu-mptcp-congestion"/>. The
        following sections summarize the performance impact of MPTCP
        as seen by an application.</t>
        
        <section title="Throughput">

          <t>The most obvious performance improvement that will be
          gained with the use of MPTCP is an increase in throughput,
          since MPTCP will pool more than one path (where available)
          between two endpoints. This will provide greater bandwidth
          for an application.  If there are shared bottlenecks between
          the flows, then the congestion control algorithms will
          ensure that load is evenly spread amongst regular and 
          multipath TCP sessions, so that no end user receives
          worse performance than single-path TCP.</t>

          <t>Furthermore, this means that an MPTCP session could
          achieve throughput that is greater than the capacity of a
          single interface on the device. If any applications make
          assumptions about interfaces due to throughput (or vice
          versa), they must take this into account.</t>

          <t>The transport of MPTCP signaling information results in a
          small overhead. If multiple subflows share a same
          bottleneck, this overhead slightly reduces the capacity that
          is available for data transport. Yet, this potential
          reduction of throughput will be neglectible in many usage
          scenarios, and the protocol contains optimisations in its
          design so that this overhead is minimal.</t>

        </section>

        <section title="Delay">

          <t>If the delays on the constituent subflows of an MPTCP
          connection differ, the jitter perceivable to an application
          may appear higher as the data is striped across the
          subflows. Although MPTCP will ensure in-order delivery to
          the application, the application must be able to cope with
          the data delivery being burstier than may be usual with
          single-path TCP. Since burstiness is commonplace on the
          Internet today, it is unlikely that applications will suffer
          from such an impact on the traffic profile, but application
          authors may wish to consider this in future development.</t>

          <t>In addition, applications that make round trip time (RTT)
          estimates at the application level may have some
          issues. Whilst the average delay calculated will be
          accurate, whether this is useful for an application will
          depend on what it requires this information for. If a new
          application wishes to derive such information, it should
          consider how multiple subflows may affect its measurements,
          and thus how it may wish to respond. In such a case, an 
          application may wish to express its scheduling preferences,
          as described later in this document.</t>
          
        </section>

        <section title="Resilience">

          <t>The use of multiple subflows simultaneously means that,
          if one should fail, all traffic will move to the remaining
          subflow(s), and additionally any lost packets can be
          retransmitted on these subflows.</t>

          <t>Subflow failure may be caused by issues within the
          network, which an application would be unaware of, or
          interface failure on the node. An application may, under
          certain circumstances, be in a position to be aware of such
          failure (e.g. by radio signal strength, or simply an interface
          enabled flag), and so must not make assumptions of an MPTCP 
          flow's stablity based on this. MPTCP will never override an
          application's request for a given interface, however, so the
          cases where this issue may be applicable are limited.</t>

        </section>

      </section>

      <section title="Potential Problems">

        <section title="Impact of Middleboxes">

          <t>MPTCP has been designed in order to pass through the
          majority of middleboxes, for example through its ability to
          open subflows in either direction, and through its use of a
          data-level sequence number.</t>

          <t>Nevertheless some middleboxes may still refuse to pass
          MPTCP messages due to the presence of TCP options. If this
          is the case, MPTCP should fall back to regular TCP. Although
          this will not create a problem for the application (its
          communication will be set up either way), there may be
          additional (and indeed, user-perceivable) delay while the
          first handshake fails.</t>

          <t>Empirical evidence suggests that new TCP options can
          successfully be used on most paths in the Internet. But they
          can also have other unexpected implications. For instance,
          intrusion detection systems could be triggered. Full
          analysis of MPTCP's impact on such middleboxes is for
          further study.</t>

        </section>

        <section title="Outdated Implicit Assumptions">

          <t>MPTCP overcomes the one-to-one mapping of the socket
          interface to a flow through the network. As a result,
          applications cannot implicitly rely on this one-to-one
          mapping any more. Applications that require the transport
          along a single path can disable the use of MPTCP as
          described later in this document. Examples include 
          monitoring tools that want to measure the available 
          bandwidth on a path, or routing protocols such as BGP
          that require the use of a specific link.</t>

        </section>

        <section title="Security Implications">

         <t>The support for multiple IP addresses within one MPTCP
         connection can result in additional security vulnerabilities,
         such as possibilities for attackers to hijack
         connections. The protocol design of MPTCP minimizes this
         risk.  An attacker on one of the paths can cause harm, but
         this is hardly an additional security risk compared to
         single-path TCP, which is vulnerable to man-in-the-middle
         attacks, too. A detailed thread analysis of MPTCP is
         published in <xref target="I-D.ietf-mptcp-threat"/>.</t>

        </section>

      </section>

    </section>

    <section title="Operation of MPTCP with Legacy Applications">
    <!-- <section title="Implications of MPTCP on Existing Interfaces"> -->

      <section title="Overview of the MPTCP Network Stack">

        <t>MPTCP is an extension of TCP, but it is designed to be
        backward compatible for legacy applications. TCP interacts
        with other parts of the network stack by different
        interfaces. The de facto standard API between TCP and
        applications is the sockets interface. The position of MPTCP
        in the protocol stack can be illustrated in 
        <xref target="fig_stack"/>.</t>

        <?rfc needLines='15'?>
        <figure title="MPTCP protocol stack" anchor="fig_stack" align="center">
          <artwork align="center"><![CDATA[
     +-------------------------------+
     |           Application         |
     +-------------------------------+
            ^                 |
 ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~
            |                 v
    +-------------------------------+
    |             MPTCP             |
    + - - - - - - - + - - - - - - - +
    | Subflow (TCP) | Subflow (TCP) |
    +-------------------------------+
    |       IP      |      IP       |
    +-------------------------------+
          ]]></artwork>
        </figure>

        <t>In general, MPTCP can affect all interfaces that rely on
        the coupling of a TCP connection to a single IP address and
        TCP port pair, to one sockets endpoint, to one network
        interface, or to a given path through the network.</t>

        <t>This means that there are two classes of applications:

          <list style="symbols">
            <t>Legacy applications: These applications use the
            existing API towards TCP without any changes. This is the
            default case.</t>
            <t>MPTCP-aware applications: These applications indicate
            support for an enhance MPTCP interface.</t>
          </list>

        In the following, it is discussed to which extent MPTCP
        affects legacy applications using the existing sockets
        API.</t>

      </section>

      <section title="Usage of Addresses Inside Applications">

        <t>The existing sockets API implies that applications deal
        with data structures that store, amongst others, the IP
        addresses and TCP port numbers of a TCP connection. A design
        objective of MPTCP is that legacy applications can continue to
        use the established sockets API without any changes. However,
        in MPTCP there is a one-to-many mapping between the socket
        endpoint and the subflows. This has several subtle
        implications for legacy applications using sockets API
        functions.</t>

        <t>During binding, an application can either select a specific
        address, or bind to INADDR_ANY. Furthermore, the
        SO_BINDTODEVICE socket option can be used to bind to a
        specific interface. If an application uses a specific address,
        or sets the SO_BINDTODEVICE socket option to bind to a
        specific interface, then MPTCP MUST respect this and not
        interfere in the application's choices. If an application
        binds to INADDR_ANY, it is assumed that the application does
        not care which addresses to use locally. In this case, a local
        policy MAY allow MPTCP to automatically set up multiple
        subflows on such a connection. The extended sockets API will
        allow applications to express specific preferences in an
        MPTCP-compatible way (e.g. bind to a subset of interfaces
        only).</t>

        <t>Applications can use the getpeername() or getsockname()
        functions in order to retrieve the IP address of the peer or
        of the local socket. These functions can be used for various
        purposes, including security mechanisms, geo-location, or
        interface checks. The socket API was designed with an
        assumption that a socket is using just one address, and since
        this address is visible to the application, the application
        may assume that the information provided by the functions is
        the same during the lifetime of a connection. However, in
        MPTCP, unlike in TCP, there is a one-to-many mapping of a
        connection to subflows, and subflows can be added and removed
        while the connections continues to exist. Therefore, MPTCP
        cannot expose addresses by getpeername() or getsockname() that
        are both valid and constant during the connection's
        lifetime.</t>

        <t>This problem is addressed as follows: If used by a legacy
        application, the MPTCP stack MUST always return the addresses
        of the first subflow of an MPTCP connection, in all
        circumstances, even if that particular subflow is no longer in
        use. As this address may not be valid any more if the first
        subflow is closed, the MPTCP stack MAY close the whole MPTCP
        connection if the first subflow is closed (fate
        sharing). Whether to close the whole MPTCP connection by
        default SHOULD be controlled by a local policy. Further
        experiments are needed to investigate its implications.</t>

        <t>Instead of getpeername() or getsockname(), MPTCP-aware
        applications can use new API calls, documented later, in order
        to retrieve the full list of address pairs for the subflows in
        use.</t>

      </section>

      <section title="Usage of Existing Socket Options">

        <t>The existing sockets API includes options that modify the
        behavior of sockets and their underlying communications
        protocols. Various socket options exist on socket, TCP, and IP
        level. The value of an option can usually be set by the
        setsockopt() system function. The getsockopt() function gets
        information.  In general, the existing sockets interface
        functions cannot configure each MPTCP subflow individually. In
        order to be backward compatible, existing APIs therefore
        should apply to all subflows within one connection, as far as
        possible.</t>

<!-- Socket options at TCP level -->

        <t>One commonly used TCP socket option (TCP_NODELAY) disables
        the Nagle algorithm as described in <xref target="RFC1122"/>.
        This option is also specified in the Posix standard
        <xref target="POSIX"/>. Applications can use this option in
        combination with MPTCP exactly in the same way. It then
        disables the Nagle algorithm for the MPTCP connection, i.e.,
        all subflows.</t>

        <t>TODO: Setting this option could also trigger a different
        path scheduler algorithm - specifically, that which is designed
        for latency-sensitive traffic, as described in a later section.</t>

<!-- Socket options at SOL level: SO_SNDBUF, SO_RCVBUF -->

        <t>Applications can also explicitly configure send and receive
        buffer sizes by the sockets API (SO_SNDBUF, SO_RCVBUF). These
        socket options can also be used in combination with MPTCP and
        then affect the buffer size of the MPTCP connection. However,
        when defining buffer sizes, application programmers should
        take into account that the transport over several subflows
        requires a certain amount of buffer for resequencing. Therefore,
        it does not make sense to use MPTCP in combination with very
        small receive buffers. Small send buffers may prevent MPTCP
        from efficiently scheduling data over different subflows.
        It may be appropriate for an MPTCP implementation to set a
        lower bound for such buffers, or alternatively treat a small
        buffer size request as an implicit request not to use MPTCP.</t>

<!-- Other socket options -->

        <t>Some network stacks also provide other
        implementation-specific socket options or interfaces that
        affect TCP's behavior. If a network stack supports MPTCP, it
        must be ensured that these options do not interfere.</t>

<!--  Non-standardized options:

           Linux TCP socket options: TCP_CORK, TCP_DEFER_ACCEPT,
           TCP_INFO, TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL,
           TCP_LINGER2, TCP_MAXSEG, TCP_NODELAY, TCP_QUICKACK,
           TCP_SYNCNT, TCP_WINDOW_CLAMP
    
           Windows TCP socket options: TCP_BSDURGENT,
           TCP_EXPEDITED_1122, TCP_NODELAY
-->
  
<!-- Other interfaces (e. g., ioctl)? -->

      </section>

      <section title="Default Enabling of MPTCP">

        <t>It is up to a local policy at the end system whether a
        network stack should automatically enable MPTCP for sockets
        even if there is no explicit sign of MPTCP awareness of the
        corresponding application. Such a choice may be under the
        control of the user through system preferences.</t>

      </section>

      <section title="Known Remaining Issues with Legacy Applications">

        <t>TODO: Future experiments will show whether legacy
        applications could break despite the backward-compatible
        API of MPTCP.</t>

      </section>

    </section>

    <section title="Minimal API Enhancements for MPTCP-aware Applications">

      <section title="Indicating MPTCP Awareness">

        <t>While applications can use MPTCP with the unmodified
        sockets API, a clean interface requires small semantic changes
        compared to the existing sockets API. Even if these changes do
        not affect most applications, they are only enabled if an
        application explicitly signals that it supports multipath
        transport and the enhanced interface, in order to maintain
        backward compatibility with legacy applications. An
        application can explicitly indicate multipath capability by
        setting the TCP_MP_ENABLE option described below.</t>

      </section>

      <section title="Modified Address Handling">

        <t>The main change of the sockets API for MPTCP-aware
        applications is as follows: If a socket is MPTCP-aware and
        thus does not use the backward-compatibility mode, the
        functions getpeername() and getsockname() SHOULD fail with a
        new error code EMULTIPATH. Due to their ambiguity, an
        MPTCP-aware application should not use these two
        functions. Instead, the information about the addresses in use
        can be accessed by the extended sockets API, if needed.</t>

      </section>

      <section title="Usage of a New Address Family">

        <t>As alternative to setting a socket option, an application
        can also use a new, separate address family called
        AF_MULTIPATH
        <xref target="I-D.sarolahti-mptcp-af-multipath"/>.  This
        separate address family can be used to exchange multiple
        addresses between an application and the standard sockets API,
        and additionally acts as an explicit indication that an
        application is MPTCP-aware, i.e., that it can deal with the
        semantic changes of the sockets API, in particular concerning
        getpeername() and getsockname(). The usage of AF_MULTIPATH is
        also more flexible with respect to multipath transport, either
        IPv4 or IPv6, or both in parallel
        <xref target="I-D.sarolahti-mptcp-af-multipath"/>.</t>

      </section>

    </section>

    <section title="Extended MPTCP API">

      <section title="MPTCP Usage Scenarios and Application Requirements">

        <t>Applications that use TCP may have different requirements on
        the transport layer.  While developers have become used to the
        characteristics of regular TCP, new opportunities created by
        MPTCP could allow the service provided to be optimised
        further.  An extended API enables MPTCP-aware applications to
        specify preferences and control certain aspects of the
        behavior, in addition to the simple controls already discussed,
        such as switching on or off the automatic use of MPTCP.</t>

        <t>An application that wishes to transmit bulk data will want
        MPTCP to provide a high throughput service immediately, through 
        creating and maximising utilisation of all available subflows.  
        This is the default MPTCP use case.</t>

        <t>But at the other extreme, there are applications that are
        highly interactive, but require only a small amount of
        throughput, and these are optimally served by low latency and
        jitter stability.  In such a situation, it would be preferable 
        for the traffic to use only the lowest latency subflow (assuming 
        it has sufficient capacity), with one or two additional subflows
        for resilience and recovery purposes.</t>

        <t>The choice between these two options affects the scheduler
        in terms of whether traffic should be, by default, sent on one
        subflow or across both. Even if the total bandwidth required is
        less than that available on an individual path, it is desirable
        to spread this load to reduce stress on potential bottlenecks, 
        and this is why this method should be the default. It is recognised,
        however, that this may not benefit all applications that require
        latency/jitter stability, so the other (single path) option is
        provided.</t>

        <t>In the case of the latter option, however, a further question
        arises: should additional subflows be used whenever the primary 
        subflow is overloaded, or only when the primary path fails 
        (hot-standby)? In other words, is latency stability or bandwidth 
        more important to the application?</t>

        <t>We therefore divide this option into two: Firstly, there is the
        single path which can overflow into an additional subflow; and 
        secondly there is single-path with hot-standby, whereby an 
        application may want an alternative backup subflow in 
        order to improve resilience.  In case that data delivery on the 
        first subflow fails, the data transport could immediately be
        continued on the second subflow, which is idle otherwise.</t>

        <t>In summary, there are three different "application profiles"
        concerning the use of MPTCP:

          <list style="numbers">
            <t>Bulk data transport</t>
            <t>Latency-sensitive transport (with overflow)</t>
            <t>Latency-sensitive transport (hot-standby)</t>
          </list>
        </t>

        <t>These different application profiles affect both the
        management of subflows, i.e., the decisions when to set up
        additional subflows to which addresses as well as the
        assignment of data (including retransmissions) to the existing
        subflows. In both cases different policies can exist.</t>

        <t>These profiles have been defined to cover the common 
        application use cases. It is not possible to cover all 
        application requirements, however, and as such applications
        may wish to have finer control over subflows and packet
        scheduling. A set of requirements is listed below.</t>

        <t>Although it is intended that such functionality will be 
        achieved through new MPTCP-specific options, it may also be 
        possible to infer some application preferences from existing 
        socket options, such as TCP_NODELAY. Whether this would be 
        reliable, and indeed appropriate, is for further study.</t>

      </section>

      <section title="Requirements on API Extensions">

        <t>Because of the importance of the sockets interface there
        are several fundamental design objectives for the interface
        between MPTCP and applications:

          <list style="symbols">
            <t>Consistency with existing sockets APIs must be
            maintained as far as possible. In order to support the
            large base of applications using the original API, a
            legacy application must be able to continue to use
            standard socket interface functions when run on a system
            supporting MPTCP. Also, MPTCP-aware applications should be
            able to access the socket without any major changes.</t>

            <t>Sockets API extensions must be minimized and independent
            of an implementation.</t>

            <t>The interface should both handle IPv4 and IPv6.</t>
          </list>
        </t>

        <t>The following is a list of specific requirements from
        applications:</t>

        <t>TODO: This list of requirements is preliminary and requires
        further discussion. Some requirements have to be removed.</t>

        <t><list style="format REQ%d:" counter="reqs">

          <t>Turn on/off MPTCP: An application should be able to
          request to turn on or turn off the usage of MPTCP. This
          means that an application should be able to explicitly
          request the use of MPTCP if this is possible. Applications
          should also be able to request not to enable MPTCP and to
          use regular TCP transport instead. This can be implicit in
          many cases, e.g., since MPTCP must disabled by the use of
          binding to a specific address, or may be enabled if an
          application uses AF_MULTIPATH.</t>

<!-- ALAN: I have removed this one, I do not feel it is an API issue

          <t>An application should be able to control MPTCP's behavior
          if the first subflow is closed, i.e., whether to close the
          whole MPTCP connection, or not (address agility/fate sharing).
          </t>
 -->

          <t>An application will want to be able to restrict MPTCP to
          binding to a given set of addresses or interfaces.</t>

          <t>An application should be able to know if multiple
          subflows are in use.</t>

          <t>An application should be able to enumerate all subflows
          in use, obtain information on the addresses used by a
          subflow, and obtain a subflow's usage (e.g., ratio of
          traffic sent via this subflow).</t>

<!--      Michael: There seems to be no agreement conderning the need of
          a connection identifier so far! -->

          <t>An application should be able to extract a unique
          identifier for the connection (per endpoint), analogous to a
          port, i.e., it should be able to retrieve MPTCP's connection
          identifier. (TODO)</t>
    
          <t>Set/get the application profile, as discussed in the
          previous section.</t>

        </list></t>

        <t>The above requirements are seen as having fairly clear
        benefits to applications. Although in some cases they are
        going above and beyond what regular TCP would provide, they
        are allowing an application to make optimal use of the new
        features that MPTCP provides.</t>

        <t>The following requirements are more specific, and could
        mostly be implied through more generic options, such as the
        application profile selection. They are currently included
        here as potential discussion points, however, as they may have
        use to application developers as more specific configuration
        options, beyond being an implicit part of a profile
        selection.</t>

        <t><list style="format REQ%d:" counter="reqs">

          <t>Constrain the maximum number of subflows to be used by an
          MPTCP connection.</t>

          <t>Request a change in scheduling between subflows.</t>

          <t>Request a change in the number of subflows in use, thus
          triggering removal or addition of subflows. (A finer control
          granularity would be: Request the establishment of a new
          subflow to a provided destination, and request the termination
          of a specified, existing subflow.)</t>

<!--      
          <t>TODO: Request establishment of a new subflow to a
          provided address.</t>

          <t>TODO: Request termination of a specified, existing subflow.</t>
 -->

          <!-- Alan: I don't like this level of application control -->

          <!-- Michael: I'd like to hear some other opinions on this.
          This design choice also depends on other API functions. For
          instance, if the app can assign some kind of preference to
          subflows, there might be no need for a explicit teardown
          function. The app can just set a large weight to its
          favorite subflows and then request a reduction of the number
          of subflows; this should have the same effect ;) -->

          <t>Control automatic establishment/termination of
          subflows?  There could be different configurations of the
          path manager, e.g., 'try ASAP', 'wait until there is a bunch
          of data, etc. (Tied to application profile?)</t>

          <t>Set/get preferred subflows or subflow usage policies?
          There could be different configurations of the multipath scheduler,
          e.g., 'all-or-nothing', 'overflow', etc. (Again, tied to
          application profile?)</t>

          <t>Get/set redundancy, i.e., to send segments on more than
          one path in parallel.</t>

<!--
          <t>Set/get sporadic sending of segments on unused paths
          ("keepalives").</t>
-->

          <t>An application should be able to modify the MPTCP
          configuration while communication is ongoing, i.e., after
          establishment of the MPTCP connection.</t>

<!--      Any other MPTCP features? -->

        </list></t>

      </section>

      <section title="Design Considerations">

        <t>Multipath transport results in many degrees of freedom.
        MPTCP manages the data transport over different subflows
        automatically. By default, this is transparent to the
        application. But applications can use the sockets API
        extensions defined in this section to interface with the MPTCP
        layer and to control important aspects of the MPTCP
        implementation's behaviour. The API uses non-mandatory socket
        options and is designed to be as light-weight as possible.</t>

        <t>MPTCP mainly affects the sending of data. Therefore, most
        of the new socket options must be set in the sender side of a
        data transfer in order to take effect. Nevertheless, it is also
        possible for a receiver to have preferences about data transfer 
        choices, as it may too have performance requirements. (TODO) It 
        is for further study as to whether it is feasible for a receiving
        application to influence sending policy, and if so, how this 
        could be implemented.</t>

        <t>As this document specifies sockets API extensions, it is
        written so that the syntax and semantics are in line with the
        Posix standard <xref target="POSIX"/> as much as possible.</t>

      </section>

      <section title="Overview of Sockets Interface Extensions">

        <t>The extended MPTCP API consist of several new socket
        options that are specific to MPTCP.  All of these socket
        options are defined at TCP level (IPPROTO_TCP). These socket
        options can be used either by the getsockopt() or by the
        setsockopt() system call.</t>

        <t>The new API functions can be classified into general
        configuration and more advanced configuration. The new socket
        options for the general configuration of MPTCP are:</t>

        <t><list style="symbols">
          <t>TCP_MP_ENABLE: Enable/disable MPTCP</t>
          <t>TCP_MP_SUBFLOWS: Get the addresses currently used by
          the MPTCP subflows, optionally complemented by further
          information such as usage ratio</t>
          <t>TCP_MP_PROFILE: Get/set the MPTCP profile</t>
          <t>...</t>
        </list></t>

        <t>Table <xref target="tab_options"/> shows a list of the
        socket options for the general configuration of MPTCP.  The
        first column gives the name of the option. The second and
        third columns indicate whether the option can be handled by
        the getsockopt() system call and/or by the setsockopt() system
        call. The fourth column lists the type of data structure
        specified along with the socket option.</t>

<!--
        TODO: Distinguish before/after connect socket call?
-->   

        <texttable anchor="tab_options" title="Socket options for MPTCP">
          <ttcol align='left'>Option name</ttcol>
          <ttcol align='center'>Get</ttcol>
          <ttcol align='center'>Set</ttcol>
          <ttcol align='center'>Data type</ttcol>

          <c>TCP_MP_ENABLE</c>
          <c>o</c>
          <c>o</c>
          <c>int</c>

          <c>TCP_MP_SUBFLOWS</c>
          <c>o</c>
          <c></c>
          <c>*1</c>

          <c>TCP_MP_PROFILE</c>
          <c>o</c>
          <c>o</c>
          <c>int</c>

          <c>...</c>
          <c></c>
          <c></c>
          <c></c>

          <postamble>*1: Data structure containing the addresses of each subflow, plus further information</postamble>
        </texttable>

        <t>TODO: More options may be added in a future version of this
        note.</t>

      </section>

      <section title="Detailed Description">

        <section title="TCP_MP_ENABLE">

          <t>TODO: Description</t>

        </section>

        <section title="TCP_MP_SUBFLOWS">

          <t>TODO: Description</t>

        </section>

        <section title="TCP_MP_PROFILE">

          <t>TODO: Description</t>

        </section>

      </section>

      <section title="Usage examples">

<!--
        <t>In this section, we describe the usage of the API using the
        syntax of the C programming language. We limit the description
        to the most important interfaces and data structures that are
        either modified or completely new because the MPTCP's API is
        otherwise identical to the sockets API
        <xref target="POSIX"/>.</t>
-->

        <t>TODO: Example C code for one or more API functions</t>

      </section>

      <section title="Interactions and Incompatibilities with other Multihoming Solutions">

        <t>The use of MPTCP can interact with various related sockets
        API extensions. Care should be taken for the usage not to
        confuse with the overlapping features:</t>

        <t><list style="symbols">

          <t>SHIM API
          <xref target="I-D.ietf-shim6-multihome-shim-api"/>: This API
          specifies sockets API extensions for the multihoming shim
          layer.</t>

          <t>HIP API <xref target="I-D.ietf-hip-native-api"/>: The
          Host Identity Protocol (HIP) also results in a new API.</t>

        </list></t>

        <t>The use of a multihoming shim layer conflicts with
        multipath transport such as MPTCP or SCTP
        <xref target="I-D.ietf-shim6-multihome-shim-api"/>. In order
        to avoid any conflict, multiaddressed MPTCP SHOULD not be
        enabled if a network stack uses SHIM6 or HIP.  Furthermore,
        applications should not try to use both the MPTCP API and a
        multihoming shim layer API.  It is feasible, however, that
        some of the MPTCP functionality, such as congestion control,
        could be used in a SHIM6 or HIP environment. Such operation is
        outside the scope of this document.</t>

      </section>

      <section title="Other Advice to Application Developers">

        <t><list style="symbols">

          <t>Using the default MPTCP configuration: MPTCP is designed
          to be efficient and robust in the default
          configuration. Application developers should not explicitly
          configure features unless this is really needed.</t>

          <t>Socker buffer dimensioning: Multipath transport requires
          larger buffers in the receiver for resequencing, as already
          explained. Applications should use reasonably buffer sizes
          (such as the operating system default values) in order to
          fully benefit from MPTCP.</t>

        </list></t>

      </section>

    </section>

    <section title="Security Considerations">

      <t>Will be added in a later version of this document.</t>

    </section>

    <section title="IANA Considerations">

      <t>No IANA considerations.</t>

    </section>

    <section title="Conclusion">

      <t>This document discusses MPTCP's application implications and
      specifies an extended API. From an architectural point of view,
      MPTCP offers additional degrees of freedom concerning the
      transport of data. The extended sockets API allows MPTCP-aware
      applications to have additional control of some aspects of the
      MPTCP implementation's behaviour and to obtain information about
      its usage.  The new socket options for MPTCP can be used by
      getsockopt() and/or setsockopt() system calls. But it is also
      ensured that the existing sockets API continues to work for
      legacy applications.</t>

    </section>

    <section title="Acknowledgments">

      <t>Authors sincerely thank to the following people for their
      helpful comments to the document: Costin Raiciu</t>

      <t>Michael Scharf is supported by the German-Lab project
      (http://www.german-lab.de/) funded by the German Federal
      Ministry of Education and Research (BMBF). Alan Ford is
      supported by Trilogy (http://www.trilogy-project.org/), a
      research project (ICT-216372) partially funded by the European
      Community under its Seventh Framework Program. The views
      expressed here are those of the author(s) only. The European
      Commission is not liable for any use that may be made of the
      information in this document.</t>

    </section>

  </middle>

  <back>

    <references title="Normative References">

      &RFC0793;
      &RFC1122;
      &RFC2119;
      &MPTCPARCH;
      &MPTCP;
      &MPTCPSEC;
      &MPTCPCC;

      <reference anchor="POSIX">
        <front>
          <title>IEEE Std. 1003.1-2008 Standard for Information Technology --
          Portable Operating System Interface (POSIX). Open Group
          Technical Standard: Base Specifications, Issue 7, 2008.</title>
        </front>
      </reference>	

    </references>

    <references title="Informative References">

      &AFMP;
      &RFC3542;
      &SHIMAPI;
      &HIPAPI;
      &SCTPAPI;
      &MIFPRACTICE;

    </references>

    <section title="Change History of the Document">

      <t>Changes compared to version 00:</t>

      <list style="symbols">

        <t>Distinction between legacy and MPTCP-aware applications</t>

        <t>Guidance concerning default enabling, reaction to the shutdown of the first sub-flow, etc.</t>

        <t>Reference to a potential use of AF_MULTIPATH</t>

        <t>Additional references to related work</t>

      </list>

    </section>

  </back>

</rfc>

PAFTECH AB 2003-20262026-04-23 20:33:57