One document matched: draft-williams-overlaypath-ip-tcp-rfc-02.xml
<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc strict="yes"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<rfc ipr="trust200902" category="std">
<front>
<title>Overlay Path Option for IP and TCP</title>
<author initials="B." surname="Williams" fullname="Brandon Williams">
<organization>Akamai, Inc.</organization>
<address>
<postal>
<street></street>
<city>Cambridge</city>
<region>MA</region>
<code></code>
<country>USA</country>
</postal>
<email>brandon.williams@akamai.com</email>
</address>
</author>
<date month="September" year=" 2012" />
<abstract>
<t>Data transport through overlay networks often uses either connection
termination or network address translation (NAT) in such a way that the
public IP addresses of the true endpoint machines involved in the data
transport are invisible to each other. This document describes IPv4,
IPv6, and TCP options for communicating this information from the
overlay network to the endpoint machines.</t>
</abstract>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t>An overlay network is a network of machines distributed throughout
multiple autonomous systems within the public Internet that can be
used to improve the performance of data transport. IP packets from the
sender are delivered first to one of the machines that make up the
overlay network. That machine will relay the IP packets via one or
more other machines in the overlay network, applying various
performance enhancement methods and eventually delivering the packets
to the real intended receiver. Such overlay networks are used to
improve the performance of content delivery
<xref target="IEEE1344002" />. Overlay networks are also used for
peer-to-peer data transport <xref target="RFC5694" />, and they have
been suggested for use in improved scalability for the internet
routing infrastructure <xref target="RFC6179" />.</t>
<t>Data transport through such an overlay network will often use network
address translation for the source (SNAT) or destination (DNAT)
addresses <xref target="RFC2663" /> in such a way that the public IP
addresses of the true endpoint machines involved in the data
transport are invisible to each other. For example, the actual sender
and receiver may use two completely different pairs of source and
destination addresses to identify the connection on the sending and
receiving networks.</t>
<figure anchor="intro_figure">
<artwork>
---------------------------------------------------------------------
ip hdr contains: ip hdr contains:
SENDER -> src = sender --> OVERLAY --> src = overlay2 --> RECEIVER
dst = overlay1 dst = receiver
---------------------------------------------------------------------
</artwork>
</figure>
<t>This can be problematic for network security and diagnostics, since
the source and destination IP addresses used by the sender are not
accessible by the receiver.</t>
<t>Some application layer protocols provide functionality that allows
the overlay network to communicate the sender's public IP address to
the receiver, such as the HTTP X-FORWARDED-FOR header field. However,
use of such application specific options requires both the overlay
network and the receiver to perform application layer processing with
associated overhead. It also requires separate implementations for
each supported application type.</t>
<t>Some proprietary implementations use IP-in-IP or UDP tunneling in
order to communicate the information in question, but these solutions
require continuous network overhead throughout the life of the
connection.</t>
<t>In order to limit the network and processing overhead associated with
the other commonly used approaches, the mechanism described herein
uses an internet or transport layer protocol option to communication
the required address(es).</t>
<t>Theoretically, the existing IP Record Route option could be used to
provide the required information, but there are a few problems
associated with this approach. First, this use would be a re-purposing
of the option, and thus existing implementations of support for the
option would quite possibly conflict with this new use. More
importantly, many IP-layer devices are configured to drop packets that
include IP protocol options. Allowing the information to be
transmitted at either the internet layer or the transport layer allows
for more reliable packet delivery in a broad range of network
configurations.</t>
<t>This document describes <xref target="RFC0791">IPv4</xref>,
<xref target="RFC2460">IPv6</xref>, and
<xref target="RFC0793">TCP</xref> options for communicating this
information from the overlay to the destination network/machine. The
list of addresses specified in the option may include any addresses
that might be useful to the eventual receiver for security and/or
diagnostic purposes, including but not limited to the source and
destination addresses as seen by the sender.</t>
<t>The IPv4 and TCP protocol options described herein provide limited
support for IPv6 addresses. It should be noted that inclusion of even
a single IPv6 address would require the option to consume nearly half
of the total option space provided by TCP and IPv4, which means that
the entire TCP option space would be consumed for SYN packets that
include the most commonly used options (i.e. MSS, WSOPT, SACK
permitted, and TSOPT). This may prevent effective use of the IPv4 and
TCP protocol options for communicating IPv6 addresses in some
circumstances.</t>
</section>
<section anchor="use_case" title="Detailed Use Case">
<t>An example use case that is satisfied by this design is as
follows.
<list style="numbers">
<t>The sending endpoint host performs a DNS lookup that returns the
IP address of a machine on the overlay network. The sending
endpoint addresses its packets with its own public IP address as
the source and the provided overlay IP address as the
destination.</t>
<t>The overlay ingress host receives the packet on its public IP
address, and forwards the packet through the overlay to the egress
host. The overlay egress host performs SNAT, replacing the source
IP address of the sending endpoint with its own IP address in
order to ensure that return traffic will also use the overlay
network. This use of SNAT hides the client's public IP address for
the receiving network.</t>
<t>For load balancing and diagnostic purposes, it is important for
one or more machines on the receiver's network to be able to
determine the public IP address associated with the sending host
(i.e. the address that was hidden due to the use of SNAT by the
overlay).</t>
<t>The overlay egress host is located on the receiver's network,
which means there is a relatively small set of addresses for
machines that may be producing packets that include the overlay
path option.</t>
</list>
</t>
<t>Under these circumstances, the overlay path option will contain a
single IP address: the public IP address of the sending host. If the
receiving network must use the IP address included in the option for a
purpose that requires trust, the fact that the overlay egress host is
under the receiver's administrative control allows the receiver to
apply the necessary limitations to the network configuration. For
example, under these circumstances, the receiver's firewall device
could be configured to drop packets from external hosts if they
contain the overlay path option.</t>
</section>
<section anchor="format" title="Option Format">
<t>Some implementations already exist for version 1 of the overlay path
option. However, version 1 of the option does not provide support for
communicating IPv6 addresses in either the IPv4 or TCP option. Both
version 1 and version 2 of the option are described here in order to
reflect the requirements of current and future implementors.</t>
<t>It is up to the implementor whether version 1 is supported or both
versions are supported. A receiving implementation that supports
version 2 MUST also support version 1. The format changes defined for
version 2 directly support the required backward compatibility.</t>
<t>When a receiving implementation encounters the overlay path option
with an unsupported version number, the receiver MAY either ignore the
option or drop the packet. The appropriate response will be dependent
upon how the overlay path option's value is used by the receiver.</t>
<section anchor="format_v1" title="Version 1">
<t>Version 1 of the option supports only IPv4 addresses. The option
format for both IPv4 and TCP is identical.</t>
<figure anchor="layout_v1">
<artwork>
+---------+---------+---------+--------------------------------+
|Type/Kind| Length | Version | Addresses ...
+---------+---------+---------+--------------------------------+
1 1 1 4 x Address Count
----------------------------------------------------------------
</artwork>
</figure>
<t><list style="hanging">
<t hangText="IPv4 Type:">The type value for IPv4 is TBD-IP4-FULL
(see also <xref target="iana">IANA Considerations</xref>).
<list style="hanging">
<t hangText="Copied flag:"> 1 (All fragments must carry the
option.)</t>
<t hangText="Option class:"> 2 (debugging/measurement)</t>
<t hangText="Option number:">TBD-IP4 (decimal)</t>
</list>
</t>
<t hangText="TCP Kind:">The option kind value for TCP is TBD-TCP
(see also <xref target="iana">IANA Considerations</xref>).</t>
<t hangText="Length:">The length of the option is variable, based
on the number of addresses provided. The minimum value is 7 (3
1-octet fields plus one 4-octet address). The option MUST be
ignored if the length value cannot represent 3 octets plus a
list of 4-octet address value.</t>
<t hangText="Version:">The version number is 1.</t>
<t hangText="Addresses:">Version 1 of the option supports only IPv4
addresses. The remainder of the option space is filled with
standard 32-bit IPv4 addresses. In practice, the first address
will be the public source address used by the sender and the
second address (if present) will be the public destination
address used by the sender. However, the nature of the
addresses provided may vary depending on the nature of the
overlay network in question and is not required to include every
IP address used for the connection. The list of IP addresses
MUST be provided in order of traversal from sender to
receiver.</t>
</list></t>
</section>
<section anchor="format_v2" title="Version 2">
<t>Version 2 of the options supports either IPv4 addresses or IPv6
addresses, but it does not support a mix of IPv4 and IPv6 options
within the same option value. Version 2 provides not only IPv4 and
TCP options, but also an IPv6 option for inclusion in the IPv6
Hop-by-hop Options header. When IPv6 address support is required,
the implementation SHOULD use the IPv6 header option whenever
possible in order to avoid exhaustion of the TCP option space. The
option format for all three protocols is identical.</t>
<figure anchor="layout_v2_ipv6">
<artwork>
+---------------+---------------+---------------+-------------------\
| Type/Kind | Length |Fmly| Version | Addresses ...
+---------------+---------------+---------------+-------------------\
8b 8b | 3b 5b |
-----------------
1 1 1 Addr Size x Count
---------------------------------------------------------------------
</artwork>
</figure>
<t><list style="hanging">
<t hangText="IPv4 Type:">Identical to Version 1.</t>
<t hangText="TCP Kind:">Identical to Version 1.</t>
<t hangText="IPv6 Type:">The Type value for IPv6 is TBD-IP6
(see also <xref target="iana">IANA Considerations</xref>).
<list style="hanging">
<t hangText="act flag:">00 (skip over option)</t>
<t hangText="chg flag:"> 0 (option data does not change
en-route)</t>
<t hangText="rest:"> TBD-IP6 (decimal)</t>
</list>
</t>
<t hangText="Length:">The length of the option is variable, based
on the address family and the number of addresses provided. The
minimum value is 7 (3 1-octet fields plus one 4-octet IPv4
address). The option MUST be ignored if the length value cannot
represent 3 octets plus a list of addresses of the correct
address family.</t>
<t hangText="Family/Version:">The third octet is comprised of two
fields: family and version.Note that the possible family values
have been selected to support backward compatibility with the
8-bit version field in version 1 of the option format.</t>
<t hangText="Family:">The high order 3 bits of the third octet
indicate the address family for all IP addresses represented in
the variable-length Addresses field. The allowed values are:
<list style="hanging">
<t hangText="0:">Address family is IPv4.</t>
<t hangText="1:">Address family is IPv6.</t>
</list>
</t>
<t hangText="Version:">The low order 5 bits of the third octet
indicate the protocol version number. The version number is
2.</t>
<t hangText="Addresses:">The remainder of the option space is
filled with either 32-bit IPv4 or 128-bit IPv6 addresses, as
indicated by the Family field. In practice, the first address
will be the public source address used by the sender and the
second address (if present) will be the public destination
address used by the sender. However, the nature of the
addresses provided may vary depending on the nature of the
overlay network in question and is not required to include every
IP address used for the connection. The list of IP addresses
MUST be provided in order of traversal from sender to
receiver.</t>
</list></t>
</section>
</section>
<section anchor="net_traversal" title="Network Traversal">
<figure anchor="net_traversal_figure">
<preamble>
The following block diagram illustrates the use of addresses in the
IPv4 header and the overlay path option as a packet traverses the
network from sender to receiver. The diagram assumes that the
overlay network uses separate addresses (overlay1 and overlay2) for
ingress and egress.
</preamble>
<artwork>
-----------------------------------------------------------------
SENDER
|
V
+----------------+
| |
| src: sender |
| dst: overlay1 |
| opt: none |
| |
+----------------+
|
V
OVERLAY
NETWORK
|
V
+----------------+
| |
| src: overlay2 |
| dst: receiver |
| opt: sender |
| |
+----------------+
|
V
RECEIVER
-----------------------------------------------------------------
</artwork>
</figure>
</section>
<section anchor="option_use" title="Option Use">
<t>Use of the TCP option allows an implementation to minimize the impact
of this option on bandwidth utilization. Due to the
connection-oriented nature of TCP, the addresses used by the overlay
network cannot change throughout the life of the connection. For this
reason, it is not necessary for the overlay network to include the
overlay path option on every packet. On the other hand, it is not
enough for the option to be provided exclusively in the TCP SYN packet
because the use of SYN cookies, for example, would mean that
connection state is not stored until completion of the three-way
handshake. For this reason, the overlay network MUST include the TCP
overlay path option in every outgoing packet until the receiver has
either acknowledged or transmitted at least one byte of real data. The
overlay network SHOULD discontinue inclusion of the TCP overlay path
option after the first byte is either received or acknowledged. The
receiver MAY ignore the TCP overlay path option on subsequent packets
after successfully processing one instance of the option attached to a
single in-order TCP packet.</t>
<t>IP is not connection oriented, which means that the above described
optimization is not possible. In order to make effective use of the
TCP optimization, an overlay network SHOULD only send the IP option on
packets that do not use TCP as the transport layer protocol. When the
IP option is in use, the overlay network MUST transmit the option with
every packet. The receiver MUST NOT assume that that addresses in the
IP overlay path option will remain consistent, but instead MUST be
prepared to handle address changes in an application appropriate
way.</t>
<t>Use of the IP option is dependent upon support for IP options in all
routers between the overlay egress point and the packet receiver. If
any router along the path is configured to drop packets with unknown
IPv4 options (or any IP options, as is sometimes done as part of a DoS
protection scheme), then use of the IP option will cause connections
to simply fail. For this reason, the IP option SHOULD only be used in
environments where the full path between the overlay egress machine
and the packet receiver is under common administrative control.</t>
<t>As explained above, the intention of both the TCP and IP options is
to provide the receiver with public IP addresses that it would
otherwise have seen if the overlay network were not in use. There are
security implications associated with exposing a network's use of the
private <xref target="RFC1918" /> address space to the public
internet, and for this reason, the overlay path option SHOULD NOT be
used to communicate RFC1918 addresses in packets that traverse the
public internet.</t>
</section>
<section anchor="security" title="Security Considerations">
<t>This specification provides no authentication/validity verification
for the data contained in the address fields. For this reason, the
data contained in the addresses field of the new option cannot itself
be considered inherently secure. In other words, confidence in the
validity of the source address of the IPv4/IPv6 packet does not
translate into confidence in the validity of the addresses in the
overlay path option. With this exception, this specification does not
alter the inherent security of IPv4, IPv6, or TCP.</t>
<t>The addresses provided in the option SHOULD NOT be used for purposes
that require a trust relationship between the overlay network and the
receiver (e.g. billing and/or intrusion prevention) unless a mechanism
outside the scope of this specification is used to ensure the
necessary level of trust. As noted above, one possible example of
such a mechanism would be to place the overlay egress host on the
receiver's own network and to configure the receiver's firewall to
drop any packets from external hosts that provide the overlay path
option. When the receiving network uses the values provided by the
option in a way that does not require trust (e.g. maintaining session
affinity in a load-balancing system), then use of a mechanism to
enforce the trust relationship may not be required.</t>
</section>
<section anchor="fwd_compat" title="Forward Compatibility Support">
<t>The most common use of this option on the internet today will require
recording IP addresses for a single address family only. However, it
may be important in the future to be able to record a mix of IPv4 and
IPv6 addresses. Alternatively, future security requirements may demand
the use of, for example, a keyed hash for data integrity and
authentication purposes and/or inclusion of additional information
specific to the sender's connection.</t>
<t>To balance current-day performance and efficiency against the need
for future extensibility, the option includes a version field, so that
future requirements can be met without the need to consume a new
option number.</t>
</section>
<section anchor="iana" title="IANA Considerations">
<t>[Paragraphs below in braces should be removed by the RFC Editor upon
publication]</t>
<t>[The TCP Overlay Path Option requires that IANA allocate a value from
the TCP option kind namespace, to be replaced for TBD-TCP throughout
this document.]</t>
<t>[The IPv4 Overlay Path Option requires that IANA allocate a value
from the IP option number namespace. The copy flag for this option is
1 and the class for this option is 2. The assigned number will replace
TBD-IP4 throughout this document, and the full type value
(representing copy, class, and number) will replace TBD-IP4-FULL
throughout the document.]</t>
<t>[The IPv6 Overlay Path Option requires that IANA allocate a value
from the IPv6 parameters: hop-by-hop options namespace. The action for
this option is 00 and the change flag for this option is 0. The
assigned number will replace TBD-IP6 throughout this document.]</t>
<t>This document defines the TCP Overlay Path option, described in <xref
target="format_v1" /> and <xref target="format_v2" />. This option
has been assigned the option number TBD-TCP by IANA action.</t>
<t>This document also defines the IPv4 Overlay Path option, described in
<xref target="format_v1" /> and <xref target="format_v2" />. This
option has been assigned the option number TBD-IP4-FULL by IANA
action.</t>
<t>This document also defines the IPv6 Overlay Path hop-by-hop option,
described in <xref target="format_v2" />. This option has been
assigned the option number TBD-IP6 by IANA action.</t>
<t>This document defines no new namespaces.</t>
</section>
</middle>
<back>
<references title="Informative References">
<reference anchor="RFC0791">
<front>
<title>Internet Protocol</title>
<author initials="J." surname="Postel" fullname="Jon Postel">
<organization abbrev="ISI">
USC/Information Sciences Institute
</organization>
</author>
<date month="September" year="1981" />
</front>
<seriesInfo name="RFC" value="791" />
<seriesInfo name="STD" value="5" />
</reference>
<reference anchor="RFC0793">
<front>
<title>Transmission Control Protocol</title>
<author initials="J." surname="Postel" fullname="Jon Postel">
<organization abbrev="ISI">
USC/Information Sciences Institute
</organization>
</author>
<date month="September" year="1981" />
</front>
<seriesInfo name="RFC" value="793" />
<seriesInfo name="STD" value="7" />
</reference>
<reference anchor="RFC1918">
<front>
<title>Address Allocation for Private Internets</title>
<author initials="Y." surname="Rekhter" fullname="Yakov Rekhter">
<organization>Cisco Systems</organization>
</author>
<author initials="B." surname="Moskowitz" fullname="Robert G. Moskowitz">
<organization abbrev="Chrysler Corp.">Chrysler Corporation</organization>
</author>
<author initials="D." surname="Karrenberg" fullname="Daniel Karrenberg">
<organization abbrev="RIPE NCC">
RIPE Networking Coordination Center
</organization>
</author>
<author initials="G.J." surname="de Groot" fullname="Geert Jan de Groot">
<organization abbrev="RIPE NCC">
RIPE Networking Coordination Center
</organization>
</author>
<author initials="E." surname="Lear" fullname="Eliot Lear">
<organization>Silicon Graphics, Inc.</organization>
</author>
<date month="February" year="1996" />
</front>
<seriesInfo name="RFC" value="1918" />
<seriesInfo name="BCP" value="5" />
</reference>
<reference anchor="RFC2460">
<front>
<title>Internet Protocol, Version 6 (IPv6)</title>
<author initials="S." surname="Deering" fullname="Stephen E. Deering">
<organization abbrev="Cisco">Cisco Systems, Inc.</organization>
</author>
<author initials="R." surname="Hinden" fullname="Robert M. Hinden">
<organization>Nokia</organization>
</author>
<date month="December" year="1998" />
</front>
<seriesInfo name="RFC" value="2460" />
</reference>
<reference anchor="RFC2663">
<front>
<title>IP Network Address Translator (NAT) Terminology and
Considerations</title>
<author initials="P." surname="Srisuresh" fullname="Pyda Srisuresh">
<organization>Lucent Technologies</organization>
</author>
<author initials="M." surname="Holdrege" fullname="Matt Holdrege">
<organization>Lucent Technologies</organization>
</author>
<date month="August" year="1999" />
</front>
<seriesInfo name="RFC" value="2663" />
</reference>
<reference anchor="RFC5694">
<front>
<title>Peer-to-Peer (P2P) Architecture: Definition, Taxonomies,
Examples, and Applicability</title>
<author initials="G." surname="Camarillo" fullname="Gonzalo Camaillo">
<organization abbrev="IAB">Internet Architecture Board</organization>
</author>
<date month="November" year="2009" />
</front>
<seriesInfo name="RFC" value="5694" />
</reference>
<reference anchor="RFC6179">
<front>
<title>The Internet Routing Overlay Network (IRON)</title>
<author initials="F." surname="Templin" fullname="Fred L. Templin">
<organization>Boeing Research & Technology</organization>
</author>
<date month="March" year="2011" />
</front>
<seriesInfo name="RFC" value="6179" />
</reference>
<reference anchor="IEEE1344002">
<front>
<title abbrev="Informed content delivery">Informed content delivery
across adaptive overlay networks: IEEE/ACM Transactions on
Networking, Vol 12, Issue 5, ppg 767-780</title>
<author initials="J.W." surname="Byers" fullname="John Byers">
<organization abbrev="Boston University">
Boston University, Dept. of Computer Science
</organization>
</author>
<author initials="J." surname="Considine" fullname="Jeffrey Considine">
<organization abbrev="Boston University">
Boston University, Dept. of Computer Science
</organization>
</author>
<author initials="M." surname="Mitzenmacher" fullname="Michael Mitzenmacher">
<organization abbrev="Harvard University">
Harvard University, EECS
</organization>
</author>
<author initials="S." surname="Rost" fullname="Stanislav Rost">
<organization abbrev="MIT">
MIT Laboratory for Computer Science
</organization>
</author>
<date month="October" year="2004" />
</front>
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 07:13:08 |