One document matched: draft-yourtchenko-nat-reveal-hash-00.txt
Network Working Group A. Yourtchenko
Internet-Draft D. Wing
Intended status: Standards Track cisco
Expires: February 26, 2011 August 25, 2010
NAT confessions: revealing the hosts behind the translator
draft-yourtchenko-nat-reveal-hash-00
Abstract
When an IP address is shared among several subscribers, it is
impossible to determine which subscriber has initiated that TCP
connection. This memo describes a technique to share the identity of
a subscriber that initiated a TCP connection with the TCP server..
The proposed method avoids altering the application-level payload and
works well with SSL-protected connections.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 26, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
Yourtchenko & Wing Expires February 26, 2011 [Page 1]
Internet-Draft Revealing the hosts behind NAPT August 2010
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Notational Conventions . . . . . . . . . . . . . . . . . . . . 4
3. Description . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Calculating the Internal Address Mapping . . . . . . . . . . . 5
5. Calculating the Verifier . . . . . . . . . . . . . . . . . . . 6
6. Encoding of the VFY into the packet: IP ID encoding . . . . . 6
7. Encoding of the VFY into the packet: TSval encoding . . . . . 6
8. Operation of the mechanism . . . . . . . . . . . . . . . . . . 7
8.1. Translator Operation . . . . . . . . . . . . . . . . . . . 7
8.2. Server Operation . . . . . . . . . . . . . . . . . . . . . 7
9. Interaction with TCP SYN cookies . . . . . . . . . . . . . . . 8
10. Other Mechanisms to Encode Client Identifier . . . . . . . . . 8
10.1. Defining a new TCP option to store the address . . . . . . 8
10.2. Using TSecr in TCP SYN . . . . . . . . . . . . . . . . . . 8
10.3. Reserving the different port ranges per client . . . . . . 8
11. Security Considerations . . . . . . . . . . . . . . . . . . . 8
12. IANA considerations . . . . . . . . . . . . . . . . . . . . . 9
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
14.1. Normative References . . . . . . . . . . . . . . . . . . . 9
14.2. Informative References . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10
Yourtchenko & Wing Expires February 26, 2011 [Page 2]
Internet-Draft Revealing the hosts behind NAPT August 2010
1. Introduction
There are several scenarios where it is valuable to know the identity
of a TCP client, including geolocation, DoS blocking, and spam
blacklists. Today, this is done by equating IPv4 address with
'identity'. However, the identity of a TCP client is obscured when
an IP address is shared I-D.ietf-intarea-shared-addressing-issues
[I-D.ietf-intarea-shared-addressing-issues]. IP address sharing is
done by both network address and port translators (NAPT) and by
application-layer proxies (e.g., HTTP or FTP proxies).
The current state of the art requires the address sharing alter the
application-level payload and include the identity of the internal
host -- usually the internal host's private IP address. This incurs
several drawbacks,
o adjustment of TCP sequence numbers and acknowledgement numbers for
the duration of the TCP session
o risk of false-positive application matching (e.g., accidentally
inserting an HTTP header into a non-HTTP payload).
o interference with application payload by increasing packet size
(e.g., MTU)
With SSL-protected applications the current state of the art
requires breaking the end-to-end encrypted connection. This results
in several undesirable consequences:
o necessity for the translator to break the end-to-end encryption,
typically by installing an addional Certificate Authority on the
client's CA trust list
o noticeable increase in the processing power required on the
address sharing device to decrypt and re-encrypt that application
payload
This specification avoids the problems described above, and defines
the method of communicating the TCP client's identity to the TCP
server by overloading the TCP timestamp field and IP Identifier field
of the initial TCP SYN.
This extension is necessary because IP address sharing, deployed by
NAT64 devices, will allow malicious users to connect to IPv4-capable
servers. Thus, until a server is only accessible via IPv6 (and
inaccessible via IPv4), the IPv4-capable server will suffer from an
inability to identify individual TCP clients as discussed in
I-D.ietf-intarea-shared-addressing-issues
Yourtchenko & Wing Expires February 26, 2011 [Page 3]
Internet-Draft Revealing the hosts behind NAPT August 2010
[I-D.ietf-intarea-shared-addressing-issues].
2. Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [RFC2119].
3. Description
This proposal leverages the common deployment of TCP timestamps and
that a timestamp-aware TCP server will echo the timestamp..
The caveat with the above is that the remote peer must know in
advance if the TCP client implements this technique or not -- the
timestamp on the server side looks just the same. This could be
resolved by manual configuration but that is impractical, so an
automatic detection mechanism is proposed. The automatic mechanism
calculates a hash over the values of interest and placing the result
into another field. The receiver can then perform the same operation
and verify. If the received and computed values match, then the TCP
timestamp received does contain the encoded internal address. The
verifier value is computed as a hash function over the mapped value
encoded into the timestamp, address after translation, and the TCP
initial sequence number - i.e. the sequence number within the SYN
segment. The usage of the TCP initial sequence number allows to
avoid the verifier value being almost always the same. The reason
for doing so is to satisfy the protocol constraints of the field that
is used to convey this value.
In order to find some place for storing this verification value, we
make another observation: TCP SYN segments are generally rather
small, and the minimum MTU on IPv4 is 576. Typical stacks send the
TCP SYN with DF=1. Therefore, they would never be fragmented. This
means we could use the 16-bit value of the IP ID to put the verifier
value in. The verifier is dependent on the initial sequence number
(ISN) -- which is should have some randomness properties as described
in RFC1948 [RFC1948], therefore the IP ID will be reasonably
different to still serve its purpose even in the extremely unlikely
case that the TCP SYN is fragmented.
Using a 16-bit value as a verifier gives 1 in 65536 chances (or,
0.0015%) probability of erroneously judging that the timestamp
contains the encoded internal address. This may be insufficient
assurance for some of the scenarios. Therefore, we calculate the
verifier (referred to as VFY value) to be a 32-bit integer - and
Yourtchenko & Wing Expires February 26, 2011 [Page 4]
Internet-Draft Revealing the hosts behind NAPT August 2010
store 16 or more bits of this value - at the expense of storing less
bits of Internal Address Mapping (iAM). However, we expect that the
range of iAM for a single public translation would be relatively
small - so, no information will be lost in this process.
4. Calculating the Internal Address Mapping
The main useful property of iAM is that it MUST stay the same for the
same internal address unless the configuration on the translator has
changed. Since the goal is to provide the stable mapping, rather
than fully reveal the internal address, any method that has this
property is acceptable - and the choice of it is left to the
implementors of the translator. If the addresses to be translated
are configured as a prefix, then the iAM can be obtained just by
taking the host bits of the address within the prefix. If the
assignment of these addresses is on an individual basis, then the
simple enumeration might be used. If the internal addresses are
assigned to the pool as set of subnets - then the combination of the
two methods above (the host bits in the least significant part, and
the enumeration in the most significant part) will give good results.
This also stimulates allocation of the internal address in equal-
sized chunks, which should make the maintenance of the network
easier.
As a result, the calculation of the iAM on the outgoing SYN segment
MUST return two values:
o iAM = Internal Address Mapping: a 32-bit unsigned integer
o siAM = Size of Internal Address Mapping, in bits: integer,
allowed range 9..24 - this is the number of significant bits
within the iAM.
The minimum value of siAM being 9 was chosen based on the following
logic:
o having a room of 512 possible hosts allows to keep the property of
iAM to not change during the smaller configuration changes, in
case the pool is made up of individual hosts.
o the range 9..24 has exactly 16 possible values, which will be
useful for encoding.
By encoding only the significant bits of the internal address mapping
the operator of the translator can minimize the probability of the
error - all the unused bits are allocated for the value used to
"fingerprint" the presence of the internal identifier. The more bits
Yourtchenko & Wing Expires February 26, 2011 [Page 5]
Internet-Draft Revealing the hosts behind NAPT August 2010
this "Verifier" value can contain - the less is the chance of
accidental match - and erroneous record of the internal identifier
when there is none.
The range from 9 bits to 24 bits allows to encode between 512 and
16777216 internal identifiers for a single public IP address.
5. Calculating the Verifier
The verifier is calculated as a 32-bit result of a hash function.
This hash function is not expected to be cryptographically strong
(the 'Security considerations' section explains why), however it
should have good distribution, good collision resistance, good
avalanche behavior and be fast and cheap to compute. These
properties are satisfied by Murmur hash [URL.Murmur-hash] function,
therefore it is the hash that we will use.
The calculation of the VFY is performed as follows:
VFY = murmur(iAM | AddrPub | siAM, TCP-ISN)
o iAM is included into the calculation as a 32 bit word.
o siAM is included into the hash calculation as a single byte.
(TBD: the 'selector' referenced below might be a more natural
number to check against, instead of siAM ?).
6. Encoding of the VFY into the packet: IP ID encoding
The low 16 bits of the VFY are encoded in network order into the IP
ID of the packet after translation. the remaining 16 bits form the
"VFYhi" value, which we attempt to fit into the TSval along with the
other information.
7. Encoding of the VFY into the packet: TSval encoding
The TCP timestamp field encodes the iAM and VFYhi as follows:
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|E E E E|S S S S| iAM MSB ... iAM LSB | VFYhi MSB .. VFYhi LSB |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The range of siAM gives 16 possible ways to store iAM (along with the
Yourtchenko & Wing Expires February 26, 2011 [Page 6]
Internet-Draft Revealing the hosts behind NAPT August 2010
same number of degrees of assurance for the detection). In order to
distinguish between those, we introduce the encoding selector (S)
field, which will determine how the lower 24 bits are split between
the iAM and the upper 16 bit of VFY. Note that the smallest value of
siAM being 9, we will never be able to store the most significant bit
of VFY.
The value of S is the number of zero-fill right-shift operations it
would take on the low 24 bit in order to "normalize" the iAM - or, in
other words, it is the number of bits of VFYhi stored within the
timestamp.
Best practices in I-D.ietf-tcpm-tcp-timestamps
[I-D.ietf-tcpm-tcp-timestamps], mention that to reduce the TIME-WAIT
state the timestamp value should be monotonously increasing across
the connections with the same 5-tuple. To give the translators an
opportunity to achieve this property, we reserve several most
significant bits within the timestamp to signify the "Epoch" (E).This
would require storing some additional state per 5-tuple, and the
implementation of such a mechanism is outside of scope for this
document. The implementations that do not implement the monotonously
increasing timestamps, MUST keep the Epoch bits intact from the
original value of the timestamp.
8. Operation of the mechanism
This section outlines the use of this mechanism by the translators
and servers.
8.1. Translator Operation
The translator is involved into processing of the initial SYN segment
(calculating the new version of the TCP timestamp and IP ID), as well
as the SYN-ACK segments (restoring the original value of the TCP
timestamp within the TSecr field).
8.2. Server Operation
The server would operate on every SYN that is of interest for the
logging. It would extract the candidate iAM, and calculate the VFY
value based on the public address and TCP ISN within the received SYN
segment. Then it would compare the VFY against the corresponding
bits in the TSval and IP ID fields. If there is a match, it means
(with a reasonable probability) that the iAM was a valid one
calculated by the translator inbetween. This information is stored
for later access by the application listening on that socket (e.g.,
stored in the TCB).
Yourtchenko & Wing Expires February 26, 2011 [Page 7]
Internet-Draft Revealing the hosts behind NAPT August 2010
9. Interaction with TCP SYN cookies
TCP SYN cookies are commonly deployed to mitigate TCP SYN attacks
RFC4987 [RFC4987]. The mechanism described in this document requires
the server store extra information which arrives on the TCP SYN,
which increases the TCP server's attack surface. To mitigate this,
the translator should apply the similar algorithm to the timestamp of
the ACK segment that is sent by the initiator of the connection in
response to the server's SYN ACK. The authors considered that
serverside might use the TSval in its SYN ACK segment, however this
would interfere with the Extended syncookies. This section needs
further discussion.
10. Other Mechanisms to Encode Client Identifier
This section outlines other mechanisms that we considered, and
outlines the reasons we consider them not applicable.
10.1. Defining a new TCP option to store the address
This would be the cleanest and simplest approach, and is discussed in
[ I-D.wing-reveal-address].
10.2. Using TSecr in TCP SYN
This value is set to zero, and is effectively unused - so it looks
like a convenient place. However this violates the RFC1323
[RFC1323], and this would require much more thorough testing - and
update to RFC1323 [RFC1323].
10.3. Reserving the different port ranges per client
This approach has an appeal due to its simplicity, but it would be
specific to each NAPT device operated by each service provider. That
is, there is no way to identify the device or know the source port
range assigned to an TCP client without contacting the administrator
of the NAPT device. Restricting clients to a specific range also
exposes the clients to some security risk I-D.ietf-tsvwg-port-
randomization [I-D.ietf-tsvwg-port-randomization].
11. Security Considerations
The connections that happen, today, without aNAPT necessarily reveal
the source address of the TCP client -- so revealing the identity of
the client this should not be a concern except for the installations
that attempt to use NAPT for "privacy" reasons. If such an
Yourtchenko & Wing Expires February 26, 2011 [Page 8]
Internet-Draft Revealing the hosts behind NAPT August 2010
installation exists, it is easy to see that any 1:1 remapping of
e.g., IP ID would cause the failure of the validation algorithm -
therefore "protecting the identity".
Therefore, if an organization has more than one level of NAPT and
wants to ensure that the internal translators do not disclose the
information about the internal addresses, it can alter any of the
elements used for the calculations - e.g. randomize the ISN, or remap
the IP ID.
An attacker might might use this functionality to appear as if IP
address sharing is occuring, in the hopes that a naive server will
allow additional attack traffic. TCP servers and applications SHOULD
NOT assume the mere presence of the functionality described in this
paper indicates there are other (benign) users sharing the same IP
address.
The modification of the TSVal option value will break TCP-AO RFC5925
[RFC5925], which provides integrity protection of the TCP SYN
(including TCP options). However, TCP-AO is already known to not
survive address sharing (through a NAPT or through an application
proxy).
12. IANA considerations
None.
13. Acknowledgements
Thanks to Nicholas Leavy for the review.
14. References
14.1. Normative References
[RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
for High Performance", RFC 1323, May 1992.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
Authentication Option", RFC 5925, June 2010.
Yourtchenko & Wing Expires February 26, 2011 [Page 9]
Internet-Draft Revealing the hosts behind NAPT August 2010
14.2. Informative References
[I-D.ietf-intarea-shared-addressing-issues]
Ford, M., Boucadair, M., Durand, A., Levis, P., and P.
Roberts, "Issues with IP Address Sharing",
draft-ietf-intarea-shared-addressing-issues-01 (work in
progress), June 2010.
[I-D.ietf-tcpm-tcp-timestamps]
Gont, F., "Reducing the TIME-WAIT state using TCP
timestamps", draft-ietf-tcpm-tcp-timestamps-00 (work in
progress), June 2010.
[I-D.ietf-tsvwg-port-randomization]
Larsen, M. and F. Gont, "Transport Protocol Port
Randomization Recommendations",
draft-ietf-tsvwg-port-randomization-09 (work in progress),
August 2010.
[RFC1948] Bellovin, S., "Defending Against Sequence Number Attacks",
RFC 1948, May 1996.
[RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common
Mitigations", RFC 4987, August 2007.
[URL.Murmur-hash]
"Murmur hash", <http://sites.google.com/site/murmurhash/>.
Authors' Addresses
Andrew Yourtchenko
cisco
6a de Kleetlaan
Diegem 1831
BE
Phone: +32 2 704 5494
Email: ayourtch@cisco.com
Yourtchenko & Wing Expires February 26, 2011 [Page 10]
Internet-Draft Revealing the hosts behind NAPT August 2010
Dan Wing
cisco
170 West Tasman Drive
San Jose CA 95134
USA
Email: dwing@cisco.com
Yourtchenko & Wing Expires February 26, 2011 [Page 11]
| PAFTECH AB 2003-2026 | 2026-04-23 11:46:50 |