One document matched: draft-vyncke-ipv6-traffic-in-p2p-networks-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC3056 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3056.xml">
<!ENTITY RFC3484 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3484.xml">
<!ENTITY RFC5214 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5214.xml">
<!ENTITY RFC4380 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4380.xml">
<!ENTITY I-D.defeche-ipv6-traffic-in-p2p-networks SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-defeche-ipv6-traffic-in-p2p-networks-00.xml">
<!ENTITY I-D.ietf-v6ops-happy-eyeballs SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-v6ops-happy-eyeballs-04.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="3"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="info" ipr="trust200902" docName="draft-vyncke-ipv6-traffic-in-p2p-networks-00.txt">
  <front>
    <title abbrev="IPv6 Traffic in BitTorrent">Measuring IPv6 Traffic in BitTorrent Networks</title>

	<author initials="M" surname="Defeche" fullname="Martin Defeche">
      <organization>University of Liege</organization>
      <address>
        <email>martin.defeche@gmail.com</email>
      </address>
    </author>

     <author initials="E" surname="Vyncke" fullname="Eric Vyncke" role="editor">
      <organization>Cisco Systems</organization>
      <address>
        <postal>
          <street>De Kleetlaan 6a</street>
          <city>Diegem</city>
          <country>Belgium</country>
          <code>1831</code>
        </postal>
        <phone>+32 2 778 4677</phone>
        <email>evyncke@cisco.com</email>
      </address>
    </author>

    <date day="24" month="October" year="2011" />

    <area>Operations and Management</area>

<workgroup>IPv6 Operations</workgroup>
<keyword>IPv6 Traffic</keyword>
<keyword>Measurements</keyword>
<keyword>BitTorrent</keyword>

    <abstract>
<t>
This document is a follow-up of a University thesis  which aims to measure the evolution over time of IPv6 traffic and
to analyze
the geographical distribution of IPv6 nodes. 
The first measurements were done during the Summer 2009 using a specific-purpose program which
connects to the BitTorrent peer-to-peer network and this document adds measurements done with the
same program but in October 2011 (2 years later).
</t>
<t>
The study was made in Peer-to-Peer (P2P) networks because they are responsible
for most Internet traffic and because their structure and functioning permit a rapid
discovery of a large number of nodes from all over the world. In addition, the P2P users
are more likely to be interested by IPv6 as IPv6 does not have the same NAT problems
as IPv4.
</t>
   </abstract>

  </front>

  <middle>
    <section title="Introduction">
    <t>
An IPv6 vs. IPv4 measurement was made in Peer-to-Peer (P2P) networks because they are responsible
for most Internet traffic <xref target="TFE3" /> and because their structure and functioning permit a rapid
discovery of a large number of nodes from all over the world. 
In addition, the P2P users
are more likely to be interested by IPv6 as IPv6 does not have the same NAT problems
as IPv4.
</t>
<t>
The measurements were made in October 2011 re-using the application developped in October 2009 and presented in 
<xref target="I-D.defeche-ipv6-traffic-in-p2p-networks"/>. This was part of the Master Thesis
<xref target="THESIS"/>.
</t>	
<t>
Measurements include: number of IPv4 vs. IPv6 nodes, which kind of IPv6 connectivity and geographical distribution.
</t>
</section><!-- introduction-->

<section anchor="bittorrent" title="Some explanations about BitTorrent">
<t>
Let's start with some explanations about BitTorrent.
</t>
	<t>
BitTorrent is based on an hybrid decentralized network which is particularly well
suited to the transfer of large files. BitTorrent generates the largest amount of traffic
of all P2P networks and was installed on 28.20% of PCs in September 2007, and this
number is certainly higher at present. BitTorrent also includes different protocols to
discover peers, namely DHT, PEX and LSD which will be discussed later. Thanks
to these mechanisms BitTorrent can be completely decentralized. The different clients
are all compatible with the core protocol but some divergences concerning PEX, DHT
and LSD appear between Azureus and the mainline, which represents at least the
BitTorrent and µTorrent clients. Swarming is one of the basis of the protocol and IPv6
specification exists although it is not implemented by all clients.
	</t>
	<t>
	Since BitTorrent is the only protocol that offers in theory a good support to IPv6,
our choice is limited. But there are other reasons why BitTorrent is the network protocol
that best matches our needs.
<list style="symbols">
	<t>The number of different copies of the same file(s) is far smaller than in other
networks. This leads to a larger number of peers sharing the same file(s). Therefore
swarming can be more efficient thanks to a larger number of sources.</t>
	<t>As BitTorrent is generally used to share large files, peers stay connected longer, 
	giving us more time to discover them.</t>
	<t>BitTorrent is responsible for the largest part of the P2P traffic throughout the
world.</t>
	<t>BitTorrent is widely used in most regions of the world.</t>
	<t>Thanks to many different extensions like <xref target="PEX">PEX</xref>, 
	<xref target="DHT">DHT</xref>
 and <xref target="LSD">LSD</xref>, the discovery of peers is improved greatly.</t>
</list>
</t>
<section anchor="PWP" title="Peer Wire Protocol">
<t>
The Peer wire Protocol <xref target="TFE5"/>, or PWP, specifies the way peers communicate in an
asynchronous fashion with each other to exchange data and signalling messages. It is
based on TCP connections that function in Full-Duplex and Pipelining mode to get
better performances. This protocol does not define how to choose pieces to request,
nor how to select peers to download from and to upload to. Certain algorithms, which
are explained below, give some solutions to attain a good propagation of pieces in the
swarm and to make peers happy with their download rate compared to their upload
rate.
</t>
</section><!-- PWP-->
<section title="Tracker">
<t>
The trackers <xref target="TFE5"/> act like servers but do not deal with the transfer of files ; their only
purposes are to manage the swarm and to respond to periodic client requests for information
about peers sharing the same torrent. Since the transfer of files is completely
supported by peers, the bandwidth requirement is very low and thus a single tracker
can handle many swarms, each one containing a large number of peers.
This protocol commonly called THP is used by clients to communicate over HTTP
with trackers. As a matter of fact, the trackers run an HTTP server. Peers contact
trackers that are present in the metadata file for the following purposes :
<list style="symbols">
	<t>to enter a swarm</t>
	<t>to leave a swarm</t>
	<t>to inform the tracker that the download is complete</t>
	<t>to periodically give information on the download state and retrieve information
	about a random peer set. This time interval is defined by the tracker. If a peer
	misses a periodical request it can be considered as disconnected by the tracker
	and thus not present in the swarm any more</t>
</list>
This communication permits trackers to keep track of peers that are in the swarm and
to avoid referencing disconnected peers.
</t>
<t>
Tracker responses
do not support IPv6 peers without this extension <xref target="TFE12"/> which means that they do not
include IPv6 peers in their responses. This extension adds two new parameters for the
tracker requests:
<list style="symbols">
	<t>ipv6 : the client IPv6 address. It can be added when the client contacts the tracker
	over IPv4;</t>
	<t>ipv4 : the client IPv4 address. Conversely, when communication is made over IPv6.</t>
</list>
It permits clients having a dual stack to advertize both its addresses in the swarm. The
port of the additional address is the same as the primary one.
</t>
</section><!-- tracker-->
<section anchor="PEX" title="Peer Exchange">
<t>
The Peer Exchange or PEX is a means to discover new peers through peers that the
client is already connected to. As a matter of fact, peers trade information concerning
the peers they are connected to. Only few initial peers are needed to rapidly find a
large number of peers. This mechanism permits a reduction of the tracker load and an
improvement in the robustness as the tracker dependency is decreased.
</t>
</section><!-- pex -->
<section anchor="DHT" title="Distributed Hash Table">
<t>
The DHT technology <xref target="TFE13"/> is a way to store a hash table over a network, thus in a
distributed way, each peers contains a part of it. Files and nodes are identified by a
same length key, which is 20 bytes in BitTorrent. Each peer also maintains a list of
different peers to efficiently route its searches. DHT in BitTorrent is an implementation
of Kademlia which is based on the XOR metric that is the distance between two nodes
or between a node and a file can be determined by a XOR of their keys.
</t>
<t>
This technology is used to decentralize the tracking mechanism to once more decrease
the dependence on trackers. Even if the trackers are down or if no trackers are
specified in the metadata file, the DHT technology permits the discovery of peers sharing
the same torrent thanks to the info key hash as identifier. Conversely to PEX, no
initial peers are needed.
</t>
</section><!-- dht-->
<section anchor="LSD" title="Local Service Delivery">
<t>
The Local Service Discovery or LSD permits the discovery of peers that are downloading
the same torrent in the same local network. The transfer rate is much higher
between two peers in the same local network than between two peers in different networks
since the bandwidth limitation is that of the local network and not the one of
the Internet connection which is far smaller, especially the upload stream. Briefly, LSD
works as follows : the hash of the info key is broadcasted in the local network to find
out peers sharing the same torrent.
</t>
</section><!-- lsd -->
</section> <!-- explanations about bittorrent-->

<section anchor="tools" title="Tools used for Measurement">
<t>
As millions of pieces of information can be reached, the choice of a MySQL database
came naturally for effectiveness reasons and because concurrent access is managed.
Each program requests, inserts, and/or updates information in this database.
</t>
<t>
The computer that holds the database and executes the programs 
has native IPv6 and IPv4 connectivity, which
will permit a better evaluation of the two versions than if we use Teredo, for instance.
</t>
<t>
Our specialized BitTorrent client joins different swarms and periodically changes to other swarms.
While in a swarm we try to get connected to as many peers as possible thanks to all
protocols supported by BitTorrent. In this way we are able to easily, automatically and
efficiently discover peers. Ideally we should choose swarms with large numbers of peers
in order to effectively retrieve information. Concerning the legality issue, we can use a
trick to avoid downloading and uploading any files. In fact, we claim that we are not
interested in any pieces so we will not download anything and we will not upload either
since we have nothing to upload. So we are present in the swarm but without taking
part in the sharing of files.
</t>

</section><!-- tools -->

<section anchor="measures" title="What was measured">
<t>
The measurements were done in two periods:
<list style="symbols">
<t>May 2009 to July 2009: 5,000,000 peers were discovered but we were only
  able to to establish a BitTorrent connection with 1,500,000 peers;</t>
<t>1 day in October 2009: 100,000 peers were discovered.</t>
<t>1 day in October 2011: 180,000 peers were discovered.</t>
</list>
</t>
<section title="IPv6 Addresses">
<t>
We will now analyze the 
2,177 
distinct IPv6 addresses found via the different protocols
and classify them into different categories.
The next section will explain
the utilization of these addresses over the world in greater detail.
</t>
<texttable anchor="ipv6_addresses" title="Different Types of IPv6 Addresses">

<ttcol></ttcol>
<ttcol>Native</ttcol>
<ttcol>Teredo</ttcol>
<ttcol>6to4</ttcol>
<ttcol>ISATAP</ttcol>
<ttcol>Tunnel Brokers</ttcol>

<c>Number in 2009</c>
<c>1,216</c>
<c>99,634</c>
<c>41,425</c>
<c>24</c>
<c>102</c>

<c>Percentage in 2009</c>
<c>0.85</c>
<c>69.72</c>
<c>28.99</c>
<c>0.02</c>
<c>0.08</c>


<c>Number in 2011</c>
<c>258</c>
<c>1,280</c>
<c>636</c>
<c>3</c>
<c>n/a</c>

<c>Percentage in 2011</c>
<c>11.85</c>
<c>58.80</c>
<c>29.22</c>
<c>0.14</c>
<c>n/a</c>
</texttable>

<t>
The next table describes which weird and plain wrong IPv6 address our probe has received.
It can be expected that a normal BitTorrent client will try to contact those addresses
and therefore will generate packets whose destination IPv6 address is illegal.
</t>

<texttable anchor="ipv6_addresses_cont" title="Different Types of IPv6 Addresses (Cont.)">
<ttcol></ttcol>
<ttcol>6bone</ttcol>
<ttcol>Site Local</ttcol>
<ttcol>IPv4 Compatible</ttcol>
<ttcol>IPv4 Mapped</ttcol>
<ttcol>Bogon</ttcol>

<c>Number in 2009</c>
<c>436</c>
<c>24</c>
<c>1</c>
<c>94</c>
<c>74</c>

<c>Percentage in 2009</c>
<c>0.31</c>
<c>0.02</c>
<c>0.00</c>
<c>0.07</c>
<c>0.05</c>


<c>Number in 2011</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>61</c>

<c>Percentage in 2011</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>100</c>

</texttable>
<section title="Native IPv6">
<t>
In the 2011 study, only 197 distinct native IPv6 addresses were discovered in 
25 different countries
during our study. More than 40 % of these addresses came from the French ISP Free.
</t>
</section><!-- native-->
<section title="Teredo">
<t>
Teredo with 59 % of IPv6 addresses found is clearly the most utilized way to get
IPv6 connectivity. This ratio is also unsually high compared to other Internet
measurements. It is probably linked to:
<list style="symbols">
<t>users: BiTorrent users are mainly residential and not professional, therefore
Teredo is allowed to traverse the firewall while most enterprises block all
outbound UDP traffic (and blocking Teredo in the same shot) and the most 
common Teredo implementation (Microsoft) also disables Teredo when used in 
an Active Directory network;</t>
<t>application: Teredo is only used by Windows to connect to an IPv6 which
does not have any IPv4 address (in other word <xref target="RFC3484"/> 
severelly limits
the normal use of Teredo), but, with BitTorrent the peers appear always as
single stack (i.e. it appears like having only an IPv6 address even if it
actually have both version of the IP protocol).</t>
</list>
</t>
<t>
When we analyze the servers employed to configure these addresses we notice that
only four servers represent 100 % of the utilized ones. Their addresses are as
follows :
<list style="numbers">
<t>65.55.158.118  : is deployed by Microsoft (56.44 %);</t>
<t>94.245.115.184 : is deployed by Microsoft ( 2.42 %);</t>
<t>94.245.121.251 : is deployed by Microsoft (17.49 %);</t>
<t>94.245.121.253 : is deployed by Microsoft (23.65 %);</t>
</list>
while in October 2009, three different servers represented 99 % of the traffic:
<list style="numbers">
<t>213.199.162.214 : is deployed by Microsoft (46.12%);</t>
<t>65.55.158.80 : is also set up by Microsoft (41.33%);</t>
<t>207.46.48.150 : is once more owned by Microsoft (12.41%).</t>
</list>
Thus 100 % of the peers that were using Teredo were likely to be using one of
the Microsoft Operating Systems.
</t>
</section><!-- teredo-->
<section title="6to4">
<t>
6to4 is also broadly used to provide IPv6 connectivity as 
29.21 % of the addresses
found in 2011 and was 28.99 % in 2009 were created by this transition mechanism. 
Nevertheless, it is still far behind Teredo. It must be noted that the proportion
of 6to4 nodes has not decreased as indicated by other measurements, but, the
other measurements count only the web access to dual-stack servers where
happy eye-balls 
<xref target="I-D.ietf-v6ops-happy-eyeballs"/> 
and <xref target="RFC3484"/> can influence the browser behavior in order to
avoid all tunneling (this includes Teredo and 6to4).
</t>
</section><!-- 6to4-->

<section title="ISATAP">
<t>
We only discovered 3 different peers that used ISATAP (<xref target="RFC5214"/>)
to get IPv6 connectivity in
a non IPv6-capable infrastructure. This mechanism is less common than Teredo (<xref target="RFC4380"/>)
and 6to4 (<xref target="RFC3056"/>). However, this bad result can be moderated by the fact that ISATAP is mostly
destined to enterprises which often have a strict control on applications used by their
employees. Thus installing a BitTorrent client is not often possible, and even if it is the
firewall can filter its traffic.
</t>
</section><!-- isatap-->
<section title="Other Addresses">
<t>
Although IPv6 addresses with the 6bone prefix should not exist anymore, in October 2009 we found
up to 436 addresses with this prefix. In fact, all these addresses have the old Teredo
prefix 3ffe:831f::/32 which was used in the 6bone network.
</t>
<t>
In October 2009, we found 94 IPv4-mapped addresses that were all discovered by PEX.
</t>
<t>The good news is that in October 2011, we found neither a single 6bone address not
an IPv4-mapped address in our peers.</t>
<t>
Bogons are IP blocks that are reserved for private or special uses plus those that
are not yet allocated by the IANA. Thus a bogon is an illegal IP address that must not
appear on the Internet since they are theoretically unroutable. At present, the IPv6
unicast space is limited to the 2000::/3 prefix.
During our study we discovered up to 61 bogons via PEX.
</t>
</section><!-- other addresses-->

</section> <!-- addresses -->
<section anchor="traffic" title="Traffic Measurements">
<t>

Most of the European countries have at least 1 % of their peers that can use IPv6 with
better results for Finland (3.25%), France (2.53%), Latvia (2.41%), Denmark (2.40%)
and Belgium (2.00%). 
Another surprising fact is that China
has only 1.40 % of its peers using IPv6 which seems strange since China is one of the
leading countries in IPv6 deployment. This result may be negatively affected by the
filter set up in this country. Other countries having peers which are IPv6-enabled
are Chile (3.07%), Mexico (2.10%), Korea (1.98%) as well as Japan (1.88%),
USA (1.34%) and Canada (1.40%).
</t>
<t>
As explained in the analysis of IPv6 addresses section, we only found few native IPv6
addresses. This analysis confirms what we discussed previously, which
is that Japan and France have the highest rate with 1.09% in 2009 (and 1.62% in 2011)
and 0.65% in 2009 (and 0.63% in 2011) respectively.
China with 0.65% in 2009 and 1.01% in 2011, 
has a good percentage in comparison with other countries as well as growth. 
</t>
</section><!-- traffic-->
</section><!-- measures-->

<section title="IANA Considerations">


   <t>There are no extra IANA consideration for this document.
   </t>

</section><!--IANA-->

<section title="Security Considerations">

   <t>This I-D does not describe any specific protocol, so, the security section is mostly irrelevant.   
   The measures were done with the specific security issues in mind:
   <list style="symbols">
		<t>copyrighted content: only a first fragment is downloaded and never stored on long term storage,
			so, it is assumed that even if copyrighted content is actually received it will never be
			user or propagated further to other peers;</t>
		<t>denial of services: timers and rate limiters are used when getting the list of torrents
		from our directory, when contacting other peers, and so on</t>
   </list>
   </t> 

</section><!--security-->

<section title="Acknowledgements">

  <t>
  Many thanks to Professor Guy Leduc of the University of Liege 
  and his research assistants, Cyril Soldani and Sylvain Martin.
  Martin is also grateful to Arvid Norberg who implemented the LibTorrent Rasterbar
library <xref target="TFE23"/>. We also exchanged ideas with Nathan Ward.
  </t>

</section><!--ack-->

</middle>
  <back>
  
<references title="Normative References">

	<reference anchor="TFE5" 
		target="http://www.bittorrent.org/beps/bep_0003.html">
		<front>
		<title>The BitTorrent Protocol Specification</title>
		<author surname="Cohen" initials="B."/>
		<date year="2008"/>
		</front>
	</reference>

	
	<reference anchor="TFE12"
		target="http://www.bittorrent.org/beps/bep_0007.html">
		<front>
		<title>IPv6 Tracker Extension</title>
		<author surname="Hazel" initials="G."/>
		<date month="May" year="2008"/>
		</front>
	</reference>
	
	<reference anchor="TFE13">
		<front>
		<title>Kademlia : A Peer-to-peer Information System Based on the XOR Metric</title>
		<author surname="Maymounkov" initials="P."/>
		</front>
	</reference>
	
	
</references>

<references title="Informative References">
    
	&RFC3056;
	&RFC3484;
	&RFC5214;
	&RFC4380;
	
	&I-D.defeche-ipv6-traffic-in-p2p-networks;
	&I-D.ietf-v6ops-happy-eyeballs;
	
	<reference anchor="TFE3">
		<front>
		<title>Internet Study 2008/2009</title>
		<author surname="Schulze" initials="H."/>
		<date year="2009"/>
		</front>
	</reference>
	
	<reference anchor="TFE23" target="http://www.rasterbar.com/products/libtorrent/">
		<front>
		<title>LibTorrent Rasterbar Library</title>
		<author surname="Norberg" initials="Arvid"/>
		</front>
	</reference>
	
	<reference anchor="THESIS">
		<front>
		<title>Measure and Analysis of IPv6 Traffic in Peer-to-Peer Networks. Master Thesis</title>
		<author surname="Defeche" initials="M">
			<organization>University of Liege</organization>
		</author>
		<date month="August" year="2009"/>
		</front>
	</reference>
	

	<reference anchor="geoip" target="http://www.maxmind.com/app/geolitecountry">
        <front>
          <title>GeoLite Country</title>

          <author>
            <organization>Maxmind</organization>
          </author>

          <date year="2011" />
        </front>
      </reference>
	  
</references>

  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 06:03:06