One document matched: draft-marocco-alto-problem-statement-03.txt
Differences from draft-marocco-alto-problem-statement-02.txt
Network Working Group E. Marocco
Internet-Draft Telecom Italia
Intended status: Informational V. Gurbani
Expires: May 6, 2009 Bell Laboratories, Alcatel-Lucent
November 2, 2008
Application-Layer Traffic Optimization (ALTO) Problem Statement
draft-marocco-alto-problem-statement-03
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 6, 2009.
Copyright Notice
Copyright (C) The IETF Trust (2008).
Abstract
A significant part of the Internet traffic today is generated by
peer-to-peer applications used, for example, for file sharing,
realtime communications and live media streaming. Such applications
often deal with large amounts of data in direct peer-to-peer
connections, but they usually have little knowledge of the underlying
network topology. As a result, they may choose their peers based on
measurements and statistics which, in some situations, may lead to
Marocco & Gurbani Expires May 6, 2009 [Page 1]
Internet-Draft ALTO Problem Statement November 2008
suboptimal choices. This document describes problem related to
optimizing traffic generated by peer-to-peer applications through the
use of network-layer information, provides a representative set of
use cases that may exhibit this problem, and outlines considerations
that have to be taken in account when arriving at equitable
solutions.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Research or Engineering? . . . . . . . . . . . . . . . . . 4
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1. File sharing . . . . . . . . . . . . . . . . . . . . . . . 7
4.2. Cache/Mirror Selection . . . . . . . . . . . . . . . . . . 7
4.3. Live Media Streaming . . . . . . . . . . . . . . . . . . . 8
4.4. Realtime Communications . . . . . . . . . . . . . . . . . 8
4.5. Distributed Hash Tables . . . . . . . . . . . . . . . . . 8
5. Solution Considerations . . . . . . . . . . . . . . . . . . . 8
5.1. ALTO Service Providers . . . . . . . . . . . . . . . . . . 8
5.2. Discovery of ALTO servers . . . . . . . . . . . . . . . . 9
5.3. User Privacy . . . . . . . . . . . . . . . . . . . . . . . 9
5.4. Topology Hiding . . . . . . . . . . . . . . . . . . . . . 9
5.5. Coexistence with Caching . . . . . . . . . . . . . . . . . 9
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11
8. Informative References . . . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12
Intellectual Property and Copyright Statements . . . . . . . . . . 13
Marocco & Gurbani Expires May 6, 2009 [Page 2]
Internet-Draft ALTO Problem Statement November 2008
1. Introduction
A significant part of the Internet traffic today is generated by
peer-to-peer (P2P) applications used, for example, for file sharing,
realtime communications and live media streaming
[WWW.cachelogic.picture] [WWW.wired.fuel]. Different from the
client/server architecture, P2P applications access resources (e.g.
files or media relays) distributed across the Internet and exchange
large amounts of data in connections that they establish directly
with nodes sharing such resources.
One advantage of P2P systems arises from the fact that the resources
such systems offer are often made available through multiple
replicas. Yet applications generally do not have reliable
information of the underlying network and thus have to select among
available instances based on information they deduce from empirical
measurements which, in some situations, lead to suboptimal choices.
For example, popular metrics based on round trip time estimation
sometimes used for initial sources selection (i.e. before actual
data transmissions begin, when goodput values are unknown) perform
quite badly for file sharing applications as they tend to ignore
bandwidth and reliability of underlying links, which have much
more influence than delay on file transfers.
Many of the existing P2P systems are based on an overlay network
consisting of direct connections peers establish among themselves;
such connections, obviously, do not account for the underlying
network topology. In addition to simply achieving suboptimal
performance, such networks can lead to congestions and cause serious
inefficiencies. As shown in [ACM.fear], traffic generated by popular
P2P applications often cross network boundaries multiple times,
overloading links which are frequently subject to congestion
[ACM.bottleneck].
Recent studies [ACM.ispp2p] [WWW.p4p.overview] [ACM.ono] have shown
that if Internet Service Providers (ISP), network operators or third
parties in general provide reliable network information such as
topology and/or bandwidth to P2P applications, it would be possible
to greatly increase application performance, reduce congestion and
optimize the overall traffic across different networks. This
document gives the problem statement of optimizing traffic generated
by P2P applications using information provided by a separate party.
The rest of this document is structured as follows. Section 3
introduces the problem more formally. Section 4 describes some use
cases where both P2P applications and network operators would benefit
from a solution to such a problem. Section 5 describes the main
Marocco & Gurbani Expires May 6, 2009 [Page 3]
Internet-Draft ALTO Problem Statement November 2008
issues to consider when designing such a solution.
1.1. Research or Engineering?
At the time of writing, several solutions have been proposed to
address the problem described in this document, both inside and
outside the IETF accompanied by encouraging simulation and field test
results [I-D.bonaventure-informed-path-selection] [ACM.ispp2p]
[WWW.p4p.overview]. Such solutions have been proposed independently,
but all consists of two essential parts:
o a discovery mechanism which can be used by a P2P application to
find a reliable information source;
o a protocol used by P2P applications to query such sources in order
to retrieve the information needed to perform better-than-random
selection of the endpoints providing a desired resource.
It is not easy to foresee how such solutions would perform in the
Internet, but a more accurate evaluation would require representative
data collected from real systems by a critical mass of users.
However, wide adoption will probably never happen without an
agreement on a common solution based on open standard.
2. Definitions
The following terms have special meaning in the definition of the
Application-Layer Traffic Optimization (ALTO) problem.
Application: A distributed communication system (e.g., file sharing)
that uses the ALTO service to improve its performance (or quality
of experience) while reducing resource consumption in the
underlying network infrastructure. Applications may use the P2P
model to organize themselves, or they can be simple client-server
based, or even a hybrid of both.
Peer: A specific participant in an application. Colloquially, a
peer refers to a participant in a P2P network or system, and this
definition does not violate that assumption. If the application
is based on a client-server or hybrid model, then the usage of the
terms "client" and "server" imparts enough context for dis-
ambiguity.
Resource: A piece of content (e.g. a file or a chunk of a file) or a
server process (e.g. for relaying a media stream or for perfoming
a computation) which can be accessed by applications. In the ALTO
context a resource is often available in several equivalent
replicas, shared by different peers.
Marocco & Gurbani Expires May 6, 2009 [Page 4]
Internet-Draft ALTO Problem Statement November 2008
Resource Identifier: An application layer identifier used to
identify a resource, no matter how many replicas thereof exist.
Resource Provider: For P2P applications, a resource provider is a
specific peer that provides some resources. For client-server or
hybrid applications, a provider is a server that hosts a resource.
Resource Consumer: For P2P applications, a resource consumer is a
specific peer that needs to access resources. For client-server
or hybrid applications, a consumer is a client that needs to
access resources.
Transport Address: All address information that is needed by a
resource consumer to access the desired resource at a specific
resource provider. This information usually consists of the
resource provider's IP address, and it may include other
information, such as a transport protocol identifier or port
numbers.
Overlay Network: A virtual network consisting of direct connections
on top of another network, established by a group of peers. This
logical structure, which can be used to implement distributed
applications, may benefit from guidance from the ALTO service.
Resource Directory: An entity which is separate from the resource
consumer, and which assists a resource consumer to identify a set
of resource providers. In P2P applications, the resource
directory may be referred to as a P2P tracker. Some applications
do not use this concept and do the address mapping directly in the
resource consumer.
Host Location Attribute: Information which is related to the
location of a host in the network topology. The ALTO service
gives recommendations based on this information. A host location
attribute may consist, for example, of an IP address, an address
prefix or address range that contains the host, an autonomous
system (AS) number, or any other localization attribute. These
different options may provide different levels of detail.
Depending on the system architecture, this may have implications
on the quality of the recommendations ALTO is able to provide, on
whether recommendations can be aggregated, and on how much
privacy-sensitive information about users might be disclosed to
additional parties.
ALTO Service: If several resource providers are able to provide the
same resource, the ALTO service gives guidance to a resource
consumer or resource directory, on which resource provider(s) to
select, in order to optimize its performance (or quality of
experience) while minimizing resource consumption in the
underlying network infrastructure.
ALTO Server: A logical entity that provides interfaces that can be
used to query the ALTO service.
Marocco & Gurbani Expires May 6, 2009 [Page 5]
Internet-Draft ALTO Problem Statement November 2008
ALTO Client: The logical entity that sends ALTO queries. Depending
on the architecture of the application it may be embedded in the
resource consumer or in the resource directory.
ALTO Query: A message sent from an ALTO client to an ALTO server,
which requests guidance from the ALTO Service.
ALTO Reply: A message sent from an ALTO server to an ALTO client,
which contains guiding information from the ALTO service.
ALTO Transaction: An ALTO transaction consists of an ALTO query and
the corresponding ALTO reply.
Local Traffic: Internet traffic which stays within the network
infrastructure of one Internet Service Providers (ISP). This type
of traffic usually causes the least costs for the ISP.
Peering Traffic: Internet traffic exhcanged by two Internet Service
Providers whose networks are directly connected. Apart from
infrastructure and operational costs, peering traffic is usually
free, within the contract of a peering agreement.
Transit Traffic: Internet traffic exchanged on the basis of economic
agreements between Internet Service Providers (ISP). An ISP
generally pays a transit provider for the delivery of traffic
flowing between its network and networks that are not directly
connected.
3. The Problem
Network engineers have been facing the problem of traffic
optimization for a long time now and have already designed mechanisms
like MPLS [RFC3031] and DiffServ [RFC3260] to deal with it. The
problem they address consists in finding (or setting) optimal routes
for packets traveling between specific source and destination
addresses and based on requirements such as low latency, high
reliability, and priority. Such solutions are usually implemented at
the link and network layers, and tend to be almost transparent. At
best, applications can only "mark" the traffic they generate with the
corresponding properties.
However, P2P applications that are today posing serious challenges to
Internet infrastructures, do not benefit much from the above
techniques and "cooperating" with external services aware of the
network topology could greatly optimize the traffic they generate.
In fact, when a P2P application needs to establish a connection, the
logical target is not a host, but rather a resource (e.g. a file or a
media relay) generally available in multiple instances on different
peers; selection of the closest one -- or, in general, the best from
an overlay topological proximity -- has much more impact on the
overall traffic than the route followed by its packets to reach the
endpoint.
Marocco & Gurbani Expires May 6, 2009 [Page 6]
Internet-Draft ALTO Problem Statement November 2008
Optimization of the peer selection is particularly important in the
initial phase of the process. Consider a P2P protocol such as
BitTorrent, where a querying peer receives a list of candidate
destinations where a resource resides. From this list, the peer will
derive a smaller set of candidates to connect to and exchange
information with. In another example, a streaming video client may
be provided with a list of destinations from which it can download
content from. In both cases, the use of topology information in an
early stage will allow applications to improve their performance and
will help ISPs make a better use of their network resources (in
particular, reducing the transit traffic on interdomain links).
Addressing the Application-Layer Traffic Optimization (ALTO) problem
means, on the one hand, deploying an ALTO service to provide
applications with information regarding the underlying network and,
on the other hand, enhancing applications in order to use such
information to perform better-than-random selection of the endpoints
they establish connections with.
4. Use Cases
4.1. File sharing
File sharing applications allow users to search for content shared by
other users and download it. Typically, search results consist of
many instances of the same file (or chunk of a file) available from
multiple sources; the goal of an ALTO solution would be to help peers
find the best ones according to the underlying networks.
On the application side, integration of ALTO functionalities may
happen at different levels. For example, while in the completely
decentralized Gnutella network selection of the best sources is
totally up to the user, in systems like BitTorrent and eDonkey,
central elements (i.e. trackers or servers) act as mediators.
Therefore, in the former case, optimization would require
modification in the applications, while in the latter it could just
be implemented in some central elements.
4.2. Cache/Mirror Selection
Providers of popular content like media and software repositories
usually resort to geographically distributed caches and mirrors for
load balancing. Selection of the proper mirror/cache for a given
user is today based on inaccurate geolocation data, on proprietary
network location systems or often delegated to the user himself. An
ALTO solution could be easily adopted to ease such a selection in an
automated way.
Marocco & Gurbani Expires May 6, 2009 [Page 7]
Internet-Draft ALTO Problem Statement November 2008
4.3. Live Media Streaming
P2P applications for live streaming allow users to receive multimedia
content produced by one source and targeted to multiple destinations,
in a realtime or near-realtime way without recurring to multicast.
Such applications typically participate in the distribution of the
content, acting as both receivers and senders; the goal of an ALTO
solution would be to help peers to find the best sources and the best
destinations for media flows they receive and relay.
4.4. Realtime Communications
P2P realtime communications allow users to establish direct media
flows, usually to place audio and video calls, or to have text chats.
In the basic case, media would flow directly between the two
endpoints; however, in the general case, a significant portion of
communications between users with limited access to the Internet
(e.g. users behind NATs, firewalls or HTTP proxies) need to be
relayed by other elements. Such media relays are distributed over
the Internet -- in some cases co-located with applications with a
public address; the goal of an ALTO solution would be to help peers
to find the best relays.
4.5. Distributed Hash Tables
Distributed hash tables (DHT) are a class of overlay algorithms used
to implement lookup functionalities in popular P2P systems, without
recurring to centralized elements. In such systems, peers maintain
addresses of other peers participating in the same DHT in a routing
table, sorted according to specific criteria. An ALTO solution would
provide valuable information for DHT algorithms which, in order to
reduce path latency of distributed queries, include round trip time
estimations among such criteria [SIGCOMM.resprox].
5. Solution Considerations
This section introduces some aspects to keep in consideration when
designing an ALTO service to provide applications with information
they can use to perform better-than-random peer selection.
5.1. ALTO Service Providers
ALTO services can be provided by at least three different kinds of
entity:
1. Network operators: usually have full knowledge of the network
they administer and are aware of the topology and policies that
transit and peering traffic are subject to;
Marocco & Gurbani Expires May 6, 2009 [Page 8]
Internet-Draft ALTO Problem Statement November 2008
2. Third parties: entities different from the network operators, but
which may have collected network information. Examples of such
entities are content delivery networks (like Akamai) which
control wide and highly distributed infrastructures, or companies
providing an ALTO service on behalf of ISPs (and thus acquire the
information from the ISPs themselves);
3. User communities: running distributed algorithms, for example for
estimating the topology of the Internet.
5.2. Discovery of ALTO servers
As a direct consequence of the totally decentralized architecture of
the Internet, it seems almost impossible to centralize all
information P2P applications may need to optimize traffic they
generate. Therefore, any solution for the ALTO problem will need to
specify a mechanism for applications to find a proper ALTO server to
query.
It is important to note that, depending on the implementation of the
ALTO service, an ALTO server could be a centralized entity for
example deployed by the network operator as well as a volatile node
participating in a distributed algorithm.
5.3. User Privacy
Information provided by the ALTO client querying the ALTO server
could help increase the level of accuracy in the replies. For
example, if the querying client indicates what kind of application it
is using (e.g. realtime communications or bulk data transfer), the
server will be able to indicate priorities in its replies
accomodating the requirements of the traffic the application will
generate. However, it is important that for using an ALTO service
the application does not have to disclose information it may consider
sensible.
5.4. Topology Hiding
Operators can play an important role in addressing the ALTO problem,
but they generally consider network information they own to be
confidential; therefore, in order to succeed and achieve wide
adoption, any solution should provide a method to help P2P
applications in peer selection without explicitly disclosing topology
of the underlying network.
5.5. Coexistence with Caching
A common approach to optimizing traffic generated by applications
which require large data transfers is based on caching techniques.
Marocco & Gurbani Expires May 6, 2009 [Page 9]
Internet-Draft ALTO Problem Statement November 2008
In some cases, such techniques have proven to be extremely effective
in both enhancing user experience and saving network resources;
however, they have two main limits in respect to the solutions based
on provision of topology information:
1. Application specificity: since a cache is meant to replace the
source of the content being accessed -- either explicitly or
transparently -- it must be able to speak the same protocol with
the querying peer. For this reason, caching solutions can be
reasonably adopted only for most popular applications (e.g. HTTP
and BitTorrent).
2. Content awareness: since caches need to actually store the
content being delivered, they are subject to legal threats
whenever the user does not have the right to access or distribute
such content. This limitation makes caching approaches unusable
in today's popular file sharing systems.
In general, solutions based on provision of topology information need
not to interfere with caching; to the contrary, if ALTO service used
by applications is aware of the presence of chaches, it can point
them out in its replies with higher priorities and thus achieve
greater optimization.
6. Security Considerations
The approach proposed in this document requires P2P applications to
delegate a portion of their routing capability to third parties,
giving them a significant role in systems where that would be
otherwise excluded.
In the case where an ALTO solution is deployed by the network
operator, it is conceivable that the P2P community would consider it
hostile because the operator could, for example:
o redirect applications to corrupted mediators providing malicious
content;
o track connections to perform content inspection;
o apply policies based on criteria other than network efficiency
(for example, to avoid peering points regulated by inconvenient
economic agreements).
However, ALTO is completely optional for P2P applications and its
purpose is to help improve performance of such applications. If, for
some reason, it fails to achieve this purpose, it would simply fail
to gain popularity and would not be used.
Even in cases where the ALTO service provider would decide to
maliciously alter results returned by queries only after the solution
has gained popularity (i.e. it behaves for a while to become popular
Marocco & Gurbani Expires May 6, 2009 [Page 10]
Internet-Draft ALTO Problem Statement November 2008
and then starts misbehaving), it would be fairly easy for P2P
application maintainers and users to revert to solutions that are not
using it. After all, it would all come down to change some
application settings in cases where the protocol is implemented
inside the client and upgrading centralized elements for
architectures like BitTorrent and eDonkey.
7. Acknowledgments
Vinay Aggarwal and the P4P working group conducted the research work
done outside the IETF. Emil Ivov, Rohan Mahy, Anthony Bryan,
Stanislav Shalunov, Laird Popkin, Stefano Previdi, Reinaldo Penno,
Dimitri Papadimitriou, Sebastian Kiesel, and many others provided
insightful discussions, specific comments and much needed
corrections.
Thanks in particular to Richard Yang for several reviews.
8. Informative References
[ACM.bottleneck]
Akella, A., Seshan, S., and A. Shaikh, "An Empirical
Evaluation of WideArea Internet Bottlenecks", Proceedings
of ACM SIGCOMM, October 2003.
[ACM.fear]
Karagiannis, T., Rodriguez, P., and K. Papagiannaki,
"Should ISPs fear Peer-Assisted Content Distribution?",
In ACM USENIX IMC, Berkeley 2005.
[ACM.ispp2p]
Aggarwal, V., Feldmann, A., and C. Scheideler, "Can ISPs
and P2P systems co-operate for improved performance?", In
ACM SIGCOMM Computer Communications Review (CCR), 37:3,
pp. 29-40.
[ACM.ono] Choffnes, D. and F. Bustamante, "Taming the Torrent: A
practical approach to reducing cross-ISP traffic in P2P
systems", Proceedings of ACM SIGCOMM, August 2008.
[I-D.bonaventure-informed-path-selection]
Saucez, D. and B. Donnet, "The case for an informed path
selection service",
draft-bonaventure-informed-path-selection-00 (work in
progress), February 2008.
Marocco & Gurbani Expires May 6, 2009 [Page 11]
Internet-Draft ALTO Problem Statement November 2008
[RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
Label Switching Architecture", RFC 3031, January 2001.
[RFC3260] Grossman, D., "New Terminology and Clarifications for
Diffserv", RFC 3260, April 2002.
[SIGCOMM.resprox]
Gummadi, K., Gummadi, R., Ratnasamy, S., Gribble, S.,
Shenker, S., and I. Stoica, "The impact of DHT routing
geometry on resilience and proximity", Proceedings of ACM
SIGCOMM, August 2003.
[WWW.cachelogic.picture]
Parker, A., "The true picture of peer-to-peer
filesharing", <http://www.cachelogic.com>.
[WWW.p4p.overview]
Xie, H., Krishnamurthy, A., Silberschatz, A., and R. Yang,
"P4P: Explicit Communications for Cooperative Control
Between P2P and Network Providers",
<http://www.dcia.info/documents/P4P_Overview.pdf>.
[WWW.wired.fuel]
Glasner, J., "P2P fuels global bandwidth binge",
<http://www.wired.com/techbiz/media/news/2005/04/67202>.
Authors' Addresses
Enrico Marocco
Telecom Italia
Via G. Reiss Romoli, 274
Turin 10148
Italy
Email: enrico.marocco@telecomitalia.it
Vijay K. Gurbani
Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane
Naperville, IL 60566
USA
Email: vkg@alcatel-lucent.com
Marocco & Gurbani Expires May 6, 2009 [Page 12]
Internet-Draft ALTO Problem Statement November 2008
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Marocco & Gurbani Expires May 6, 2009 [Page 13]
| PAFTECH AB 2003-2026 | 2026-04-23 14:37:13 |