One document matched: draft-weaver-alto-edge-caches-00.txt
Application Layer Traffic N. Weaver
Optimization (ALTO) Working Group International Computer Science
Internet-Draft Institute
Intended status: Informational March 4, 2009
Expires: September 5, 2009
Peer to Peer Localization Services and Edge Caches
draft-weaver-alto-edge-caches-00
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 5, 2009.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Abstract
Without caches in the infrastructure, peer to peer content delivery's
primary effect is cost shifting rather than cost savings. Even with
Weaver Expires September 5, 2009 [Page 1]
Internet-Draft P2P Localization and Edge Caches March 2009
perfect localization, depending on the relative cost of last-mile
uplink bandwidth verses transport bandwidth, P2P may substantially
increase aggregate cost. Yet the addition of edge caches, caches
located in the ISPs near the customers, radically change the
economics of P2P content delivery. Edge caches interact very
strongly with localization services for P2P content delivery, and any
localization service must be tightly integrated into edge-cache
operation.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. The Design of Edge Caches . . . . . . . . . . . . . . . . . . . 3
2.1. Safe Incentives for Edge Caches . . . . . . . . . . . . . . 4
3. An Economic Model for Delivery Costs . . . . . . . . . . . . . 5
3.1. The Limits of Localization . . . . . . . . . . . . . . . . 6
4. Edge-Cache Interactions with Localization . . . . . . . . . . . 6
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7
8. Security Considerations . . . . . . . . . . . . . . . . . . . . 8
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8
9.1. Normative References . . . . . . . . . . . . . . . . . . . 8
9.2. Informative References . . . . . . . . . . . . . . . . . . 8
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 8
Weaver Expires September 5, 2009 [Page 2]
Internet-Draft P2P Localization and Edge Caches March 2009
1. Introduction
When compared with conventional content delivery, peer to peer
content delivery of bulk data is significant at shifting costs from
the content provider to the ISPs, but can often significantly magnify
the aggregate cost of delivery. Depending on the particular costs to
an ISP, even perfect localization (restriction of P2P activity to
within the ISP's network) may still result in significantly higher
aggregate costs over conventional content delivery, although
localization does reduce transit costs.
However, if edge-caches are introduced into the architecture, the
economics can change radically. Rather than increasing transport
costs, P2P with ISP-provided edge caches reduce transport costs for
all parties, achiving costs reductions for the ISP analogous to those
seen with edge-based HTTP servers such as Akamai [akamai]. Yet
unlike edge-based web servers, edge-caches for P2P are failure-
transparent: when they fail, or do not have the right data, the
failure does not impact correct operation of the P2P system.
It is critical that ALTO or other localization services for bulk-data
P2P be both edge-cache aware and assist edge-caches in their
operation, for localization without edge-caches may not produce
significant cost savings to the ISPs or performance benefits to the
customers, but edge-caches need localization services both to ease
client discovery and to provide necessary topological information for
edge-cache operation.
This document begins with a brief discussion of edge caches for P2P
(Section 2), then outlines a simple cost model of content delivery
(Section 3), which argues why both localization and edge-caches are
necessary for cost-effective content delivery. It then discusses how
localization and edge-caches should interact (Section 4), before a
brief conclusions section (Section 5)
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. The Design of Edge Caches
An edge-cache is simply a special P2P node which lives in the ISP's
network close to, but not at, the final recipients. Thus it incurs
no transit cost in communicating with ISP-local peers, and is close
in latency and has a high-bandwidth connection into the ISP's
internal network.
Weaver Expires September 5, 2009 [Page 3]
Internet-Draft P2P Localization and Edge Caches March 2009
The role of an edge cache is to coordinate transfers between local
peers and the rest of the Internet, as well as to cache data for
subsequent use, within the existing or modified P2P protocol. For
example, a BitTorrent edge cache can participate in a swarm, offering
up data only to ISP-local peers once it has a complete file, and
refusing to seed or leech (but only tit-for-tat) with peers outside
the ISP before it has obtained the entire file.
One feature of an edge-cache is that it can be unreliable. Since,
from the point of view of the other peers, it is simply another P2P
participant, if the edge-cache fails to include a block, a file, or
fails altogether, the P2P system will still work properly. This is
in sharp contrast to edge-based HTTP caches or CDNs, where a failure
in the node may result in failures to the user.
A side consequence of unreliability is that an edge-cache can
therefore be inexpensive. For example, a 1U server (based on a Mini-
ITX motherboard) capable of holding 4 SATA disks might cost less than
$800. With a price of $130 for a 1.5TB drive, an edge cache costing
less than $1400 could cache over 5 TB of data. Such a low-cost
system might suffer significantly higher transient failure rates than
a higher-quality server, necessitating a reboot, reimage, and
disabling of bad disks, but as failures are low-consequence, such
caches can be cheap to deploy.
Finally, a P2P edge-cache doesn't require changing existing P2P
protocols. As long as local peers will find the edge cache, or the
edge-cache can find the local peers, edge-caches can be introduced
into existing protocols without change. In particular, BitTorrent is
highly amenable to edge-caches without requiring client changes.
2.1. Safe Incentives for Edge Caches
The biggest impediment to building edge-caches is not technical but
legal. Given a P2P swarm, a single edge cache or collection of
caches should be able to monitor the swarm and find participants.
But an edge cache needs to be notified both about a particular P2P
swarm and that it is acceptable to cache the swarm.
It is outside the scope of this document for a detailed discussion,
but there exist many possibilities, such as P2P content providers
(such as Linux ISO images) registering their content, users of the
ISP asserting that a swarm is legitimate (and consenting to be
identified if a copyright holder objects), and agreements with third
party data providers (such as Amazon S3) which support BitTorrent and
other P2P content distribution.
Weaver Expires September 5, 2009 [Page 4]
Internet-Draft P2P Localization and Edge Caches March 2009
3. An Economic Model for Delivery Costs
For purposes of this discussion, we assume that different portions of
the network have different costs to transmit or receive one unit of
data. Although costs really vary by time of day and network
conditions (for example, the cost to an ISP of traffic on an
uncongested uplink on the last mile is effectively 0, but can be huge
if there is congestion, or peering arrangements may make the cost of
uplink transit negative), for simplicity we will ignore these effects
for now.
CP: This is the cost for the content provider to send one unit of
data
CDN: This is the cost for the content provider to send one unit of
data through a third party, edge-based CDN
CT: This is the cost for the ISP to receive one unit of data from the
general internet
CTU: This is the cost for the ISP to send one unit of data to the
general internet
CL: This is the cost for the ISP to send one unit of data to the end
customer across the last mile
CLU: This is the cost for the ISP to receive one unit of data from an
end customer across the last mile.
With such a basic cost model, it becomes possible to estimate the
costs for for different content delivery mechanisms.
Central (conventional) HTTP traffic: For such traffic, the content
provider pays N*CP, while the ISP pays N*(CT+CL). The costs
increases linearly with the number of requests.
Edge-located HTTP content delivery networks (such as Akamai): For
such traffic, the content provider pays N*CDN, while the ISP pays
N*CL. This is obviously the best case for the ISP, but the cost of
the CDN may not be favorable to the content provider.
Conventional P2P without localization: If we assume the P2P system is
highly efficient, the content provider pays only CP regardless of the
number of users. The ISP will need to pay N*(CL + CLU) for all users
on the last mile, and some value less than N*(CT + CTU) for transit.
Conventional P2P with perfect localization: If the P2P system is
perfect, including localizing the traffic completely within the ISP,
Weaver Expires September 5, 2009 [Page 5]
Internet-Draft P2P Localization and Edge Caches March 2009
the content provider pays only CP, while the ISP will need to pay
N*(CL + CLU) but only (CT + CTU) for transit.
Conventional P2P with perfect localization and perfect edge caches:
Adding in edge-caches changes the situation. Now the content
provider pays only CP, while the ISP pays N*CL + CT + CTU.
3.1. The Limits of Localization
Such a simple cost model illustrates the major limitation of
localization. If CLU, the cost of the last mile uplink, is more than
CT, the cost of the transit downlink, P2P can significantly increase
the costs to the ISP over conventional HTTP delivery, even with
perfect localization and perfect operation. For some networks, such
as DOCSIS cable modems, this is often the case, as increasing network
capacity on the shared last mile may require new infrastructure or
repurposing bandwidth otherwise used for higher-value services such
as television channels.
Yet it shows that if edge-caches are added into the system, everybody
sees a cost savings: both the content provider and the ISP benefit
from lower cost, but without the reliability concerns present in
edge-based HTTP CDNs. Thus edge-caches represent the best of both
worlds: for a content provider, edge-caches in the P2P system have
the same low cost as a conventional P2P system, but for the ISP, the
edge-caches have the same low cost as an edge-located CDN.
4. Edge-Cache Interactions with Localization
Since edge-caches are critical to realize the true potential of P2P
to create an aggregate cost savings, they need to be considered when
developing other portions of a common P2P infrastructure. In
particular, edge-caches both interact with and benefit from
localization services, and thus it is critical that both localization
and edge-caching be codesigned to interoperate. Thus some edge-cache
concerns which directly relate to localization.
Edge-cache discovery: Any localization service which supports the
discovery of "preferable" nodes should give preference to any
relevant edge-caches in the system. Thus the localization service
will drive traffic towards the relevant edge caches, resulting in
greater performance and lower cost-of-delivery.
Edge-cache content notification: Any localization service should also
act as content notification, notifying the edge-cache about a user's
desire to fetch a particular piece of content. The edge-cache may
use this information, along with other constraints and heuristics, to
Weaver Expires September 5, 2009 [Page 6]
Internet-Draft P2P Localization and Edge Caches March 2009
determine whether it should participate in this distribution system.
For example, a particular ISP's edge-cache for BitTorrent could be
configured to cache torrents requested from Amazon S3 or other
sources based on a contractual relationship, but reject torrents
hosted elsewhere.
Peer-access control: The edge-cache, when contacted by a peer, needs
to know whether the peer is local to its network. Thus the
localization service should support queries from the edge cache as to
whether a peer would be considered local to the ISP.
Support for file descriptors: In order for both the localization
service and the edge-cache to track files as they are requested, ALTO
requests from peers should include both a per-file unique ID and a
variable length field containing the protocol's representation of the
file requested (eg, for BitTorrent, the .torrent file). This has
some minor privacy implications, but greatly enhances both the
ability of localization to know which peers are involved in a
particular transfer and the ability of edge-caches to determine which
data to fetch.
5. Conclusions
Edge-caches are critical if P2P is to achieve the promised aggregate
cost savings. Without an edge-cache, localization's benefits are
limited, as even perfect localization is unable to reduce the
transfers over the last-mile uplink. Yet edge-caches also need to
rely on localization, both to drive traffic to the edge cache, to
discover new content, and to determine which peers are allowed to
access the edge-cache. Thus localization protocols should include
edge-caches in their focus, and edge-caches will need to use
localization protocols.
6. Acknowledgements
Grant info here. All opinions are those of the author, not the
funding institution.
Feedback on the general concept and economic models for P2P edge
caches from Richard Woundy, Jason Livingood, Vern Paxson, Christian
Kreibich, and others.
7. IANA Considerations
None
Weaver Expires September 5, 2009 [Page 7]
Internet-Draft P2P Localization and Edge Caches March 2009
8. Security Considerations
The privacy concerns of edge-caches and localization are only mild to
moderate. It is already possible for P2P nodes to observe what other
nodes are downloading or making available, and an edge-cache simply
represents another such node in the system. Any P2P system which
wishes to avoid this problem will not want to use localization
(because of the impacts on traffic analysis), and ISPs will not want
to cache such data (because most of the data will represent illegal
content).
This is also why localization services such as ALTO should have a
query interface that doesn't just give a list of IP addressees to
rank, but also has query modes which present ALTO with a UUID and a
content identifier, so a localization system can keep track of other
systems which have already requested the same content.
9. References
9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
9.2. Informative References
[akamai] Akamai Inc, "The Akamai CDN", 2008,
<http://www.akamai.com>.
Author's Address
Nicholas Weaver
International Computer Science Institute
1947 Center Street suite 600
Berkeley, CA 94704
USA
Phone: +1 510 666 2903
Email: nweaver@icsi.berkeley.edu
Weaver Expires September 5, 2009 [Page 8]
| PAFTECH AB 2003-2026 | 2026-04-23 04:27:23 |