One document matched: draft-deng-ppsp-bfbitmap-00.txt
Network Working Group L. Deng
Internet-Draft J. Peng
Intended status: Informational China Mobile
Expires: January 04, 2014 Y. Zhang
CoolPad
July 03, 2013
Efficient Chunk Availability Compression for PPSP
draft-deng-ppsp-bfbitmap-00.txt
Abstract
This document proposes to employ bloom filter algorithms in
compressing chunk availability information in exchange between peers
and the tracker through the PPSP-TP protocol and PPSPP protocol.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 04, 2014.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Deng, et al. Expires January 04, 2014 [Page 1]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Background on Bloom Filter . . . . . . . . . . . . . . . . . 3
3. BF-based Chunk Availability Exchange . . . . . . . . . . . . 4
3.1. A non-BF PPSP session . . . . . . . . . . . . . . . . . . 4
3.2. A PPSP Session with BF-bitmaps . . . . . . . . . . . . . 5
4. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1. Algorithm Configuration Setup . . . . . . . . . . . . . . 7
5. Security Considerations . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1. Normative References . . . . . . . . . . . . . . . . . . 8
7.2. Informative References . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
As it is pointed out by [I-D.ietf-ppsp-problem-statement], current
practices often use a "bitmap" message in order to exchange chunk
availability. The message is of kilobytes in size and exchanged
frequently, e.g., an interval of several seconds or less.
To begin with, in a mobile environment with scarce bandwidth, the
message size may need to be shortened or it may require more
efficient methods for expressing and distributing chunk availability
information, which is different from wire-line P2P streaming.
Even in a wire-line P2P streaming application, frequent exchange of
large volume of bitmap information, is among the key factors that set
a limit to the system's efficiency and scalability [P2P-limit].
Therefore, the following requirements for PPSP protocols in terms of
chunk availability exchange are stated in
[I-D.ietf-ppsp-problem-statement] :
PPSP.TP.REQ-3: The tracker protocol MUST take the frequency of
messages and efficient use of bandwidth into consideration, when
communicating chunk availability information.
PPSP.PP.REQ-7: The peer protocol MUST take the frequency of
messages and efficient use of bandwidth into consideration, when
communicating chunk information.
To this end, we propose to employ bloom filter algorithms in
compressing chunk availability information in exchange between peers
and the tracker through the PPSP-TP protocol and PPSPP protocol, in
view of its widely demonstrated high compacted data structure and low
complexity and cost in storage, transportation and computation.
Deng, et al. Expires January 04, 2014 [Page 2]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
2. Background on Bloom Filter
Bloom Filter (or BF for short) was first introduced in 1970s
[BF-bloom], which makes use of multiple hashing functions to build a
mapping from a set to a compact binary array, for the purpose of
highly efficient member queries with a tolerably low error rate of
wrongly reported hits. Despite their extraordinary efficiency in
terms of storage reduction and query acceleration, bloom filters
suffer from the fact that there is possibility that a non-member of
the set be taken as a member after the query. However, research
[BF-analysis] shows that the odds that a BF makes an erroneous query
response can be suppressed to near zero, by tactful selection and
configuration of various system parameters, including the hash
functions used, the number of hash functions to be used, and the
length of the bit array.
------------------------------------------------
BF(set S, integer m, hash set H)
1 filter=allocate m bits initialized to 0;
2 for each xi in S do
3 for each hash functions hi in H do
4 filter[hi(xi)]=1;
5 return filter;
-------------------------------------------------
MT(element elm, BF filter, integar m, hash set H)
1 for each hash functions hi in H do
2 if (filter[hi(elm)]!=1)
3 return false;
4 return true;
-------------------------------------------------
Figure 1: Basic algorithms for BF-bitmap
More specifically, as shown in Figure 1, the BF(S,m) algorithm takes
a n-membered sub-set S={x1,x2,...,xn} from a universal set U, as
input and outputs a m-bit binary array B as a compacted
representation of S. In order to do that, it makes use of k
independent random hash functions, each of which maps a member to a
bit in B (i.e hj: U-> [1,m], j=1...k) by marking all the k bits
mapped from each member of S. The BF algorithm is highly efficient in
the following aspects:
o It is quite simple and straightforward to generate the BF
representation of a set S, B=BF(S): initially, all the bits in B
is set to 0; then, for each member x of the set S, mark each bit
Deng, et al. Expires January 04, 2014 [Page 3]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
in B, to which a hash function maps x (as shown in Figure 1 as the
BF algorithm).
o It is highly efficient to check whether or not a given element x
is in any BF-represented set B=BF(S): for each hash function hj,
check the value of B[hj(x)] against 1. It is always safe to
exclude the element x out of set S, once there is a zero-valued
hash bit, otherwise it is assumed that x is a member of S (as
shown in by the MT algorithm in Figure 1).
For instance, given a 2GB movie file, the original bitmap for a
sharing peer would be 1024-bit, if using 2MB-sized chunks. By simply
using 4 random hash functions and a 128-bit BF-bitmap, the
possibility of erroneous hits by MT algorithm would be lower than 3%.
As for a simple illustration, the 4 hash functions may be established
through the MD5 message-digest algorithm [RFC1321], a widely used
cryptographic hash function that produces a 128-bit (16-byte) hash
value from an arbitrary binary input. MD5 has been utilized in a
wide variety of security applications, and is also commonly used to
check data integrity.
Specifically, the processing of 4 hash functions is as follows: use
the MD5 algorithm to turn a given chunk_ID into a 128-bit binary
array, further separate the 128 bits into 4 arrays (32-bit each), and
finally divide each of them using 128 to yield four integers in the
range of [1,m].
3. BF-based Chunk Availability Exchange
3.1. A non-BF PPSP session
We borrow the general message flow (shown in Figure 2) from [I-D
.ietf-ppsp-base-tracker-protocol] to exemplify the usage of BF-bitmap
in PPSP protocols.
+--------+ +--------+ +--------+ +--------+ +-------+
| Player | | Peer 1 | | Portal | | Tracker| | Peer 2|
+--------+ +--------+ +--------+ +--------+ +-------+
| | | | |
|--Page request----------------->| | |
|<--------------Page with links--| | |
|--Select stream (MPD Request)-->| | |
|<--------------------OK+MPD(x)--| | |
|--Start/Resume->|--CONNECT(join x)------------>| |
|<-----------OK--|<----------------OK+Peerlist--| |
| | | |
|--Get(Chunk)--->|<---------- (Peer protocol) ------------->|
Deng, et al. Expires January 04, 2014 [Page 4]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
|<--------Chunk--|<---------------------------------Chunks--|
: : : : :
| |--STAT_REPORT---------------->| |
| |<-------------------------Ok--| |
: : : : :
|--Get(Chunk)--->|<---------- (Peer protocol) ------------->|
|<--------Chunk--|<---------------------------------Chunks--|
: : : : :
Figure 2: A typical PPSP session for watching a streaming content.
When a peer wants to receive streaming of a selected content (Leech
mode):
1. Peer connects to a connection tracker and joins a swarm.
2. Peer acquires a list of other peers in the swarm from the
connection tracker.
3. Peer exchanges its content availability with the peers on the
obtained peer list (via peer protocol).
4. Peer identifies the peers with desired content.
5. Peer requests content from the identified peers (peer protocol).
When a peer wants to share streaming contents (Seeder mode) with
other peers:
1. Peer connects to the connection tracker.
2. Peer sends information to the connection tracker about the swarms
it belongs to (joined swarms).
3.2. A PPSP Session with BF-bitmaps
This document proposes to employ bloom filter algorithms in
compressing chunk availability information in exchange between peers
and the tracker through the PPSP-TP protocols and PPSPP protocol.
The key modifications to the current protocols are summarized as
follows: (as shown in Figure 3)
+--------+ +--------+ +--------+ +--------+ +-------+
| Player | | Peer 1 | | Portal | | Tracker| | Peer 2|
+--------+ +--------+ +--------+ +--------+ +-------+
| | | | |
(a1) |--Page request----------------->| | |
|<----Page with links(+BF conf)--| | |
|--Select stream (MPD Request)-->| | |
|<----------OK+MPD(x)(+BF conf)--| | |
(a2) |--Start/Resume->|--CONNECT(join x)------------>| |
Deng, et al. Expires January 04, 2014 [Page 5]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
|<-----------OK--|<------OK(+BF conf)+Peerlist--| |
| | | |
(c1) | |<------------ HAVE(BF(S2))----------------|
(c2) |-Get(Chunk s1)->| | | |
(c3) | |------------- REQUEST(BF(s1))------------>|
|<--------Chunk--|<---------------------------------Chunks--|
: : : : :
(a3) | |-STAT_REPORT(BF(ContentMap))->| |
| |<-------------------------Ok--| |
: : : : :
|--Get(Chunk)--->|<---------- (Peer protocol) ------------->|
|<--------Chunk--|<---------------------------------Chunks--|
: : : : :
Figure 3: A typical PPSP session with BF-bitmaps.
a. Integration to the base TP protocol:
* (a1-a2)Configuration Setup: m, The length of the output bit
array and H, the hash functions in use, are system level
parameters that should be configured globally and stored at
the tracker and published to a joining peer through the TP
protocols.
* (a3)Peers use the BF(S,m,H) algorithm for compressing the
subset of locally stored and integrity verified chunks (set S)
in terms of a given swarm-ID, whenever reporting or updating
its chunk availability information with the tracker. The
length of each BF-bitmap is constant (O(m)).
* In response to a JOIN request from a peer, the tracker may
accompany the returned peer list with each recommended peer's
BF-formed chunk availability bitmap, as the initial guidance
for the requestor to start looking for neighbors in the same
swarm. The additional cost for bearing the chunk-level
availability information is constant (O(m)).
b. Integration to the extended TP protocol:
* Peers use the BF(S,m,H) algorithm for indicating its query
intention in the FIND request for a specific chunk sub set S'
of the given swarm to the tracker or the other peer. The
additional cost for bearing the chunk set is constant (O(m)).
Deng, et al. Expires January 04, 2014 [Page 6]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
* In response to a FIND request with specific chunk set S' in
need from a peer, the tracker performs the subset containment
check on the query set parameter BF(S' against each registered
peer's chunk availability BF(S) by three simple binary
operations to decided whether or not include the peer into the
peer list in return: check if "F(S) equals (BF(S) bitwise OR
(BF(S) bitwise XOR BF(S'))" holds. The computation cost for
each subset check is constant (O(m)).
c. Integration to the peer protocol:
* Peers use the BF(S,m,H) algorithm for compressing the subset
of locally stored and integrity verified chunks (set S) in
terms of a given swarm-ID, whenever reporting or updating its
chunk availability information with another peer. The length
of each BF-bitmap is constant (O(m)).
* For a downloading peer to decide which neighbor to request for
a given chunk_ID s, it uses the member query algorithm
MT(s,bf,m,H) on each neighbor's BF-bitmap bf. The computation
cost for this member check is constant (O(m)).
4. Open Issues
4.1. Algorithm Configuration Setup
As stated earlier, the BF scheme is based on a mutual arrangement
between the information requestor and the responder of the basic
settings for the hash algorithms (both the number of them and the
specific ones) in use and the coded bitmap's binary length. In other
words, there SHOULD be a way of configuration setup mechanism in a
local system.
To serve as the input for further discussion, we provide two initial
proposals here:
Deng, et al. Expires January 04, 2014 [Page 7]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
o Option1: Centralized Server for Uniform Configuration: The most
simple and straightforward way would be to set up a logically
centralized configuration server to do the trick. For instance,
the RELOAD base protocol introduces such a configuration server to
synchronize the hash function for the P2P DHT before a peer/client
joins in the overlay [I-D.ietf-p2psip-base]. The most simple and
straightforward way would be to set up a logically centralized
configuration server to do the trick. For instance, the RELOAD
base protocol introduces such a configuration server to
synchronize the hash function for the P2P DHT before a peer/client
joins in the overlay [I-D.ietf-p2psip-base]. There are two
potential ways to integrate into the base TP protocol's enrollment
and bootstrap process: The Publishing and Searching Portal could
serve as a configuration server and return the BF configuration
information to the peer through player, either
* via the page returned in response to a web page request, or
* via the MPD(Media Presentation Description) file in response to
a MPD request.
o Option2: Configuration Exchange as Joining in a SWARM: In view of
the interworking usage of PPSP as a generally accepted suite of
protocols to bridging different P2P systems, who may differ in
specific choice of hash functions and other parameters, there
SHOULD be a way of parameter negotiation mechanism across
different systems. Negotiation may also introduce flexibility in
a single system. E.g. large files or mobile peers may prefer more
compact way of exchanging this information. Therefore, the
tracker could include a swarm-specific BF configuration parameters
into the OK response to the JOIN request from a new-coming peer
(as labeled by (a2) in Figure 3).
5. Security Considerations
TBA
6. IANA Considerations
None.
7. References
7.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, November 1997.
Deng, et al. Expires January 04, 2014 [Page 8]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
7.2. Informative References
[BF-analysis]
Broder, A. and M. Mitzenmacher, "Network applications of
Bloom Filters: a survey", Internet Mathematics Vol. 1, No.
4, pp. 485-509, 2004.
[BF-bloom]
Bloom, B., "Space/time trade-offs in hash coding with
allowable errors.", Communications of ACM Vol. 13, No. 7,
pp. 422-426, 1970.
[I-D.ietf-huang-extended-tracker-protocol]
Huang, R., Zong, N., Cruz, R., Nunes, ., and J. Taveira,
"PPSP Tracker Protocol-Extended Protocol", draft-ietf-
huang-extended-tracker-protocol-02 (work in progress),
February 2013.
[I-D.ietf-p2psip-base]
Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and
H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)
Base Protocol", February 2013.
[I-D.ietf-ppsp-base-tracker-protocol]
Cruz, R., Nunes, M., Gu, Y., Xia, J., and J. Taveira,
"PPSP Tracker Protocol-Base Protocol (PPSP-TP/1.0)",
draft-ietf-ppsp-base-tracker-protocol-00 (work in
progress), February 2013.
[I-D.ietf-ppsp-problem-statement]
Zhang, Y. and N. Zong, "Problem Statement and Requirements
of Peer-to-Peer Streaming Protocol (PPSP)", draft-ietf-
ppsp-problem-statement-12 (work in progress), January
2013.
[P2P-limit]
Feng, C., Li, B., and B. Li, "Understanding the
performance gap between pull-based mesh streaming
protocols and fundamental limits", in Proc. of IEEE
INFOCOM , 2009.
[RFC1321] Rivest, . and . Newport, "RFC 1321: The MD5 message-digest
algorithm", draft-ietf-p2psip-base-26 (work in progress),
April 1992.
Authors' Addresses
Deng, et al. Expires January 04, 2014 [Page 9]
Internet-DraEfficient Chunk Availability Compression for PPSP July 2013
Lingli Deng
China Mobile
Email: denglingli@chinamobile.com
Jin Peng
China Mobile
Email: pengjin@chinamobile.com
Yunfei Zhang
CoolPad
Email: hishigh@gmail.com
Deng, et al. Expires January 04, 2014 [Page 10]
| PAFTECH AB 2003-2026 | 2026-04-24 02:42:03 |