One document matched: draft-irtf-dtnrg-zinky-random-binary-fec-scheme-00.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc1112 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.1112.xml'>
<!ENTITY rfc2119 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc3171 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3171.xml'>
<!ENTITY rfc3376 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3376.xml'>
<!ENTITY rfc3927 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3927.xml'>
<!ENTITY rfc3986 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3986.xml'>
<!ENTITY rfc4122 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4122.xml'>
<!ENTITY rfc4838 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4838.xml'>
<!ENTITY rfc5050 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5050.xml'>
<!ENTITY rfc5052 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5052.xml'>
<!ENTITY rfc5053 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5053.xml'>
<!ENTITY rfc5170 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5170.xml'>
<!ENTITY rfc5445 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5445.xml'>
<!ENTITY rfc5510 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5510.xml'>
<!ENTITY rfc6256 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6256.xml'>
]>
<rfc category="exp" ipr="trust200902" docName="draft-irtf-dtnrg-zinky-random-binary-fec-scheme-00">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc strict="no" ?>
<front>
<title abbrev="DTN-EC-Scheme"> Random Binary FEC Scheme for Bundle Protocol</title>
<author initials='J. A.' surname="Zinky" fullname='John Zinky'>
<organization>
Raytheon BBN Technologies
</organization>
<address>
<postal>
<street>10 Moulton St.</street>
<city>Cambridge</city> <region>MA</region>
<code>02138</code>
<country>US</country>
</postal>
<email>jzinky@bbn.com</email>
</address>
</author>
<author initials='A.' surname="Caro" fullname='Armando Caro'>
<organization>
Raytheon BBN Technologies
</organization>
<address>
<postal>
<street>10 Moulton St.</street>
<city>Cambridge</city> <region>MA</region>
<code>02138</code>
<country>US</country>
</postal>
<email>acaro@bbn.com</email>
</address>
</author>
<author initials='G.' surname='Stein' fullname='Gregory Stein'>
<organization>
Laboratory for Telecommunications Sciences
</organization>
<address>
<postal>
<street>8080 Greenmead Drive</street>
<city>College Park</city> <region>MD</region>
<code>20740</code>
<country>US</country>
</postal>
<email>gstein@ece.umd.edu</email>
</address>
</author>
<date/>
<abstract>
<t>
This document describes the Random Binary Forward Error
Correcting (FEC) Scheme for the Erasure Coding Extension
<xref target="ErasureCoding"/> to the DTN Bundle Protocol
<xref target="RFC5050"/>. The Random Binary FEC scheme is a
Fully-Specified FEC scheme adhering to the specification
guidelines of the FEC Building Block
<xref target="RFC5052"/>. The DTN Bundle protocol is used as
the Content Delivery Protocol. This FEC scheme is one of many
possible FEC schemes that may be used by the Erasure Coding
Extension. The Random Binary FEC scheme has several properties
that makes it efficient in the case where Data Objects are
divided into hundreds to thousands of source-symbols and where
the resources available of decoding are substantially
greater than the resources available for encoding.
</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>
The Coding Layer of the Erasure Coding Extension encodes an
ordered array of Chunks into an Encoding, which consisting of
an Encoding Data and Encoding Vector (coding formula
coefficients). The DTN Bundle protocol is used as the Content
Delivery Protocol (CDP) to transfer the Encoding to the
destination. When a significate number of Encodings have
arrived, they are decoded and the resulting ordered array of
Chunks is delivered to the Data Object layer at the
destination.
</t>
<t>
For Random Binary Coding, Encoding Vectors are generated
randomly and are sent with the Encoding Data as a unit in the
same Bundle. This allows the encoding process and transfer
overhead to be relatively efficient at the cost of an
expensive decode process. Each Encoding is effectively a
repair symbol and carries the same potential information about
the Chunks. Any Encoding can be treated as equivalent to any
other Encoding. Thus, the DTN communication channel can be
extremely poor, and can reorder, delay, or drop a large
percentage of the Encodings. The only important factor is the
number of non-duplicate Encodings that arrive. Random Binary
coding can generate a large number (exponential in N) of
non-duplicate Encodings to compensate for huge drop rates,
even greater than 99% drops. Also, receipt feedback does not
have to acknowledge specific Encodings, but only has to
summarize the state of the received Encoding Set, such as the
expected number of Encodings that need to be received before
all Chunks can be decoded.
</t>
<t>
The Random Binary FEC Scheme may be configured to behave like
a wide variety of traditional FEC schemes by restricting
which Encodings are generated. Different Encoding restrictions
may be used depending on the expected conditions of the DTN
and the application transfer requirements. For example, in the
case where bundles are not reordered and the drop rate is low,
the system could be configured to behave like a block parity
FEC. Configuration options are described in <xref
target='configure'/>.
</t>
<t>
This document only addresses the Coding Layer of the Erasure
Coding Extension and follows the organization recommended in <xref
target="RFC5052"/>. The first section introduces the Random
Binary FEC definitions, notation, and formats. Following
sections describe procedures used to implement the scheme and
the actual process of encoding, decoding and recoding. An
additional section describes how to configure Random Binary
FEC to handle different DTN conditions and end user
requirements. The document ends with discussions on Security
and IANA considerations.
</t>
</section>
<section title="Terminology">
<t>
The terminology used in this document follows the
terminology of Erasure Coding Extension to the Bundle
Protocol <xref target="ErasureCoding"/> and the FEC Building
Block <xref target="RFC5052"/>. These documents have more
comprehensive descriptions for the concepts used in this
document.
</t>
<section title="Definitions">
<t>
<list style="hanging">
<t hangText="Data Octets"> is the array of octets that
stores the Data Object and meta data to be
transferred. The Data Octets array is treated as a whole
entity and has a UUID. The Data Object Layer divides the
Data Octets into an ordered array of equal length
Chunks, which are considered the input and output for
the Coding Layer. The Data Object Layer is responsible
for padding the last Chunk and storing the Data Object
length in the meta data.
</t>
<t hangText="Coding Formula">
maps Chunks onto Encoding Data. The
Random Binary FEC coding formula is a linear combination
of Chunks over the mathematical field GF(2).
<vspace blankLines='1' />
E = Vn-1 * Cn-1 + ... + Vi * Ci + ... + V0 * C0
<vspace blankLines='0' />
Where the
V's are the binary coefficients of the coding formula
and the C's are the Chunks. The coding formula is
identified by its array of binary coefficients (V) and
is stored in the Encoding Vector. The coding formula can
also be concisely represented as binary dot product
(see <xref target="notation"/>)
between the vector of coefficients and the vector of
Chunks.
<vspace blankLines='0' />
E = V dot C
</t>
<t hangText="Hamming Weight">
is the number of coefficients with a nonzero value in
an Encoding Vector. Encodings with low (sparse) Hamming
weights need only a few XOR operations to generate the
Encoding Data. Low Hamming weights are desirable for the
encoding process, because they consume fewer encoder
resources to generate.
</t>
<t hangText="Rank"> is a measure of independence for a set
of Encodings. An Encoding Set is a group of Encodings
that share the same UUID. The Encoding Vectors from the
Encoding Set can be combined as the rows of a KxN binary
matrix (S), where K is the number of Encodings in the
Encoding Set and N is the number of Chunks. The Rank of
this matrix is the number of linearly independent
Encodings in the Encoding Set. When an Encoding is added
to an Encoding Set, it is called Innovative, if it is
not a linear combination of the already received
encodings, thereby raising the rank of the matrix S.
Otherwise, it is called Redundant. If an Encoding Set
has rank equal to the number of Chunks (N), then the
Encoding Set has full rank and can be used to solve for
all Chunks. Calculating the rank of an Encoding Set is
discussed in <xref target="rank"/>
</t>
<t hangText="Transmission Overhead">
is the expected number of extra Encodings received in
order to have a full rank Encoding Set. All the
Encodings generated by this FEC scheme can be considered repair
symbols, so when Encodings are added to an Encoding
Set, some of the Encodings may be redundant.
Transmission overhead is the expected number
of redundant Encodings before the Encoding Set can be
solved. </t>
</list>
</t>
</section>
<section anchor='notation' title="Notation">
<t>
<list style="hanging">
<t hangText="number_of_chunks (N or n)">
is the number Chunks.
</t>
<t hangText="chunk_length (L)">
is the number of octets in a Chunk. All Chunks for
Data Octets with the same UUID MUST have the same length.
</t>
<t hangText="Chunk Index (i)">
range from 0 to N - 1.
</t>
<t hangText="Chunk (C)">
is an octet array of the raw data from Data Octets.
</t>
<t hangText="Encoding Data (E)">
is an octet array of the coding data.
</t>
<t hangText="Encoding Vector (V)">
is a binary array of coefficients that represents the coding
formula.
</t>
<t hangText="Encoding Set Matrix (S)">
is a binary matrix that is formed from N linearly independent
Encoding Vectors.
</t>
<t hangText="Bitwise Exclusive Or (XOR) ">
is an operator on two octet arrays.
Bitwise XOR is an expensive operation and efficient
implementations are discussed in <xref target='xor'/>.
</t>
<t hangText=" Binary Dot Product (dot)">
is an operator on a vector of binary coefficients and a
vector of octet arrays. The result
is an octet array. The Binary Dot Product XORs together
the octet arrays that corresponded to nonzero values in the
binary vector and ignores the octet arrays that correspond
to zeros.
</t>
</list>
</t>
<t>
The following table summarizes the corresponding terms used
in <xref target="RFC5052"/> and in
<xref target="ErasureCoding"/>.
</t>
<texttable anchor='summary_terms' title='Comparison of Terms'>
<ttcol align='left' >
RFC5052
</ttcol>
<ttcol align='left' >
DTN Erasure Coding
</ttcol>
<c>Source Symbol</c>
<c>Chunk </c>
<c>Repair Symbol </c>
<c>Encoding Data </c>
<c>Encoding Symbol </c>
<c>Chunk or Encoding Data </c>
<c>Encoding Symbol Length</c>
<c>Chunk Length</c>
<c>FEC Payload ID </c>
<c>Encoding Vector </c>
<c>Generator Matrix </c>
<c>Encoding Set Matrix </c>
</texttable>
</section>
<section title="Abbreviations">
<t>
For convenience the following Abbreviation are used in this document.
<list>
<t>
BPA: Bundle Protocol Agent, see <xref target="RFC5050"/>
</t>
<t >
CDP: Content Delivery Protocol, see <xref target="RFC5052"/>
</t>
<t >
DTN: Delay/Disruption Tolerant Network, see <xref target="RFC5050"/>
</t>
<t >
FEC: Forward Error Correction, see <xref target="RFC5052"/>
</t>
<t >
SDNV: Self-Delimiting Numeric Values, see <xref target="RFC6256"/>
</t>
<t >
XOR: Exclusive or (math operation)
</t>
</list>
</t>
</section>
<section title="Requirements Notation">
<t>
The key words "MUST", "MUST NOT",
"REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be
interpreted as described in <xref target="RFC2119"/>.
</t>
</section>
</section>
<section title="Random Binary FEC Overview">
<t>
The Random Binary FEC scheme is a specific implementation of
the Coding Layer that defines the encoding, decoding, and recoding
process.
</t>
<t> The Random Binary FEC scheme encodes Chunks by creating the
linear combination of Chunks over the mathematical field
GF(2). In GF(2), the values of the coefficients are zero and
one, multiply is the AND Boolean operator, and addition is the
XOR Boolean operator. A linear combination sums together the
Chunks, using the XOR operator, whose coefficients are one in
the Encoding Vector. In the general case, the coefficients of
an Encoding Vector are selected at random, so a large number
of Encodings can be generated (2^N), where N is the number of
Chunks. Every Encoding can be considered a repair symbol,
except for the degenerate cases where only one coefficient is
nonzero. In this case, only one Chunk is XORed into the
Encoding, so the Encoding acts as a source symbol.
</t>
<t> Decoding requires collecting enough Encodings to solve the
set of binary linear equations. To accomplish this, at least N
innovative Encodings must be collected in order to solve the
linear equations and decode all N Chunks. For Random Binary
coding, decoding is an expensive operation for which there is
no guarantee that the first N Encodings received will be
sufficient to decode all the Chunks, i.e. it is expected that
some of the Encodings received will be redundant. Decoding is
unlikely to decode any chunks until roughly N distinct
Encodings have arrived and will, with probability greater than
1/2, be able to decode all chunks when N + 2 distinct
encodings have arrived <xref target="Stud2006"/>. So the
arrival process is fairly abrupt with no Chunks being
delivered even though many Encodings have arrived and then
quickly around the time the Nth distinct Encoding arrives all
the Chunks can be Decoded.
</t>
<t>
In the case where the DTN network has very short contact times
relative to the transmission time of the Data Object and nodes
move randomly, Encodings for the same Data Object may take
radically different paths to the destination. Encodings may be
Recoded by Intermediate Recoders along the path, to reduce the
chance that duplicate Encodings will be delivered to the
destination from alternative paths. Transmissions to multiple
destinations also benefit, because the destinations do not
have to receive the same Encodings, just enough non-duplicate
Encodings.
</t>
</section>
<section title="Formats and Codes">
<t>
This section maps the concepts defined in the FEC Building
Blocks <xref target="RFC5052"/> to the Content Delivery
Protocol services offered by the Erasure Coding Extension of
the DTN Bundle Protocol <xref target="ErasureCoding"/>. This
FEC Scheme fully specifies the functions of a Coding Layer for
the Erasure Coding Extension. Encoding Vector and Encoding
Data are the two data structures used to represent an Encoding
at the Coding Layer. This section shows how they are formatted
into the Erasure Coding Extension Block and Bundle Payload for
delivery by the DTN Bundle Protocol. The section's
organization follows the FEC Building Blocks recommendation
for easy comparison with other FEC schemes.
</t>
<section title="FEC Payload ID">
<t>
As defined in the <xref target="RFC5052"/>, the FEC Payload
ID contains information that indicates to the FEC decoder the
relationships between the encoding symbols (Encoding Data)
carried by a particular packet (Bundle) and the FEC encoding
transformation. For the Random Binary FEC Scheme, the FEC
Payload ID is the Encoding Vector, which holds the
coefficients of the coding formula from all Chunks
to the particular Encoding Data. The raw Encoding Vector is a
binary array of coefficients that is the length of the
number_of_chunks (N), which can be 1000's of bits
long. <xref target='scheme_specific'/> defines how the
Encoding Vector is efficiently formated into the FEC Scheme
Parameters field of the Erasure Coding Extension Block.
</t>
</section>
<section title=" FEC Object Transmission Information">
<t>
The FEC Object Transmission Information contains information
which is essential to the decoder in order to decode the
encoded object.
</t>
<section title=" Mandatory">
<t>
<list style='hanging'>
<t hangText="FEC Encoding ID"> is an the ID for the type
of FEC Scheme. For the Erasure Coding Extension, the FEC
Encoding ID also defines the format of the FEC Scheme
Parameters field of the Erasure Coding Extension
Block. The Random Binary Scheme has four different
formats, so it has four different FEC Encoding IDs. The
FEC Encoding ID is stored in the FEC Scheme Type field
of the Erasure Coding Extension Block as an SDNV value.
<xref target='scheme_specific'/> defines the values of
FEC Encoding IDs used by the Random Binary FEC Scheme.
</t>
</list>
</t>
</section>
<section title=" Common">
<t>
The following parameters are common to all FEC Schemes.
</t>
<t>
<list style='hanging'>
<t hangText="Transfer-Length">
is the length of the array of unencoded Data Octets to be
transfered. The transfer length is
not stored in a field in the bundle, but is
calculated by the formula:
<vspace blankLines='1' />
transfer_length = number_of_chunks * chunk_length
</t>
<t hangText="Encoding-Symbol-Length">
is the length of the octet array that can either hold
an Encoding Data (repair symbol) or a Chunk (source
symbol), i.e. the chunk_length. The Encoding Data is
transfered as the Bundle Payload. So the
Encoding-Symbol-Length is the length of the Bundle
Payload and is not stored as an explicit field in the
Erasure Coding Extension Block.
</t>
<t hangText="Maximum-Source-Block-Length">
The Random Binary FEC scheme does not use Blocks, so does not
use Maximum Source Block Length.
</t>
<t hangText="Max-Number-of-Encoding-Symbols">
The Random Binary FEC scheme does not limit the number
of Encoding Bundles that can be generated and sent by
the Coding Layer. But the Regulating Layer of Erasure
Coding Extension does define a Handling Specification
field that COULD be used to limit the rate and amount
of Redundant Encoding Bundles that are forwarded by
the Intermediate Regulating Layer. Thus, effectively setting
a limit on the Max-Number-of-Encoding-Symbols
forwarded by Intermediate Nodes.
</t>
</list>
</t>
</section>
<section anchor='scheme_specific' title="Scheme-Specific">
<t>
<list style='hanging'>
<t hangText="Encoding Data">
(repair symbol) is the result of applying the coding
formula to the Chunks (source symbols). Encoding Data
is a array of octets. Encoding Data is stored in the
Bundle Payload in highest octet index first order,
with no padding and no trailing zero. The Bundle
Payload only contains the Encoding Data octet array,
no additional payload header is used by the Random
Binary FEC scheme.
</t>
<t hangText="Encoding Vector"> is an array of binary
coefficients of the coding formula with length equal
to the number_of_chunks. The Encoding Vector is
stored into the Coding Parameters Field of the EC
Extension Block. Several formats are specified in this
document, each of which reduces the number of bits
needed to transmit the vector in certain cases. The
number of bits needed depends on the contents of the
vector, i.e., on the number and distribution of the
ones in the vector. An Encoding Vector MAY be
represented in any of the formats, but the choice of
format can dramatically reduce the length of the
Coding Parameter field. Encodings for the same Object
UUID MAY use different vector formats. Encoders SHOULD
dynamically choose the shortest format, when
constructing an Encoding Bundle. Decoders and Recoders
SHALL support all formats.
</t>
</list>
</t>
<section title="Full Binary Array: FEC Scheme Type = 1">
<t>
The most general Encoding Vector format is to send all
the binary coefficients as an array of octets. The
Encoding Vector binary coefficients are packed 8
coefficients to an octet. The lowest octet index and
bit index is 0. If the number of binary coefficients is
not a multiple of 8, padding bits are added to the
highest indicies using the value of zero. The resulting
octet array is sent with the highest index first.
</t>
<texttable anchor='full_vector' title='Full Vector'>
<ttcol width='20%' align='left' >
Field
</ttcol>
<ttcol align='center' >
Type
</ttcol>
<ttcol align='left' >
Description
</ttcol>
<c> Packed Binary Array </c>
<c>Octets </c>
<c>
All binary coefficients of the Encoding Vector packed into
an octet array.
</c>
</texttable>
<t>
The bit array has the following implicit parameters:
<vspace blankLines='1' />
octet_array_length = ceiling( number_of_chunks / 8 )
<vspace blankLines='1' />
octet_index = floor( coeff_index / 8 )
<vspace blankLines='1' />
bit_within_octet = coeff_index MOD 8
</t>
</section>
<section title="List of Chunk Indicies: FEC Scheme Type = 2">
<t>
For Encoding Vectors with a low Hamming weight,
i.e. with few coefficients that have the value of one, a
list of the vector indicies for the ones reduce the
parameter length. The list of indices starts with the
list length, followed by the list of indicies. All
numbers SHALL be in SDNV format. The index list has the
following format:
</t>
<texttable anchor='list_indicies' title='List of Indicies'>
<ttcol width='20%' align='left' >
Field
</ttcol>
<ttcol align='center' >
Type
</ttcol>
<ttcol align='left' >
Description
</ttcol>
<c >List Length</c>
<c> SDNV </c>
<c> The number of indicies in the Encoding Vector
with coefficient of one. </c>
<c> Index List </c>
<c>SDNV </c>
<c> List of indicies with coefficient value equal to
one. The indicies SHOULD be in order from least to
largest. Duplicate indicies SHALL NOT be sent by the
Encoder and SHALL be ignored by the Decoder. </c>
</texttable>
</section>
<section title="Windowed Binary Array: FEC Scheme Type = 3">
<t>
When an Encoding Vector has its ones grouped in a single small
range of indicies, for example Windowed encoding <xref
target="Stud2006"/>, a partial bit vector should be
sent. The starting index is sent along with a bit array
of that contains all the coefficients that are ones.
The Windowed Octet Array has the following format:
</t>
<texttable anchor='partial_vector' title='Partial Vector'>
<ttcol width='20%' align='left' >
Field
</ttcol>
<ttcol align='center' >
Type
</ttcol>
<ttcol align='left' >
Description
</ttcol>
<c>Lowest Index </c>
<c> SDNV </c>
<c> The lowest index value with a coefficient of one.
Bit index for zero of the Packed Octet Array
will be offset by this index. </c>
<c>Length Octet Array </c>
<c> SDNV </c>
<c>
octet_array_length = ceiling( (highest_index - lowest_index) / 8 )
</c>
<c>Packed Binary Array</c>
<c>SDNV </c>
<c> Encoding Vector packed into octet array using bit array
to octet array mapping from FEC Scheme Type 1. </c>
</texttable>
</section>
<section title="Finite Field Array: FEC Scheme Type = 4">
<t>
Random Binary FEC is a special case of a Random Finite
Field FEC, where the finite field is specifically GF(2),
instead of GF(2^m). In the general Random Finite Field
FEC, Encoding Vector coefficients are represented as a
binary number of length m. The finite field GF(2^m) is
characterized by a irreducible polynomial from Section
8.1 of <xref target="RFC5510"/>. Higher values of m,
such as m=8, are useful in situations where the number
of Chunks is small and it is desirable to reduce the
number of redundant Encodings that are expected to be
received in order to get a full rank. Random Binary FEC
implementations MUST be able to interpret a Finite Field
Array with m=1, as a Full Binary Array.
</t>
<t>
For transmission, Encoding Vector coefficients are
packed into an array of bits. The lowest bit index and
coefficient index is 0. The coefficient with index 0 is
packed into with its least significant bit into bit
index 0. Subsequent coefficients are concatenated to
fill the bit array. The bit array is packed into a
octet array. If the number of coefficients (n) times their
length (m) is not a multiple of 8, padding bits are
added to the highest bit positions using the value of
zero. The resulting octet array is sent with the highest
index first.
</t>
<texttable anchor='finite_field' title='Finite Field'>
<ttcol width='20%' align='left' >
Field
</ttcol>
<ttcol align='center' >
Type
</ttcol>
<ttcol align='left' >
Description
</ttcol>
<c> Finite Field Degree (m) </c>
<c>SDNV </c>
<c> Specifies the finite field of the form GF(2^m),
where m is the Finite Field Degree.
</c>
<c> Packed Coefficients Array </c>
<c>Octets </c>
<c> Encoding Vector packed into an octet array, with each
coefficient is represented as a binary number of length
m.
</c>
</texttable>
<t>
The octet array has the following implicit parameters:
<vspace blankLines='1' />
octet_array_length = ceiling( (number_of_chunks * m) / 8 )
<vspace blankLines='1' />
octet_index = floor( (coeff_index * m) / 8 )
<vspace blankLines='1' />
starting_bit_within_octet = (coeff_index * m) MOD 8
</t>
</section>
</section>
</section>
</section>
<section title="Procedures">
<t>
The Random Binary FEC scheme uses the following procedures which are
common to the encode, decode and recode processes.
</t>
<section anchor='xor' title="Bitwise XOR">
<t>
Encoding involves the XOR operation on multiple Chunks to
form an Encoding Data. Decoding involves the XOR operation
on multiple Encoding Datas to recover a Chunk. XORing two
octet arrays logically takes every bit in one array and
performs the XOR operation on the corresponding bit in the
other array. That is, the octet index and the bit position
within the octet are the same. The results are put into the
corresponding bit of a new array. Note that bits that do not
share the same index do not interact with each other. So
even though Chunks and Encoding Data are defined as octet
arrays, the bit-wise XOR can be implemented using any
convenient memory unit, such as byte, int or long.
</t>
<t>
The XOR operation is the most CPU intensive operation used
by this FEC scheme, so the number of XOR operations SHOULD
be minimized and the XOR operation implementation SHOULD be
efficient. To minimize XORs in the encoding process, a low
Hamming weight Encoding Vector SHOULD be used. To maximize
the efficiency of the XOR operation, the largest memory unit
available SHOULD be used, such as 64 bit long.
</t>
</section>
<section anchor='solve' title="Solve">
<t>
To solve the simultaneous equations to decode the Encodings
back into Chunks, the most general solution is to use
Gaussian Elimination to either invert the Encoding Set
matrix or to algebraically solve the equations directly.
The Encoding Vectors are
used as rows to form a Encoding Set matrix (S).
The Encoding Data can be used to
form vector (e). The encoding process can be represented in
matrix notation as a vector of Encoding Data (e) that was
created by multiplying the Encoding Set matrix (S) by the vector
of Chunks (c).
<vspace blankLines='0' />
e = c S
<vspace blankLines='0' />
Gaussian Elimination can be used to calculate the inverse of
the encoding matrix (S^-1). The Chunks can be
recovered by multiplying the vector of Encodings Datas by
the inverted encoding matrix.
<vspace blankLines='0' />
c = e S^-1
<vspace blankLines='0' />
If the Hamming weight of the Encoding Vectors are
low and hence the Encoding Set matrix is sparse. Solving the
equations algebraically instead of using the matrix
inversion, usually results in less octet array XOR
operations.
</t>
<t>
Gaussian Elimination is an expensive operation that involves
O(N^3) operations over the field GF(2) and O(N^2) XOR
operations on Encoding Data octet arrays. A large body
of research has been conducted to create efficient
algorithms to solve simultaneous equations and will not be
presented in this document, but SHOULD be exploited by
implementations of the Random Binary FEC scheme. Many of
these algorithms involve restricting the form of the
Encoding Vectors, with dramatic reductions in encoding or
decoding cost, but with other tradeoffs in terms of
reliability, bandwidth used, or other systemic factors.
<xref target='configure'/> discusses several options on how
to configure the encoding process to best match the tradeoffs.
</t>
</section>
<section anchor='rank' title="Rank">
<t>
The rank operation determines the number of linearly
independent Encodings in an Encoding Set, i.e the rank of the
Encoding Set matrix S. The rank operator
is less expensive than solving the whole Encoding Set. If
the rank is calculated incrementally as each Encoding is
inserted into its Encoding Set, then an insert has O(N^2)
operations in the field GF(2), but needs no XOR operations
on Encoding Data octet arrays. Given the reduced cost of the
rank operator, it can be used to determine which Encodings to
use in the solving process. It is also used to detect
redundant Encodings. As with solving, a large body of
research has been conducted to create efficient algorithms
to calculate rank and will not be presented in this
document, but SHOULD be exploited by implementations of the
Random Binary FEC scheme.
</t>
</section>
</section>
<section title="Random Binary FEC code specification">
<t>
The Coding layer consists of an Encoder and Decoder at the end
points. Intermediate Recoders may also be used to generate new
Encodings from previously received Encodings to reduce the
chance that duplicate Encodings are propagated over different
paths to the destination.
</t>
<section anchor='encode' title="Encoder">
<t>
The encoding process transforms a linear combination of
Chunks into an Encoding. The encoding coefficients are
stored in the Encoding Vector. For the general Random Binary
Encoding the coefficients are generated randomly, for
example by setting K indicies to one and the rest to
zero. The Hamming weight of the of the Encoding Vector SHOULD
be low to reduce the number of bitwise XOR operations done
by the Encoder process. Empirical measurement show a Hamming
weight of O(log N) generates Encodings that are as diverse
as using Hamming weights of O(N) <xref
target="Stud2006"/>. One caution on generating Encoding
Vectors, if all the Hamming weights for Encodings in an
Encoding Set are even, then the Encoding Set can not be
solved <xref target="Stud2006"/>. A simple solution to avoid
this situation is to generate Encoding Vectors with odd
Hamming weights.
</t>
<t>
The Encoding Data (E) is generated by taking the binary dot
product of the Encoding Vector (V) and the vector of Chunks
(C). That is, the Encoding Data accumulates an octet array
that is the XOR of Chunks whose coefficient is one in the
Encoding Vector.
<vspace blankLines='0' />
E = V dot C
<vspace blankLines='0' />
</t>
</section>
<section anchor='decode' title="Decoder">
<t>
When the Encoding Set rank is equal to the number of Chunks
(N), then N linearly independent Encodings can be extracted
and used to solve for the Chunks, see
<xref target="solve"/>. If the encoding process is
restricted, then simplified decoding algorithms can be used
that exploit the restriction. The choice of decoding
algorithm is left to the implementation, but to support
interoperability, implementations MUST support the
unrestricted encoding process. For example, a decoder could
detect the pattern of Encodings and use the appropriate
decoder algorithm, but would default to Gaussian
Elimination.
</t>
<t>
Decoding can be done incrementally by partially solving the
decoding equations as the Encodings arrive. For some
configurations of the encoding process, such as Block
Parity, some Chunks can be solved before all N innovative
Encodings have arrived. In these cases, the Decoder MAY
deliver these Chunks to the Data Object layer before all
Chunks can be decoded.
</t>
</section>
<section anchor='recode' title="Intermediate Recoder">
<t>
Intermediate Recoders generate new Encodings from the
Encodings that it has already received. Recoding reduces the
chance that duplicate Encodings are delivered over different
paths to the destination. The recode operation selects
several Encodings from the Encoding Set at Random. The
selected Encodings are combined to form a new Encoding. The
combination procedure is as follows
<list>
<t>
The new Encoding Vector is the XOR of the coefficients of
all the selected Encoding Vectors
</t>
<t>
The new Encoding Data is the XOR of the octet arrays of
all the selected Encoding Datas.
</t>
</list>
</t>
<t>
When two Encodings with Hamming weight less than N/2 are
combined, the resulting Hamming weight tends be larger than
the originals. Conversely, when two Encodings with Hamming
weight more than N/2 are combined, the resulting Hamming
weight tend be smaller than the originals. Thus, the
recoding process drives the Hamming weight towards N/2. As
Encoding Bundles are transfered across the DTN, recoding can
change any special configuration restrictions put on the
encoding process. Recoding SHOULD have the option to be
disabled as part of the Regulating Layer Handling
Specification.
</t>
<t>
Care should be given to the recoding process to insure that
all Encodings in the Encoding Set are represented in the new
stream of recoded Encodings. If each new Encoding draws
always from the whole Encoding Set, then some Encodings will
be chosen less often than others. Hence their information
will not be propagated as much as Encodings that were
selected more often. This problem is a form of the Coupon
Collectors Problem, which results in an Encoding Stream that
needs to receive up to N Log (N) Encodings instead of
only N + 2 to receive the full rank. One solution is to
generate new Encodings in cycles. Each Encoding is allowed
to be used only once during a cycle and when all Encodings
are used a new cycle begins.
</t>
</section>
</section>
<section anchor='configure' title="Configure">
<t>
The Random Binary Scheme may be configured to
implement the following basic FEC schemes, all of which can
be represented by the formats in <xref
target="scheme_specific"/> . The configurations restrict
the coding formulas, which results in encoding streams
with different properties, and potentially different
decoding algorithms. Some of these FEC schemes are described
in other RFCs and are fully specified here for the content
delivery protocol using DTN bundles.
</t>
<section title="Full Random Binary">
<t>
Full Random Binary is the generic configuration on which
other configuration characteristics are compared.
Full Random Binary generates encodings randomly over the
full range of possible Encoding Vectors. A new Encoding is
generated by randomly setting each coefficient in the
Encoding Vector to one, with a probability of 1/2. The
resulting stream of Encodings has an average Hamming weight
of N / 2. So the encoding process has O(N) octet array
XORs. The received Encoding Set has no special structure, so
the decoder must use full Gaussian Elimination. The
algebraic solver algorithm does not have an advantage over
the matrix inversion in this case, so the decoder process
has O(N^2) octet array XORs. The expected transmission overhead is
only N + 1.6, when the number of Chunks is on the order of
1000's. Finally, the coefficients of the Encoding Vector
has no restrictions, so the Encoding Vector is packed into the
<xref target='full_vector'>full
vector format </xref>.
</t>
</section>
<section title="Windowed Erasure Codes">
<t>
With a simple restriction for how the random
coefficients are generated, the encoding and decoding cost
can be dramatically reduced while still maintaining the
low transmission overhead of the Full Random configuration
<xref target="Stud2006"/>.
</t>
<t>
The windowed configuration first restricts the Hamming
weight and then restricts the range that coefficients can
be set to one to a "window" of consecutive indicies. The
Hamming weight is restricted to 2 log (N), but should be
odd. So the encoder process is O(log N) instead of O(N)
with Full Random. The window has a length of 2 sqr(N) and
the offset is chosen randomly for each new Encoding, with
wrapping from highest to lowest index. With a slight
modification to the Gaussian Elimination algorithm, the
decoder can algebraically solve the Encoding Set with a
windowed matrix in O(N^2.5), instead of the O(N^3) for
Full Random. The transmission overhead remains close to
the N + 1.6 overhead of the Full Random. Unfortunately,
unconstrained Recoding will disrupt the specialized form
of the windowed encoding matrix, which will result in
higher decoding times to again be O(N^3). Finally the
coefficients are restricted to a window, so the Encoding
Vector should be packed into
the <xref target='partial_vector'> partial vector
format </xref>.
</t>
</section>
<section title="Compact No-code FEC Scheme">
<t>
The degenerate case of sending only Encodings with Hamming
weight of one, i.e. only source symbols, can behave like an
additional fragmentation layer or as the test case named
"Compact No-code FEC Scheme" in <xref target="RFC5445"/>. In
this configuration, the encoding and decoding process
perform no work, but the system is not protected from any
dropped bundles. Since the Encoding Vector has only one
coefficient with value of one, its index should be packed into the
<xref target='list_indicies'>list of indicies format</xref>.
</t>
</section>
<section title="Block Parity">
<t>
To show the flexibility of the Random Binary FEC scheme, the
classic block parity FEC scheme described in
<xref target="RFC5445"/> can be fully specified for the DTN
content delivery protocol using DTN bundles. As in the
Compact No-code FEC Scheme, source symbols are sent
unencoded with its Chunk index packed into
the <xref target='list_indicies'>list of indicies
format</xref>. The block parity repair symbol has all the
coefficients in the block to set to one, and
uses <xref target='partial_vector'>partial vector
format </xref>. The normal incremental decoder will
automatically detect source symbols as solved. The parity
repair symbol will be applied, if any source symbols are
dropped, or it is treated as redundant, if no bundles where
dropped in the block. Chunks may be delivered as they
arrive. The Block Parity FEC scheme is practical in the case
where dropped bundles are rare and not bursty.
</t>
</section>
</section>
<section title="Security Considerations">
<t>
No additional security considerations have been identified
beyond those described in <xref target="ErasureCoding"/>
</t>
</section>
<section title="IANA Considerations">
<t>
The Random Binary Scheme uses three FEC Encoding IDs. The
assigned IDs should be less than 128 in order to fit into
one byte using SDNV values. The reference implementation
uses the following FEC Scheme Types:
<list>
<t>
Full Binary Array = 1
</t>
<t>
List of Chunk Indicies = 2
</t>
<t>
Windowed Binary Array = 3
</t>
<t>
Finite Field Array = 4
</t>
</list>
</t>
</section>
</middle>
<back>
<references title="Normative References">
&rfc2119;
&rfc5050;
&rfc5052;
&rfc5510;
&rfc6256;
<reference anchor='ErasureCoding'>
<front>
<title>Bundle Protocol Erasure Coding Extension</title>
<author initials='J.' surname='Zinky'
fullname='John Zinky'>
<organization abbrev='BBN'>
Raytheon BBN Technologies
</organization>
</author>
<author initials='A.' surname='Caro'
fullname='Armando Caro'>
<organization abbrev='BBN'>
Raytheon BBN Technologies
</organization>
</author>
<author initials='G.' surname='Stein'
fullname='Gregory Stein'>
<organization abbrev='LTS'>
Laboratory for Telecommunications Sciences
</organization>
</author>
<date month='Aug' year='2012' />
</front>
<seriesInfo name='Internet-Draft' value='draft-irtf-dtnrg-zinky-erasure-coding-extension-00' />
</reference>
</references>
<references title="Informative References">
&rfc5445;
<reference anchor='Stud2006'>
<front>
<title>Windowed Erasure Codes</title>
<author initials='C.' surname='Studholme'
fullname='Chris Studholme'>
<organization abbrev='utoronoto'>
University of Toronto
</organization>
</author>
<author initials='I.' surname='Blake'
fullname='Ian Blake'>
<organization abbrev='utoronoto'>
University of Toronto
</organization>
</author>
<date month='July' year='2006' />
</front>
<seriesInfo name='IEEE' value='International Symposium on Information Theory' />
</reference>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 10:59:46 |