One document matched: draft-irtf-dtnrg-zinky-random-binary-fec-scheme-00.xml


<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc1112 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.1112.xml'>
<!ENTITY rfc2119 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc3171 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3171.xml'>
<!ENTITY rfc3376 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3376.xml'>
<!ENTITY rfc3927 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3927.xml'>
<!ENTITY rfc3986 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3986.xml'>
<!ENTITY rfc4122 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4122.xml'>
<!ENTITY rfc4838 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4838.xml'>
<!ENTITY rfc5050 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5050.xml'>
<!ENTITY rfc5052 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5052.xml'>
<!ENTITY rfc5053 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5053.xml'>
<!ENTITY rfc5170 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5170.xml'>
<!ENTITY rfc5445 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5445.xml'>
<!ENTITY rfc5510 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5510.xml'>
<!ENTITY rfc6256 PUBLIC '' 
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6256.xml'>
	  ]>

<rfc category="exp" ipr="trust200902" docName="draft-irtf-dtnrg-zinky-random-binary-fec-scheme-00">
  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
  <?rfc toc="yes" ?>
  <?rfc symrefs="yes" ?>
  <?rfc sortrefs="yes"?>
  <?rfc iprnotified="no" ?>
  <?rfc strict="no" ?>

  <front>
    <title abbrev="DTN-EC-Scheme"> Random Binary FEC Scheme for Bundle Protocol</title>

   <author initials='J. A.' surname="Zinky" fullname='John Zinky'>
      <organization>
	Raytheon BBN Technologies
      </organization>
      <address>
	<postal>
          <street>10 Moulton St.</street>
          <city>Cambridge</city> <region>MA</region>
          <code>02138</code>
          <country>US</country>
	</postal>
	<email>jzinky@bbn.com</email>
      </address>
    </author>

    <author initials='A.' surname="Caro" fullname='Armando Caro'>
      <organization>
	Raytheon BBN Technologies
      </organization>
      <address>
	<postal>
          <street>10 Moulton St.</street>
          <city>Cambridge</city> <region>MA</region>
          <code>02138</code>
          <country>US</country>
	</postal>
	<email>acaro@bbn.com</email>
      </address>
    </author>

  <author initials='G.' surname='Stein' fullname='Gregory Stein'>
      <organization>
	Laboratory for Telecommunications Sciences
      </organization>
      <address>
	<postal>
          <street>8080 Greenmead Drive</street>
          <city>College Park</city> <region>MD</region>
          <code>20740</code>
          <country>US</country>
	</postal>
	<email>gstein@ece.umd.edu</email>
      </address>
    </author>
    <date/>

    <abstract>
      <t>
	This document describes the Random Binary Forward Error
	Correcting (FEC) Scheme for the Erasure Coding Extension
	<xref target="ErasureCoding"/> to the DTN Bundle Protocol
	<xref target="RFC5050"/>.  The Random Binary FEC scheme is a
	Fully-Specified FEC scheme adhering to the specification
	guidelines of the FEC Building Block
	<xref target="RFC5052"/>. The DTN Bundle protocol is used as
	the Content Delivery Protocol. This FEC scheme is one of many
	possible FEC schemes that may be used by the Erasure Coding
	Extension. The Random Binary FEC scheme has several properties
	that makes it efficient in the case where Data Objects are
	divided into hundreds to thousands of source-symbols and where
	the resources available of decoding are substantially
	greater than the resources available for encoding.
      </t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>
	The Coding Layer of the Erasure Coding Extension encodes an
	ordered array of Chunks into an Encoding, which consisting of
	an Encoding Data and Encoding Vector (coding formula
	coefficients).  The DTN Bundle protocol is used as the Content
	Delivery Protocol (CDP) to transfer the Encoding to the
	destination. When a significate number of Encodings have
	arrived, they are decoded and the resulting ordered array of
	Chunks is delivered to the Data Object layer at the
	destination.
      </t>
      <t>
	For Random Binary Coding, Encoding Vectors are generated
	randomly and are sent with the Encoding Data as a unit in the
	same Bundle.  This allows the encoding process and transfer
	overhead to be relatively efficient at the cost of an
	expensive decode process. Each Encoding is effectively a
	repair symbol and carries the same potential information about
	the Chunks. Any Encoding can be treated as equivalent to any
	other Encoding. Thus, the DTN communication channel can be
	extremely poor, and can reorder, delay, or drop a large
	percentage of the Encodings. The only important factor is the
	number of non-duplicate Encodings that arrive. Random Binary
	coding can generate a large number (exponential in N) of
	non-duplicate Encodings to compensate for huge drop rates,
	even greater than 99% drops. Also, receipt feedback does not
	have to acknowledge specific Encodings, but only has to
	summarize the state of the received Encoding Set, such as the
	expected number of Encodings that need to be received before
	all Chunks can be decoded.
      </t>
      <t>
	The Random Binary FEC Scheme may be configured to behave like
	a wide variety of traditional FEC schemes by restricting
	which Encodings are generated. Different Encoding restrictions
	may be used depending on the expected conditions of the DTN
	and the application transfer requirements. For example, in the
	case where bundles are not reordered and the drop rate is low,
	the system could be configured to behave like a block parity
	FEC. Configuration options are described in <xref
	target='configure'/>.
      </t>
      <t>
	This document only addresses the Coding Layer of the Erasure
	Coding Extension and follows the organization recommended in <xref
	target="RFC5052"/>. The first section introduces the Random
	Binary FEC definitions, notation, and formats. Following
	sections describe procedures used to implement the scheme and
	the actual process of encoding, decoding and recoding. An
	additional section describes how to configure Random Binary
	FEC to handle different DTN conditions and end user
	requirements. The document ends with discussions on Security
	and IANA considerations.
      </t>
    </section>

    <section title="Terminology">
	<t>
	  The terminology used in this document follows the
	  terminology of Erasure Coding Extension to the Bundle
	  Protocol <xref target="ErasureCoding"/> and the FEC Building
	  Block <xref target="RFC5052"/>. These documents have more
	  comprehensive descriptions for the concepts used in this
	  document.
	</t>

      <section title="Definitions">
	<t>
	  <list style="hanging">
	    <t hangText="Data Octets"> is the array of octets that
	      stores the Data Object and meta data to be
	      transferred. The Data Octets array is treated as a whole
	      entity and has a UUID. The Data Object Layer divides the
	      Data Octets into an ordered array of equal length
	      Chunks, which are considered the input and output for
	      the Coding Layer.  The Data Object Layer is responsible
	      for padding the last Chunk and storing the Data Object
	      length in the meta data.
	    </t>
	    <t hangText="Coding Formula">
	      maps Chunks onto Encoding Data.  The
	      Random Binary FEC coding formula is a linear combination
	      of Chunks over the mathematical field GF(2).
	      <vspace blankLines='1' /> 
	      E = Vn-1 * Cn-1 + ... + Vi * Ci + ... + V0 * C0 
	      <vspace blankLines='0' /> 
	      Where the
	      V's are the binary coefficients of the coding formula
	      and the C's are the Chunks. The coding formula is
	      identified by its array of binary coefficients (V) and
	      is stored in the Encoding Vector. The coding formula can
	      also be concisely represented as binary dot product 
	      (see  <xref target="notation"/>)
	      between the vector of coefficients and the vector of
	      Chunks. 
	      <vspace blankLines='0' /> 
	      E = V dot C
	    </t>
	    <t hangText="Hamming Weight">
	      is the number of coefficients with a nonzero value in
	      an Encoding Vector. Encodings with low (sparse) Hamming
	      weights need only a few XOR operations to generate the
	      Encoding Data. Low Hamming weights are desirable for the
	      encoding process, because they consume fewer encoder
	      resources to generate.
	    </t>
	    <t hangText="Rank"> is a measure of independence for a set
	      of Encodings.  An Encoding Set is a group of Encodings
	      that share the same UUID.  The Encoding Vectors from the
	      Encoding Set can be combined as the rows of a KxN binary
	      matrix (S), where K is the number of Encodings in the
	      Encoding Set and N is the number of Chunks. The Rank of
	      this matrix is the number of linearly independent
	      Encodings in the Encoding Set. When an Encoding is added
	      to an Encoding Set, it is called Innovative, if it is
	      not a linear combination of the already received
	      encodings, thereby raising the rank of the matrix S.
	      Otherwise, it is called Redundant.  If an Encoding Set
	      has rank equal to the number of Chunks (N), then the
	      Encoding Set has full rank and can be used to solve for
	      all Chunks. Calculating the rank of an Encoding Set is
	      discussed in <xref target="rank"/>
	    </t>
	    <t hangText="Transmission Overhead"> 
	      is the expected number of extra Encodings received in
	      order to have a full rank Encoding Set. All the
	      Encodings generated by this FEC scheme can be considered repair
	      symbols, so when Encodings are added to an Encoding
	      Set, some of the Encodings may be redundant. 
	      Transmission overhead is the expected number
	      of redundant Encodings before the Encoding Set can be
	      solved.  </t>
	  </list>
	</t>
      </section>

      <section anchor='notation' title="Notation">
	<t>
	  <list style="hanging">

	    <t hangText="number_of_chunks (N or n)"> 
	      is the number Chunks.
	    </t>
	    <t hangText="chunk_length (L)"> 
	      is the number of octets in a Chunk. All Chunks for 
	      Data Octets with the same UUID MUST have the same length.
	    </t>
	    <t hangText="Chunk Index (i)"> 
	      range from 0 to N - 1.
	    </t>
	    <t hangText="Chunk (C)"> 
	      is an octet array of the raw data from Data Octets.
	    </t>
	    <t hangText="Encoding Data (E)"> 
	      is an octet array of the coding data.
	    </t>
	    <t hangText="Encoding Vector (V)"> 
	      is a binary array of coefficients that represents the coding
	      formula.
	    </t>
	    <t hangText="Encoding Set Matrix (S)"> 
	      is a binary matrix that is formed from N linearly independent
	      Encoding Vectors.
	    </t>
	    <t hangText="Bitwise Exclusive Or (XOR) ">
	      is an operator on two octet arrays.
	      Bitwise XOR is an expensive operation and efficient
	      implementations are discussed in <xref target='xor'/>.
	    </t>
	    <t hangText=" Binary Dot Product (dot)"> 
	      is an operator on a vector of binary coefficients and a
	      vector of octet arrays.  The result
	      is an octet array. The Binary Dot Product XORs together
	      the octet arrays that corresponded to nonzero values in the
	      binary vector and ignores the octet arrays that correspond
	      to zeros.
	    </t>
	  </list>
	</t>
	<t>
	  The following table summarizes the corresponding terms used
	  in <xref target="RFC5052"/> and in
	  <xref target="ErasureCoding"/>.
	</t>
	  <texttable anchor='summary_terms' title='Comparison of Terms'>
	      <ttcol align='left' > 
		  RFC5052
	      </ttcol>
	      <ttcol align='left' > 
		 DTN Erasure Coding
	      </ttcol>
	      <c>Source Symbol</c>
	      <c>Chunk </c>
	      <c>Repair Symbol </c>
	      <c>Encoding Data </c>
	      <c>Encoding Symbol </c>
	      <c>Chunk or Encoding Data  </c>
	      <c>Encoding Symbol Length</c>
	      <c>Chunk Length</c>
	      <c>FEC Payload ID </c>
	      <c>Encoding Vector </c>
	      <c>Generator Matrix </c>
	      <c>Encoding Set Matrix </c>
	    </texttable>
      </section>
      <section title="Abbreviations">
	<t>
	  For convenience the following Abbreviation are used in this document.
	<list>
	    <t> 
	      BPA: Bundle Protocol Agent,  see <xref target="RFC5050"/>
	    </t>
	    <t > 
	      CDP: Content Delivery Protocol, see <xref target="RFC5052"/>
	    </t>
	    <t > 
	      DTN: Delay/Disruption Tolerant Network, see <xref target="RFC5050"/>
	    </t>
	    <t > 
	      FEC: Forward Error Correction, see <xref target="RFC5052"/>
	    </t>
	    <t > 
	      SDNV: Self-Delimiting Numeric Values, see <xref target="RFC6256"/>
	    </t>
	    <t > 
	      XOR: Exclusive or (math operation)
	    </t>

	</list>
	</t>
      </section>
      <section title="Requirements Notation">
	<t>
	  The key words "MUST", "MUST NOT",
	  "REQUIRED", "SHALL", "SHALL NOT",
	  "SHOULD", "SHOULD NOT", "RECOMMENDED",
	  "MAY", and "OPTIONAL" in this document are to be
	  interpreted as described in <xref target="RFC2119"/>.
	</t>
      </section>
    </section>


    <section title="Random Binary FEC Overview">
      <t>
	The Random Binary FEC scheme is a specific implementation of
	the Coding Layer that defines the encoding, decoding, and recoding
	process.  
      </t>
      <t> The Random Binary FEC scheme encodes Chunks by creating the
	linear combination of Chunks over the mathematical field
	GF(2). In GF(2), the values of the coefficients are zero and
	one, multiply is the AND Boolean operator, and addition is the
	XOR Boolean operator. A linear combination sums together the
	Chunks, using the XOR operator, whose coefficients are one in
	the Encoding Vector. In the general case, the coefficients of
	an Encoding Vector are selected at random, so a large number
	of Encodings can be generated (2^N), where N is the number of
	Chunks. Every Encoding can be considered a repair symbol,
	except for the degenerate cases where only one coefficient is
	nonzero. In this case, only one Chunk is XORed into the
	Encoding, so the Encoding acts as a source symbol.
      </t>
      <t> Decoding requires collecting enough Encodings to solve the
	set of binary linear equations. To accomplish this, at least N
	innovative Encodings must be collected in order to solve the
	linear equations and decode all N Chunks.  For Random Binary
	coding, decoding is an expensive operation for which there is
	no guarantee that the first N Encodings received will be
	sufficient to decode all the Chunks, i.e. it is expected that
	some of the Encodings received will be redundant. Decoding is
	unlikely to decode any chunks until roughly N distinct
	Encodings have arrived and will, with probability greater than
	1/2, be able to decode all chunks when N + 2 distinct
	encodings have arrived <xref target="Stud2006"/>. So the
	arrival process is fairly abrupt with no Chunks being
	delivered even though many Encodings have arrived and then
	quickly around the time the Nth distinct Encoding arrives all
	the Chunks can be Decoded.
      </t>
      <t>
 	In the case where the DTN network has very short contact times
 	relative to the transmission time of the Data Object and nodes
 	move randomly, Encodings for the same Data Object may take
 	radically different paths to the destination. Encodings may be
 	Recoded by Intermediate Recoders along the path, to reduce the
 	chance that duplicate Encodings will be delivered to the
 	destination from alternative paths.  Transmissions to multiple
 	destinations also benefit, because the destinations do not
 	have to receive the same Encodings, just enough non-duplicate
 	Encodings.
      </t>
  
    </section>


    <section title="Formats and Codes">
      <t>
	This section maps the concepts defined in the FEC Building
	Blocks <xref target="RFC5052"/> to the Content Delivery
	Protocol services offered by the Erasure Coding Extension of
	the DTN Bundle Protocol <xref target="ErasureCoding"/>. This
	FEC Scheme fully specifies the functions of a Coding Layer for
	the Erasure Coding Extension. Encoding Vector and Encoding
	Data are the two data structures used to represent an Encoding
	at the Coding Layer.  This section shows how they are formatted
	into the Erasure Coding Extension Block and Bundle Payload for
	delivery by the DTN Bundle Protocol. The section's
	organization follows the FEC Building Blocks recommendation
	for easy comparison with other FEC schemes.
      </t>
      <section title="FEC Payload ID">
	<t>
	 As defined in the <xref target="RFC5052"/>, the FEC Payload
	 ID contains information that indicates to the FEC decoder the
	 relationships between the encoding symbols (Encoding Data)
	 carried by a particular packet (Bundle) and the FEC encoding
	 transformation. For the Random Binary FEC Scheme, the FEC
	 Payload ID is the Encoding Vector, which holds the
	 coefficients of the coding formula from all Chunks
	 to the particular Encoding Data. The raw Encoding Vector is a
	 binary array of coefficients that is the length of the
	 number_of_chunks (N), which can be 1000's of bits
	 long. <xref target='scheme_specific'/> defines how the
	 Encoding Vector is efficiently formated into the FEC Scheme
	 Parameters field of the Erasure Coding Extension Block.
	</t>
      </section>
      <section title=" FEC Object Transmission Information">
	<t>
	  The FEC Object Transmission Information contains information
	  which is essential to the decoder in order to decode the
	  encoded object.
	</t>
	<section title=" Mandatory">
	  <t>
	    <list style='hanging'>
	      <t hangText="FEC Encoding ID"> is an the ID for the type
	      of FEC Scheme. For the Erasure Coding Extension, the FEC
	      Encoding ID also defines the format of the FEC Scheme
	      Parameters field of the Erasure Coding Extension
	      Block. The Random Binary Scheme has four different
	      formats, so it has four different FEC Encoding IDs.  The
	      FEC Encoding ID is stored in the FEC Scheme Type field
	      of the Erasure Coding Extension Block as an SDNV value.
	      <xref target='scheme_specific'/> defines the values of
	      FEC Encoding IDs used by the Random Binary FEC Scheme.
	      </t>
	    </list>
	  </t>
	</section>
	<section title=" Common">
	  <t>
	    The following parameters are common to all FEC Schemes.
	    
	  </t>
	  <t>
	    <list style='hanging'>
	      <t hangText="Transfer-Length">
		is the length of the array of unencoded Data Octets to be
		transfered.  The transfer length is
		not stored in a field in the bundle, but is
		calculated by the formula:
		<vspace blankLines='1' />
		transfer_length = number_of_chunks * chunk_length
	      </t>
	      <t hangText="Encoding-Symbol-Length">
		is the length of the octet array that can either hold
		an Encoding Data (repair symbol) or a Chunk (source
		symbol), i.e. the chunk_length.  The Encoding Data is
		transfered as the Bundle Payload. So the
		Encoding-Symbol-Length is the length of the Bundle
		Payload and is not stored as an explicit field in the
		Erasure Coding Extension Block.
	      </t>
	      <t hangText="Maximum-Source-Block-Length">
		The Random Binary FEC scheme does not use Blocks, so does not 
		use Maximum Source Block Length.
	      </t>
	      <t hangText="Max-Number-of-Encoding-Symbols">
		The Random Binary FEC scheme does not limit the number
		of Encoding Bundles that can be generated and sent by
		the Coding Layer.  But the Regulating Layer of Erasure
		Coding Extension does define a Handling Specification
		field that COULD be used to limit the rate and amount
		of Redundant Encoding Bundles that are forwarded by
		the Intermediate Regulating Layer. Thus, effectively setting
		a limit on the Max-Number-of-Encoding-Symbols
		forwarded by Intermediate Nodes.
	      </t>
	    </list>
	  </t>

	</section>
	
	<section anchor='scheme_specific' title="Scheme-Specific">
	  <t>
	    <list style='hanging'>
	      <t hangText="Encoding Data">
		(repair symbol) is the result of applying the coding
		formula to the Chunks (source symbols). Encoding Data
		is a array of octets.  Encoding Data is stored in the
		Bundle Payload in highest octet index first order,
		with no padding and no trailing zero. The Bundle
		Payload only contains the Encoding Data octet array,
		no additional payload header is used by the Random
		Binary FEC scheme.
	      </t>
	      <t hangText="Encoding Vector"> is an array of binary
		coefficients of the coding formula with length equal
		to the number_of_chunks.  The Encoding Vector is
		stored into the Coding Parameters Field of the EC
		Extension Block. Several formats are specified in this
		document, each of which reduces the number of bits
		needed to transmit the vector in certain cases.  The
		number of bits needed depends on the contents of the
		vector, i.e., on the number and distribution of the
		ones in the vector.  An Encoding Vector MAY be
		represented in any of the formats, but the choice of
		format can dramatically reduce the length of the
		Coding Parameter field.  Encodings for the same Object
		UUID MAY use different vector formats. Encoders SHOULD
		dynamically choose the shortest format, when
		constructing an Encoding Bundle. Decoders and Recoders
		SHALL support all formats.
	      </t>
	    </list>
	  </t>
	  <section  title="Full Binary Array:  FEC Scheme Type = 1">
	    <t>
	      The most general Encoding Vector format is to send all
	      the binary coefficients as an array of octets. The
	      Encoding Vector binary coefficients are packed 8
	      coefficients to an octet.  The lowest octet index and
	      bit index is 0. If the number of binary coefficients is
	      not a multiple of 8, padding bits are added to the
	      highest indicies using the value of zero. The resulting
	      octet array is sent with the highest index first.
	    </t>
	    <texttable anchor='full_vector' title='Full Vector'>
	      <ttcol width='20%' align='left' > 
		Field 
	      </ttcol>
	      <ttcol align='center' > 
		Type 
	      </ttcol>
	      <ttcol align='left'  > 
		Description
	      </ttcol>
	      <c> Packed Binary Array  </c>
	      <c>Octets  </c>
	      <c> 
		All binary coefficients of the Encoding Vector packed into
		an octet array. 
	      </c>
	    </texttable>
	    <t>
	      The bit array has the following implicit parameters:
	      <vspace blankLines='1' />
	      octet_array_length = ceiling( number_of_chunks / 8 )
	      <vspace blankLines='1' />
	      octet_index = floor( coeff_index / 8 )
	      <vspace blankLines='1' />
	      bit_within_octet = coeff_index MOD 8
	    </t>
	  </section>
	  
	  <section title="List of Chunk Indicies: FEC Scheme Type = 2">
	    <t>
	      For Encoding Vectors with a low Hamming weight,
	      i.e. with few coefficients that have the value of one, a
	      list of the vector indicies for the ones reduce the
	      parameter length. The list of indices starts with the
	      list length, followed by the list of indicies. All
	      numbers SHALL be in SDNV format. The index list has the
	      following format:
	    </t>
	    <texttable anchor='list_indicies' title='List of Indicies'>
	      <ttcol width='20%' align='left' > 
		Field 
	      </ttcol>
	      <ttcol align='center' > 
		Type 
	      </ttcol>
	      <ttcol align='left'  > 
		Description
	      </ttcol>
	      <c >List Length</c>
	      <c> SDNV  </c>
	      <c> The number of indicies in the Encoding Vector
	      with coefficient of one.  </c>
	      
	      <c> Index List  </c>
	      <c>SDNV  </c>
	      <c> List of indicies with coefficient value equal to
	      one. The indicies SHOULD be in order from least to
	      largest. Duplicate indicies SHALL NOT be sent by the
	      Encoder and SHALL be ignored by the Decoder.  </c>
	    </texttable>
	  </section>
	  
	  <section title="Windowed Binary Array: FEC Scheme Type = 3">
	    <t>
	      When an Encoding Vector has its ones grouped in a single small
	      range of indicies, for example Windowed encoding <xref
	      target="Stud2006"/>, a partial bit vector should be
	      sent. The starting index is sent along with a bit array
	      of that contains all the coefficients that are ones.
	      The Windowed Octet Array has the following format:
	    </t>
	    <texttable anchor='partial_vector' title='Partial Vector'>
	      <ttcol width='20%' align='left' > 
		Field 
	      </ttcol>
	      <ttcol align='center' > 
		Type 
	      </ttcol>
	      <ttcol align='left'  > 
		Description
	      </ttcol>
	      <c>Lowest Index   </c>
	      <c> SDNV  </c>
	      <c> The lowest index value with a coefficient of one.
	      Bit index for zero of the Packed Octet Array
	      will be offset by this index.  </c>
	      
	      <c>Length Octet Array  </c>
	      <c> SDNV  </c>
	      <c> 	      
		octet_array_length = ceiling( (highest_index - lowest_index) / 8 )
	      </c>
	      <c>Packed Binary Array</c>
	      <c>SDNV   </c>
	      <c> Encoding Vector packed into octet array using bit array
	      to octet array mapping from FEC Scheme Type 1.   </c>
	    </texttable>
	  </section>
	  <section  title="Finite Field Array:  FEC Scheme Type = 4">
	    <t>
	      Random Binary FEC is a special case of a Random Finite
	      Field FEC, where the finite field is specifically GF(2),
	      instead of GF(2^m).  In the general Random Finite Field
	      FEC, Encoding Vector coefficients are represented as a
	      binary number of length m.  The finite field GF(2^m) is
	      characterized by a irreducible polynomial from Section
	      8.1 of <xref target="RFC5510"/>.  Higher values of m,
	      such as m=8, are useful in situations where the number
	      of Chunks is small and it is desirable to reduce the
	      number of redundant Encodings that are expected to be
	      received in order to get a full rank. Random Binary FEC
	      implementations MUST be able to interpret a Finite Field
	      Array with m=1, as a Full Binary Array.
	    </t>
	    <t>
	      For transmission, Encoding Vector coefficients are
	      packed into an array of bits.  The lowest bit index and
	      coefficient index is 0. The coefficient with index 0 is
	      packed into with its least significant bit into bit
	      index 0. Subsequent coefficients are concatenated to
	      fill the bit array.  The bit array is packed into a
	      octet array.  If the number of coefficients (n) times their
	      length (m) is not a multiple of 8, padding bits are
	      added to the highest bit positions using the value of
	      zero. The resulting octet array is sent with the highest
	      index first.
	    </t>
	    <texttable anchor='finite_field' title='Finite Field'>
	      <ttcol width='20%' align='left' > 
		Field 
	      </ttcol>
	      <ttcol align='center' > 
		Type 
	      </ttcol>
	      <ttcol align='left'  > 
		Description
	      </ttcol>
	      <c> Finite Field Degree (m) </c>
	      <c>SDNV </c>
	      <c> Specifies the finite field of the form GF(2^m),
		where m is the Finite Field Degree.
	      </c>

	      <c> Packed Coefficients Array  </c>
	      <c>Octets  </c>
	      <c> Encoding Vector packed into an octet array, with each
		coefficient is represented as a binary number of length 
		m.   
	      </c>
	    </texttable>
	    <t>
	      The octet array has the following implicit parameters:
	      <vspace blankLines='1' />
	      octet_array_length = ceiling( (number_of_chunks * m) / 8 )
	      <vspace blankLines='1' />
	      octet_index = floor( (coeff_index * m) / 8 )
	      <vspace blankLines='1' />
	      starting_bit_within_octet = (coeff_index * m) MOD 8
	    </t>
	  </section>

	</section>
      </section>
    </section>
    <section title="Procedures">
      <t>
	The Random Binary FEC scheme uses the following procedures which are
	common to the encode, decode and recode processes.
      </t>
      <section anchor='xor' title="Bitwise XOR">
	<t>
	  Encoding involves the XOR operation on multiple Chunks to
	  form an Encoding Data. Decoding involves the XOR operation
	  on multiple Encoding Datas to recover a Chunk.  XORing two
	  octet arrays logically takes every bit in one array and
	  performs the XOR operation on the corresponding bit in the
	  other array. That is, the octet index and the bit position
	  within the octet are the same. The results are put into the
	  corresponding bit of a new array. Note that bits that do not
	  share the same index do not interact with each other. So
	  even though Chunks and Encoding Data are defined as octet
	  arrays, the bit-wise XOR can be implemented using any
	  convenient memory unit, such as byte, int or long.
	</t>
	<t>
          The XOR operation is the most CPU intensive operation used
          by this FEC scheme, so the number of XOR operations SHOULD
          be minimized and the XOR operation implementation SHOULD be
          efficient. To minimize XORs in the encoding process, a low
          Hamming weight Encoding Vector SHOULD be used. To maximize
          the efficiency of the XOR operation, the largest memory unit
          available SHOULD be used, such as 64 bit long.
	</t>
      </section>
      <section anchor='solve' title="Solve">
	<t>
	  To solve the simultaneous equations to decode the Encodings
	  back into Chunks, the most general solution is to use
	  Gaussian Elimination to either invert the Encoding Set
	  matrix or to algebraically solve the equations directly.
	  The Encoding Vectors are
	  used as rows to form a Encoding Set matrix (S).
	  The Encoding Data can be used to
	  form vector (e). The encoding process can be represented in
	  matrix notation as a vector of Encoding Data (e) that was
	  created by multiplying the Encoding Set matrix (S) by the vector
	  of Chunks (c).
	  <vspace blankLines='0' />
	  e = c S
	  <vspace blankLines='0' />
	  Gaussian Elimination can be used to calculate the inverse of
	  the encoding matrix (S^-1). The Chunks can be
	  recovered by multiplying the vector of Encodings Datas by
	  the inverted encoding matrix.
	  <vspace blankLines='0' />
	  c = e S^-1
	  <vspace blankLines='0' />
	  If the Hamming weight of the Encoding Vectors are
	  low and hence the Encoding Set matrix is sparse. Solving the
	  equations algebraically instead of using the matrix
	  inversion, usually results in less octet array XOR
	  operations.
	</t>
	<t>
	  Gaussian Elimination is an expensive operation that involves
	  O(N^3) operations over the field GF(2) and O(N^2) XOR
	  operations on Encoding Data octet arrays. A large body
	  of research has been conducted to create efficient
	  algorithms to solve simultaneous equations and will not be
	  presented in this document, but SHOULD be exploited by
	  implementations of the Random Binary FEC scheme. Many of
	  these algorithms involve restricting the form of the
	  Encoding Vectors, with dramatic reductions in encoding or
	  decoding cost, but with other tradeoffs in terms of
	  reliability, bandwidth used, or other systemic factors. 
	  <xref target='configure'/> discusses several options on how
	  to configure the encoding process to best match the tradeoffs.
	</t>
      </section>
      <section anchor='rank' title="Rank">
	<t>
	  The rank operation determines the number of linearly
	  independent Encodings in an Encoding Set, i.e the rank of the 
	  Encoding Set matrix S. The rank operator
	  is less expensive than solving the whole Encoding Set. If
	  the rank is calculated incrementally as each Encoding is
	  inserted into its Encoding Set, then an insert has O(N^2)
	  operations in the field GF(2), but needs no XOR operations
	  on Encoding Data octet arrays. Given the reduced cost of the
	  rank operator, it can be used to determine which Encodings to
	  use in the solving process. It is also used to detect
	  redundant Encodings.  As with solving, a large body of
	  research has been conducted to create efficient algorithms
	  to calculate rank and will not be presented in this
	  document, but SHOULD be exploited by implementations of the
	  Random Binary FEC scheme.
	</t>
      </section>
    </section>
    <section title="Random Binary FEC code specification">
      <t>
	The Coding layer consists of an Encoder and Decoder at the end
	points. Intermediate Recoders may also be used to generate new
	Encodings from previously received Encodings to reduce the
	chance that duplicate Encodings are propagated over different
	paths to the destination.
      </t>
      <section anchor='encode' title="Encoder">
	<t>
	  The encoding process transforms a linear combination of
	  Chunks into an Encoding. The encoding coefficients are
	  stored in the Encoding Vector. For the general Random Binary
	  Encoding the coefficients are generated randomly, for
	  example by setting K indicies to one and the rest to
	  zero. The Hamming weight of the of the Encoding Vector SHOULD
	  be low to reduce the number of bitwise XOR operations done
	  by the Encoder process. Empirical measurement show a Hamming
	  weight of O(log N) generates Encodings that are as diverse
	  as using Hamming weights of O(N) <xref
	  target="Stud2006"/>. One caution on generating Encoding
	  Vectors, if all the Hamming weights for Encodings in an
	  Encoding Set are even, then the Encoding Set can not be
	  solved <xref target="Stud2006"/>. A simple solution to avoid
	  this situation is to generate Encoding Vectors with odd
	  Hamming weights.
	</t>
	<t>
	  The Encoding Data (E) is generated by taking the binary dot
	  product of the Encoding Vector (V) and the vector of Chunks
	  (C). That is, the Encoding Data accumulates an octet array
	  that is the XOR of Chunks whose coefficient is one in the
	  Encoding Vector.
	  <vspace blankLines='0' />
	  E = V dot C
	  <vspace blankLines='0' />
 	</t>
      </section>
      
      <section anchor='decode' title="Decoder">
	<t>
	  When the Encoding Set rank is equal to the number of Chunks
	  (N), then N linearly independent Encodings can be extracted
	  and used to solve for the Chunks, see
	  <xref target="solve"/>.  If the encoding process is
	  restricted, then simplified decoding algorithms can be used
	  that exploit the restriction. The choice of decoding
	  algorithm is left to the implementation, but to support
	  interoperability, implementations MUST support the
	  unrestricted encoding process. For example, a decoder could
	  detect the pattern of Encodings and use the appropriate
	  decoder algorithm, but would default to Gaussian
	  Elimination.
	</t>
	<t>
	  Decoding can be done incrementally by partially solving the
	  decoding equations as the Encodings arrive. For some
	  configurations of the encoding process, such as Block
	  Parity, some Chunks can be solved before all N innovative
	  Encodings have arrived.  In these cases, the Decoder MAY
	  deliver these Chunks to the Data Object layer before all
	  Chunks can be decoded.
	</t>
      </section>

      <section  anchor='recode' title="Intermediate Recoder">
	<t>
	  Intermediate Recoders generate new Encodings from the
	  Encodings that it has already received. Recoding reduces the
	  chance that duplicate Encodings are delivered over different
	  paths to the destination. The recode operation selects
	  several Encodings from the Encoding Set at Random. The
	  selected Encodings are combined to form a new Encoding. The
	  combination procedure is as follows
	<list>
	  <t> 
	    The new Encoding Vector is the XOR of the coefficients of
	    all the selected Encoding Vectors
	  </t>
	  <t> 
	    The new Encoding Data is the XOR of the octet arrays of
	    all the selected Encoding Datas.
	  </t>
	</list>
	</t>
	<t>
	  When two Encodings with Hamming weight less than N/2 are
	  combined, the resulting Hamming weight tends be larger than
	  the originals. Conversely, when two Encodings with Hamming
	  weight more than N/2 are combined, the resulting Hamming
	  weight tend be smaller than the originals. Thus, the
	  recoding process drives the Hamming weight towards N/2.  As
	  Encoding Bundles are transfered across the DTN, recoding can
	  change any special configuration restrictions put on the
	  encoding process. Recoding SHOULD have the option to be
	  disabled as part of the Regulating Layer Handling
	  Specification.
	</t>
	<t>
	  Care should be given to the recoding process to insure that
	  all Encodings in the Encoding Set are represented in the new
	  stream of recoded Encodings. If each new Encoding draws
	  always from the whole Encoding Set, then some Encodings will
	  be chosen less often than others.  Hence their information
	  will not be propagated as much as Encodings that were
	  selected more often. This problem is a form of the Coupon
	  Collectors Problem, which results in an Encoding Stream that
	  needs to receive up to N Log (N) Encodings instead of
	  only N + 2 to receive the full rank. One solution is to
	  generate new Encodings in cycles.  Each Encoding is allowed
	  to be used only once during a cycle and when all Encodings
	  are used a new cycle begins.
	</t>

      </section>
    </section>
    <section anchor='configure' title="Configure">
      <t>
	  The Random Binary Scheme may be configured to
	  implement the following basic FEC schemes, all of which can
	  be represented by the formats in <xref
	  target="scheme_specific"/> .  The configurations restrict
	  the coding formulas, which results in encoding streams
	  with different properties, and potentially different
	  decoding algorithms. Some of these FEC schemes are described
	  in other RFCs and are fully specified here for the content
	  delivery protocol using DTN bundles.
      </t>
      <section title="Full Random Binary">
	<t>
	  Full Random Binary is the generic configuration on which
	  other configuration characteristics are compared.
	  Full Random Binary generates encodings randomly over the
	  full range of possible Encoding Vectors. A new Encoding is
	  generated by randomly setting each coefficient in the
	  Encoding Vector to one, with a probability of 1/2. The
	  resulting stream of Encodings has an average Hamming weight
	  of N / 2. So the encoding process has O(N) octet array
	  XORs. The received Encoding Set has no special structure, so
	  the decoder must use full Gaussian Elimination. The
	  algebraic solver algorithm does not have an advantage over
	  the matrix inversion in this case, so the decoder process
	  has O(N^2) octet array XORs. The expected transmission overhead is
	  only N + 1.6, when the number of Chunks is on the order of
	  1000's.  Finally, the coefficients of the Encoding Vector
	  has no restrictions, so the Encoding Vector is packed into the
	  <xref target='full_vector'>full
	  vector format </xref>.
	  </t>
      </section>
      <section title="Windowed Erasure Codes">
	  <t>
	    With a simple restriction for how the random
	    coefficients are generated, the encoding and decoding cost
	    can be dramatically reduced while still maintaining the
	    low transmission overhead of the Full Random configuration
	    <xref target="Stud2006"/>. 
	  </t>
	  <t>
	    The windowed configuration first restricts the Hamming
	    weight and then restricts the range  that coefficients can
	    be set to one to a "window" of consecutive indicies. The
	    Hamming weight is restricted to 2 log (N), but should be
	    odd.  So the encoder process is O(log N) instead of O(N)
	    with Full Random. The window has a length of 2 sqr(N) and
	    the offset is chosen randomly for each new Encoding, with
	    wrapping from highest to lowest index. With a slight
	    modification to the Gaussian Elimination algorithm, the
	    decoder can algebraically solve the Encoding Set with a
	    windowed matrix in O(N^2.5), instead of the O(N^3) for
	    Full Random. The transmission overhead remains close to
	    the N + 1.6 overhead of the Full Random.  Unfortunately,
	    unconstrained Recoding will disrupt the specialized form
	    of the windowed encoding matrix, which will result in
	    higher decoding times to again be O(N^3).  Finally the
	    coefficients are restricted to a window, so the Encoding
	    Vector should be packed into
	    the <xref target='partial_vector'> partial vector
	    format </xref>.
	  </t>
      </section>
      <section title="Compact No-code FEC Scheme">
	<t>
	  The degenerate case of sending only Encodings with Hamming
	  weight of one, i.e. only source symbols, can behave like an
	  additional fragmentation layer or as the test case named
	  "Compact No-code FEC Scheme" in <xref target="RFC5445"/>.  In
	  this configuration, the encoding and decoding process
	  perform no work, but the system is not protected from any
	  dropped bundles.  Since the Encoding Vector has only one
	  coefficient with value of one, its index should be packed into the
	  <xref target='list_indicies'>list of indicies format</xref>.
	</t>
      </section>
      <section title="Block Parity">
	<t>
	  To show the flexibility of the Random Binary FEC scheme, the
	  classic block parity FEC scheme described in
	  <xref target="RFC5445"/> can be fully specified for the DTN
	  content delivery protocol using DTN bundles.  As in the
	  Compact No-code FEC Scheme, source symbols are sent
	  unencoded with its Chunk index packed into
	  the <xref target='list_indicies'>list of indicies
	  format</xref>.  The block parity repair symbol has all the
	  coefficients in the block to set to one, and
	  uses <xref target='partial_vector'>partial vector
	  format </xref>. The normal incremental decoder will
	  automatically detect source symbols as solved.  The parity
	  repair symbol will be applied, if any source symbols are
	  dropped, or it is treated as redundant, if no bundles where
	  dropped in the block. Chunks may be delivered as they
	  arrive. The Block Parity FEC scheme is practical in the case
	  where dropped bundles are rare and not bursty.
	</t>
      </section>

    </section>
    <section title="Security Considerations">
      <t>
	No additional security considerations have been identified
	beyond those described in <xref target="ErasureCoding"/>
      </t>
    </section>

    <section title="IANA Considerations">
      <t>
	  The Random Binary Scheme uses three FEC Encoding IDs.  The
	  assigned IDs should be less than 128 in order to fit into
	  one byte using SDNV values. The reference implementation
	  uses the following FEC Scheme Types:
	<list>
	  <t>
	   Full Binary Array = 1
	  </t>
	  <t>
	   List of Chunk Indicies = 2
	  </t>
	  <t>
	    Windowed Binary Array = 3
	  </t>
	  <t>
	    Finite Field Array = 4
	  </t>
	</list>
      </t>
    </section>

  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;
      &rfc5050;
      &rfc5052;
      &rfc5510;
      &rfc6256;
      <reference anchor='ErasureCoding'>
        <front>
          <title>Bundle Protocol Erasure Coding Extension</title>
         <author initials='J.' surname='Zinky'
                  fullname='John Zinky'>
            <organization abbrev='BBN'>
              Raytheon BBN Technologies
            </organization>
          </author>
           <author initials='A.' surname='Caro'
                  fullname='Armando Caro'>
            <organization abbrev='BBN'>
              Raytheon BBN Technologies
            </organization>
          </author>
          <author initials='G.' surname='Stein'
                  fullname='Gregory Stein'>
            <organization abbrev='LTS'>
              Laboratory for Telecommunications Sciences
            </organization>
          </author>
          <date month='Aug' year='2012' />
        </front>
        <seriesInfo name='Internet-Draft' value='draft-irtf-dtnrg-zinky-erasure-coding-extension-00' />
      </reference>
    </references>
    
    <references title="Informative References">
      &rfc5445;

      <reference anchor='Stud2006'>
        <front>
          <title>Windowed Erasure Codes</title>
          <author initials='C.' surname='Studholme'
                  fullname='Chris Studholme'>
            <organization abbrev='utoronoto'>
             University of Toronto
            </organization>
          </author>
          <author initials='I.' surname='Blake'
                  fullname='Ian Blake'>
            <organization abbrev='utoronoto'>
             University of Toronto
            </organization>
          </author>
          <date month='July' year='2006' />
        </front>
        <seriesInfo name='IEEE' value='International Symposium on Information Theory' />
      </reference>

    </references>
  </back>

</rfc>

PAFTECH AB 2003-20262026-04-24 10:59:46