One document matched: draft-ietf-homenet-dncp-06.xml


<?xml version='1.0' ?>
<!--
Created:       Mon Nov 18 17:55:22 2013 mstenber

split from draft-ietf-homenet-hncp-03-pre - generic parts

TBD: Where should we note that DNCP is fundamentally about individual TLVs,
and they can be queued (and jittered) freely?

-->

<!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>

<?rfc autobreaks="yes"?>
<?rfc compact="yes"?>
<?rfc strict='yes'?>
<?rfc subcompact="no"?>
<?rfc symrefs="yes"?>
<?rfc toc="yes"?>
<?rfc tocindent="yes"?>

<rfc
    ipr='trust200902'
    docName='draft-ietf-homenet-dncp-06'
    category='std'
    >
  <front>
    <title abbrev="Distributed Node Consensus Protocol">
      Distributed Node Consensus Protocol
    </title>
    <author initials="M" surname="Stenberg" fullname="Markus Stenberg">
      <address>
        <postal>
          <street/>
          <city>Helsinki</city>
          <code>00930</code>
          <country>Finland</country>
        </postal>
        <email>markus.stenberg@iki.fi</email>
      </address>
    </author>
    <author initials="S" surname="Barth" fullname="Steven Barth">
      <address>
        <postal>
          <street/>
          <city>Halle</city>
          <code>06114</code>
          <country>Germany</country>
        </postal>
        <email>cyrus@openwrt.org</email>
      </address>
    </author>
    <date month="June" year="2015" />

    <area>Internet</area>
    <workgroup>Homenet Working Group</workgroup>
    <keyword>Homenet</keyword>
    <abstract>

      <t>This document describes the Distributed Node Consensus Protocol
      (DNCP), a generic state synchronization protocol which uses Trickle
      and Merkle trees. DNCP leaves some details unspecified or
      provides alternative options. Therefore, only profiles which specify
      those missing parts define actual implementable DNCP based
      protocols.</t>

    </abstract>
  </front>
  <middle>
    <section title="Introduction">

      <t>DNCP is designed to provide a way for each participating node to
      publish a set of TLV (Type-Length-Value) tuples, and to provide
      a shared and common view about the data published by every currently
      or recently bidirectionally reachable DNCP node in a network.</t>

      <t>For state synchronization a Merkle tree is used. It is formed by
      first calculating a hash for the dataset, called node data, published
      by each node, and then calculating another hash over those node data hashes.
      The single resulting hash, called network state hash, is transmitted using the
      <xref target="RFC6206">Trickle algorithm</xref> to ensure that all nodes
      share the same view of the current state of the published data within the
      network. The use of Trickle with only short network state hashes sent
      infrequently (in steady state) makes DNCP very thrifty when updates happen
      rarely.</t>

      <t>For maintaining liveliness of the topology and the data within it,
      a combination of Trickled network state, keep-alives, and "other"
      means of ensuring reachability are used. The core idea is that if
      every node ensures its neighbors are present, transitively, the whole
      network state also stays up-to-date.</t>

      <t>DNCP is most suitable for data that changes only infrequently to gain
      the maximum benefit from using Trickle. As the network of nodes, or
      the rate of data changes grows over a given time interval, Trickle
      is eventually used less and less and the benefit of using DNCP
      diminishes. In these cases Trickle just provides extra complexity
      within the specification and little added value. If constant rapid
      state changes are needed, the preferable choice is to use an
      additional point-to-point channel whose address or locator is
      published using DNCP.</t>

    </section>

    <section title="Terminology">

      <texttable suppress-title="true" style="none" align="left">
	<ttcol width="25%" /><ttcol width="75%" />

	<c>DNCP profile</c>

	<c>a definition of the set of rules and values listed in <xref
	target="profile-bits"/> specifying the behavior of a DNCP based
	protocol, such as the transport method in use. In this document,
	any DNCP profile specific parameter with a profile-specific
	fixed value is prefixed with DNCP_.</c>

      <c /><c />

      <c>DNCP node</c>
      <c>a single node which runs a protocol based on a DNCP
      profile.</c>

      <c /><c />

      <c>Link</c>
      <c>a link-layer media over which directly connected nodes can
      communicate.</c>

      <c>DNCP network</c>

      <c>a set of DNCP nodes running the same DNCP profile.

      The set consists of nodes that have discovered each other using the
      transport method defined in the DNCP profile, via multicast on local
      links, and/or by using unicast communication.
      </c>

      <c /><c />

      <c>Node identifier</c>
      <c>an opaque fixed-length identifier consisting of
      DNCP_NODE_IDENTIFIER_LENGTH bytes which uniquely identifies a DNCP
      node within a DNCP network.</c>

      <c /><c />

      <c /><c />

      <c>Interface</c>
      <c>a node's attachment to a particular link.</c>

      <c /><c />

      <c>Endpoint</c>

      <c>a locally configured communication endpoint of a DNCP node, such
      as a network socket. It is either bound to an Interface for multicast
      and unicast communication, or configured for explicit unicast
      communication with a predefined set of remote addresses. Endpoints
      are usually in one of the transport modes specified in <xref
      target="dt" />. </c>

      <c /><c />

      <c>Endpoint identifier</c>

      <c>a 32-bit opaque value, which identifies a particular endpoint of
      a particular DNCP node. The value 0 is reserved for DNCP and DNCP
      profile purposes and not used to identify an actual
      endpoint. This definition is in sync with the interface index
      definition in <xref target="RFC3493"/>, as the non-zero small
      positive integers should comfortably fit within 32 bits.</c>

      <c /><c />

      <c>Peer</c>
      <c>another DNCP node with which a DNCP node communicates using a
      particular local and remote endpoint pair.</c>

      <c /><c />

      <c>Node data</c>
      <c>a set of TLVs published and owned by a node in the DNCP
      network. Other nodes pass it along as-is, even if they cannot
      fully interpret it.</c>

      <c /><c />

      <c>Node state</c>
      <c>a set of metadata attributes for node data. It includes a sequence
      number for versioning, a hash value for comparing equality of stored
      node data, and a timestamp indicating the time passed since its last
      publication. The hash function and the length of the hash value are
      defined in the DNCP profile.</c>

      <c /><c />

      <c>Network state hash</c>
      <c>a hash value which represents the current state of the network.
      The hash function and the length of the hash value are defined in
      the DNCP profile.

      Whenever a node is added, removed or updates its published node data
      this hash value changes as well.

      For calculation, please see <xref target="merkle" />.

      </c>

      <c /><c />

      <c>Trust verdict</c>
      <c>a statement about the trustworthiness of a
      certificate announced by a node participating in the certificate
      based trust consensus mechanism.</c>

      <c /><c />

      <c>Effective trust verdict</c>

      <c>the trust verdict with the highest priority within the set of
      trust verdicts announced for the certificate in the DNCP network.</c>

      <c /><c />

      <c>Topology graph</c>
      <c>the undirected graph of DNCP nodes produced by
      retaining only bidirectional peer relationships between nodes.</c>

      </texttable>


    <section anchor="kwd" title='Requirements Language'>

      <t>
       The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
       NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
       RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to
       be interpreted as described in <xref target='RFC2119'>RFC 2119</xref>.
      </t>

    </section>
    </section>


    <section title="Overview">

      <t>DNCP operates primarily using unicast exchanges between nodes, and
      may use multicast for Trickle-based shared state dissemination and
      topology discovery. If used in pure unicast mode with unreliable
      transport, Trickle is also used between peers.</t>

      <t>DNCP discovers the topology of its nodes and maintains the
      liveliness of published node data by ensuring that the publishing
      node was - at least recently - bidirectionally reachable.  This is
      determined, e.g., by a recent and consistent multicast or unicast TLV
      exchange with its peers. New potential peers can be discovered
      autonomously on multicast-enabled links, their addresses may be
      manually configured or they may be found by some other means defined
      in a later specification.</t>

      <t>A Merkle tree is maintained by each node to represent the state of all
      currently reachable nodes and the Trickle algorithm is used to trigger
      synchronization. Consistency among neighboring nodes
      is thereby determined by comparing the current root of their respective
      trees, i.e., their individually calculated network state hashes.</t>

      <t>Before joining a DNCP network, a node starts with a Merkle tree
      (and therefore a calculated network state hash) only consisting of
      the node itself. It then announces said hash by means of the Trickle
      algorithm on all its configured endpoints.</t>

      <t>When an update is detected by a node (e.g., by receiving an
      inconsistent network state hash from a peer) the originator of the
      event is requested to provide a list of the state of all nodes,
      i.e., all the information it uses to calculate its own Merkle
      tree. The node uses the list to determine whether its own information
      is outdated and - if necessary - requests the actual node data that
      has changed.</t>

      <t>Whenever a node's local copy of any node data and its Merkle tree are
      updated (e.g., due to its own or another node's node state changing or
      due to a peer being added or removed) its Trickle instances are reset
      which eventually causes any update to be propagated to all of its
      peers.</t>

    </section>

    <section title="Operation">

      <section title="Merkle Tree" anchor="merkle">

        <t>Each DNCP node maintains a Merkle tree of height 1 to manage
        state updates of individual DNCP nodes, the leaves of the tree, and
        the network as a whole, the root of the tree.</t>

        <t>Each leaf represents one recently bidirectionally reachable node
        (see <xref target="liveliness" />), and is represented by a tuple
        consisting of the node's update sequence number in network byte
        order concatenated with the hash-value of the node's ordered node
        data published in the <xref target="node-state">Node State
        TLV</xref>. These leaves are ordered in ascending order of the
        respective node identifiers.

        The root of the tree - the network state hash - is represented by
        the hash-value calculated over all such leaf tuples concatenated in
        order.  It is used to determine whether the view of the network of
        two or more nodes is consistent and shared.</t>

        <t>The leaves and the root network state hash are updated on-demand
        and whenever any locally stored per-node state changes. This
        includes local unidirectional reachability encoded in the published
        <xref target="neighbor">Neighbor TLV</xref>s and - when combined
        with remote data - results in awareness of bidirectional
        reachability changes.</t>

      </section>

      <section anchor="dt" title="Data Transport">

        <t>DNCP has relatively few requirements for the underlying
        transport; it requires some way of transmitting either unicast
        datagram or stream data to a peer and, if used in multicast mode, a
        way of sending multicast datagrams.

        As multicast is used only to identify potential new DNCP nodes and
        to send status messages which merely notify that a unicast exchange
        should be triggered, the multicast transport does not have to be
        secured.

        If unicast security is desired and one of the built-in security
        methods is to be used, support for some TLS-derived transport
        scheme - such as <xref target="RFC5246">TLS</xref> on top of TCP or
        <xref target="RFC6347">DTLS</xref> on top of UDP - is also
        required.

        A specific definition of the transport(s) in use and their parameters
        MUST be provided by the DNCP profile.</t>

        <t>TLVs are sent across the transport as is, and they SHOULD be
        sent together where, e.g., MTU considerations do not recommend
        sending them in multiple batches. TLVs in general are handled
        individually and statelessly, with one exception. To form
        bidirectional peer relationships DNCP requires identification of
        the endpoints used for communication. A DNCP node desiring
        bidirectional peer relationship therefore MUST send an <xref
        target="endpoint">Endpoint TLV</xref>. When it is sent varies,
        depending on the underlying transport:

        <list style="symbols">

          <t>If using a stream transport, the TLV SHOULD be sent only once
          within the stream. </t>

          <t>If using datagram transport, it MUST be included in every
          datagram.</t>
        </list>

        Bidirectional peer
        relationship is not necessary for read-only access to the DNCP
        state, but it is required to be able to publish data.
        </t>

        <t>Given the assorted transport options as well as potential
        endpoint configuration, a DNCP endpoint may be used in various
        transport modes:

        <list style="hanging">

          <t hangText="Unicast:">
            <list style="symbols">

              <t>If only reliable unicast transport is employed, Trickle is
              not used at all. Where Trickle reset occurs, a single
              <xref target="net-state">Network State TLV</xref> is sent
              instead to every unicast peer. Additionally, recently changed
              <xref target="node-state">Node State TLV</xref>s MAY be
              included. </t>

              <t>If only unreliable unicast transport is employed, Trickle
              state is kept per each peer and it is used to send Network
              State TLVs every now and then, as specified in <xref
              target="trickle-updates" />.</t>
            </list>
          </t>

          <t hangText="Multicast+Unicast:"> If multicast datagram transport
          is available on an endpoint, Trickle state is only maintained for the
          endpoint as a whole. It is used to send Network State TLVs every
          now and then, as specified in <xref target="trickle-updates"
          />. Additionally, per-endpoint keep-alives MAY be defined
          in the DNCP profile, as specified in <xref target="pe-ka" />.</t>

          <t hangText="MulticastListen+Unicast:">
            Just like Unicast, except multicast transmissions are listened to
            in order to detect changes of the highest node identifier.
            This mode is used only if the DNCP profile supports <xref
            target="broadcast">dense broadcast link optimization</xref>.</t>
        </list>
        </t>

      </section>

      <section title="Trickle-Driven Status Updates"
               anchor="trickle-updates">

        <t>The Trickle algorithm has 3 parameters: Imin, Imax and k. Imin
        and Imax represent the minimum and maximum values for I, which is
        the time interval during which at least k Trickle updates must be
        seen on an endpoint to prevent local state transmission.  The
        actual suggested Trickle algorithm parameters are DNCP profile
        specific, as described in <xref target="profile-bits"/>.</t>

        <t>The Trickle state for all Trickle instances is considered
        inconsistent and reset if and only if the locally calculated
        network state hash changes. This occurs either due to a change in
        the local node's own node data, or due to receipt of more recent
        data from another node.</t>

        <t>Every time a particular Trickle instance indicates that an
        update should be sent, the node MUST send a <xref
        target="net-state">Network State TLV</xref> if and only if:
        <list style="symbols">

          <t>the endpoint is in Multicast+Unicast transport mode, in which
          case the TLV MUST be sent over multicast.</t>

          <t>the endpoint is NOT in Multicast+Unicast transport mode, and the
          unicast transport is unreliable, in which case the TLV MUST be sent
          over unicast.</t>

        </list>
        </t>

        <t>A (sub)set of all <xref target="node-state">Node State
        TLVs</xref> MAY also be included, unless it is defined as
        undesirable for some reason by the DNCP profile, or to avoid
        exposure of the node state TLVs by transmitting them within
        insecure multicast when using also secure unicast.</t>

      </section>

      <section title="Processing of Received TLVs" anchor="reception">

        <t>This section describes how received TLVs are processed. The DNCP
        profile may specify when to ignore particular TLVs, e.g., to modify
        security properties - see <xref target="profile-bits" /> for
        what may be safely defined to be ignored in a profile.

        Any 'reply' mentioned in the steps below denotes sending of the
        specified TLV(s) over unicast to the originator of the TLV being
        processed. If the TLV being replied to was received via multicast
        and it was sent to a link with shared bandwidth, the reply SHOULD
        be delayed by a random timespan in [0, Imin/2], to avoid potential
        simultaneous replies that may cause problems on some links. Sending
        of replies MAY also be rate-limited or
        omitted for a short period of time by an implementation. However, an
        implementation MUST eventually reply to similar repeated requests,
        as otherwise state synchronization would break.</t>

        <t>A DNCP node MUST process TLVs received from any valid address,
        as specified by a given DNCP profile and the configuration of a
        particular endpoint, whether this address is known to be the
        address of a neighbor or not. This provision satisfies the needs of
        monitoring or other host software that needs to discover the DNCP
        topology without adding to the state in the network.</t>

        <t>Upon receipt of:
        <list style="symbols">

          <t><xref target="req-net-state">Request Network State TLV</xref>:

          The receiver MUST reply with a <xref target="net-state">Network
          State TLV</xref> and a <xref target="node-state">Node State
          TLV</xref> for each node data used to calculate the network state
          hash. The Node State TLVs MUST NOT contain the optional node
          data part unless explicitly specified in the DNCP profile.</t>

          <t><xref target="req-node-state">Request Node State TLV</xref>:

          If the receiver has node data for the corresponding node, it MUST
          reply with a <xref target="node-state">Node State TLV</xref> for
          the corresponding node. The optional node data part MUST be
          included in the TLV.</t>

          <t><xref target="net-state">Network State TLV</xref>:

          If the network state hash differs from the locally calculated
          network state hash, and the receiver is unaware of any particular
          node state differences with the sender, the receiver MUST reply
          with a <xref target="req-net-state">Request Network State
          TLV</xref>. These replies MUST be rate limited to only at most
          one reply per link per unique network state hash within Imin. The
          simplest way to ensure this rate limit is a timestamp indicating
          requests, and sending at most one <xref target="req-net-state">
          Request Network State TLV</xref> per Imin.

          To facilitate faster state synchronization, if a Request Network
          State TLV is sent in a reply, a local, current Network State TLV
          MAY also be sent.</t>

          <t><xref target="node-state">Node State TLV</xref>:

          <list style="symbols">

            <t>If the node identifier matches the local node identifier and
            the TLV has a greater update sequence number than its current
            local value, or the same update sequence number and a different
            hash, the node SHOULD re-publish its own node data with an
            update sequence number significantly (e.g., 1000) greater than
            the received one, to reclaim the node identifier.
            This may occur normally once due to the local
            node restarting and not storing the most recently used update
            sequence number. If this occurs more than once or for nodes
            not re-publishing their own node data, the DNCP profile
            MUST provide guidance on how to handle these situations as
            it indicates the existence of another active node with the same
            node identifier.</t>

            <t>If the node identifier does not match the local node
            identifier, and one or more of the following conditions are
            true:

            <list style="symbols">

              <t>The local information is outdated for the corresponding node
              (local update sequence number is less than that within the
              TLV).</t>

              <t>The local information is potentially incorrect (local update
              sequence number matches but the node data hash differs).</t>

              <t>There is no data for that node altogether.</t>

            </list>

            Then:

            <list style="symbols">

              <t>If the TLV does not contain node data, and the hash of the
              node data differs, the receiver MUST reply with a <xref
              target="req-node-state">Request Node State TLV</xref> for the
              corresponding node.</t>

              <t>Otherwise the receiver MUST update its locally stored
              state for that node (node data if present, update sequence
              number, relative time) to match the received TLV.</t>
            </list>
            </t>


          </list>


          For comparison purposes of the update sequence number,
          a looping comparison function MUST be used to avoid problems in
          case of overflow. The comparison function
          a < b <=> (a - b) % 2^32 & 2^31 != 0
          is RECOMMENDED unless the DNCP profile defines another.

          </t>

          <t>Any other TLV:

          TLVs not recognized by the receiver MUST be silently ignored.</t>

        </list>
        </t>

        <t>If secure unicast transport is configured for an endpoint, any
        Node State TLVs received over insecure multicast MUST be silently
        ignored.</t>

      </section>

      <section title="Adding and Removing Peers">

        <t>When receiving a <xref target="endpoint">Node Endpoint
        TLV</xref> on an endpoint from an unknown peer:

        <list style="symbols">

          <t>If received over unicast, the remote node MUST be added as a
          peer on the endpoint and a <xref target="neighbor">Neighbor
          TLV</xref> MUST be created for it.
          </t>

          <t>If received over multicast, the node MAY be sent a (possibly
          rate-limited) unicast <xref target="req-net-state">Request
          Network State TLV</xref>.</t>

        </list>
        </t>

        <t>If keep-alives specified in <xref target="ka" /> are NOT sent by
        the peer (either the DNCP profile does not specify the use of
        keep-alives or the particular peer chooses not to send
        keep-alives), some other existing local transport-specific
        means (such as Ethernet carrier-detection or TCP keep-alive)
        MUST be employed to ensure its presence.
        When the peer is no longer present, the Neighbor
        TLV and the local DNCP peer state MUST be removed.</t>

        <t>If the local endpoint is in the Multicast-Listen+Unicast
        transport mode, a <xref target="neighbor">Neighbor TLV</xref> MUST
        NOT be published for the peers not having the highest node
        identifier.</t>

      </section>

      <section anchor="liveliness" title="Data Liveliness Validation">

        <t>When a Neighbor TLV or a whole node is added or removed, the
        topology graph MUST be traversed either immediately or with a
        small delay shorter than the DNCP profile-defined Trickle Imin.</t>

        <t>The topology graph traversal starts with the local node. The
        edges to be traversed are identified by looking for Neighbor TLVs
        on both nodes, that have the other node's node identifier in the
        Neighbor Node Identifier, and local and neighbor endpoint
        identifiers swapped. Each node reached is marked currently
        reachable.</t>

        <t>DNCP nodes that have not been reachable in the most recent
        topology graph traversal MUST NOT be used for calculation of the
        network state hash, be provided to any applications that need to
        use the whole TLV graph, or be provided to remote nodes. They
        MAY be removed immediately after the topology graph traversal, however
        it is RECOMMENDED to keep them at least briefly to improve the
        speed of DNCP network state convergence and to reduce the number of
        redundant state transmissions between nodes.</t>

      </section>

    </section>

    <section anchor="dm" title="Data Model">

      <t>This section describes the local data structures a minimal
      implementation might use. This section is provided only as a
      convenience for the implementor. Some of the <xref
      target="ext">optional extensions</xref> describe additional data
      requirements, and some optional parts of the core protocol may also
      require more.</t>

      <t>A DNCP node has:

      <list style="symbols">

        <t>A data structure containing data about the most recently sent
        <xref target="req-net-state">Request Network State TLVs</xref>.
        The simplest option is keeping a timestamp of the most recent request
        (required to fulfill reply rate limiting specified in <xref target="reception" />).</t>

      </list>
      </t>

      <t>A DNCP node has for every DNCP node in the DNCP network:

      <list style="symbols">

        <t>Node identifier: the unique identifier of the node. The length,
        how it is produced, and how collisions are handled, is up to the
        particular DNCP profile.</t>

        <t>Node data: the set of TLV tuples published by that particular
        node. As they are transmitted ordered (see <xref
        target="node-state">Node State TLV</xref> for details), maintaining
        the order within the data structure here may be reasonable. </t>

        <t>Latest update sequence number: the 32-bit sequence number that
        is incremented any time the TLV set is published. The comparison
        function used to compare them is described in <xref
        target="reception" />.</t>

        <t>Origination time: the (estimated) time when the
        current TLV set with the current update sequence number was
        published.

        It is used to populate the Milliseconds Since Origination field in
        a <xref target="node-state">Node State TLV</xref>. Ideally it also
        has millisecond accuracy.
        </t>

      </list>
      </t>

      <t>Additionally, a DNCP node has a set of endpoints for which DNCP
      is configured to be used. For each such endpoint, a node has:
      <list style="symbols">

        <t>Endpoint identifier: the 32-bit opaque value uniquely
        identifying it within the local node.</t>

        <t>Trickle instance: the endpoint's Trickle instance with
        parameters I, T, and c (only on an endpoint in Multicast+Unicast
        transport mode).</t>

      </list>
      </t>

      <t>and one (or more) of the following:
      <list style="symbols">
	<t>Interface: the assigned local network interface.</t>

        <t>Unicast address: the DNCP node it should connect with.</t>

        <t>Range of addresses: the DNCP nodes that are allowed to connect.</t>
      </list>
      </t>

      <t>For each remote (peer, endpoint) pair detected on a
      local endpoint, a DNCP node has:

      <list style="symbols">

        <t>Node identifier: the unique identifier of the peer.</t>

        <t>Endpoint identifier: the unique endpoint identifier used by the
        peer.</t>

        <t>Peer address: the most recently used address of the peer
        (authenticated and authorized, if security is enabled).</t>

        <t>Trickle instance: the particular peer's Trickle instance with
        parameters I, T, and c (only on a unicast-only endpoint with
        unreliable unicast transport) .</t>

      </list>
      </t>
    </section>



    <section anchor="ext" title="Optional Extensions">

      <t>This section specifies extensions to the core protocol that a DNCP
      profile may use.</t>

      <section anchor="ka" title="Keep-Alives">

        <t><xref target="trickle-updates">Trickle-driven status
        updates</xref> provide a mechanism for handling of new peer
        detection on an endpoint, as well as state change
        notifications. Another mechanism may be needed to get rid of old,
        no longer valid peers if the transport or lower layers do not
        provide one.</t>

        <t>If keep-alives are not specified in the DNCP profile, the rest
        of this subsection MUST be ignored.</t>

        <t>A DNCP profile MAY specify either per-endpoint or per-peer
        keep-alive support. </t>

        <t>For every endpoint that a keep-alive is specified for in the
        DNCP profile, the endpoint-specific keep-alive interval MUST be
        maintained. By default, it is DNCP_KEEPALIVE_INTERVAL. If there is a
        local value that is preferred for that for any reason (configuration,
        energy conservation, media type, ..), it can be substituted
        instead. If a non-default keep-alive interval is used on any
        endpoint, a DNCP node MUST publish appropriate <xref
        target="ka-interval">Keep-Alive Interval TLV(s)</xref> within its
        node data.</t>

        <section title="Data Model Additions">

          <t>The following additions to the <xref target="dm">Data
          Model</xref> are needed to support keep-alives:</t>

          <t>For each configured endpoint that has per-endpoint keep-alives
          enabled:

          <list style="symbols">
            <t>Last sent: If a timestamp which indicates the last time a
            <xref target="net-state">Network State TLV</xref> was sent over
            that interface.</t>
          </list>
          </t>

          <t>For each remote (peer, endpoint) pair detected on a
          local endpoint, a DNCP node has:


          <list style="symbols">

            <t>Last contact timestamp: a timestamp which indicates the last
            time a consistent <xref target="net-state">Network State
            TLV</xref> was received from the peer over multicast, or anything
            was received over unicast. When adding a new peer, it is
            initialized to the current time.</t>

            <t>Last sent: If per-peer keep-alives are enabled, a timestamp
            which indicates the last time a <xref
            target="net-state">Network State TLV</xref> was sent to to that
            point-to-point peer. When adding a new peer, it is initialized
            to the current time.</t>

          </list>
          </t>

        </section>

        <section anchor="pe-ka" title="Per-Endpoint Periodic Keep-Alives">

          <t>If per-endpoint keep-alives are enabled on an endpoint in
          Multicast+Unicast transport mode, and if no traffic containing a
          <xref target="net-state">Network State TLV</xref> has been sent
          to a particular endpoint within the endpoint-specific keep-alive
          interval, a <xref target="net-state">Network State TLV</xref>
          MUST be sent on that endpoint, and a new Trickle transmission
          time 't' in [I/2, I] MUST be randomly chosen. The actual sending
          time SHOULD be further delayed by a random timespan in [0,
          Imin/2].</t>

        </section>

        <section title="Per-Peer Periodic Keep-Alives">

          <t>If per-peer keep-alives are enabled on a unicast-only
          endpoint, and if no traffic containing a <xref
          target="net-state">Network State TLV</xref> has been sent to a
          particular peer within the endpoint-specific keep-alive interval,
          a <xref target="net-state">Network State TLV</xref> MUST be sent to
          the peer and a new Trickle transmission time 't' in [I/2, I] MUST
          be randomly chosen.</t>

        </section>

        <section title="Received TLV Processing Additions">

          <t>If a TLV is received over unicast from the peer, the Last
          contact timestamp for the peer MUST be updated.</t>

          <t>On receipt of a <xref target="net-state">Network State TLV</xref>
          which is consistent with the locally calculated network state hash,
          the Last contact timestamp for the peer MUST be updated.</t>

        </section>

        <section title="Neighbor Removal">

          <t>For every peer on every endpoint, the endpoint-specific
          keep-alive interval must be calculated by looking for <xref
          target="ka-interval">Keep-Alive Interval TLVs</xref> published by
          the node, and if none exist, using the default value of
          DNCP_KEEPALIVE_INTERVAL. If the peer's last contact
          timestamp has not been updated for at least locally chosen
          potentially endpoint-specific keep-alive multiplier (defaults to
          DNCP_KEEPALIVE_MULTIPLIER) times the peer's endpoint-specific
          keep-alive interval, the Neighbor TLV for that peer and the local
          DNCP peer state MUST be removed.</t>

        </section>

      </section>

      <section anchor="broadcast" title="Support For Dense Broadcast Links">

        <t>This optimization is needed to avoid a state space explosion.
        Given a large set of DNCP nodes publishing data on an endpoint
        that actually uses multicast on a link, every node will add a
        <xref target="neighbor">Neighbor TLV</xref> for each peer.
        While Trickle limits the amount of traffic on the link in
        stable state to some extent, the total amount of data that is added
        to and maintained in the DNCP network given N nodes on a
        multicast-enabled link is O(N^2). Additionally if per-peer
        keep-alives are employed, there will be O(N^2) keep-alives running
        on the link if liveliness of peers is not ensured using some other
        way (e.g., TCP connection lifetime, layer 2 notification,
        per-endpoint keep-alive). </t>

        <t>An upper bound for the number of neighbors that are allowed for
        a particular type of link that an endpoint in Multicast+Unicast
        transport mode is used on SHOULD be provided by a DNCP profile, but
        MAY also be chosen at runtime.

        Main consideration when selecting a bound (if any)
        for a particular type of link should be whether it supports
        broadcast traffic, and whether a too large number of neighbors case
        is likely to happen during the use of that particular DNCP profile
        on that particular type of link. If neither is likely, there is little
        point specifying support for this for that particular link
        type.</t>

        <t>If a DNCP profile does not support this extension at all, the
        rest of this subsection MUST be ignored. This is because when this
        extension is employed, the state within the DNCP network only
        contains a subset of the full topology of the network. Therefore
        every node must be aware of the potential of it being used in a
        particular DNCP profile.</t>

        <t>If the specified upper bound is exceeded for some endpoint in
        Multicast+Unicast transport mode and if the node does not have the
        highest node identifier on the link, it SHOULD treat the endpoint
        as a unicast endpoint connected to the node that has the highest
        node identifier detected on the link, therefore transitioning to
        Multicast-listen+Unicast transport mode. The nodes in
        Multicast-listen+Unicast transport mode MUST keep listening to
        multicast traffic to both receive messages from the node(s) still
        in Multicast+Unicast mode, and as well to react to nodes with a
        greater node identifier appearing. If the highest node identifier
        present on the link changes, the remote unicast address of the
        endpoints in Multicast-Listen+Unicast transport mode MUST be
        changed. If the node identifier of the local node is the highest
        one, the node MUST switch back to, or stay in Multicast+Unicast
        mode, and normally form peer relationships with all peers.</t>

      </section>

      <section anchor="fragmentation" title="Node Data Fragmentation">

        <t>A DNCP profile may be required to support node data which would
        not fit the maximum size of a single <xref target="node-state">Node
        State TLV</xref> (roughly 64KB of payload), or use a datagram-only
        transport with a limited MTU and no reliable support for
        fragmentation. To handle such cases, a DNCP profile MAY specify a
        fixed number of trailing bytes in the node identifier to represent
        a fragment number indicating a part of a node's node data. The
        profile MAY also specify an upper bound for the size of a single
        fragment to accommodate limitations of links in the network. Note
        that the maximum size of fragment also constrains the maximum size
        of a single TLV published by a node. </t>

        <t>The data within Node State TLVs of all fragments MUST be valid,
        as specified in <xref target="node-state" />. The locally used node
        data for a particular node MUST be produced by concatenating node
        data in each fragment, in ascending fragment number order. The
        locally used concatenated node data MUST still follow the ordering
        described in <xref target="node-state" />.</t>

        <t>Any transmitted node identifiers used to identify the own or any
        other node MUST have the fragment number 0. For algorithm purposes,
        the relative time since the most recent fragment change MUST be
        used, regardless of fragment number. Therefore, even if just some
        of the node data fragments change, they all are considered
        refreshed if one of them is.</t>

        <t>If using fragmentation, the data liveliness validation defined
        in <xref target="liveliness" /> is extended so that if a <xref
        target="fragment-count">Fragment Count TLV</xref> is present within
        the fragment number 0, all fragments up to fragment number
        specified in the Count field are also considered reachable if the
        fragment number 0 itself is reachable based on graph
        traversal. </t>

      </section>
    </section>

    <section anchor="tlvs" title="Type-Length-Value Objects">
      <t>
        Each TLV is encoded as a 2 byte type field, followed by a 2 byte
        length field (of the value excluding header, in bytes, 0 meaning no value)
        followed by the value itself, if any. Both type and length fields
        in the header as well as all integer fields inside the value -
        unless explicitly stated otherwise - are represented in network
        byte order. Padding bytes with value zero MUST be added up to the
        next 4 byte boundary if the length is not divisible by 4.  These
        padding bytes MUST NOT be included in the number stored in the length
        field.
      </t>

      <figure>
        <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Type               |           Length              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             Value                             |
|                     (variable # of bytes)                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        </artwork>
      </figure>

      <t>
        For example, type=123 (0x7b) TLV with value 'x' (120 =
        0x78) is encoded as: 007B 0001 7800 0000.
      </t>

      <t>In this section, the following special notation is used:
      <list>

        <t>.. = octet string concatenation operation.</t>

        <t>H(x) = non-cryptographic hash function specified by DNCP
        profile. </t>

      </list>
      </t>


      <section title="Request TLVs">

        <section anchor="req-net-state" title="Request Network State TLV">

          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Type: REQ-NETWORK-STATE (1)  |           Length: 0           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV is used to request response with a <xref
          target="net-state">Network State TLV</xref> and all <xref
          target="node-state">Node State TLVs</xref> (without node
          data).</t>

        </section>
        <section anchor="req-node-state" title="Request Node State TLV">


          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Type: REQ-NODE-STATE (2)   |          Length: >0           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Node Identifier                        |
|                  (length fixed in DNCP profile)               |
...
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV is used to request a <xref target="node-state">
          Node State TLV</xref> (including node data) for the node
          with the matching node identifier.</t>

        </section>

      </section>
      <section title="Data TLVs">
        <section anchor="endpoint" title="Node Endpoint TLV">

          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type: NODE-ENDPOINT (3)     |          Length: > 4          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Node Identifier                        |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Endpoint Identifier                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV identifies both the local node's node identifier, as
          well as the particular endpoint's endpoint identifier. It is used
          when bidirectional peering is desired, as described in the <xref
          target="dt" />. </t>

        </section>
        <section anchor="net-state" title="Network State TLV">

          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Type: NETWORK-STATE (4)    |          Length: > 0          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      H(update number of node 1 .. H(node data of node 1) ..   |
|     .. update number of node N .. H(node data of node N))     |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV contains the current locally calculated network state
          hash, see <xref target="merkle" /> for how it is calculated.</t>

        </section>
        <section anchor="node-state" title="Node State TLV">

          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Type: NODE-STATE (5)     |          Length: > 8          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Node Identifier                        |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Update Sequence Number                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                Milliseconds Since Origination                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         H(node data)                          |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|(optionally) Nested TLVs containing node information           |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV represents the local node's knowledge about the
          published state of a node in the DNCP network identified by the
          Node Identifier field in the TLV.</t>

          <t>Every node, including the originating one, MUST update the
          Milliseconds Since Origination whenever it sends a Node
          State TLV based on when the node estimates the data was
          originally published. This is, e.g., to ensure that any relative
          timestamps contained within the published node data can be
          correctly offset and interpreted. Ultimately, what is provided is
          just an approximation, as transmission delays are not accounted
          for. </t>

          <t>Absent any changes, if the originating node notices that the
          32-bit milliseconds since origination value would be close to
          overflow (greater than 2^32-2^16), the node MUST re-publish its
          TLVs even if there is no change. In other words, absent any other
          changes, the TLV set MUST be re-published roughly every 48
          days.</t>
          <!-- TODO: (MSt) ^ is there better place for this paragraph? -->

          <t>The actual node data of the node may be included within the
          TLV as well. In a DNCP profile which supports fragmentation,
          described in <xref target="fragmentation" />, the TLV data may be
          only partial but it MUST contain full individual TLVs.

          This set of TLVs MUST be strictly ordered based on ascending
          binary content (including TLV type and length). This enables,
          e.g., efficient state delta processing and no-copy indexing by TLV
          type by the recipient.</t>

        </section>

      </section>


      <section title="Data TLVs within Node State TLV">

        <t>These TLVs are published by the DNCP nodes, and therefore only
        encoded within the Node State TLVs. If encountered outside Node
        State TLV, they MUST be silently ignored.</t>

        <section anchor="fragment-count"
                 title="Fragment Count TLV">
          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Type: FRAGMENT-COUNT (7)     |         Length: > 0           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             Count                             |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>If the DNCP profile supports node data fragmentation as
          specified in <xref target="fragmentation" />, this TLV indicates
          that the node data is encoded as a sequence of Node State
          TLVs. Following Node State TLVs with Node Identifiers up to Count
          greater than the current one MUST be considered reachable and part
          of the same logical set of node data that this TLV is within. The
          fragment portion of the Node Identifier of the Node State TLV
          this TLV appears in MUST be zero.</t>

        </section>


        <section anchor="neighbor"
                 title="Neighbor TLV">
          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Type: NEIGHBOR (8)      |          Length: > 8          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Neighbor Node Identifier                   |
|                  (length fixed in DNCP profile)               |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  Neighbor Endpoint Identifier                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Local Endpoint Identifier                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV indicates that the node in question vouches that the
          specified neighbor is reachable by it on the specified local
          endpoint.

          The presence of this TLV at least guarantees that the node
          publishing it has received traffic from the neighbor
          recently. For guaranteed up-to-date bidirectional reachability,
          the existence of both nodes' matching Neighbor TLVs needs to be
          checked. </t>
        </section>

        <section anchor="ka-interval"
                 title="Keep-Alive Interval TLV">

          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type: KEEP-ALIVE-INTERVAL (9) |          Length: 8            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Endpoint Identifier                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Interval                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>This TLV indicates a non-default interval being used to send
          keep-alives specified in <xref target="ka" />.</t>

          <t>Endpoint identifier is used to identify the particular
          endpoint for which the interval applies. If 0, it applies for
          ALL endpoints for which no specific TLV exists.</t>

          <t>Interval specifies the interval in milliseconds at which the
          node sends keep-alives. A value of zero means no keep-alives are
          sent at all; in that case, some lower layer mechanism that
          ensures presence of nodes MUST be available and used. </t>
        </section>

      </section>

    </section>


    <section title="Security and Trust Management">

      <t>If specified in the DNCP profile, either <xref
      target="RFC6347">DTLS</xref> or <xref target="RFC5246">TLS</xref> may
      be used to authenticate and encrypt either some (if specified
      optional in the profile), or all unicast traffic. The following
      methods for establishing trust are defined, but it is up to the DNCP
      profile to specify which ones may, should or must be supported.</t>

      <section title="Pre-Shared Key Based Trust Method">

        <t>A PSK-based trust model is a simple security management
        mechanism that allows an administrator to deploy devices to an
        existing network by configuring them with a pre-defined key,
        similar to the configuration of an administrator password or
        WPA-key.  Although limited in nature it is useful to provide a
        user-friendly security mechanism for smaller networks. </t>

      </section>

      <section title="PKI Based Trust Method">

        <t>A PKI-based trust-model enables more advanced management
        capabilities at the cost of increased complexity and
        bootstrapping effort. It however allows trust to be managed in a
        centralized manner and is therefore useful for larger networks
        with a need for an authoritative trust management.</t>

      </section>

      <section title="Certificate Based Trust Consensus Method">

        <t>The certificate-based consensus model is designed to be a
        compromise between trust management effort and flexibility. It is
        based on X.509-certificates and allows each DNCP node to provide a
        trust verdict on any other certificate and a consensus is found to
        determine whether a node using this certificate or any
        certificate signed by it is to be trusted. </t>

        <t>A DNCP node not using this security method MUST ignore all
        announced trust verdicts and MUST NOT announce any such verdicts
        by itself, i.e., any other normative language in this subsection
        does not apply to it.</t>

        <t>The current effective trust verdict for any certificate is
        defined as the one with the highest priority from all trust
        verdicts announced for said certificate at the time.</t>

        <section title="Trust Verdicts">

          <t>Trust verdicts are statements of DNCP nodes about the
          trustworthiness of X.509-certificates.  There are 5 possible
          trust verdicts in order of ascending priority:

          <list>

            <t>0 (Neutral): no trust verdict exists but the DNCP network
            should determine one.</t>

            <t>1 (Cached Trust): the last known effective trust verdict was
            Configured or Cached Trust.</t>

            <t>2 (Cached Distrust): the last known effective trust verdict
            was Configured or Cached Distrust.</t>

            <t>3 (Configured Trust): trustworthy based upon an external
            ceremony or configuration.</t>

            <t>4 (Configured Distrust): not trustworthy based upon an
            external ceremony or configuration.</t>

          </list>
          </t>

          <t>
            Trust verdicts are differentiated in 3 groups:

            <list style="symbols">
              <t>Configured verdicts are used to announce explicit
              trust verdicts a node has based on any external trust
              bootstrap or predefined relation a node has formed with a
              given certificate.</t>

              <t>Cached verdicts are used to retain the last known trust
              state in case all nodes with configured verdicts about a
              given certificate have been disconnected or turned off.</t>

              <t>The Neutral verdict is used to announce a new node
              intending to join the network so a final verdict for it can
              be found.</t>
            </list>
          </t>

          <t>
            The current effective trust verdict for any certificate is
            defined as the one with the highest priority within the set of
            trust verdicts announced for the certificate in the DNCP
            network.

            A node MUST be trusted for participating in the DNCP network if
            and only if the current effective trust verdict for its own
            certificate or any one in its certificate hierarchy is (Cached
            or Configured) Trust and none of the certificates in its
            hierarchy have an effective trust verdict of (Cached or
            Configured) Distrust.

            In case a node has a configured verdict, which is different
            from the current effective trust verdict for a certificate, the
            current effective trust verdict takes precedence in deciding
            trustworthiness. Despite that, the node still retains and
            announces its configured verdict.
          </t>
        </section>

        <section title="Trust Cache">

          <t>Each node SHOULD maintain a trust cache containing the current
          effective trust verdicts for all certificates currently announced
          in the DNCP network. This cache is used as a backup of the last
          known state in case there is no node announcing a configured
          verdict for a known certificate.  It SHOULD be saved to a
          non-volatile memory at reasonable time intervals to survive a
          reboot or power outage.</t>

          <t>Every time a node (re)joins the network or detects the change
          of an effective trust verdict for any certificate, it will
          synchronize its cache, i.e., store new effective trust verdicts
          overwriting any previously cached verdicts. Configured verdicts
          are stored in the cache as their respective cached counterparts.
          Neutral verdicts are never stored and do not override existing
          cached verdicts.</t>
        </section>

        <section title="Announcement of Verdicts">

          <t>A node SHOULD always announce any configured trust verdicts it
          has established by itself, and it MUST do so if announcing the
          configured trust verdict leads to a change in the current
          effective trust verdict for the respective certificate.  In
          absence of configured verdicts, it MUST announce cached trust
          verdicts it has stored in its trust cache, if one of the
          following conditions applies:

          <list style="symbols">

            <t>The stored trust verdict is Cached Trust and the current
            effective trust verdict for the certificate is Neutral or does
            not exist.</t>

            <t>The stored trust verdict is Cached Distrust and the current
            effective trust verdict for the certificate is Cached
            Trust.</t>

          </list>

          A node rechecks these conditions whenever it detects changes of
          announced trust verdicts anywhere in the network.
          </t>

          <t>Upon encountering a node with a hierarchy of certificates for
          which there is no effective trust verdict, a node adds a Neutral
          Trust-Verdict-TLV to its node data for all certificates found in
          the hierarchy, and publishes it until an effective trust verdict
          different from Neutral can be found for any of the certificates,
          or a reasonable amount of time (10 minutes is suggested) with no
          reaction and no further authentication attempts has passed.  Such
          trust verdicts SHOULD also be limited in rate and number to
          prevent denial-of-service attacks.</t>

          <t>Trust verdicts are announced using Trust-Verdict TLVs:
          <figure>
            <artwork>
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type: Trust-Verdict (10)    |        Length: 37-100         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Verdict    |                 (reserved)                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
|                                                               |
|                                                               |
|                      SHA-256 Fingerprint                      |
|                                                               |
|                                                               |
|                                                               |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          Common Name                          |
            </artwork>
          </figure>

          <list>
            <t>Verdict represents the numerical index of the trust
            verdict.</t>

            <t>(reserved) is reserved for future additions and MUST be set
            to 0 when creating TLVs and ignored when parsing them.</t>

            <t>SHA-256 Fingerprint contains the <xref
            target="RFC6234">SHA-256</xref> hash value of the certificate
            in DER-format.</t>

            <t>Common Name contains the variable-length (1-64 bytes) common
            name of the certificate. Final byte MUST have value of 0.</t>
          </list>
          </t>
        </section>

        <section title="Bootstrap Ceremonies">
          <t>The following non-exhaustive list of methods describes
          possible ways to establish trust relationships between
          DNCP nodes and node certificates. Trust establishment is a
          two-way process in which the existing network must trust the
          newly added node and the newly added node must trust at least
          one of its neighboring nodes.

          It is therefore necessary that both the newly added node and an
          already trusted node perform such a ceremony to successfully
          introduce a node into the DNCP network.  In all cases an
          administrator MUST be provided with external means to identify
          the node belonging to a certificate based on its fingerprint
          and a meaningful common name.</t>

          <section title="Trust by Identification">
            <t>A node implementing certificate-based trust MUST provide
            an interface to retrieve the current set of effective trust
            verdicts, fingerprints and names of all certificates currently
            known and set configured trust verdicts to be
            announced. Alternatively it MAY provide a companion DNCP node
            or application with these capabilities with which it has a
            pre-established trust relationship.</t>
          </section>

          <section title="Preconfigured Trust">
            <t>A node MAY be preconfigured to trust a certain set of
            node or CA certificates.  However such trust relationships
            MUST NOT result in unwanted or unrelated trust for nodes not
            intended to be run inside the same network (e.g., all other
            devices by the same manufacturer).</t>
          </section>

          <section title="Trust on Button Press">
            <t>A node MAY provide a physical or virtual interface to put
            one or more of its internal network interfaces temporarily into
            a mode in which it trusts the certificate of the first
            DNCP node it can successfully establish a connection
            with.</t>
          </section>

          <section title="Trust on First Use">
            <t>A node which is not associated with any other DNCP node MAY
            trust the certificate of the first DNCP node it can
            successfully establish a connection with. This method MUST NOT
            be used when the node has already associated with any other
            DNCP node.</t>
          </section>
        </section>
      </section>
    </section>

    <section anchor="profile-bits" title="DNCP Profile-Specific Definitions">
      <!-- TBD - Pierre has a point - should define guidance on these,
           perhaps -->
      <t>Each DNCP profile MUST specify the following aspects:
      <list style="symbols">

        <t>Unicast and optionally multicast transport protocol(s) to be
        used. If multicast-based node and status discovery is desired, a
        datagram-based transport supporting multicast has to be available.
        </t>

        <t>How the chosen transport(s) are secured: Not at all, optionally
        or always with the TLS scheme defined here using one or more of the
        methods, or with something else. If the links with DNCP nodes can
        be sufficiently secured or isolated, it is possible to run DNCP in
        a secure manner without using any form of authentication or
        encryption.</t>

        <t>Transport protocols' parameters such as port numbers to be used,
        or multicast address to be used. Unicast, multicast, and secure
        unicast may each require different parameters, if applicable. </t>

        <t>When receiving TLVs, what sort of TLVs are ignored in addition -
        as specified in <xref target="reception" /> - e.g., for security
        reasons.

        A DNCP profile may safely define the following DNCP TLVs to be safely
        ignored:

        <list style="symbols">

          <t>Anything received over multicast, except <xref
          target="endpoint">Node Endpoint TLV</xref> and <xref
          target="net-state">Network State TLV</xref>.
          </t>

          <t>Any TLVs received over unreliable unicast or multicast at too
          high rate; Trickle will ensure eventual convergence given the
          rate slows down at some point.</t>

        </list>
        </t>

        <t>How to deal with node identifier collision as described in <xref
        target="reception" />. Main options are either for one or both
        nodes to assign new node identifiers to themselves, or to notify
        someone about a fatal error condition in the DNCP network.</t>

        <t>Imin, Imax and k ranges to be suggested for implementations to
        be used in the Trickle algorithm. The Trickle algorithm does not
        require these to be the same across all implementations for it to
        work, but similar orders of magnitude helps implementations of a DNCP
        profile to behave more consistently and to facilitate estimation of
        lower and upper bounds for convergence behavior of the network.</t>

        <t>Hash function H(x) to be used, and how many bits of the output
        are actually used. The chosen hash function is used to handle both
        hashing of node specific data, and network state hash, which is a
        hash of node specific data hashes. SHA-256 defined in <xref
        target="RFC6234" /> is the recommended default choice, but a
        non-cryptographic hash function could be used as well.</t>

        <t>DNCP_NODE_IDENTIFIER_LENGTH: The fixed length of a node
        identifier (in bytes).</t>

        <t>Whether to send keep-alives, and if so, whether per-endpoint
        (requires multicast transport), or per-peer. Keep-alive has also
        associated parameters:

        <list style="symbols">
          <t>DNCP_KEEPALIVE_INTERVAL: How often keep-alives are to be
          sent by default (if enabled).</t>

          <t>DNCP_KEEPALIVE_MULTIPLIER: How many times the
          DNCP_KEEPALIVE_INTERVAL (or peer-supplied keep-alive interval
          value) a node may not be heard from to be considered still
          valid. This is just a default used in absence of any other
          configuration information, or particular per-endpoint
          configuration.</t>
        </list>
        </t>
        <t>Whether to support fragmentation, and if so, the number of bytes
        reserved for fragment count in the node identifier.</t>
      </list>
      </t>
    </section>

    <section title="Security Considerations">

      <t>DNCP profiles may use multicast to indicate DNCP state changes and
      for keep-alive purposes. However, no actual published data TLVs will
      be sent across that channel. Therefore an attacker may only learn
      hash values of the state within DNCP and may be able to trigger
      unicast synchronization attempts between nodes on a local link this
      way. A DNCP node should therefore rate-limit its reactions to
      multicast packets.</t>

      <t>When using DNCP to bootstrap a network, PKI based solutions may have
      issues when validating certificates due to potentially unavailable
      accurate time, or due to inability to use the network to either check
      Certifcate Revocation Lists or perform on-line validation.</t>

      <t>The Certificate-based trust consensus mechanism defined in this
      document allows for a consenting revocation, however in case of a
      compromised device the trust cache may be poisoned before the actual
      revocation happens allowing the distrusted device to rejoin the network
      using a different identity.  Stopping such an attack might require
      physical intervention and flushing of the trust caches. </t>

    </section>

    <section anchor="iana" title="IANA Considerations">

      <t>IANA should set up a registry for DNCP TLV types,
      with the following initial contents:</t>

      <t>0: Reserved</t>
      <t>1: Request network state</t>
      <t>2: Request node state</t>
      <t>3: Node endpoint</t>
      <t>4: Network state</t>
      <t>5: Node state</t>
      <t>6: Reserved (was: Custom)</t>
      <t>7: Fragment count</t>
      <t>8: Neighbor</t>
      <t>9: Keep-alive interval</t>
      <t>10: Trust-Verdict</t>
      <t>32-191: Reserved for per-DNCP profile use</t>

      <t>192-255: Reserved for per-implementation experimentation. How
      collision is avoided is out of scope of this document.</t>

      <t>For the rest of the values (11-31, 256-65535), policy of 'standards
      action' should be used.</t>

    </section>

  </middle>
  <back>
    <references title="Normative references">
      <?rfc include="reference.RFC.2119.xml"?>
      <?rfc include="reference.RFC.6206.xml"?>
      <?rfc include="reference.RFC.6347.xml"?>
      <?rfc include="reference.RFC.5246.xml"?>
    </references>
    <references title="Informative references">
      <?rfc include="reference.RFC.3493.xml"?>
      <?rfc include="reference.RFC.6234.xml"?>
    </references>

    <section title="Some Questions and Answers [RFC Editor: please remove]">

      <t>Q: 32-bit endpoint id?</t>
      <t>A: Here, it would save 32 bits per neighbor if it was 16 bits (and
      less is not realistic). However, TLVs defined elsewhere would not
      seem to even gain that much on average.  32 bits is also used for
      ifindex in various operating systems, making for simpler
      implementation.</t>

      <t>Q: Why have topology information at all?</t>
      <t>A: It is an alternative to the more traditional seq#/TTL-based flooding
      schemes. In steady state, there is no need to, e.g., re-publish every now
      and then.</t>

    </section>
    <section title="Changelog [RFC Editor: please remove]">
      <t>draft-ietf-homenet-dncp-06:
      <list style="symbols">

        <t>Removed custom TLV.</t>

        <t>Made keep-alive multipliers local implementation choice, profiles
        just provide guidance on sane default value.</t>

        <t>Removed the DNCP_GRACE_INTERVAL as it is really
        implementation choice.</t>

        <t>Simplified the suggested structures in data model.</t>

        <t>Reorganized the document and provided an overview section.</t>

      </list>
      </t>
      <t>draft-ietf-homenet-dncp-04:
      <list style="symbols">

        <t>Added mandatory rate limiting for network state requests, and
        optional slightly faster convergence mechanism by including current
        local network state in the remote network state requests.</t>

      </list>
      </t>

      <t>draft-ietf-homenet-dncp-03:
      <list style="symbols">

        <t>Renamed connection -> endpoint.</t>

        <t>!!! Backwards incompatible change: Renumbered TLVs, and got rid
        of node data TLV; instead, node data TLV's contents are optionally
        within node state TLV.</t>

      </list>
      </t>

      <t>draft-ietf-homenet-dncp-02:
      <list style="symbols">

        <t>Changed DNCP "messages" into series of TLV streams, allowing
        optimized round-trip saving synchronization.</t>

        <t>Added fragmentation support for bigger node data and for chunking
        in absence of reliable L2 and L3 fragmentation.</t>
      </list>
      </t>

      <t>draft-ietf-homenet-dncp-01:
      <list style="symbols">

        <t>Fixed keep-alive semantics to consider unicast requests also
        updates of most recently consistent, and added proactive unicast
        request to ensure even inconsistent keep-alive messages eventually
        triggering consistency timestamp update.</t>

        <t>Facilitated (simple) read-only clients by making Node Connection
        TLV optional if just using DNCP for read-only purposes.</t>

        <t>Added text describing how to deal with "dense" networks, but left
        actual numbers and mechanics up to DNCP profiles and (local)
        configurations.</t>
      </list>
      </t>

      <t>draft-ietf-homenet-dncp-00: Split from pre-version of
      draft-ietf-homenet-hncp-03 generic parts. Changes that affect
      implementations:
      <list style="symbols">

        <t>TLVs were renumbered.</t>

        <t>TLV length does not include header (=-4). This facilitates,
        e.g., use of DHCPv6 option parsing libraries (same encoding), and
        reduces complexity (no need to handle error values of length less
        than 4).</t>

        <t>Trickle is reset only when locally calculated network state hash
        is changes, not as remote different network state hash is seen. This
        prevents, e.g., attacks by multicast with one multicast packet to force
        Trickle reset on every interface of every node on a link.</t>

        <t>Instead of 'ping', use 'keep-alive' (optional) for dead peer
        detection. Different message used!</t>

      </list>
      </t>

    </section>

    <section title="Draft Source [RFC Editor: please remove]">
      <t>As usual, this draft is available at <eref
      target="https://github.com/fingon/ietf-drafts/">
      https://github.com/fingon/ietf-drafts/</eref>
      in source format (with nice Makefile too). Feel free to send comments
      and/or pull requests if and when you have changes to it! </t>
    </section>

    <section title="Acknowledgements">

      <t>Thanks to Ole Troan, Pierre Pfister, Mark Baugher, Mark Townsley,
      Juliusz Chroboczek, Jiazi Yi, Mikael Abrahamsson, Brian Carpenter and
      Thomas Clausen for their contributions to the draft.</t>

    </section>

  </back>
</rfc>

PAFTECH AB 2003-20262026-04-23 04:59:46