One document matched: draft-ietf-ipfix-anon-04.xml


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<rfc ipr="trust200902" category="exp" docName="draft-ietf-ipfix-anon-04.txt">
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>

<front>
  <title abbrev="IP Flow Anonymisation Support">
    IP Flow Anonymisation Support 
  </title>
  <author initials="E." surname="Boschi" fullname="Elisa Boschi">
    <organization abbrev="ETH Zurich">
      Swiss Federal Institute of Technology Zurich 
    </organization>
    <address>
      <postal>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <email>boschie@tik.ee.ethz.ch</email>
    </address>
  </author>
  <author initials="B." surname="Trammell" fullname="Brian Trammell">
    <organization abbrev="ETH Zurich">
      Swiss Federal Institute of Technology Zurich 
    </organization>
    <address>
      <postal>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <phone>+41 44 632 70 13</phone>
      <email>trammell@tik.ee.ethz.ch</email>
    </address>
  </author>
  <date month="October" day="8" year="2010"></date>
  <area>Operations</area>
  <workgroup>IPFIX Working Group</workgroup>
  <abstract> 

    <t>This document describes anonymisation techniques for IP flow data and
    the export of anonymised data using the IPFIX protocol. It categorizes
    common anonymisation schemes and defines the parameters needed to describe
    them. It provides guidelines for the implementation of anonymised data
    export and storage over IPFIX, and describes an information model and
    Options-based method for anonymisation metadata export within the IPFIX
    protocol or storage in IPFIX Files.</t>

  </abstract>
</front>

<middle>

  <section title="Introduction">

    <t>The standardisation of an IP flow information export protocol <xref
    target="RFC5101"/> and associated representations removes a technical
    barrier to the sharing of IP flow data across organizational boundaries
    and with network operations, security, and research communities for a wide
    variety of purposes. However, with wider dissemination comes greater risks
    to the privacy of the users of networks under measurement, and to the
    security of those networks. While it is not a complete solution to the
    issues posed by distribution of IP flow information, anonymisation (i.e.,
    the deletion or transformation of information that is considered sensitive
    and could be used to reveal the identity of subjects involved in a
    communication) is an important tool for the protection of privacy within
    network measurement infrastructures.</t>

    <t>This document presents a mechanism for representing anonymised data
    within IPFIX and guidelines for using it. It begins with a categorization
    of anonymisation techniques. It then describes applicability of each
    technique to commonly anonymisable fields of IP flow data, organized by
    information element data type and semantics as in <xref
    target="RFC5102"></xref>; enumerates the parameters required by each of
    the applicable anonymisation techniques; and provides guidelines for the
    use of each of these techniques in accordance with best practices in data
    protection. Finally, it specifies a mechanism for exporting anonymised
    data and binding anonymisation metadata to Templates and Options Templates
    using IPFIX Options.</t>

    <section title="IPFIX Protocol Overview">

      <t>In the IPFIX protocol, { type, length, value } tuples are expressed
      in Templates containing { type, length } pairs, specifying which { value
      } fields are present in data records conforming to the Template, giving
      great flexibility as to what data is transmitted. Since Templates are
      sent very infrequently compared with Data Records, this results in
      significant bandwidth savings. Various different data formats may be
      transmitted simply by sending new Templates specifying the { type,
      length } pairs for the new data format. See <xref target="RFC5101"></xref> for more information.</t>

      <t>The <xref target="RFC5102">IPFIX information model</xref> defines a
      large number of standard Information Elements which provide the
      necessary { type } information for Templates. The use of standard
      elements enables interoperability among different vendors'
      implementations. Additionally, non-standard enterprise-specific elements
      may be defined for private use.</t>

    </section>

    <section title="IPFIX Documents Overview" anchor="intro-docs">

      <t><xref target="RFC5101">"Specification of the IPFIX
      Protocol for the Exchange of IP Traffic Flow Information"</xref>
      and its associated documents
      define the IPFIX Protocol, which provides network engineers and
      administrators with access to IP traffic flow information.</t>

      <t><xref target="RFC5470">"Architecture for IP Flow
      Information Export"</xref> defines
      the architecture for the export of measured IP flow information out of
      an IPFIX Exporting Process to an IPFIX Collecting Process, and the
      basic terminology used to describe the elements of this architecture,
      per the requirements defined in <xref target="RFC3917">"Requirements
      for IP Flow Information Export"</xref>. The IPFIX Protocol document
      <xref target="RFC5101"></xref> then covers the details of the method for
      transporting IPFIX Data Records and Templates via a congestion-aware
      transport protocol from an IPFIX Exporting Process to an IPFIX
      Collecting Process.</t>

      <t><xref target="RFC5102">"Information Model for IP Flow Information
      Export"</xref> describes the Information Elements used by IPFIX,
      including details on Information Element naming, numbering, and data
      type encoding. Finally, <xref target="RFC5472">"IPFIX
      Applicability"</xref> describes the various applications of the IPFIX
      protocol and their use of information exported via IPFIX, and relates
      the IPFIX architecture to other measurement architectures and
      frameworks.</t>

      <t>Additionally, <xref target="RFC5655">"Specification
      of the IPFIX File Format"</xref> describes a file format based upon the
      IPFIX Protocol for the storage of flow data.</t>

      <t>This document references the Protocol and Architecture documents for
      terminology, and extends the IPFIX Information Model to provide new
      Information Elements for anonymisation metadata. The anonymisation
      techniques described herein are equally applicable to the IPFIX Protocol
      and data stored in IPFIX Files.</t>

    </section>
    
    <section title="Anonymisation within the IPFIX Architecture" anchor="intro-arch">

      <t>According to <xref target="RFC5470"/>, IPFIX Message anonymisation is
      optionally performed as the final operation before handing the Message
      to the transport protocol for export. While no provision is made in the
      architecture for anonymisation metadata as in <xref
      target="aes-section"></xref>, this arrangement does allow for the
      rewriting necessary for comprehensive anonymisation of IPFIX
      export as in <xref target="export-anon-section"></xref>. The development
      of the <xref target="I-D.ietf-ipfix-mediators-framework">IPFIX
      Mediation</xref> framework and the <xref target="RFC5655">IPFIX File
      Format</xref> expand upon this initial architectural allowance for
      anonymisation by adding to the list of places that anonymisation may be
      applied. The former specifies IPFIX Mediators, which rewrite existing
      IPFIX Messages, and the latter specifies a method for storage of IPFIX
      data in files.</t>
      
       <t>More detail on the applicable architectural arrangements of
      anonymisation can be found in <xref
      target="export-anon-arrangement"></xref></t>.

    </section>
    
  </section>

  <section title="Terminology">

    <t>Terms used in this document that are defined in the Terminology section
    of the <xref target="RFC5101">IPFIX Protocol</xref> document are to be
    interpreted as defined there. In addition, this document defines the
    following terms:</t>

    <list style="hanging"> 

      <t hangText="Anonymisation Record: ">A record, defined by the
      Anonymisation Options Template in section <xref target="opt-section"/>,
      that defines the properties of the anonymisation applied to a single
      Information Element within a single Template or Options Template.</t>

      <t hangText="Anonymised Data Record: ">A Data Record within a Data Set
      containing at least one Information Element with anonymised values. The
      Information Element(s) within the Template or Options Template
      describing this Data Record SHOULD have a corresponding Anonymisation
      Record.</t>

      <t hangText="Intermediate Anonymisation Process: ">An intermediate
      process which takes Data Records and and transforms them into Anonymised
      Data Records.</t>

    </list>

    <t>Note that there is an explicit difference in this document between a
    "Data Set" (which is defined as in <xref target="RFC5101"/>) and a "data
    set". When in lower case, this term refers to any collection of data
    (usually, within the context of this document, flow or packet data) which
    may contain identifying information and is therefore subject to
    anonymisation.</t>

    <t>Note also that when the term Template is used in this document, unless
    otherwise noted, it applies both to Templates and Options Templates as
    defined in <xref target="RFC5101"/>. Specifically, Anonymisation Records
    may apply to both Templates and Options Templates.</t>

    <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
    document are to be interpreted as described in <xref target="RFC2119">RFC
    2119</xref>.</t>

  </section>

  <section title="Categorisation of Anonymisation Techniques">

    <t>Anonymisation modifies a data set in order to
    protect the identity of the people or entities described by the data set
    from disclosure. With respect to network traffic data, anonymisation
    generally attempts to preserve some set of properties of the network
    traffic useful for a given application or applications, while ensuring the
    data cannot be traced back to the specific networks, hosts, or users
    generating the traffic.</t>

    <t>Anonymisation may be broadly classified according to two properties:
    recoverability and countability. All anonymisation techniques map the real
    space of identifiers or values into a separate, anonymised space,
    according to some function. A technique is said to be recoverable when the
    function used is invertible or can otherwise be reversed and a real
    identifier can be recovered from a given replacement identifier.</t>

    <t>Countability compares the dimension of the anonymised space (N) to the
    dimension of the real space (M), and denotes how the count of unique
    values is preserved by the anonymisation function. If the anonymised space
    is smaller than the real space, then the function is said to generalise
    the input, mapping more than one input point to each anonymous value
    (e.g., as with aggregation). By definition, generalisation is not
    recoverable.</t>

    <t>If the dimensions of the anonymised and real spaces are the same, such
    that the count of unique values is preserved, then the function is said to
    be a direct substitution function. If the dimension of the anonymised
    space is larger, such that each real value maps to a set of anonymised
    values, then the function is said to be a set substitution function. Note
    that with set substitution functions, the sets of anonymised values are
    not necessarily disjoint. Either direct or set substitution functions are
    said to be one-way if there exists no non-brute force method for
    recovering the real data point from an anonymised one in isolation (i.e.,
    if the only way to recover the data point is to attack the anonymised data
    set as a whole, e.g. through fingerprinting or data injection).</t>

    <t>This classification is summarised in the table below.</t>

      <texttable> 
        <ttcol align="left">Recoverability / Countability</ttcol> 
        <ttcol align="left">Recoverable</ttcol> 
        <ttcol align="left">Non-recoverable</ttcol>
        <c>N < M </c><c>N.A.</c><c>Generalisation</c>
        <c>N = M </c><c>Direct Substitution</c><c>One-way Direct Substitution</c>
        <c>N > M </c><c>Set Substitution</c><c>One-way Set Substitution</c> 
      </texttable>

  </section>

  <section title="Anonymisation of IP Flow Data">

    <t>Due to the restricted semantics of IP flow data, there is a relatively
    limited set of specific anonymisation techniques available on flow data,
    though each falls into the broad categories above. Each type of field that
    may commonly appear in a flow record may have its own applicable specific
    techniques.</t>

    <t>While anonymisation is generally applied at the resolution of single
    fields within a flow record, attacks against anonymisation use entire
    flows and relationships between hosts and flows within a given data set.
    Therefore, fields which may not necessarily be identifying by themselves
    may be anonymised in order to increase the anonymity of the data set as a
    whole.</t>

    <t>Of all the fields in an IP flow record, IP addresses are the most
    likely to be used to directly identify entities in the real world. Each IP
    address is associated with an interface on a network host, and can
    potentially be identified with a single user. Additionally, IP addresses
    are structured identifiers; that is, partial IP address prefixes may be
    used to identify networks just as full IP addresses identify hosts. This
    makes anonymisation of IP addresses particularly important.</t>

    <t>MAC addresses uniquely identify devices on the network; while they
    are not often available in traffic data collected at Layer 3, and cannot
    be used to locate devices within the network, some traces may contain
    sub-IP data including MAC address data. Hardware addresses may be mappable
    to device serial numbers, and to the entities or individuals who purchased
    the devices, when combined with external databases. MAC addresses are also
    often used in constructing IPv6 addresses (see section 2.5.1 of <xref
    target="RFC4291"/>), and as such may be used to reconstruct the low-order
    bits of anonymised IPv6 addresses in certain circumstances. Therefore, MAC
    address anonymisation is also important.</t>

    <t>Port numbers identify abstract entities (applications) as opposed to
    real-world entities, but they can be used to classify hosts and user
    behavior. Passive port fingerprinting, both of well-known and ephemeral
    ports, can be used to determine the operating system running on a host.
    Relative data volumes by port can also be used to determine the host's
    function (workstation, web server, etc.); this information can be used to
    identify hosts and users.</t>

    <t>While not identifiers in and of themselves, timestamps and counters
    can reveal the behavior of the hosts and users on a network. Any given
    network activity is recognizable by a pattern of relative time differences
    and data volumes in the associated sequence of flows, even without host
    address information. They can therefore be used to identify hosts and
    users. Timestamps and counters are also vulnerable to traffic injection
    attacks, where traffic with a known pattern is injected into a network
    under measurement, and this pattern is later identified in the anonymised
    data set. </t>

    <t>The simplest and most extreme form of anonymisation, which can be
    applied to any field of a flow record, is black-marker anonymisation, or
    complete deletion of a given field. Note that black-marker anonymisation
    is equivalent to simply not exporting the field(s) in question.</t>

    <t> While black-marker anonymisation completely protects the data in
    the deleted fields from the risk of disclosure, it also reduces the
    utility of the anonymised data set as a whole. Techniques that retain some
    information while reducing (though not eliminating) the disclosure risk
    will be extensively discussed in the following sections; note that the
    techniques specifically applicable to IP addresses, timestamps, ports, and
    counters will be discussed in separate sections.</t>

    <section title="IP Address Anonymisation">

      <t>Since IP addresses are the most common identifiers within flow data
      that can be used to directly identify a person, organization, or host,
      most of the work on flow and trace data anonymisation has gone into IP
      address anonymisation techniques. Indeed, the aim of most attacks
      against anonymisation is to recover the map from anonymised IP addresses
      to original IP addresses thereby identifying the identified hosts. There
      is therefore a wide range of IP address anonymisation schemes that fit
      into the following categories.</t>

      <texttable> 
        <ttcol align="left">Scheme</ttcol> 
        <ttcol align="left">Action</ttcol> 
        <c>Truncation</c><c>Generalisation</c>
        <c>Reverse Truncation</c><c >Generalisation</c>
        <c>Permutation</c><c>Direct Substitution</c>
        <c>Prefix-preserving Pseudonymisation</c><c>Direct Substitution</c>
      </texttable>

      <section title="Truncation">

        <t>Truncation removes "n" of the least significant bits from an IP
        address, replacing them with zeroes. In effect, it replaces a host
        address with a network address for some fixed netblock; for IPv4
        addresses, 8-bit truncation corresponds to replacement with a /24
        network address. Truncation is a non-reversible generalisation scheme.
        Note that while truncation is effective for making hosts
        non-identifiable, it preserves information which can be used to
        identify an organization, a geographic region, a country, or a
        continent.</t>

        <t>Truncation to an address length of 0 is equivalent to black-marker
        anonymisation. Complete removal of IP address information is only
        recommended for analysis tasks which have no need to separate flow
        data by host or network; e.g. as a first stage to per-application
        (port) or time-series total volume analyses.</t>

      </section>

      <section title="Reverse Truncation">

        <t>Reverse truncation removes "n" of the most significant bits from an
        IP address, replacing them with zeroes. Reverse truncation is a
        non-reversible generalisation scheme. Reverse truncation is effective
        for making networks unidentifiable, partially or completely removing
        information which can be used to identify an organization, a
        geographic region, a country, or a continent (or RIR region of
        responsibility). However, it may cause ambiguity when applied to data
        collected from more than one network, since it treats all the hosts
        with the same address on different networks as if they are the same
        host. It is not particularly useful when publishing data where the
        network of origin is known or can be easily guessed by virtue of the
        identity of the publisher.</t>

        <t>Like truncation, reverse truncation to an address length of 0 is
        equivalent to black-marker anonymisation.</t>

      </section>

      <section title="Permutation">

        <t>Permutation is a direct substitution technique, replacing each IP
        address with an address selected from the set of possible IP
        addresses, such that each anonymised address represents a
        unique original address. The selection function is often random,
        though it is not necessarily so. Permutation does not preserve any
        structural information about a network, but it does preserve the
        unique count of IP addresses. Any application that requires more
        structure than host-uniqueness will not be able to use permuted IP
        addresses.</t>

        <t>While permutation ideally guarantees that each anonymised address
        represents a unique original address, such requires significant state
        in the Intermediate Anonymisation Process. Therefore, permutation may
        be implemented by hashing for performance reasons, with hash functions
        that may have relatively small collision probabilities. Such
        techniques are still essentially direct substitution techniques,
        despite the nonzero error probability.</t>

      </section>

      <section title="Prefix-preserving Pseudonymisation">

        <t>Prefix-preserving pseudonymisation is a direct substitution
        technique, like permutation but further restricted such that the
        structure of subnets is preserved at each level while anonymising IP
        addresses. If two real IP addresses match on a prefix of "n" bits, the
        two anonymised IP addresses will match on a prefix of "n" bits as
        well. This is useful when relationships among networks must be
        preserved for a given analysis task, but introduces structure into the
        anonymised data which can be exploited in attacks against the
        anonymisation technique.</t>

        <t>Scanning in Internet background traffic can cause particular
        problems with this technique: if a scanner uses a predictable and
        known sequence of addresses, this information can be used to reverse
        the substitution. The low order portion of the address can be left
        unanonymized as a partial defense against this attack.</t>

      </section>
        
    </section>

    <section title="MAC Address Anonymisation">

      <t>Flow data containing sub-IP information can also contain identifying
      information in the form of the hardware (MAC) address. While MAC
      address information cannot be used to locate a node within a network, it
      can be used to directly uniquely identify a specific device. Vendors or
      organizations within the supply chain may then have the information
      necessary to identify the entity or individual that purchased the
      device.</t>

      <t>MAC address information is not as structured as IP address
      information. EUI-48 and EUI-64 MAC addresses contain an Organizational
      Unique Identifier (OUI) in the three most significant bytes of the
      address; this OUI additionally contains bits noting whether the address
      is locally or globally administered. Beyond this, the address is
      unstructured, and there is no particular relationship among the OUIs
      assigned to a given vendor.</t>

      <t>Note that MAC address information also appear within IPv6
      addresses, as the EAP-64 address, or EAP-48 address encoded as an EAP-64
      address, is used as the least significant 64 bits of the IPv6 address in
      the case of link local addressing or stateless autoconfiguration; the
      considerations and techniques in this section may then apply to such
      IPv6 addresses as well.</t>

      <texttable> 
        <ttcol align="left">Scheme</ttcol> 
        <ttcol align="left">Action</ttcol> 
        <c>Reverse Truncation</c><c>Generalisation</c>
        <c>Permutation</c><c>Direct Substitution</c>
        <c>Structured Pseudonymisation</c><c>Direct Substitution</c>
      </texttable>

      <section title="Reverse Truncation">

        <t>Reverse truncation removes "n" of the most significant bits from an
        MAC address, replacing them with zeroes. Reverse truncation is a
        non-reversible generalisation scheme. This has the effect of removing
        bits of the OUI, which identify manufacturers, before removing the
        least significant bits. Reverse truncation of 24 bits zeroes out the
        OUI.</t>

        <t>Reverse truncation is effective for making device manufacturers
        partially or completely unidentifiable within a dataset. However, it
        may cause ambiguity by introducing the possibility of truncated MAC
        address collision. Also note that the utility or removing manufacturer
        information is dubious, and not particularly well-covered by the
        literature.</t>

        <t>Reverse truncation to an address length of 0 is
        equivalent to black-marker anonymisation.</t>

      </section>

      <section title="Permutation">

        <t>Permutation is a direct substitution technique, replacing each
        MAC address with an address selected from the set of possible
        MAC addresses, such that each anonymised address
        represents a unique original address. The selection function is often
        random, though it is not necessarily so. Permutation does not preserve
        any structural information about a network, but it does preserve the
        unique count of devices on the network. Any application that requires
        more structure than host-uniqueness will not be able to use permuted
        MAC addresses.</t>

        <t>While permutation ideally guarantees that each anonymised address
        represents a unique original address, such requires significant state
        in the Intermediate Anonymisation Process. Therefore, permutation may
        be implemented by hashing for performance reasons, with hash functions
        that may have relatively small collision probabilities. Such
        techniques are still essentially direct substitution techniques,
        despite the nonzero error probability.</t>

      </section>

      <section title="Structured Pseudonymisation">

        <t>Structured pseudonymisation for MAC addresses is a direct
        substitution technique, like permutation, but restricted such
        that the OUI (the most significant three bytes) is permuted separately
        from the node identifier, the remainder. This is useful when the
        uniqueness of OUIs must be preserved for a given analysis task, but
        introduces structure into the anonymised data which can be exploited
        in attacks against the anonymisation technique.</t>

      </section>
      
    </section>
    
    <section title="Timestamp Anonymisation">

      <t>The particular time at which a flow began or ended is not
      particularly identifiable information, but it can be used as part of
      attacks against other anonymisation techniques or for user profiling.
      Precise timestamps can be used in injected-traffic fingerprinting
      attacks, which use known information about a set of traffic generated or
      otherwise known by an attacker to recover mappings of other anonymised
      fields, as well as to identify certain activity by response delay and
      size fingerprinting, which compares response sizes and inter-flow times
      in anonymised data to known values. Therefore, timestamp information may
      be anonymised in order to ensure the protection of the entire data
      set.</t>

      <texttable> 
        <ttcol align="left">Scheme</ttcol> 
        <ttcol align="left">Action</ttcol> 
        <c>Precision Degradation</c><c>Generalisation</c>
        <c>Enumeration</c><c>Direct or Set Substitution</c>
        <c>Random Shifts</c><c>Direct Substitution</c>
      </texttable>

      <section title="Precision Degradation">

        <t>Precision Degradation is a generalisation technique that removes
        the most precise components of a timestamp, accounting all events
        occurring in each given interval (e.g. one millisecond for millisecond
        level degradation) as simultaneous. This has the effect of potentially
        collapsing many timestamps into one. With this technique time
        precision is reduced, and sequencing may be lost, but the information
        at which time the event occurred is preserved. The anonymised data may
        not be generally useful for applications which require strict
        sequencing of flows.</t>

        <t>Note that flow meters with low time precision (e.g. second
        precision, or millisecond precision on high-capacity networks) perform
        the equivalent of precision degradation anonymisation by their
        design.</t>

        <t>Note also that degradation to a very low precision (e.g. on the
        order of minutes, hours, or days) is commonly used in analyses
        operating on time-series aggregated data, and may also be described as
        binning; though the time scales are longer and applicability more
        restricted, this is in principle the same operation.</t>

        <t>Precision degradation to infinitely low precision is equivalent to
        black-marker anonymisation. Removal of timestamp information is only
        recommended for analysis tasks which have no need to separate flows in
        time, for example for counting total volumes or unique occurrences of
        other flow keys in an entire dataset.</t>

      </section>
      
      <section title="Enumeration">

        <t>Enumeration is a substitution function that retains the
        chronological order in which events occurred while eliminating time
        information. Timestamps are substituted by equidistant timestamps (or
        numbers) starting from a randomly chosen start value. The resulting
        data is useful for applications requiring strict sequencing, but not
        for those requiring good timing information (e.g. delay- or jitter-
        measurement for QoS applications or SLA validation).</t>

      </section>
      
      <section title="Random Shifts">

        <t>Random time shifts add a random offset to every timestamp within a
        dataset. This reversible substitution technique therefore retains
        duration and inter-event interval information as well as chronological
        order of flows. It is primarily intended to defeat traffic injection
        fingerprinting attacks.</t>

      </section>
      
    </section>

    <section title="Counter Anonymisation">

      <t>Counters (such as packet and octet volumes per flow) are subject to
      fingerprinting and injection attacks against anonymisation, or for user
      profiling as timestamps are. Counter anonymisation can help defeat these
      attacks, but are only usable for analysis tasks for which relative or
      imprecise magnitudes of activity are useful. Counter information can
      also be completely removed, but this is only recommended for analysis
      tasks which have no need to evaluate the removed counter, for example
      for counting only unique occurrences of other flow keys.</t>

      <texttable> 
        <ttcol align="left">Scheme</ttcol> 
        <ttcol align="left">Action</ttcol> 
        <c>Precision Degradation</c><c>Generalisation</c>
        <c>Binning</c><c>Generalisation</c>
        <c>Random noise addition</c><c>Direct or Set Substitution</c>
      </texttable>

      <section title="Precision Degradation">

        <t>As with precision degradation in timestamps, precision degradation
        of counters removes lower-order bits of the counters, treating all the
        counters in a given range as having the same value. Depending on the
        precision reduction, this loses information about the relationships
        between sizes of similarly-sized flows, but keeps relative magnitude
        information. Precision degradation to an infinitely low precision is
        equivalent to black-marker anonymisation.</t>

      </section>

      <section title="Binning">

        <t>Binning can be seen as a special case of precision degradation; the
        operation is identical, except for in precision degradation the
        counter ranges are uniform, and in binning they need not be. For
        example, a common counter binning scheme for packet counters could be
        to bin values 1-2 together, and 3-infinity together, thereby
        separating potentially completely-opened TCP connections from unopened
        ones. Binning schemes are generally chosen to keep precisely the
        amount of information required in a counter for a given analysis task.
        Note that, also unlike precision degradation, the bin label need not
        be within the bin's range. Binning counters to a single bin is
        equivalent to black-marker anonymisation. </t>

      </section>

      <section title="Random Noise Addition">

        <t>Random noise addition adds a random amount to a counter in each
        flow; this is used to keep relative magnitude information and minimize
        the disruption to size relationship information while avoiding
        fingerprinting attacks against anonymisation. Note that there is no
        guarantee that random noise addition will maintain ranking order by a
        counter among members of a set. Random noise addition is particularly
        useful when the derived analysis data will not be presented in such a
        way as to require the lower-order bits of the counters.</t>

      </section>

    </section>

    <section title="Anonymisation of Other Flow Fields">

      <t>Other fields, particularly port numbers and protocol numbers, can
      be used to partially identify the applications that generated the
      traffic in a a given flow trace. This information can be used in
      fingerprinting attacks, and may be of interest on its own (e.g., to
      reveal that a certain application with suspected vulnerabilities is
      running on a given network). These fields are generally
      anonymised using one of two techniques.</t>

      <texttable> 
        <ttcol align="left">Scheme</ttcol> 
        <ttcol align="left">Action</ttcol> 
        <c>Binning</c><c>Generalisation</c>
        <c>Permutation</c><c>Direct Substitution</c>
      </texttable>
      
      <section title="Binning">

        <t>Binning is a generalisation technique mapping a set of potentially
        non-uniform ranges into a set of arbitrarily labeled bins. Common bin
        arrangements depend on the field type and the analysis application.
        For example, an IP protocol bin arrangement may preserve 1, 6, and 17
        for ICMP, UDP, and TCP traffic, and bin all other protocols into a
        single bin, to mitigate the use of uncommon protocols in
        fingerprinting attacks. Another example arrangement may bin source and
        destination ports into low (0-1023) and high (1024-65535) bins in
        order to tell service from ephemeral ports without identifying
        individual applications.</t>

        <t>Binning other flow key fields to a single bin is equivalent to
        black-marker anonymisation. Removal of other flow key information is
        only recommended for analysis tasks which have no need to
        differentiate flows on the removed keys, for example for total traffic
        counts or unique counts of other flow keys.</t>

      </section>      

      <section title="Permutation">

        <t>Permutation is a direct substitution technique, replacing
        each value with an value selected from the set of possible
        range, such that each anonymised value represents a unique
        original value. This is used to preserve the count of unique values
        without preserving information about, or the ordering of, the values
        themselves.</t>

        <t>While permutation ideally guarantees that each anonymised value
        represents a unique original value, such may require significant state
        in the Intermediate Anonymisation Process. Therefore, permutation may
        be implemented by hashing for performance reasons, with hash functions
        that may have relatively small collision probabilities. Such
        techniques are still essentially direct substitution techniques,
        despite the nonzero error probability.</t>

      </section>

    </section>

  </section>

  <section title="Parameters for the Description of Anonymisation Techniques"> 

    <t>This section details the abstract parameters used to describe the
    anonymisation techniques examined in the previous section, on a
    per-parameter basis. These parameters and their export safety inform the
    design of the IPFIX anonymisation metadata export specified in the
    following section.</t>

    <section title="Stability" anchor="params-stability">

      <t>A stable anonymisation will always map a given value in the real
      space to a single given value in the anonymised space, while an unstable
      anonymisation will change this mapping over time; a completely unstable
      anonymisation is essentially indistinguishable from black-marker
      anonymisation. Any given anonymisation technique may be applied with a
      varying range of stability. Stability is important for assessing the
      comparability of anonymised information in different data sets, or in
      the same data set over different time periods. In practice, an
      anonymisation may also be stable for every data set published by an a
      particular producer to a particular consumer, stable for a stated time
      period within a dataset or across datasets, or stable only for a single
      data set.</t>

      <t>If no information about stability is available, users of anonymised
      data MAY assume that the techniques used are stable across the entire
      dataset, but unstable across datasets. Note that stability presents a
      risk-utility tradeoff, as completely stable anonymisation can be used
      for longer-term trend analysis tasks but also presents more risk of
      attack given the stable mapping. Information about the stability of
      a mapping SHOULD be exported along with the anonymised data.</t>

    </section>

    <section title="Truncation Length">

      <t>Truncation and precision degradation are described by the truncation
      length, or the amount of data still remaining in the anonymised field
      after anonymisation.</t>

      <t>Truncation length can generally be inferred from a given data set,
      and need not be specially exported or protected. For bit-level
      truncation, the truncated bits are generally inferable by the least
      significant bit set for an instance of an Information Element described
      by a given Template (or the most significant bit set, in the case of
      reverse truncation). For precision degradation, the truncation is
      inferable from the maximum precision given. Note that while this
      inference method is generally applicable, it is data-dependent: there is
      no guarantee that it will recover the exact truncation length used to
      prepare the data.</t>

      <t>In the special case of IP address export with variable (per-record)
      truncation, the truncation MAY be expressed by exporting the prefix
      length alongside the address.</t>

    </section>
    
    <section title="Bin Map">

      <t>Binning is described by the specification of a bin mapping function.
      This function can be generally expressed in terms of an associative
      array that maps each point in the original space to a bin, although from
      an implementation standpoint most bin functions are much simpler and
      more efficient.</t>

      <t>Since knowledge of the bin mapping function can be used to partially
      deanonymise binned data, depending on the degree of generalisation, no
      information about the bin mapping function should be exported.</t>
      
    </section>
      
    <section title="Permutation">

      <t>Like binning, permutation is described by the specification of a
      permutation function. In the general case, this can be expressed in
      terms of an associative array that maps each point in the original space
      to a point in the anonymised space. Unlike binning, each point in the
      anonymised space corresponds to a single, unique point in the
      original space.</t>

      <t>Since knowledge of the permutation function may, depending on the
      function, be used to completely deanonymise permuted data, no
      information about the permutation function or its parameters should be
      exported.</t>

    </section>

    <section title="Shift Amount">

      <t>Shifting requires an amount to shift each value by. Since the shift
      amount can be used to deanonymise data protected by shifting, no
      information about the shift amount should be exported.</t>

    </section>

  </section> 

  <section title="Anonymisation Export Support in IPFIX" anchor="aes-section">

    <t>Anonymised data exported via IPFIX SHOULD be annotated with
    anonymisation metadata, which details which fields described by which
    Templates are anonymised, and provides appropriate information on the
    anonymisation techniques used. This metadata SHOULD be exported in Data
    Records described by the recommended Options Templates described in this
    section; these Options Templates use the additional Information Elements
    described in the following subsection.</t>

    <t>Note that fields anonymised using the black-marker (removal) technique
    do not require any special metadata support: black-marker anonymised
    fields SHOULD NOT be exported at all, by omitting the corresponding
    Information Elements from Template describing the Data Set. In the case
    where application requirements dictate that a black-marker anonymised
    field must remain in a Template, then an Exporting Process MAY export
    black-marker anonymised fields with their native length as all-zeros, but
    only in cases where enough contextual information exists within the record
    to differentiate a black-marker anonymised field exported in this way from
    a real zero value.</t>

    <section title="Anonymisation Records and the Anonymisation Options Template" anchor="opt-section">

      <t>The Anonymisation Options Template describes Anonymisation Records,
      which allow anonymisation metadata to be exported inline over IPFIX or
      stored in an IPFIX File, by binding information about anonymisation
      techniques to Information Elements within defined Templates or Options
      Templates. IPFIX Exporting Processes SHOULD export anonymisation records
      for any Template describing exported anonymised Data Records; IPFIX
      Collecting Processes and processes downstream from them MAY use
      anonymisation records to treat anonymised data differently depending on
      the applied technique.</t>

      <t>Anonymisation Records contain ancillary information bound to a
      Template, so many of the considerations for Templates apply to
      Anonymisation Records as well. First, reliability is important: an
      Exporting Process SHOULD export Anonymisation Records after the
      Templates they describe have been exported, and SHOULD export
      anonymisation records reliably.</t>

      <t>Anonymisation Records MUST be handled by Collecting Processes as
      scoped to the Template to which they apply within the Transport Session
      in which they are sent. When a Template is withdrawn via a Template
      Withdrawal Message or expires during a UDP transport session, the
      accompanying Anonymisation Records are withdrawn or expire as well, and
      do not apply to subsequent Templates with the same Template ID within
      the Session unless re-exported.</t>
      
      <t>The Stability Class within the anonymisationFlags IE can be used to
      declare that a given anonymisation technique's mapping will remain
      stable across multiple sessions, but this does not mean that
      anonymisation technique information given in the Anonymisation Records
      themselves persist across Sessions. Each new Transport Session MUST
      contain new Anonymisation Records for each Template describing
      anonymised Data Sets.</t>

      <t>SCTP per-stream export <xref
      target="I-D.ietf-ipfix-export-per-sctp-stream"/> may be used to ease
      management of Anonymisation Records if appropriate for the
      application.</t>

      <texttable>
        <ttcol align="left">IE</ttcol>
        <ttcol align="left">Description</ttcol>
        <c>templateId [scope]</c>
        <c>

          The Template ID of the Template or Options Template containing the
          Information Element described by this anonymisation record. This
          Information Element MUST be defined as a Scope Field.

        </c>
        <c>informationElementId [scope]</c>
        <c>

          The Information Element identifier of the Information Element
          described by this anonymisation record. This Information Element
          MUST be defined as a Scope Field. Exporting Processes MUST clear
          then Enterprise bit of the informationElementId and Collecting
          Processes SHOULD ignore it; information about enterprise-specific
          Information Elements is exported via the privateEnterpriseNumber
          Information Element.

        </c>
       <c>privateEnterpriseNumber [scope] [optional]</c>
        <c>

          The Private Enterprise Number of the enterprise-specific Information
          Element described by this anonymisation record. This Information
          Element MUST be defined as a Scope Field if present. A
          privateEnterpriseNumber of 0 signifies that the Information Element
          is IANA-registered.

        </c>
        <c>informationElementIndex [scope] [optional]</c>
        <c>

          The Information Element index of the instance of the Information
          Element described by this anonymisation record identified by the
          informationElementId within the Template. Optional; need only be
          present when describing Templates that have multiple instances of
          the same Information Element. This Information Element MUST be
          defined as a Scope Field if present. This Information Element is
          defined in <xref target="ie-section"></xref>, below.

        </c>
        <c>anonymisationFlags</c>
        <c>

          Flags describing the mapping stability and specialized modifications
          to the Anonymisation Technique in use. SHOULD be present. This
          Information Element is defined in <xref target="ie-af-section"/>,
          below.

        </c>
        <c>anonymisationTechnique</c>
        <c>

          The technique used to anonymise the data. MUST be present. This
          Information Element is defined in <xref target="ie-at-section"/>,
          below.

        </c>

       </texttable>
    </section>
    
    <section title="Recommended Information Elements for Anonymisation Metadata" anchor="ie-section">

      <section title="informationElementIndex" anchor="ie-iei-section">
       <list style="hanging">
         <t hangText="Description: ">
           A zero-based index of an Information Element referenced by informationElementId within a Template referenced by templateId; used to disambiguate scope for templates containing multiple identical Information Elements.</t>
         <t hangText="Abstract Data Type: ">unsigned16</t>
         <t hangText="ElementId: ">TBD3</t>
         <t hangText="Status: ">Proposed</t>
       </list>
      </section>      

      <section title="anonymisationTechnique" anchor="ie-at-section">
      	<list style="hanging">
      	  <t hangText="Description: ">

            A description of the anonymisation technique applied to a
            referenced Information Element within a referenced Template. Each
            technique may be applicable only to certain Information Elements
            and recommended only for certain Infomation Elements; these
            restrictions are noted in the table below.

            <texttable>
            <ttcol align="left">Value</ttcol>
            <ttcol align="left">Description</ttcol>
            <ttcol align="left">Applicable to</ttcol>
            <ttcol align="left">Recommended for</ttcol>

       	    <c>0</c>
       	    <c>Undefined: the Exporting Process makes no representation as to whether the defined field is anonymised or not. While the Collecting Process MAY assume that the field is not anonymised, it is not guaranteed not to be. This is the default anonymisation technique.</c>
       	    <c>all</c>
       	    <c>all</c>
     	        
       	    <c>1</c>
       	    <c>None: the values exported are real.</c>
       	    <c>all</c>
       	    <c>all</c>
       	    
       	    <c>2</c>
       	    <c>Precision Degradation/Truncation: the values exported are anonymised using simple precision degradation or truncation. The new precision or number of truncated bits is implicit in the exported data, and can be deduced by the Collecting Process.</c>
       	    <c>all</c>
       	    <c>all</c>

       	    <c>3</c>
       	    <c>Binning: the values exported are anonymised into bins.</c>
       	    <c>all</c>
       	    <c>all</c>

       	    <c>4</c><c>Enumeration: the values exported are anonymised by enumeration.</c>
       	    <c>all</c>
       	    <c>timestamps</c>
       	    
       	    <c>5</c>
       	    <c>Permutation: the values exported are anonymised by permutation.</c>
       	    <c>all</c>
       	    <c>identifiers</c>
       	    
       	    <c>6</c><c>Structured Permutation: the values exported are anonymised by  permutation, preserving bit-level structure as appropriate; this represents prefix-preserving IP address anonymisation or structured MAC address anonymisation.</c>
       	    <c>addresses</c>
            <c></c>

            <c>7</c><c>Reverse Truncation: the values exported are anonymised using reverse truncation. The number of truncated bits is implicit in the exported data, and can be deduced by the Collecting Process.</c>
         	<c>addresses</c>
       	    <c></c>

            <c>8</c><c>Noise: the values exported are anonymised by adding random noise to each value.</c>
       	    <c>non-identifiers</c>
       	    <c>counters</c>

            <c>9</c><c>Offset: the values exported are anonymised by adding a single offset to all values.</c>
       	    <c>all</c>
       	    <c>timestamps</c>
       	  </texttable>

         </t>
       	<t hangText="Abstract Data Type: ">unsigned16</t>
       	<t hangText="ElementId: ">TBD2</t>
       	<t hangText="Status: ">Proposed</t>
       </list>
      </section>      

      <section title="anonymisationFlags" anchor="ie-af-section">
          <list style="hanging">
            <t hangText="Description: ">

              A flag word describing specialized modifications to the
              anonymisation policy in effect for the anonymisation technique
              applied to a referenced Information Element within a referenced
              Template. When flags are clear (0), the normal policy (as
              described by anonymisationTechnique) applies without
              modification.

              <figure title="anonymisationFlags IE">
                  <artwork><![CDATA[
   MSB   14  13  12  11  10   9   8   7   6   5   4   3   2   1  LSB
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
   |                Reserved                       |LOR|PmA|   SC  |
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
                    ]]></artwork>
              </figure>

              <texttable>
              <ttcol align="left">bit(s) (LSB = 0)</ttcol>
              <ttcol align="left">name</ttcol>
              <ttcol align="left">description</ttcol>

              <c>0-1</c><c>SC</c><c>Stability Class: see the Stability Class
              table below, and section <xref target="params-stability"/>.</c>

         	    <c>2</c><c>PmA</c><c>Perimeter Anonymisation: when set (1),
              source- Information Elements as described in <xref
              target="RFC5103"/> are interpreted as external addresses, and
              destination- Information Elements as described in <xref
              target="RFC5103"/> are interpreted as internal addresses, for
              the purposes of associating anonymisationTechnique to
              Information Elements only; see <xref
              target="perimeter-anon"/> for details. This bit MUST NOT be set
              when associated with a non-endpoint (i.e., source- or
              destination-) Information Element. SHOULD be consistent within a
              record (i.e., if a source- Information Element has this flag
              set, the corresponding destination- element SHOULD have this
              flag set, and vice-versa.)</c>

         	    <c>3</c><c>LOR</c><c>Low-Order Unchanged: when set (1), the
              low-order bits of the anonymised Information Element contain
              real data. This modification is intended for the anonymisation
              of network-level addresses while leaving host-level addresses
              intact in order to preserve host level-structure, which could
              otherwise be used to reverse anonymisation. MUST NOT be set when
              associated with a truncation-based anonymisationTechnique.</c>

         	    <c>4-15</c><c>Reserved</c><c>Reserved for future use: SHOULD be
              cleared (0) by the Exporting Process and MUST be ignored by the
              Collecting Process.</c>

         	  </texttable>

              The Stability Class portion of this flags word describes the
              stability class of the anonymisation technique applied to a
              referenced Information Element within a referenced Template.
              Stability classes refer to the stability of the parameters of
              the anonymisation technique, and therefore the comparability of
              the mapping between the real and anonymised values over time.
              This determines which anonymised datasets may be compared with
              each other. Values are as follows:

              <texttable>
                <ttcol align="left">Bit 1</ttcol>
                <ttcol align="left">Bit 0</ttcol>
                <ttcol align="left">Description</ttcol>
           	    <c>0</c><c>0</c><c>Undefined: the Exporting Process makes no representation as to how stable the mapping is, or over what time period values of this field will remain comparable; while the Collecting Process MAY assume Session level stability, Session level stability is not guaranteed. Processes SHOULD assume this is the case in the absence of stability class information; this is the default stability class.</c>
           	    <c>0</c><c>1</c><c>Session: the Exporting Process will ensure that the parameters of the anonymisation technique are stable during the Transport Session. All the values of the described Information Element for each Record described by the referenced Template within the Transport Session are comparable. The Exporting Process SHOULD endeavour to ensure at least this stability class.</c>
           	    <c>1</c><c>0</c><c>Exporter-Collector Pair: the Exporting Process will ensure that the parameters of the anonymisation technique are stable across Transport Sessions over time with the given Collecting Process, but may use different parameters for different Collecting Processes. Data exported to different Collecting Processes is not comparable.</c>
           	    <c>1</c><c>1</c><c>Stable: the Exporting Process will ensure that the parameters of the anonymisation technique are stable across Transport Sessions over time, regardless of the Collecting Process to which it is sent.</c>
              </texttable>

            </t>
         	<t hangText="Abstract Data Type: ">unsigned16</t>
         	<t hangText="ElementId: ">TBD1</t>
         	<t hangText="Status: ">Proposed</t>
         </list>
      </section>
      
    </section>

  </section>

  <section title="Applying Anonymisation Techniques to IPFIX Export and Storage" anchor="export-anon-section">

    <t>When exporting or storing anonymised flow data using IPFIX, certain
    interactions between the IPFIX Protocol and the anonymisation techniques
    in use must be considered; these are treated in the subsections below.</t>

    <section title="Arrangement of Processes in IPFIX Anonymisation" anchor="export-anon-arrangement">

      <t>Anonymisation may be applied to IPFIX data at three stages within the
      collection infrastructure: on initial export, at a mediator, or after
      collection, as shown in <xref target="loc-fig"></xref>. Each of these
      locations has specific considerations and applicability.</t>

      <figure title="Potential Anonymisation Locations" anchor="loc-fig">
        <artwork><![CDATA[
            +==========================================+
            | Exporting Process                        |
            +==========================================+
              |                                      |
              |    (Anonymised at Original Exporter) |
              V                                      |
            +=============================+          |
            | Mediator                    |          |
            +=============================+          |
              |                                      |
              | (Anonymising Mediator)               |
              V                                      V
            +==========================================+
            | Collecting Process                       |
            +==========================================+
                    |
                    | (Anonymising CP/File Writer)
                    V
            +--------------------+
            | IPFIX File Storage |
            +--------------------+
        ]]></artwork>
      </figure>

      <t>Anonymisation is generally performed before the wider dissemination
      or repurposing of a flow data set, e.g., adapting operational
      measurement data for research. Therefore, direct anonymisation of flow
      data on initial export is only applicable in certain restricted
      circumstances: when the Exporting Process is "publishing" data to a
      Collecting Process directly, and the Exporting Process and Collecting
      Process are operated by different entities. Note that certain guidelines
      in <xref target="header-anon"></xref> with respect to timestamp
      anonymisation may not apply in this case, as the Collecting Process may
      be able to deduce certain timing information from the time at which each
      Message is received.</t>

      <t>A much more flexible arrangement is to anonymise data within a <xref
      target="I-D.ietf-ipfix-mediators-framework">Mediator</xref>. Here,
      original data is sent to a Mediator, which performs the anonymisation
      function and re-exports the anonymised data. Such a Mediator could be
      located at the administrative domain boundary of the initial Exporting
      Process operator, exporting anonymised data to other consumers outside
      the organisation. In this case, the original Exporter SHOULD use TLS as
      specified in <xref target="RFC5101"></xref> to secure the channel to the
      Mediator, and the Mediator should follow the guidelines in <xref
      target="guidelines"></xref>, to mitigate the risk of original data
      disclosure.</t>

      <t>When data is to be published as an anonymised data set in an <xref
      target="RFC5655">IPFIX File</xref>, the anonymisation may be done at the
      final Collecting Process before storage and dissemination, as well. In
      this case, the Collector should follow the guidelines in <xref
      target="guidelines"></xref>, especially as regards File-specific Options
      in <xref target="opt-anon"></xref> </t>

      <t>In each of these data flows, the anonymisation of records is
      undertaken by an Intermediate Anonymisation Process (IAP); the data
      flows into and out of this IAP are shown in <xref target="iap-dataflows"></xref> below.</t>

      <figure title="Data flows through the anonymisation process" anchor="iap-dataflows">
                <artwork><![CDATA[
packets --+                     +- IPFIX Messages -+
          |                     |                  |
          V                     V                  V
+==================+ +====================+ +=============+
| Metering Process | | Collecting Process | | File Reader |
+==================+ +====================+ +=============+
          |      Non-anonymised | Records          |
          V                     V                  V
+=========================================================+
|          Intermediate Anonymisation Process (IAP)       |
+=========================================================+
          | Anonymised     ^            Anonymised |
          | Records        |               Records |
          V                |                       V
+===================+    Anonymisation      +=============+
| Exporting Process |<--- Parameters ------>| File Writer |
+===================+                       +=============+
          |                                        |
          +------------> IPFIX Messages <----------+
        ]]></artwork>
              </figure>

      <t>Anonymisation parameters must also be available to the Exporting
      Process and/or File Writer in order to ensure header data is also
      appropriately anonymised as in <xref target="header-anon"></xref>.</t>

      <t>Following each of the data flows through the IAP, we describe
      five basic types of anonymisation arrangements within this framework in
      <xref target="iap-arrangements"></xref>. In addition to the three arrangements
      described in detail above, anonymisation can also be done at a
      collocated Metering Process and File Writer (see section 7.3.2 of <xref target="RFC5655"></xref>), or at a file manipulator (see section
      7.3.7 of <xref target="RFC5655"></xref>).</t>

        <figure title="Possible anonymisation arrangements in the IPFIX architecture" anchor="iap-arrangements">
                <artwork><![CDATA[
         +----+  +-----+  +----+
 pkts -> | MP |->| IAP |->| EP |-> anonymisation on Original Exporter
         +----+  +-----+  +----+
         +----+  +-----+  +----+
 pkts -> | MP |->| IAP |->| FW |-> Anonymising collocated MP/File Writer
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | CP |->| IAP |->| EP |-> Anonymising Mediator (Masq. Proxy)
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | CP |->| IAP |->| FW |-> Anonymising collocated CP/File Writer
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | FR |->| IAP |->| FW |-> Anonymising file manipulator
 File    +----+  +-----+  +----+
                ]]></artwork>
              </figure>

      <t>Note that anonymisation may occur at more than one location within a  
      given collection infrastructure, to provide varying levels of anonymisation, 
      disclosure risk, or data utility for specific purposes.</t>

    </section>
    
    <section title="IPFIX-Specific Anonymisation Guidelines" anchor="guidelines">

      <t>In implementing and deploying the anonymisation techniques described
      in this document, implementors should note that IPFIX already provides
      features that support anonymised data export, and use these where
      appropriate. Care must also be taken that data structures supporting the
      operation of the protocol itself do not leak data that could be used to
      reverse the anonymisation applied to the flow data. Such data structures
      may appear in the header, or within the data stream itself, especially
      as options data. Each of these and their impact on specific
      anonymisation techniques is noted in a separate subsection below.</t>

      <section title="Appropriate Use of Information Elements for Anonymised Data" anchor="iespec-anon">

        <t>Note, as in <xref target="aes-section"></xref> above, that
        black-marker anonymised fields SHOULD NOT be exported at all; the
        absence of the field in a given Data Set is implicitly declared by not
        including the corresponding Information Element in the Template
        describing that Data Set.</t>

        <t>When using precision degradation of timestamps, Exporting Processes
        SHOULD export timing information using Information Elements of an
        appropriate precision, as explained in Section 4.5 of <xref
        target="RFC5153"></xref>. For example, timestamps measured in
        millisecond-level precision and degraded to second-level precision
        should use flowStartSeconds and flowEndSeconds, not
        flowStartMilliseconds and flowEndMilliseconds.</t>

        <t>When exporting anonymised data and anonymisation metadata,
        Exporting Processes SHOULD ensure that the combination of Information
        Element and declared anonymisation technique are compatible.
        Specifically, the applicable and recommended Information Element types
        and semantics for each technique are noted in the description of the
        anonymisationTechnique Information Element in <xref
        target="ie-at-section"></xref>. In this description, a timestamp is an
        Information Element with the data type dateTimeSeconds,
        dataTimeMilliseconds, dateTimeMicroseconds, or dateTimeNanoseconds; an
        address is an Information Element with the data type ipv4Address,
        ipv6Address, or macAddress; and an identifier is an Information
        Element with identifier data type semantics. Exporting Process MUST
        NOT export Anonymisation Options records binding techniques to
        Information Elements to which they are not applicable, and SHOULD NOT
        export Anonymisation Options records binding techniques to Information
        Elements for which they are not recommended. </t>

      </section>
      
      <section title="Export of Perimeter-Based Anonymisation Policies" anchor="perimeter-anon">

          <t>Data collected from a single network may require different
          anonymisation policies for addresses internal and external to the
          network. For example, internal addresses could be subject to simple
          permutation, while external addresses could be aggregated into
          networks by truncation. When exporting anonymised perimeter
          bidirectional flow (biflow) data as in section 5.2 of <xref
          target="RFC5103"/>, this arrangement may be easily represented by
          specifying one technique for source endpoint information (which
          represents the external endpoint in a perimeter biflow) and one
          technique for destination endpoint information (which represents the
          internal address in a perimeter biflow).</t>

          <t>However, it can also be useful to represent perimeter-based
          anonymisation policies with unidirectional flow (uniflow), or
          non-perimeter biflow data. In this case, the Perimeter Anonymisation
          bit (bit 2) in the anonymisationFlags Information Element describing
          the anonymised address Information Elements can be set to change the
          meaning of "source" and "destination" of Information Elements to
          mean "external" and "internal" as with perimeter biflows, but only
          with respect to anonymisation policies.</t>

      </section>

      <section title="Anonymisation of Header Data" anchor="header-anon">

        <t>Each IPFIX Message contains a Message Header; within this Message
        Header are contained two fields which may be used to break certain
        anonymisation techniques: the Export Time, and the Observation Domain
        ID</t>

        <t>Export of IPFIX Messages containing anonymised timestamp data where
        the original Export Time Message header has some relationship to the
        anonymised timestamps SHOULD anonymise the Export Time header field so
        that the Export Time is consistent with the anonymised timestamp data.
        Otherwise, relationships between export and flow time could be used to
        partially or totally reverse timestamp anonymisation. Anonymisation of
        timestamps and the Export Time header field should take care to avoid
        times too far in the past or future; while <xref target="RFC5101"/>
        does not make any allowance for Export Time error detection, it is
        sensible that Collecting Processes may interpret Messages with
        seemingly nonsensical Export Times as erroneous. Specific limits are
        implementation-dependent, but this issue may cause interoperability
        issues when anonymising the Export Time header field.</t>

        <t>The similarity in size between an Observation Domain ID and an IPv4
        address (32 bits) may lead to a temptation to use an IPv4 interface
        address on the Metering or Exporting Process as the Observation Domain
        ID. If this address bears some relation to the IP addresses in the
        flow data (e.g., shares a network prefix with internal addresses) and
        the IP addresses in the flow data are anonymised in a
        structure-preserving way, then the Observation Domain ID may be used
        to break the IP address anonymisation. Use of an IPv4 interface
        address on the Metering or Exporting Process as the Observation Domain
        ID is NOT RECOMMENDED in this case.</t>

      </section>

      <section title="Anonymisation of Options Data" anchor="opt-anon">

        <t>IPFIX uses the Options mechanism to export, among other things,
        metadata about exported flows and the flow collection infrastructure.
        As with the IPFIX Message Header, certain Options recommended in <xref
        target="RFC5101"></xref> and <xref target="RFC5655"></xref> containing
        flow timestamps and network addresses of Exporting and Collecting
        Processes may be used to break certain anonymisation techniques; care
        should be taken while using them with anonymised data export and
        storage.</t>

        <t>The Exporting Process Reliability Statistics Options Template,
        recommended in <xref target="RFC5101"></xref>, contains an Exporting
        Process ID field, which may be an exportingProcessIPv4Address
        Information Element or an exportingProcessIPv6Address Information
        Element. If the Exporting Process address bears some relation to the
        IP addresses in the flow data (e.g., shares a network prefix with
        internal addresses) and the IP addresses in the flow data are
        anonymised in a structure-preserving way, then the Exporting Process
        address may be used to break the IP address anonymisation. Exporting
        Processes exporting anonymised data in this situation SHOULD mitigate
        the risk of attack either by omitting Options described by the
        Exporting Process Reliability Statistics Options Template, or by
        anonymising the Exporting Process address using a similar technique to
        that used to anonymise the IP addresses in the exported data.</t>

        <t>Similarly, the Export Session Details Options Template and Message
        Details Options Template specified for the <xref
        target="RFC5655">IPFIX File Format</xref> may contain the
        exportingProcessIPv4Address Information Element or the
        exportingProcessIPv6Address Information Element to identify an
        Exporting Process from which a flow record was received, and the
        collectingProcessIPv4Address Information Element or the
        collectingProcessIPv6Address Information Element to identify the
        Collecting Process which received it. If the Exporting Process or
        Collecting Process address bears some relation to the IP addresses in
        the data set (e.g., shares a network prefix with internal addresses)
        and the IP addresses in the data set are anonymised in a
        structure-preserving way, then the Exporting Process or Collecting
        Process address may be used to break the IP address anonymisation.
        Since these Options Templates are primarily intended for storing IPFIX
        Transport Session data for auditing, replay, and testing purposes, it
        is NOT RECOMMENDED that storage of anonymised data include these
        Options Templates in order to mitigate the risk of attack.</t>

        <t>The Message Details Options Template specified for the <xref
        target="RFC5655">IPFIX File Format</xref> also contains the
        collectionTimeMilliseconds Information Element. As with the Export
        Time Message Header field, if the exported data set contains
        anonymised timestamp information, and the collectionTimeMilliseconds
        Information Element in a given Message has some relationship to the
        anonymised timestamp information, then this relationship can be
        exploited to reverse the timestamp anonymisation. Since this Options
        Template is primarily intended for storing IPFIX Transport Session
        data for auditing, replay, and testing purposes, it is NOT RECOMMENDED
        that storage of anonymised data include this Options Template in order
        to mitigate the risk of attack.</t>

        <t>Since the Time Window Options Template specified for the 
        <xref target="RFC5655">IPFIX File Format</xref> refers to the
        timestamps within the data set to provide partial table of contents
        information for an IPFIX File, care must be taken to ensure that
        Options described by this template are written using the anonymised
        timestamps instead of the original ones.</t>

      </section>

      <section title="Special-Use Address Space Considerations" anchor="sua-anon">

          <t>When anonymising data for transport or storage using IPFIX
          containing anonymised IP addresses, and the analysis purpose permits
          doing so, it is recommended to filter out or leave unanonymised data
          containing the special-use IPv4 addresses enumerated in <xref
          target="RFC5735"/> or the special-use IPv6 addresses enumerated in
          <xref target="RFC5156"/>. Data containing these addresses (e.g.
          0.0.0.0 and 169.254.0.0/16 for link-local autoconfiguration in IPv4
          space) are often associated with specific, well-known behavioral
          patterns. Detection of these patterns in anonymised data can lead to
          deanonymisation of these special-use addresses, which increases the
          chance of a complete reversal of anonymisation by an attacker,
          especially of prefix-preserving techniques.</t>

      </section>

      <section title="Protecting Out-of-Band Configuration and Management Data">

       <t>Special care should be taken when exporting or sharing anonymised
       data to avoid information leakage via the configuration or management
       planes of the IPFIX Device containing the Exporting Process or the File
       Writer. For example, adding noise to counters is useless if the
       receiver can deduce the values in the counters from SNMP information,
       and concealing the network under test is similarly useless if such
       information is available in a configuration document. As the specifics
       of these concerns are largely implementation- and deployment-dependent,
       specific mitigation is out of scope for this draft. The general ground
       rule is that information of similar type to that anonymised should not
       be made available to the receiver by any means, whether in the Data
       Records, in IPFIX protocol structures such as Message Headers, or
       out-of-band.</t>

        </section>

    </section>
  </section>

  <section title="Examples">

      <t>In this example, consider the export or storage of an anonymised IPv4 data set from a single network described by a simple template containing a  timestamp in seconds, a five-tuple, and packet and octet counters. The template describing each record in this data set is shown in figure <xref target="af-template"/>.</t>
      
         <figure title="Example Flow Template" anchor="af-template">
           <artwork><![CDATA[
                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 2           |          Length =  40         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Template ID = 256        |        Field Count = 8        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| flowStartSeconds        150 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| sourceIPv4Address         8 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| destinationIPv4Address   12 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| sourceTransportPort       7 |       Field Length =  2       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| destinationTransportPort 11 |       Field Length =  2       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| packetDeltaCount          2 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| octetDeltaCount           1 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| protocolIdentifier        4 |       Field Length =  1       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
           ]]></artwork>
         </figure>

    <t>Suppose that this data set is anonymised according to the following policy:</t>

      <list style="symbols">
          <t>IP addresses within the network are protected by reverse truncation.</t>
          <t>IP addresses outside the network are protected by prefix-preserving anonymisation.</t>
          <t>Octet counts are exported using degraded precision in order to provide minimal protection against fingerprinting attacks.</t>
          <t>All other fields are exported unanonymised.</t>
      </list>

    <t>In order to export anonymisation records for this template and policy,
    first, the Anonymisation Options Template shown in figure <xref target="anon-opt-template"/> is exported. For this
    example, the optional privateEnterpriseNumber and informationElementIndex
    Information Elements are omitted, because they are not used.</t>

           <figure title="Example Anonymisation Options Template" anchor="anon-opt-template">
              <artwork><![CDATA[
                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 3           |          Length =  26         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Template ID = 257        |        Field Count = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Scope Field Count = 2      |0| templateID              145 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length = 2        |0| informationElementId    303 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length = 2        |0| anonymisationFlags     TBD1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length = 2        |0| anonymisationTechnique TBD2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length = 2        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
              ]]></artwork>
          </figure>

    <t>Following the Anonymisation Options Template comes a Data Set
    containing Anonymisation Records. This data set has an entry for each
    Information Element Specifier in Template 256 describing the flow records.
    This Data Set is shown in figure <xref target="anon-records"/>. Note that
    sourceIPv4Address and destinationIPv4Address have the Perimeter
    Anonymisation (0x0004) flag set in anonymisationFlags, meaning that source
    address should be treated as network-external, and the destination address
    as network-internal.</t>


         <figure title="Example Anonymisation Records" anchor="anon-records">
     <artwork><![CDATA[
                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 257         |          Length =  68         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | flowStartSeconds       IE 150 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| no flags               0x0000 | Not Anonymised              1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | sourceIPv4Address        IE 8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Perimeter, Session SC  0x0005 | Structured Permutation      6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | destinationIPv4Address  IE 12 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Perimeter, Stable      0x0007 | Reverse Truncation          7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | sourceTransportPort      IE 7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| no flags               0x0000 | Not Anonymised              1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | dest.TransportPort      IE 11 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| no flags               0x0000 | Not Anonymised              1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | packetDeltaCount         IE 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| no flags               0x0000 | Not Anonymised              1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | octetDeltaCount          IE 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stable                 0x0003 | Precision Degradation       2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Template 256         | protocolIdentifier      IE 4  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| no flags               0x0000 | Not Anonymised              1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
           ]]></artwork>
         </figure>

    <t>Following the Anonymisation Records come the data sets containing the
    anonymised data, exported according to the template in figure <xref
    target="af-template"/>. Bringing it all together, consider an IPFIX
    Message containing three real data records and the necessary templates to
    export them, shown in <xref target="af-complete-real"/>. (Note that the
    scale of this message is 8-bytes per line, for compactness; lines of dots
    '. . . . . ' represent shifting of the example bit structure for
    clarity.)</t>

         <figure title="Example Real Message" anchor="af-complete-real">
           <artwork><![CDATA[
           1         2         3         4         5         6
 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0x000a        | length 135    | export time 1271227717        | msg 
| sequence 0                    | domain 1                      | hdr
| SetID 2       | length 40     | tid 256       | fields 8      | tmpl
| IE 150        | length 4      | IE 8          | length 4      | set     
| IE 12         | length 4      | IE 7          | length 2      |      
| IE 11         | length 2      | IE 2          | length 4      |      
| IE 1          | length 4      | IE 4          | length 1      |      
| SetID 256     | length 79     | time 1271227681               | data
| sip 192.0.2.3                 | dip 198.51.100.7              | set
| sp 53         | dp 53         | packets 1                     |
| bytes 74                      | prt 17  | . . . . . . . . . . . 
| time 1271227682               | sip 198.51.100.7              |
| dip 192.0.2.88                | sp 5091       | dp 80         |
| packets 60                    | bytes 2896                    |
| prt 6   | . . . . . . . . . . . . . . . . . . . . . . . . . . .
| time 1271227683               | sip 198.51.100.7              |
| dip 203.0.113.9               | sp 5092       | dp 80         |
| packets 44                    | bytes 2037                    |
| prt 6   |
+---------+
            ]]></artwork>
         </figure>

        <t>The corresponding anonymised message is then shown in <xref
        target="af-complete-anon"/>. The options template set describing
        Anonymisation Records and the Anonymisation Records themselves are
        added; IP addresses and byte counts are anonymised as declared.</t>

         <figure title="Corresponding Anonymised Message" anchor="af-complete-anon">
           <artwork><![CDATA[
           1         2         3         4         5         6
 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0x000a        | length 233    | export time 1271227717        | msg 
| sequence 0                    | domain 1                      | hdr
| SetID 2       | length 40     | tid 256       | fields 8      | tmpl
| IE 150        | length 4      | IE 8          | length 4      | set     
| IE 12         | length 4      | IE 7          | length 2      |      
| IE 11         | length 2      | IE 2          | length 4      |      
| IE 1          | length 4      | IE 4          | length 1      |      
| SetID 3       | length 30     | tid 257       | fields 4      | opt
| scope 2       | . . . . . . . . . . . . . . . . . . . . . . . . tmpl
| IE 145        | length 2      | IE 303        | length 2      | set
| IE TBD1       | length 2      | IE TBD2       | length 2      |
| SetID 257     | length 68     | . . . . . . . . . . . . . . . . anon
| tid 256       | IE 150        | flags 0       | tech 1        | recs
| tid 256       | IE 8          | flags 5       | tech 6        |
| tid 256       | IE 12         | flags 7       | tech 7        |
| tid 256       | IE 7          | flags 0       | tech 1        |
| tid 256       | IE 11         | flags 0       | tech 1        |
| tid 256       | IE 2          | flags 0       | tech 1        |
| tid 256       | IE 1          | flags 3       | tech 2        |
| tid 256       | IE41          | flags 0       | tech 1        |
| SetID 256     | length 79     | time 1271227681               | data
| sip 254.202.119.209           | dip 0.0.0.7                   | set
| sp 53         | dp 53         | packets 1                     |
| bytes 100                     | prt 17  | . . . . . . . . . . . 
| time 1271227682               | sip 0.0.0.7                   |
| dip 254.202.119.6             | sp 5091       | dp 80         |
| packets 60                    | bytes 2900                    |
| prt 6   | . . . . . . . . . . . . . . . . . . . . . . . . . . .
| time 1271227683               | sip 0.0.0.7                   |
| dip 2.19.199.176              | sp 5092       | dp 80         |
| packets 60                    | bytes 2000                    |
| prt 6   |
+---------+
            ]]></artwork>
         </figure>

   </section>

  <section title="Security Considerations">

    <t>This document provides guidelines for exporting metadata about
    anonymised data in IPFIX, or storing metadata about anonymised data in
    IPFIX Files. It is not intended as a general statement on the
    applicability of specific flow data anonymisation techniques. Exporters or
    publishers of anonymised data must take care that the applied
    anonymisation technique is appropriate for the data source, the purpose,
    and the risk of deanonymisation of a given application.</t>

    <t>We note specifically that anonymisation is not a replacement for
    encryption for confidentiality. It is only appropriate for protecting
    identifying information in data to be used for purposes in which the
    protected data is irrelevant. Confidentiality in export is best served by
    using TLS or DTLS as in the Security Considerations section of <xref
    target="RFC5101"/>, and in long-term storage by implementation-specific
    protection applied as in the Security Considerations section of <xref
    target="RFC5655"/>. Indeed, confidentiality and anonymisation
    are not mutually exclusive, as encryption for confidentiality may be
    applied to anonymised data export or storage, as well, when the anonymised
    data is not intended for public release.</t>

    <t>When using pseudonymisation techniques that have a mutable mapping,
    there is an inherent tradeoff in the stability of the map between
    long-term comparability and security of the data set against
    deanonymisation. In general, deanonymisation attacks are more effective
    given more information, so the longer a given mapping is valid, the more
    information can be applied to deanonymisation. The specific details of
    this are technique-dependent and therefore out of the scope of this
    document.</t>

    <t>When releasing anonymised data, publishers need to ensure that data
    that could be used in deanonymisation is not leaked through the export
    protocol; guidelines for addressing this risk are provided in <xref
    target="guidelines"/>.</t>

    <t>Note as well that the Security Considerations section of <xref
    target="RFC5101"/> applies as well to the export of anonymised data, and
    the Security Considerations section of <xref
    target="RFC5655"/> to the storage of anonymised data, or the
    publication of anonymised traces.</t>

  </section>

  <section title="IANA Considerations">

    <t>This document specifies the creation of several new IPFIX Information
    Elements in the IPFIX Information Element registry located at
    http://www.iana.org/assignments/ipfix, as defined in <xref
    target="ie-section"></xref> above. IANA has assigned the following
    Information Element numbers for their respective Information Elements as
    specified below:</t>

    <list style="symbols">

      <t>Information Element number TBD1 for the 
      anonymisationFlags Information Element.</t>

      <t>Information Element number TBD2 for the anonymisationTechnique
      Information Element.</t>

      <t>Information Element number TBD3 for the informationElementIndex
      Information Element.</t>
    </list>

    <t>[NOTE for IANA: The text TBDn should be replaced with the respective
    assigned Information Element numbers where they appear in this
    document.]</t>

     </section>

  <section title="Acknowledgments">

    <t>We thank Paul Aitken and John McHugh for their comments and insight,
    and Carsten Schmoll, Benoit Claise, and Lothar Braun for their reviews.
    Special thanks to the ICT-PRISM project for its material support of this
    work.</t>

  </section>

</middle>   

<back>

  <references title="Normative References">
	<?rfc include="reference.RFC.5101" ?>
	<?rfc include="reference.RFC.5102" ?>
	<?rfc include="reference.RFC.5103" ?>
	<?rfc include="reference.RFC.5655" ?>
	<?rfc include="reference.RFC.2119" ?>
	<?rfc include="reference.RFC.5735" ?>
	<?rfc include="reference.RFC.5156" ?>
  </references>

  <references title="Informative References">
	<?rfc include="reference.RFC.5470" ?>
	<?rfc include="reference.RFC.5472" ?>
    <?rfc include="reference.I-D.ietf-ipfix-mediators-framework" ?>
    <?rfc include="reference.I-D.ietf-ipfix-export-per-sctp-stream" ?>
	<?rfc include="reference.RFC.5153" ?>
	<?rfc include="reference.RFC.3917" ?>
	<?rfc include="reference.RFC.4291" ?>
<!--    
    <reference anchor='cryptopan'>
      <front>
        <title>Prefix-Preserving IP Address Anonymisation</title>
        <author initials='J' surname='Fan' fullname='Jinliang Fan'>
          <organization />
        </author>
        <author initials='J' surname='Xu' fullname='Jun Xu'>
          <organization />
        </author>
        <author initials='M' surname='Ammar' fullname='Mostafa H. Ammar'>
          <organization />
        </author>
        <author initials='S' surname='Moon' fullname='Sue B. Moon'>
          <organization />
        </author>
        <date month='October' day='7' year='2004' />
        <abstract/>
      </front>

      <seriesInfo name='' value='Computer Networks, Volume 46, Issue 2, Pages 253-272, Elsevier'/>
    </reference>
-->
  </references>

</back>
</rfc>

PAFTECH AB 2003-20262026-04-23 09:22:11