One document matched: draft-boschi-ipfix-anon-00.xml


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY draftIpfixArch PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-arch.xml">
<!ENTITY draftIpfixAs PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-as.xml'>
<!ENTITY draftIpfixArchitecture PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-architecture.xml'>
<!ENTITY draftIpfixRR PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-reducing-redundancy.xml'>
<!ENTITY rfc3917 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3917.xml'>
<!ENTITY rfc5101 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5101.xml">
<!ENTITY rfc5102 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5102.xml">
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">

] >
<rfc ipr="full3978" category="exp" docName="draft-boschi-ipfix-anon-00.txt">
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>

<front>
  <title abbrev="IP Flow Anonymisation Support">
    IP Flow Anonymisation Support 
  </title>
  <author initials="E." surname="Boschi" fullname="Elisa Boschi">
    <organization abbrev="Hitachi Europe">
      Hitachi Europe 
    </organization>
    <address>
      <postal>
        <street>c/o ETH Zurich</street>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <phone>+41 44 632 70 57</phone>
      <email>elisa.boschi@hitachi-eu.com</email>
    </address>
  </author>
  <author initials="B." surname="Trammell" fullname="Brian Trammell">
    <organization abbrev="Hitachi Europe">
      Hitachi Europe 
    </organization>
    <address>
      <postal>
        <street>c/o ETH Zurich</street>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <phone>+41 44 632 70 13</phone>
      <email>trammell@tik.ee.ethz.ch</email>
    </address>
  </author>
  <date month="July" day="7" year="2008"></date>
  <area>Operations</area>
  <workgroup>IPFIX Working Group</workgroup>
  <abstract> 
    <t>This document describes anonymisation techniques for IP flow data. It
    provides a categorization of common anomymisation schemes and defines the
    parameters needed to describe them. It describes support for anonymization
    within the IPFIX protocol, providing the basis for the definition of
    information models for configuring anonymisation techniques within an
    IPFIX Metering or Exporting Process, and for reporting the technique in
    use to an IPFIX Collecting Process.</t>
  </abstract>
</front>

<middle>

  <section title="Introduction">

    <t>The standardisation of an IP flow information export protocol <xref
    target="RFC5101"></xref> and associated representations removes a
    technical barrier to the sharing of IP flow data across organizational
    boundaries and with network operations, security, and research communities
    for a wide variety of purposes. However, with wider dissemination comes
    greater risks to the privacy of the users of networks under measurement,
    and to the security of those networks. While it is not a complete solution
    to the issues posed by distribution of IP flow information, anonymisation
    is an important tool for the protection of privacy within network
    measurement infrastructures. Additionally, various jurisdictions define
    data protection laws and regulations that flow measurement activities must
    comply with, and anonymisation may be a part of such compliance [IMC07,
    FloCon08].</t>

    <t>This document presents a mechanism for representing anonymised data
    within IPFIX and guidelines for using it. It begins with a categorization
    of anonymisation techniques. It then describes applicability of each
    technique to commonly anonymisable fields of IP flow data, organized by
    information element data type and semantics as in <xref
    target="RFC5102"></xref>; enumerates the parameters required by each of
    the applicable anonymisation techniques; and provides guidelines for the
    use of each of these techniques in accordance with best practices in data
    protection. Finally, it specifies a mechanism for exporting anonymised
    data and binding anonymisation metadata to templates using IPFIX
    Options.</t>

    <section title="IPFIX Protocol Overview">

      <t>In the IPFIX protocol, { type, length, value } tuples are expressed
      in templates containing { type, length } pairs, specifying which { value
      } fields are present in data records conforming to the Template, giving
      great flexibility as to what data is transmitted.</t>

      <t>Since Templates are sent very infrequently compared with Data
      Records, this results in significant bandwidth savings.</t>

      <t>Different Data Records may be transmitted simply by sending new
      Templates specifying the { type, length } pairs for the new data format.
      See <xref target="RFC5101"></xref> for more information.</t>

      <t>The <xref target="RFC5102">IPFIX information model</xref> defines a
      large number of standard Information Elements which provide the
      necessary { type } information for Templates.</t>

      <t>The use of standard elements enables interoperability among different
      vendors' implementations. Additionally, non-standard enterprise-specific
      elements may be defined for private use.</t>
      
    </section>

    <section title="IPFIX Documents Overview" anchor="intro-docs">

      <t><xref target="RFC5101">"Specification of the IPFIX
      Protocol for the Exchange of IP Traffic Flow Information"</xref>
      (informally, the IPFIX Protocol document) and its associated documents
      define the IPFIX Protocol, which provides network engineers and
      administrators with access to IP traffic flow information.</t>

      <t><xref target="I-D.ietf-ipfix-arch">"Architecture for IP Flow
      Information Export"</xref> (the IPFIX Architecture document) defines
      the architecture for the export of measured IP flow information out of
      an IPFIX Exporting Process to an IPFIX Collecting Process, and the
      basic terminology used to describe the elements of this architecture,
      per the requirements defined in <xref target="RFC3917">"Requirements
      for IP Flow Information Export"</xref>. The IPFIX Protocol document
      <xref target="RFC5101"></xref> then covers the details of the method for
      transporting IPFIX Data Records and Templates via a congestion-aware
      transport protocol from an IPFIX Exporting Process to an IPFIX
      Collecting Process.</t>

      <t><xref target="RFC5102">"Information Model for IP Flow
      Information Export"</xref> (informally, the IPFIX Information Model
      document) describes the Information Elements used by IPFIX, including
      details on Information Element naming, numbering, and data type
      encoding. Finally, <xref target="I-D.ietf-ipfix-as">"IPFIX
      Applicability"</xref> describes the various applications of the IPFIX
      protocol and their use of information exported via IPFIX, and relates
      the IPFIX architecture to other measurement architectures and
      frameworks.</t>

      <t>This document references the Protocol and Architecture documents for
      terminology and extends the IPFIX Information Model to provide new
      Information Elements for anonymisation metadata.</t>

    </section>

  </section>

  <section title="Terminology">

    <t>The terminology used in this document is fully aligned with the
    terminology defined in <xref target="RFC5101"></xref>. Therefore, the
    terms defined in the IPFIX terminology are capitalized in this document,
    as in other IPFIX drafts (<xref target="RFC5101"></xref>, <xref
    target="RFC5102"></xref>).</t>

    <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
    NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
    in this document are to be interpreted as described
    in <xref target="RFC2119">RFC 2119</xref>.</t>

  </section>

  <section title="Categorisation of Anonymisation Techniques">

    <t>Anonymisation modifies a data set in order to
    protect the identity of the people or entities described by the data set
    from disclosure. With respect to network traffic data, anonymisation
    generally attempts to preserve some set of properties of the network
    traffic useful for a given application or applications, while ensuring the
    data cannot be traced back to the specific networks, hosts, or users
    generating the traffic.</t>

    <t>Anonymisation may be broadly split into three categories:
    generalisation and reversible or irreversible substitution. When
    generalisation is used, identifying information is grouped in sets, and
    one single value is used to identify each set element. Note that this may
    cause multiple records to become indistinguishable, thereby aggregating
    them into a single record. Generalisation is an irreversible operation, in
    that the information needed to identify a single record from its
    "generalised value" is lost.</t>

    <t>Substitution (or pseudonymization) substitutes a false identifier for a
    real one, and can be reversible or irreversible. Reversible substitution
    uses an invertible or otherwise reversible function, so that the real
    identifier may be recovered later. Irreversible substitution, likewise,
    uses a one-way or randomising function, so that the real identifier cannot
    be recovered.</t>

    <t>While anonymisation is generally applied at the resolution of single
    fields within a record, attacks against anonymisation use entire records
    and relationships between records within a data set. Therefore, fields
    which may not necessarily be identifying by themselves may be anonymised
    in order to increase the anonymity of the data set as a whole.</t>

  </section>

  <section title="Anonymisation of IP Flow Data">

    <t>Due to the restricted semantics of IP flow data, there are a relatively
    limited set of specific anonymisation techniques available on flow data,
    though each falls into the broad categories above. Each type of field that
    may commonly appear in a flow record may have its own applicable specific
    techniques.</t>

    <t>Of all the fields in an IP flow record, the most attention in the
    literature has been paid to IP addresses [TODO: cite]. IP addresses are
    structured identifiers, that is, partial IP address prefixes may be used
    to identify networks just as full IP addresses identify hosts. This leads
    to the application of prefix-preserving anonymisation of IP address
    information [TODO: cite]. Prefix-preserving anonymisation is a (generally
    irreversible) substitution technique which has the additional property
    that the structure of the IP address space is maintained in the anonymised
    data.</t>

    <t>While not identifiers in and of themselves, timestamps are vulnerable
    to fingerprinting attacks, wherein relationships between the start and end
    timestamps of flows within a data set can be used to identify hosts or
    networks [TODO: cite]. Therefore, a variety of anonymisation techniques
    are available, including loss of precision (a form of generalisation), or
    noise addition (substitution), which may or may not preserve the
    sequencing of flows in the data set.</t>

    <t>Counters and other flow values can also be used to break anonymisation
    in fingerprinting attacks, so the same techniques, precision loss and
    noise addition, are available for these fields as well.</t>

    <t>Of course, the simplest form of anomymisation and the most extreme form
    of generalisation is black-marker anonymisation, or full deletion of a
    field from each record of the flow data. The black marker technique is
    available on any type of field in a flow record.</t>

    <t>[TODO: This section is incomplete; the set of techniques should be more
    exhaustive.]</t>

    <section title="IP Address Anonymisation">

      <t>The following table gives an overview of the schemes for IP address
      anonymization described in this document and their categorization.</t>

      <texttable> <ttcol align="left">Scheme</ttcol> <ttcol
      align="left">Action</ttcol> <ttcol align="left">Reversibility</ttcol>
      <c>Truncation</c><c>Generalisation</c><c>N</c>
      <c>Scrambling</c><c>Substitution</c><c>Y</c>
      <c>Prefix-preserving</c><c>Substitution</c><c>Y</c> 
      <c>Random noise addition</c><c>Substitution</c><c>N</c>
      </texttable>

      <t>[TODO: This section is incomplete; text here should expand
      on the table.]</t>

    </section>

    <section title="Timestamp Anonymisation">

      <t>[TODO: as section 4.1]</t>

      <t>[EDITOR'S NOTE: Counters might go here, since they are subject to the
      same techniques for largely the same reasons.]</t>

    </section>

    <section title="Anonymisation of Other Flow Fields">

      <t>[TODO: as section 4.1]</t>

      <t>[EDITOR'S NOTE: Port Numbers go here. Counters might, if not above.
      It might make sense to split this into flow key anonymisation versus
      flow value anonymisation.]</t>

    </section>

  </section>

  <section title="Parameters for the Description of Anonymisation Techniques"> 

    <t>[TODO: see corresponding section of draft-ietf-psamp-sample-tech for
    the proposed structure of this section.] </t>

  </section> 

  <section title="Anonymisation Support in IPFIX">

    <t>[TODO: Here we'll describe how the information specified above can be
    transmitted on the wire using an option template. The idea is to scope the
    option to the Template ID and for each field specify which are anonymised,
    providing info on the output characteristics of the technique, and which
    ones aren't.]</t>

    <t>[EDITOR'S NOTE: Multiple anon. techniques applied on an IE at the same
    time is indicated with multiple elements of the same type (in application
    order as in PSAMP)]</t>

    <t>[EDITOR'S NOTE: for blackmarking we'll recommend not to export the
    information at all following the data protection law principle that only
    necessary information should be exported.]</t>


  </section>

  <section title="Security Considerations">

    <t>[TODO: write this section.]</t>

   </section>
  
  <section title="IANA Considerations">
    <t>This document contains no actions for IANA.</t>
  </section>

</middle>   

<back>

  <references title="Normative References">
    &rfc5101;
    &rfc5102;
  </references>

  <references title="Informative References">
    &draftIpfixArch;
    &draftIpfixAs;
    &draftIpfixArchitecture;
    &draftIpfixRR;     
    &rfc3917;
    &rfc2119; 
  </references>

</back>
</rfc>

PAFTECH AB 2003-20262026-04-24 02:42:19