One document matched: draft-boschi-ipfix-anon-00.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY draftIpfixArch PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-arch.xml">
<!ENTITY draftIpfixAs PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-as.xml'>
<!ENTITY draftIpfixArchitecture PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-architecture.xml'>
<!ENTITY draftIpfixRR PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-reducing-redundancy.xml'>
<!ENTITY rfc3917 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3917.xml'>
<!ENTITY rfc5101 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5101.xml">
<!ENTITY rfc5102 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5102.xml">
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
] >
<rfc ipr="full3978" category="exp" docName="draft-boschi-ipfix-anon-00.txt">
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<front>
<title abbrev="IP Flow Anonymisation Support">
IP Flow Anonymisation Support
</title>
<author initials="E." surname="Boschi" fullname="Elisa Boschi">
<organization abbrev="Hitachi Europe">
Hitachi Europe
</organization>
<address>
<postal>
<street>c/o ETH Zurich</street>
<street>Gloriastrasse 35</street>
<city>8092 Zurich</city>
<country>Switzerland</country>
</postal>
<phone>+41 44 632 70 57</phone>
<email>elisa.boschi@hitachi-eu.com</email>
</address>
</author>
<author initials="B." surname="Trammell" fullname="Brian Trammell">
<organization abbrev="Hitachi Europe">
Hitachi Europe
</organization>
<address>
<postal>
<street>c/o ETH Zurich</street>
<street>Gloriastrasse 35</street>
<city>8092 Zurich</city>
<country>Switzerland</country>
</postal>
<phone>+41 44 632 70 13</phone>
<email>trammell@tik.ee.ethz.ch</email>
</address>
</author>
<date month="July" day="7" year="2008"></date>
<area>Operations</area>
<workgroup>IPFIX Working Group</workgroup>
<abstract>
<t>This document describes anonymisation techniques for IP flow data. It
provides a categorization of common anomymisation schemes and defines the
parameters needed to describe them. It describes support for anonymization
within the IPFIX protocol, providing the basis for the definition of
information models for configuring anonymisation techniques within an
IPFIX Metering or Exporting Process, and for reporting the technique in
use to an IPFIX Collecting Process.</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>The standardisation of an IP flow information export protocol <xref
target="RFC5101"></xref> and associated representations removes a
technical barrier to the sharing of IP flow data across organizational
boundaries and with network operations, security, and research communities
for a wide variety of purposes. However, with wider dissemination comes
greater risks to the privacy of the users of networks under measurement,
and to the security of those networks. While it is not a complete solution
to the issues posed by distribution of IP flow information, anonymisation
is an important tool for the protection of privacy within network
measurement infrastructures. Additionally, various jurisdictions define
data protection laws and regulations that flow measurement activities must
comply with, and anonymisation may be a part of such compliance [IMC07,
FloCon08].</t>
<t>This document presents a mechanism for representing anonymised data
within IPFIX and guidelines for using it. It begins with a categorization
of anonymisation techniques. It then describes applicability of each
technique to commonly anonymisable fields of IP flow data, organized by
information element data type and semantics as in <xref
target="RFC5102"></xref>; enumerates the parameters required by each of
the applicable anonymisation techniques; and provides guidelines for the
use of each of these techniques in accordance with best practices in data
protection. Finally, it specifies a mechanism for exporting anonymised
data and binding anonymisation metadata to templates using IPFIX
Options.</t>
<section title="IPFIX Protocol Overview">
<t>In the IPFIX protocol, { type, length, value } tuples are expressed
in templates containing { type, length } pairs, specifying which { value
} fields are present in data records conforming to the Template, giving
great flexibility as to what data is transmitted.</t>
<t>Since Templates are sent very infrequently compared with Data
Records, this results in significant bandwidth savings.</t>
<t>Different Data Records may be transmitted simply by sending new
Templates specifying the { type, length } pairs for the new data format.
See <xref target="RFC5101"></xref> for more information.</t>
<t>The <xref target="RFC5102">IPFIX information model</xref> defines a
large number of standard Information Elements which provide the
necessary { type } information for Templates.</t>
<t>The use of standard elements enables interoperability among different
vendors' implementations. Additionally, non-standard enterprise-specific
elements may be defined for private use.</t>
</section>
<section title="IPFIX Documents Overview" anchor="intro-docs">
<t><xref target="RFC5101">"Specification of the IPFIX
Protocol for the Exchange of IP Traffic Flow Information"</xref>
(informally, the IPFIX Protocol document) and its associated documents
define the IPFIX Protocol, which provides network engineers and
administrators with access to IP traffic flow information.</t>
<t><xref target="I-D.ietf-ipfix-arch">"Architecture for IP Flow
Information Export"</xref> (the IPFIX Architecture document) defines
the architecture for the export of measured IP flow information out of
an IPFIX Exporting Process to an IPFIX Collecting Process, and the
basic terminology used to describe the elements of this architecture,
per the requirements defined in <xref target="RFC3917">"Requirements
for IP Flow Information Export"</xref>. The IPFIX Protocol document
<xref target="RFC5101"></xref> then covers the details of the method for
transporting IPFIX Data Records and Templates via a congestion-aware
transport protocol from an IPFIX Exporting Process to an IPFIX
Collecting Process.</t>
<t><xref target="RFC5102">"Information Model for IP Flow
Information Export"</xref> (informally, the IPFIX Information Model
document) describes the Information Elements used by IPFIX, including
details on Information Element naming, numbering, and data type
encoding. Finally, <xref target="I-D.ietf-ipfix-as">"IPFIX
Applicability"</xref> describes the various applications of the IPFIX
protocol and their use of information exported via IPFIX, and relates
the IPFIX architecture to other measurement architectures and
frameworks.</t>
<t>This document references the Protocol and Architecture documents for
terminology and extends the IPFIX Information Model to provide new
Information Elements for anonymisation metadata.</t>
</section>
</section>
<section title="Terminology">
<t>The terminology used in this document is fully aligned with the
terminology defined in <xref target="RFC5101"></xref>. Therefore, the
terms defined in the IPFIX terminology are capitalized in this document,
as in other IPFIX drafts (<xref target="RFC5101"></xref>, <xref
target="RFC5102"></xref>).</t>
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
in this document are to be interpreted as described
in <xref target="RFC2119">RFC 2119</xref>.</t>
</section>
<section title="Categorisation of Anonymisation Techniques">
<t>Anonymisation modifies a data set in order to
protect the identity of the people or entities described by the data set
from disclosure. With respect to network traffic data, anonymisation
generally attempts to preserve some set of properties of the network
traffic useful for a given application or applications, while ensuring the
data cannot be traced back to the specific networks, hosts, or users
generating the traffic.</t>
<t>Anonymisation may be broadly split into three categories:
generalisation and reversible or irreversible substitution. When
generalisation is used, identifying information is grouped in sets, and
one single value is used to identify each set element. Note that this may
cause multiple records to become indistinguishable, thereby aggregating
them into a single record. Generalisation is an irreversible operation, in
that the information needed to identify a single record from its
"generalised value" is lost.</t>
<t>Substitution (or pseudonymization) substitutes a false identifier for a
real one, and can be reversible or irreversible. Reversible substitution
uses an invertible or otherwise reversible function, so that the real
identifier may be recovered later. Irreversible substitution, likewise,
uses a one-way or randomising function, so that the real identifier cannot
be recovered.</t>
<t>While anonymisation is generally applied at the resolution of single
fields within a record, attacks against anonymisation use entire records
and relationships between records within a data set. Therefore, fields
which may not necessarily be identifying by themselves may be anonymised
in order to increase the anonymity of the data set as a whole.</t>
</section>
<section title="Anonymisation of IP Flow Data">
<t>Due to the restricted semantics of IP flow data, there are a relatively
limited set of specific anonymisation techniques available on flow data,
though each falls into the broad categories above. Each type of field that
may commonly appear in a flow record may have its own applicable specific
techniques.</t>
<t>Of all the fields in an IP flow record, the most attention in the
literature has been paid to IP addresses [TODO: cite]. IP addresses are
structured identifiers, that is, partial IP address prefixes may be used
to identify networks just as full IP addresses identify hosts. This leads
to the application of prefix-preserving anonymisation of IP address
information [TODO: cite]. Prefix-preserving anonymisation is a (generally
irreversible) substitution technique which has the additional property
that the structure of the IP address space is maintained in the anonymised
data.</t>
<t>While not identifiers in and of themselves, timestamps are vulnerable
to fingerprinting attacks, wherein relationships between the start and end
timestamps of flows within a data set can be used to identify hosts or
networks [TODO: cite]. Therefore, a variety of anonymisation techniques
are available, including loss of precision (a form of generalisation), or
noise addition (substitution), which may or may not preserve the
sequencing of flows in the data set.</t>
<t>Counters and other flow values can also be used to break anonymisation
in fingerprinting attacks, so the same techniques, precision loss and
noise addition, are available for these fields as well.</t>
<t>Of course, the simplest form of anomymisation and the most extreme form
of generalisation is black-marker anonymisation, or full deletion of a
field from each record of the flow data. The black marker technique is
available on any type of field in a flow record.</t>
<t>[TODO: This section is incomplete; the set of techniques should be more
exhaustive.]</t>
<section title="IP Address Anonymisation">
<t>The following table gives an overview of the schemes for IP address
anonymization described in this document and their categorization.</t>
<texttable> <ttcol align="left">Scheme</ttcol> <ttcol
align="left">Action</ttcol> <ttcol align="left">Reversibility</ttcol>
<c>Truncation</c><c>Generalisation</c><c>N</c>
<c>Scrambling</c><c>Substitution</c><c>Y</c>
<c>Prefix-preserving</c><c>Substitution</c><c>Y</c>
<c>Random noise addition</c><c>Substitution</c><c>N</c>
</texttable>
<t>[TODO: This section is incomplete; text here should expand
on the table.]</t>
</section>
<section title="Timestamp Anonymisation">
<t>[TODO: as section 4.1]</t>
<t>[EDITOR'S NOTE: Counters might go here, since they are subject to the
same techniques for largely the same reasons.]</t>
</section>
<section title="Anonymisation of Other Flow Fields">
<t>[TODO: as section 4.1]</t>
<t>[EDITOR'S NOTE: Port Numbers go here. Counters might, if not above.
It might make sense to split this into flow key anonymisation versus
flow value anonymisation.]</t>
</section>
</section>
<section title="Parameters for the Description of Anonymisation Techniques">
<t>[TODO: see corresponding section of draft-ietf-psamp-sample-tech for
the proposed structure of this section.] </t>
</section>
<section title="Anonymisation Support in IPFIX">
<t>[TODO: Here we'll describe how the information specified above can be
transmitted on the wire using an option template. The idea is to scope the
option to the Template ID and for each field specify which are anonymised,
providing info on the output characteristics of the technique, and which
ones aren't.]</t>
<t>[EDITOR'S NOTE: Multiple anon. techniques applied on an IE at the same
time is indicated with multiple elements of the same type (in application
order as in PSAMP)]</t>
<t>[EDITOR'S NOTE: for blackmarking we'll recommend not to export the
information at all following the data protection law principle that only
necessary information should be exported.]</t>
</section>
<section title="Security Considerations">
<t>[TODO: write this section.]</t>
</section>
<section title="IANA Considerations">
<t>This document contains no actions for IANA.</t>
</section>
</middle>
<back>
<references title="Normative References">
&rfc5101;
&rfc5102;
</references>
<references title="Informative References">
&draftIpfixArch;
&draftIpfixAs;
&draftIpfixArchitecture;
&draftIpfixRR;
&rfc3917;
&rfc2119;
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 02:42:19 |