One document matched: draft-hansen-privacy-terminology-03.txt
Differences from draft-hansen-privacy-terminology-02.txt
Network Working Group M. Hansen
Internet-Draft ULD Kiel
Intended status: Informational H. Tschofenig
Expires: May 1, 2012 Nokia Siemens Networks
R. Smith, Ed.
JANET(UK)
October 29, 2011
Privacy Terminology
draft-hansen-privacy-terminology-03.txt
Abstract
Privacy is a concept that has been debated and argued throughout the
last few millennia by all manner of people. Its most striking
feature is that nobody seems able to agree upon a precise definition
of what it actually is. In order to discuss privacy in any
meaningful way a tightly defined context needs to be elucidated. The
specific context of privacy used within this document is that of
"personal data", information about an individual stored and/or
transmitted electronically in Internet protocols. This context is
highly relevant since a lot of work within the IETF involves defining
protocols that can potentially transport (either explicitly or
implicitly) personal data.
This document aims to establish a basic lexicon around privacy so
that IETF contributors who wish to discuss privacy considerations
within their work can do so using terminology consistent across the
area.
Note: This document is discussed at
https://www.ietf.org/mailman/listinfo/ietf-privacy
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Hansen, et al. Expires May 1, 2012 [Page 1]
Internet-Draft Privacy Terminology October 2011
This Internet-Draft will expire on May 1, 2012.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Unlinkability . . . . . . . . . . . . . . . . . . . . . . . . 7
5. Undetectability . . . . . . . . . . . . . . . . . . . . . . . 9
6. Pseudonymity . . . . . . . . . . . . . . . . . . . . . . . . . 10
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12
8. Security Considerations . . . . . . . . . . . . . . . . . . . 13
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
10.1. Normative References . . . . . . . . . . . . . . . . . . 15
10.2. Informative References . . . . . . . . . . . . . . . . . 15
Hansen, et al. Expires May 1, 2012 [Page 2]
Internet-Draft Privacy Terminology October 2011
1. Introduction
Privacy is a concept that has been debated and argued throughout the
last few millennia by all manner of people, including philosophers,
psychologists, lawyers, and more recently, computer scientists. Its
most striking feature is that nobody seems able to agree upon a
precise definition of what it actually is. Every individual, every
group, and every culture have their own different views and
preconceptions about the concept - some mutually complimentary, some
distinctly different. However, it is generally (but not
unanimously!) agreed that the protection of privacy is "A Good Thing"
and often, people only realize what it was when they feel that they
have lost it.
Even within the specific content of computing and computer science,
there are still many facets to privacy. For example, consideration
of privacy in terms of personal information is distinctly different
from consideration of privacy in a geographical information sense: in
the former a loss of privacy might be framed as the uncontrolled
release of personal information without the subject's consent, while
in the latter it might be the ability to compute the location of an
individual beyond a certain degree of accuracy.
In order to discuss privacy in any meaningful way a tightly defined
context needs to be elucidated. The specific context of privacy used
within this document is that of "personal data", information about an
individual stored and/or transmitted electronically in Internet
protocols. This context is highly relevant since a lot of work
within the IETF involves defining protocols that can potentially
transport (either explicitly or implicitly) personal data and can
therefore either, by dint of design decisions when creating them,
enable either privacy protection or result in privacy breaches. In
this specific context, discussions of privacy largely centre around
the collection minimalization, the usage, and release of such
personal data.
Work in this area of privacy and privacy protection over the last few
decades has centered on the idea of data minimization; it uses
terminologies such as anonymity, unlinkability, unobservability, and
pseudonymity. These terms are often used in discussions about the
privacy properties of systems.
The core principal of data minimization is that the ability for
others to collect any personal data should be removed. Often,
however, the collection of personal data cannot not be prevented
entirely, in which case the goal is to minimize the amount of
personal data that can be collected for a given purpose and to offer
ways to control the dissemination of personal data.
Hansen, et al. Expires May 1, 2012 [Page 3]
Internet-Draft Privacy Terminology October 2011
Data minimization is the only generic strategy to enhance individual
privacy in cases where valid personal information is used since all
valid personal data inherently provides some linkability. Other
techniques have been proposed and implemented that aim to enhance
privacy by providing misinformation (inaccurate or erroneous
information, provided usually without conscious effort to mislead or
deceive) or disinformation (deliberately false or distorted
information provided in order to mislead or deceive). However, these
techniques are out of scope for this document.
This document aims to establish a basic lexicon around privacy so
that IETF contributors who wish to discuss privacy considerations
within their work (see [I-D.iab-privacy-considerations]) can do so
using terminology consistent across areas. Note that it does not
attempt to define all aspects of privacy terminology, rather it just
establishes terms to some of the most common ideas and concepts.
Hansen, et al. Expires May 1, 2012 [Page 4]
Internet-Draft Privacy Terminology October 2011
2. Context
To keep discussion as simple as possible in many cases it is usual to
not distinguish between a human using some software, the software
itself, and the device on which it is running. In this case, it is
assumed that there is a one-to-one relationship between the device
running the software that is the scope of Internet protocol
development and the human using that software.
There are various cases, however, when this human-to-software link is
not one-to-one. Protocols developed in the IETF typically do not
mandate any specific relationship but typically envision that uses of
a specific protocol may reveal those relationships. For example,
multiple hosts used by different persons may be attached to an single
Internet gateway within a household. From the Internet Service
Provider point of view all these devices belong to a single person:
the subscriber with whom a contract was established. Unless there
are good reasons to highlight the more complex one-to-many
relationship this document will present scenarios using the simpler
one-to-one relationship, without loss of generality, for editorial
reasons.
When necessary we use the term initiator and responder to refer to
the communication interaction of a protocol. This particular
terminology is used to highlight that many protocols utilize
bidirectional communication where both ends send and receive data.
Finally, we assume that the attacker uses all information available
to infer (probabilities of) his items of interest (IOIs). These IOIs
may be attributes (and their values) of personal data, or may be
actions such as who sent, or who received, which messages.
Hansen, et al. Expires May 1, 2012 [Page 5]
Internet-Draft Privacy Terminology October 2011
3. Anonymity
Definition: Anonymity of a subject from an attacker's perspective
means that the attacker cannot sufficiently identify the subject
within a set of subjects, the anonymity set.
To enable anonymity of a subject, there always has to be an
appropriate set of subjects with potentially the same attributes.
The set of all possible subjects is known as the anonymity set, and
membership of this set may vary over time.
The set of possible subjects depends on the knowledge of the
attacker. Thus, anonymity is relative with respect to the attacker.
Therefore, an initiator may be anonymous (initiator anonymity) only
within a set of potential initiators - their initiator anonymity set
- which itself may be a subset of all subjects who may send a
message. Conversely a responder may be anonymous (responder
anonymity) only within a set of potential responders - their
responder anonymity set. Both anonymity sets may be disjoint, may
overlap, or may be the same.
As an example consider RFC 3325 (P-Asserted-Identity, PAI)
[RFC3325], an extension for the Session Initiation Protocol (SIP),
that allows subjects, such as a VoIP caller, to instruct an
intermediary he or she trusts not to populate the SIP From header
field with its authenticated and verified identity. The recipient
of the call, as well as any other entity outside the user's trust
domain, would therefore only learn that the SIP message (typically
a SIP INVITE) was sent with a header field 'From: "Anonymous"
<sip:anonymous@anonymous.invalid>' rather than the subject's
address-of-record, which is typically thought of as the "public
address" of the user. When PAI is used the subject becomes
anonymous within the initiator anonymity set that is populated by
every subject making use of that specific intermediary.
Note that this example assumes that other personal data cannot be
inferred from the other SIP protocol payloads, which is a useful
assumption to be made in the analysis of one specific protocol
extension but not for analysis of an entire architecture.
Hansen, et al. Expires May 1, 2012 [Page 6]
Internet-Draft Privacy Terminology October 2011
4. Unlinkability
Definition: Unlinkability of two or more Items Of Interest (e.g.,
subjects, messages, actions, ...) from an attacker's perspective
means that within a particular set of information, the attacker
cannot distinguish whether these IOIs are related or not (with a
high enough degree of probability to be useful).
Unlinkability of two (or more) messages may of course depend on
whether their content is protected against the attacker. In the
cases where this is not true, messages may only be unlinkable if we
assume that the attacker is not able to infer information about the
initiator or responder from the message content itself. It is worth
noting that even if the content itself does not betray linkable
information explicitly, deep semantical analysis of a message
sequence can often detect certain characteristics which link them
together, e.g., similarities in structure, style, use of some words
or phrases, consistent appearance of some grammatical errors, etc.
The unlinkability property can be considered as a more "fine-grained"
version of anonymity since there are many more relations where
unlinkability might be an issue than just the relation of "anonymity"
between subjects and IOIs. As such, it may sometimes be necessary to
explicitly state to which attributes anonymity refers to (beyond the
subject to IOI relationship). An attacker might get to know
information on linkability of various messages while not necessarily
reducing anonymity of the particular subject. As an example an
attacker, in spite of being able to link all encrypted messages in a
set of transactions, does not learn the identify of the subject who
is the source of the transactions.
There are several items of terminology heavily related to
unlinkability:
Definition: We use the term "profiling" to mean learning information
about a particular subject while that subject remains anonymous to
the attacker. For example, if an attacker concludes that a
subject plays a specific computer game, reads specific news
article on a website, and uploads certain videos, then the
subjects activities have been profiled, even if the attacker is
unable to identify that specific subject.
Definition: "Relationship anonymity" of a pair of subjects means
that sender and recipient (or each recipient in case of multicast)
are unlinkable. The classical MIX-net [Chau81] without dummy
traffic is one implementation with just this property: The
attacker sees who sends messages when, and who receives messages
when, but cannot figure out who is sending messages to whom.
Hansen, et al. Expires May 1, 2012 [Page 7]
Internet-Draft Privacy Terminology October 2011
Definition: The term "unlinkable session" refers the ability of the
system to render a set of actions by a subject unlinkable from one
another over a sequence of protocol runs (sessions). This term is
useful for cases where a sequence of interactions between an
initiator and a responder is necessary for the application logic
rather than a single-shot message. We refer to this as a session.
When doing an analysis with respect to unlinkability we compare
this session to a sequence of sessions to determine linkability.
Definition: We refer as a "linking identifier" to any parameter that
an attacker can observe about an IOI and use to link it to similar
IOIs. For example, the window size header transmitted in a
typical HTTP request is a linking identifier.
Hansen, et al. Expires May 1, 2012 [Page 8]
Internet-Draft Privacy Terminology October 2011
5. Undetectability
Definition: Undetectability of an item of interest (IOI) from an
attacker's perspective means that the attacker cannot sufficiently
distinguish whether it exists or not.
In contrast to anonymity and unlinkability, where the IOI is
protected indirectly through protection of the IOI's relationship to
a subject or other IOI, undetectability is the direct protection of
an IOI. For example, undetectability can be regarded as a possible
and desirable property of steganographic systems.
If we consider messages as IOIs, then undetectability means that
messages are not sufficiently discernible from, e.g., "random noise".
Hansen, et al. Expires May 1, 2012 [Page 9]
Internet-Draft Privacy Terminology October 2011
6. Pseudonymity
Definition: A pseudonym is an identifier of a subject other than one
of the subject's real names.
Achieving anonymity, unlinkability, and maybe undetectability may
enable the ideal of data minimization. Unfortunately, it would also
prevent a certain class of useful two-way communication scenarios.
Therefore, for many applications, we need to accept a certain amount
of linkability and detectability while attempting to retain
unlinkability between the subject and their transactions. This is
achieved through appropriate kinds of pseudonymous identifiers.
These identifiers are then often used to refer to established state
or are used for access control purposes. An identifier is defined in
[id] as "a lexical token that names entities".
The term 'real name' is the antonym to "pseudonym". There may be
multiple real names over a lifetime -- in particular legal names.
For example, a human being may possess the names which appear on
their birth certificate or on other official identity documents
issued by the State; for a legal person the name under which it
operates and which is registered in official registers (e.g.,
commercial register or register of associations). A human being's
real name typically comprises their given name and a family name.
Note that from a mere technological perspective it cannot always be
determined whether an identifier of a subject is a pseudonym or a
real name.
Additional useful terms are:
Definition: The "holder" of the pseudonym is the subject to whom the
pseudonym refers.
Definition: A subject is "pseudonymous" if a pseudonym is used as
identifier instead of one of its real names.
Definition: Pseudonymity is the state of remaining pseudonymous
through the use of pseudonyms as identifiers.
Sender pseudonymity is defined as the sender being pseudonymous,
recipient pseudonymity is defined as the recipient being
pseudonymous.
In order to be useful in the context of Internet communication we use
the term digital pseudonym and declare it as a pseudonym that is
suitable to be used to authenticate the holder's IOIs.
Anonymity through the use of pseudonyms is stronger where ...
Hansen, et al. Expires May 1, 2012 [Page 10]
Internet-Draft Privacy Terminology October 2011
o the less personal data of the pseudonym holder can be linked to
the pseudonym;
o the less often and the less context-spanning pseudonyms are used
and therefore the less data about the holder can be linked;
o the more often independently chosen pseudonyms are used for new
actions (i.e., making them, from an observer's perspective,
unlinkable)
For Internet protocols it is important whether protocols allow
identifiers to be recycled dynamically, what the lifetime of the
pseudonyms are, to whom they get exposed, how subjects are able to
control disclosure, and how often they can be changed over time (and
what the consequences are when they are regularly changed). These
aspects are described in [I-D.iab-privacy-considerations].
Hansen, et al. Expires May 1, 2012 [Page 11]
Internet-Draft Privacy Terminology October 2011
7. Acknowledgments
Parts of this document utilizes content from [anon_terminology],
which had a long history starting in 2000 and whose quality was
improved due to the feedback from a number of people. The authors
would like to thank Andreas Pfitzmann for his work on an earlier
draft version of this document.
Within the IETF a number of persons had provided their feedback to
this document. We would like to thank Scott Brim, Marc Linsner,
Bryan McLaughlin, Nick Mathewson, Eric Rescorla, Alissa Cooper, Scott
Bradner, Nat Sakimura, Bjoern Hoehrmann, David Singer, Dean Willis,
Christine Runnegar, Lucy Lynch, Trend Adams, Mark Lizar, Martin
Thomson, Josh Howlett, and Mischa Tuffield.
Hansen, et al. Expires May 1, 2012 [Page 12]
Internet-Draft Privacy Terminology October 2011
8. Security Considerations
This document introduces terminology for talking about privacy within
IETF specifications. Since privacy protection often relies on
security mechanisms then this document is also related to security in
its broader context.
Hansen, et al. Expires May 1, 2012 [Page 13]
Internet-Draft Privacy Terminology October 2011
9. IANA Considerations
This document does not require actions by IANA.
Hansen, et al. Expires May 1, 2012 [Page 14]
Internet-Draft Privacy Terminology October 2011
10. References
10.1. Normative References
[I-D.iab-privacy-considerations] Cooper, A., Tschofenig, H., Aboba,
B., Peterson, J., and J. Morris,
"Privacy Considerations for
Internet Protocols",
draft-iab-privacy-considerations-01
(work in progress), October 2011.
[id] "Identifier - Wikipeadia",
Wikipedia , 2011.
10.2. Informative References
[Chau81] Chaum, D., "Untraceable Electronic
Mail, Return Addresses, and Digital
Pseudonyms", Communications of the
ACM , 24/2, 84-88, 1981.
[RFC3325] Jennings, C., Peterson, J., and M.
Watson, "Private Extensions to the
Session Initiation Protocol (SIP)
for Asserted Identity within
Trusted Networks", RFC 3325,
November 2002.
[anon_terminology] Pfitzmann, A. and A. Pfitzmann, "A
terminology for talking about
privacy by data minimization:
Anonymity, Unlinkability,
Undetectability, Unobservability,
Pseudonymity, and Identity
Management", URL: http://
dud.inf.tu-dresden.de/literatur/
Anon_Terminology_v0.34.pdf ,
version 034, 2010.
Hansen, et al. Expires May 1, 2012 [Page 15]
Internet-Draft Privacy Terminology October 2011
Authors' Addresses
Marit Hansen
ULD Kiel
EMail: marit.hansen@datenschutzzentrum.de
Hannes Tschofenig
Nokia Siemens Networks
Linnoitustie 6
Espoo 02600
Finland
Phone: +358 (50) 4871445
EMail: Hannes.Tschofenig@gmx.net
URI: http://www.tschofenig.priv.at
Rhys Smith (editor)
JANET(UK)
EMail: rhys.smith@ja.net
Hansen, et al. Expires May 1, 2012 [Page 16]
| PAFTECH AB 2003-2026 | 2026-04-24 04:59:16 |