One document matched: draft-huitema-dnssd-privacy-00.xml


<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [

<!ENTITY rfc1033 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.1033.xml'>
<!ENTITY rfc1034 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml'>
<!ENTITY rfc1035 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml'>
<!ENTITY rfc2045 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2045.xml'>
<!ENTITY rfc2119 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc2782 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2782.xml'>
<!ENTITY rfc4055 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4055.xml'>
<!ENTITY rfc4075 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4075.xml'>
<!ENTITY rfc6762 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6762.xml'>
<!ENTITY rfc6763 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6763.xml'>
<!ENTITY rfc7626 PUBLIC ''
   'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7626.xml'>

<!ENTITY I-D.ietf-intarea-hostname-practice PUBLIC ''  
   "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-intarea-hostname-practice.xml"> 
<!ENTITY I-D.ietf-dprive-dns-over-tls PUBLIC ''  
   "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-dprive-dns-over-tls.xml">
<!ENTITY I-D.ietf-dprive-dnsodtls PUBLIC ''  
   "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-dprive-dnsodtls.xml">
<!ENTITY I-D.ietf-dhc-anonymity-profile PUBLIC ''  
   "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-dhc-anonymity-profile.xml">
]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc compact="yes"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>

<!-- Expand crefs and put them inline -->
<?rfc comments='yes' ?>
<?rfc inline='yes' ?>

<rfc category="std" 
     docName="draft-huitema-dnssd-privacy-00.txt" 
     ipr="trust200902">

<front>
    <title abbrev="DNS-SD Privacy Extensions">
      Privacy Extensions for DNS-SD
    </title>

   <author fullname="Christian Huitema" initials="C." surname="Huitema">
      <organization>Microsoft</organization>
      <address>
        <postal>
          <street> </street>
          <city>Redmond</city>
          <code>98052</code>
          <region>WA</region>
          <country>U.S.A.</country>
        </postal>
        <email>huitema@microsoft.com</email>
      </address>
    </author>

    <date year="2016" />

    <abstract>
        <t> 
DNS-SD allows discovery of services published in DNS or MDNS. The publication
normally disclose information about the device publishing the services.
There are use cases where devices want to communicate without disclosing
their identity, for example two mobile devices visiting the same
hotspot. We propose a method to obfuscate the identification
information published by DNS-SD.
        </t>
    </abstract>
</front>

<middle>
<section title="Introduction">
<t>
There are cases when nodes connected to a network want to provide
or consume services without exposing their identity to the other
parties connected to the same network. Consider for example a
traveller wanting to upload pictures from a phone to a laptop
when connected to the Wi-Fi network of an Internet cafe, or
two travellers who want to share files between their laptops
when waiting for their plane in an airport lounge. 
</t>
<t>
We expect that these exchanges will start with a discovery 
procedure using DNS-SD <xref target="RFC6763" />. One of the devices
will publish the availability of a service, such as a picture library
or a file store in our examples. The user of the other device will
discover this service, and then connect to it.
</t>
<t>
When analysing these scenarios in <xref target="analysis"/>, we find that
the DNS-SD messages leak identifying information such as instance name,
host name or service properties. We review the design constraint of a solution 
in <xref target="design"/>, and describe the proposed solution in
<xref target="solution"/>.
</t>
<section title="Requirements">
<t>
  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  document are to be interpreted as described in <xref target="RFC2119" />.
</t>
</section>
</section>

<section title="Privacy implications of DNS-SD" anchor="analysis">
<t>
DNS-Based Service Discovery (DNS-SD) is defined in <xref target="RFC6763" />. 
It allows nodes to publish the availibility of an instance of a service by
inserting specific records in the DNS (<xref target="RFC1033"/>, 
<xref target="RFC1034"/>, <xref target="RFC1035"/>) or by publishing 
these records locally using
multicast DNS (MDNS) <xref target="RFC6762"/>. The service availability
will be described in three types of records:
</t>
<t>
<list style="hanging">
<t hangText="PTR Record:">Associate the service name in the domain with 
the "instance" name published by the node.
</t>
<t hangText="SRV Record:">Provides the node name, port number, priority and 
weight associated with the service instance, in conformance with <xref target="RFC2782" />.
</t>
<t hangText="TXT Record:">Provides a set of attribute-value pairs describing
specific properties of the service instance.
</t>
</list>
</t>
<t>
In the remaining subsections, we will review the privacy issues related to publishing
instance names, node names, service attributes and other data, as well as review 
the implications of using the discovery service as a client.
</t>

<section title="Privacy implication of publishing instance names" anchor="instanceLeak" >
<t>
In the first phase of discovery, the client will obtain a copy of all
the PTR records associated to a service in a given naming domain. Each
record contains a domain name starting with an instance name.
Instance names are free form description of the instance, and are meant to
convey enough information so discovery clients can easily select the
desired service. 
Section 4 of <xref target="RFC6763" /> give the
following example for the instance names of a printer service:
</t>
<t>
<figure>
<artwork>
     Building 2, 1st Floor  .  example  .  com  .
     Building 2, 2nd Floor  .  example  .  com  .
     Building 2, 3rd Floor  .  example  .  com  .
     Building 2, 4th Floor  .  example  .  com  .
</artwork>
</figure>
</t>
<t>
Nodes that use DNS-SD in a mobile environment will rely on the specificity
of the instance name to identify the desired service. In our example of users
wanting to upload pictures to a laptop in an Internet Cafe, the list of 
available services may look like:
</t>
<t>
<figure>
<artwork>
     Alice's notebook       .  local  .
     Bob's laptop           .  local  .
     Image store for Carol  .  local  .
</artwork>
</figure>
</t>
<t>
Alice will see the list on her phone and understand intuitively that she should
pick the fist item. The discovery will "just work." It will also reveal 
to anybody who cares that Alice is currently visiting the Internet Cafe.
</t>
</section>

<section title="Privacy implication of publishing node names">
<t>
The SRV records contain the DNS name of the node publishing the
service. Typical implementations construct this DNS name by
concatenating the "host name" of the node with the name of the 
local domain. The privacy implications of this
practice are reviewed in <xref target="I-D.ietf-intarea-hostname-practice" />.
Depending on naming practices, the host name is either a strong 
identifier of the device, or at a minimum a partial identifier.
It enables tracking of the device, and by extension of the device's owner.
</t>
</section>

<section title="Privacy implication of publishing service attributes">
<t>
The TXT records contain a set of attribute and value pairs characteristics of
the service implementation. These attributes reveal some information
about the devices that publishes the service. The amount of information
will vary widely with the particular service and its implementation:
</t>
<t>
<list style="symbols">
<t>
Some attributeslike the paper size available in a printer, are the
same on many devices, and thus only provides limited information
to a tracker.
</t>
<t>
Attributes that have freeform values, such as the name of a directory,
may reveal much more information.
</t>
</list>
</t>
<t>
Combinations of attributes have more information power than specific attributes,
and can potentially be used for "fingerprinting" a specific device.
</t>

</section>

<section title="Device fingerprinting">
<t>
The combination of information published in DNS-SD has the potential to
provide a "fingerprint" of a specific device. Such information includes: 
</t>
<t>
<list style="symbols">
<t>
The list of services published by the device, which can be retrieved because the
SRV records will point to the same host name.
</t>
<t>
The specific attributes describing these services.
</t>
<t>
The port numbers used by the services.
</t>
<t>
The values of the priority and weight attributes in the SRV records.
</t>
</list>
</t>
<t>
This combination of services and attribute will often be sufficient to identify
the version of the software running on a device. If a device publishes
many services with rich sets of attributes, the combination may be
sufficient to identify the specific device. 
</t>
</section>

<section title="Privacy implication of discovering services" anchor="clientPrivacy" >
<t>
The consumers of services engage in discovery, and in doing so do 
reveal some information such as the list of services that they
are interested in and the domains in which they are looking for the
services. When the clients select specific instances of services,
they reveal their preference for these instances.
</t>
<t>
In first analysis, the leakage of information by lients looks benign
compared to the disclosures made by the servers. There may be a 
concern when the client is attempting to use rare services.
</t>
</section>

</section>

<section title="Design of DNS-SD privacy mitigations" anchor="design" >
<t>
Ah Ah.
</t>

<section title="Obfuscated instance names" >
<t>
The privacy issues described in <xref target="instanceLeak"/> 
can be solved by obfuscating the instance names. Instead
of a user friendly description of the instance,
the nodes will publish a random looking string of characters.
To prevent tracking over time and location, different string
values should be used at different locations, or at different times.
</t>
<t>
Authorized parties should be able to "de-obfuscate" the names,
while non-authorized third parties will not be. For example,
if both Alice notebook and Bob's laptop use an obfuscation process, 
the list of available services should appear differently 
to them and to thrid parties. Alice's phone will be able to
de-obfuscate the name of Alice's notebook, but not that of 
Bob's laptop. Bob's phone will do the opposite. Carol will do
neither.
</t>
<t>
Alice will see something like:
</t>
<t>
<figure>
<artwork>
     GobbeldygookBlaBlaBla (Alice's notebook) .  local  .
     Abracadabragooklybok                     .  local  .
     Image store for Carol                    .  local  .
</artwork>
</figure>
</t>
<t>
Bob will see:
</t>
<t>
<figure>
<artwork>
     GobbeldygookBlaBlaBla                   .  local  .
     Abracadabragooklybok (Bob's laptop)     .  local  .
     Image store for Carol                   .  local  .
</artwork>
</figure>
</t>
<t>
Carol will see:
</t>
<t>
<figure>
<artwork>
     GobbeldygookBlaBlaBla  .  local  .
     Abracadabragooklybok   .  local  .
     Image store for Carol  .  local  .
</artwork>
</figure>
</t>
<t>
In that example, Alice, Bob and Carol will be able to select the
appropriate instance. It would probably be preferable to filter out the
obfuscated instance names, to avoid confusing the user. In our example, Alice 
and Bob have updated their software to understand obfuscation, and they
could easily filter out the obfuscated strings that they do not like.
But Carol is not using this system, and we could argue that her experience 
is suboptimal.
</t>
<t>
The suboptimal experience with unmodified software could be avoided if
the obfuscated service records were published using different service names,
or using different domain names. This would of course make management a bit
more complex, and is thus debatable.
</t>
</section>

<section title="Randomized host names" >
<t>
Instead of publishing their actual name in the SRV records, nodes 
could publish a randomized name. That the solution argued for
in <xref target="I-D.ietf-intarea-hostname-practice" />.
</t>
<t>
Randomized host names will prevent some of the tracking.
Host names are typically not visible by the users, and
randomizing host names will probably not cause much
usability issues.
</t>
</section>


<section title="Timing of obfuscation and randomization" anchor="timing" >
<t>
It is important that obfuscation of instance names be performed at the right time,
and that the obfuscated names change in synchrony with other identifiers,
such as MAC Addresses, IP Addresses or host names.
If the randomized host name changed
but the instance name remained constant, an adversary would have no difficulty
linking the old and new host names. Similarly, if IP or MAC addresses changed but 
host names remained constant, the adversary could link the new addresses to the
old ones using the published name.
</t>
<t>
The problem is handled in <xref target="I-D.ietf-intarea-hostname-practice" />, 
which recommends to pick a new random host name at the time of connecting to 
a new network. The instance names should be obfuscated at the same time,
or maybe use the randomized host name as input in the randomization
process.
</t>

</section>

<section title="Fingerprint resistance" >
<t>
Difficult...
</t>
</section>


<section title="A note on Private DNS services" >
<t>
The DNS Private Exchange working group develops mechanisms to
provide confidentiality to DNS transactions, addressing the problems 
outlined in <xref target="RFC7626" />. The solutions being developed 
include DNS over TLS <xref target="I-D.ietf-dprive-dns-over-tls" />
and DNS over DTLS <xref target="I-D.ietf-dprive-dnsodtls" />.
</t>
<t>
We could imagine that DNS-SD nodes are configure to update and
retrieve DNS records using DNS over TLS or DNS over DTLS, but 
a number of problems can arise:
</t>
<t>
<list style="symbols" >
<t>
Discovery queries are scoped by the domain name within which services
are published. As nodes move and visit arbitrary networks, there
is no guarantee that the domain services for these networks
will be accessible using DNS over TLS or DNS over DTLS.
</t>
<t>
Information placed in the DNS is considered public. Even if
the server does support DNS over TLS, third parties will 
still be able to discover the content of PTR, SRV and TXT
records.
</t>
<t>
Neither DNS over TLS nor DNS over DTLS applies to MDNS.
</t>
</list>
</t>
<t>
In short, DNS ovr TLS and DNS over DTLS solve a different problem,
and are not a solution for DNS-SD privacy.
</t>
</section>

</section>

<section title="Privacy extensions for DNS-SD" anchor="solution" >
<t>
The proposed solution uses the following components:
</t>
<t>
<list style="symbols">
<t>
The host names are randomized to prevent tracking.
</t>
<t>
Nodes provide an Instance Discovery Key to other nodes authorized to discover the service instance,
</t>
<t>
The Instance Discovery Key is combined with a random seed to obfuscate the instance names,
</t>
<t>
Nodes engaged in discovery attempt to de-obfuscate the instance names using the set of Instance Discovery Key that they know about,
</t>
</list>
</t>
<t>
These components are detailed in the following subsections.
</t>
<section title="Randomized Host Name" >
<t>
Nodes publishing services with DNS-SD and concerned about their privacy MUST
use a randomized host name. The randomized name MUST be changed when
network conectivity changes, to avid the correlation issues described in
<xref target="timing" />. The randomized host name MUST be used in
the SRV records describing the service instance, and the corresponding 
A or AAAA records MUST be made available through DNS or MDNS, within the
same scope as the PTR, SRV and TXT records used by DNS-SD.
</t>
<t>
If the link-layer address of the network connection is properly obfuscated 
(e.g. using MAC Address Randomization), 
The Randomized Host Name MAY be computed using the algorithm described
in section 3.7 of <xref target="I-D.ietf-dhc-anonymity-profile" />. 
If this is not possible, the randomized host name SHOULD be constructed by simply
picking a 48 bit random number meeting the 
Randomness Requirements for Security expressed in <xref target="RFC4075" />,
and then use the hexadecimal representation of this number as the
obfuscated host name.
</t>
</section>

<section title="Instance Discovery Key" >
<t>
The obfuscation and de-obfuscation of instance names is controlled by the Instance Discovery Key. 
Each device publishing a service instance configures an Instance Discovery Key associated with
the service instance.
</t>
<t>
The Instance Key SHOULD be at least 16 bytes long (128 bits). Its content SHOULD meet the 
Randomness Requirements for Security expressed in <xref target="RFC4075" />.
</t>
</section>

<section title="Composing Obfuscated Instance Names" anchor="obfuscation" >
<t>
The obfuscated instance name is composed of two components,
a seed and a hash, encoded in BASE64 (<xref target="RFC2045" /> section 6.8)
and separated by a dot:
</t>
<t>
<figure>
<artwork>
   instance_name = <base64_seed> "." <base64_hash>
</artwork>
</figure>
</t>
<t>
The seed is derived algorithmically from the randomized host name. 
If the randomized name changes, new instance names SHOULD be computed
and the corresponding records SHOULD be published 
in order to meet the requirement defined in <xref target="timing" />.
</t>
<t>
The complete instance name MUST be genrated using the following process:
</t>
<t>
<figure>
<artwork>
   long_seed = HASH(randomized_host_name)
   seed = first 12 bytes of long_seed
   long_hash = HASH(seed | instance_discovery_key )
   instance_hash = first 12 bytes of long_hash 
   instance_name = BASE64(seed) "." BASE64(instance_hash) 
</artwork>
</figure>
</t>
<t>
In this formula, HASH SHOULD be the function SHA256 
defined in <xref target="RFC4055"/>, unless otherwise specified. 
Implementers MAY eventually replace SHA256 with a stronger algorithm.
</t>
<t>
The algorithm produces seeds and hash that are encoded as 16 BASE64 characters.
The resulting instance name is 33 characters long, which fits
within the 63 characters limit defined in
<xref target="RFC6763"/>.
</t>
</section>

<section title="De-Obfuscation of Instance Names" >
<t>
De-obfuscation of instance names assumes that authorized nodes are provisioned with 
three elements for each discoverable instance:
</t>
<t>
<list style="symbols">
<t>
the de-obfuscated instance name,
</t>
<t>
a copy of the instance_discovery_key,
</t>
<t>
optionally, the identifier of the HASH function used by the publisher.
</t>
</list>
</t>
<t>
A given node may be provisioned do discover many instances. For example,
Alice's phone may know about Alice's laptop and Alice's desktop. It might
also know of Bob's laptop, if Alice and Bob have agreed to share such 
information.
</t>
<t>
To de-obfuscate the instance names, nodes performing discovery should 
obtain the list of PTR records published for the service and domain being 
searched and then do the following:
</t>
<t>
<list style="symbols">
<t>
Test whether the instance name contains the base64 encoding of a
seed and hash as defined in <xref target="obfuscation" />. If it
is not in that form, the name is not considered obfuscated.
</t>
<t>
Retrieve the binary seed and hash from the base64 encoding.
</t>
<t>
For each known instance discovery key, compute whether the
hash of the seed and key, and compare it to the published
hash.
</t>
<t>
If there is a hash, the de-obfuscated name of the instance
is the de-obfuscated name associated with the matching
instance discovery key
</t>
</list>
</t>

</section>

</section>

<section title="Security Considerations">
<t> 
This document specifies a method to protect the privacy of 
service publishing nodes. This is especially useful when operating
in a public space.
Obfuscating the identity of the publishing nodes prevents
some forms of "targeting" of high value nodes.
</t>
<t>
Obfuscating the identity of the publishing nodes does
not provide any form of access control. It will not prevent
attackers from trying to access the services.
</t>
<t>
The cost of the de-obfuscation algorithm scales as the product
of the number of authorized publishers known by the client,
times the number of obfuscated services published in the
searched name domain. Attackers could potentially publish
a large number of bogus instances of a service, forcing a
high computation cost on discovery clients. While this
potential denial of service attack is concerning, we note
that this is merely an aggravation of a flooding attacks
against DNS-SD. 
</t>
</section>

<section title="IANA Considerations" anchor="iana">
<t> 
This draft does not require any IANA action.
</t> 
</section>

<section title="Acknowledgments">
    <t>
This draft results from initial discussions with Dave Thaler.
    </t>
</section>
</middle>

<back>
<references title="Normative References">
       &rfc2045;
       &rfc2119;
       &rfc4055;
       &rfc4075;
       &rfc6763;
</references>
<references title="Informative References">
       &rfc1033;
       &rfc1034;
       &rfc1035;
       &rfc2782;
       &rfc6762;
       &rfc7626;
       &I-D.ietf-intarea-hostname-practice;
       &I-D.ietf-dprive-dns-over-tls;
       &I-D.ietf-dprive-dnsodtls;
       &I-D.ietf-dhc-anonymity-profile;
</references>  

</back>
</rfc>

PAFTECH AB 2003-20262026-04-22 23:58:37