One document matched: draft-weaver-dnsext-comprehensive-resolver-00.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced. 
     An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2181 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2181.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
     (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-weaver-dnsext-comprehensive-resolver-00" ipr="full3978">
  <!-- category values: std, bcp, info, exp, and historic
     ipr values: full3667, noModification3667, noDerivatives3667
     you can add the attributes updates="NNNN" and obsoletes="NNNN" 
     they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
         full title is longer than 39 characters -->

    <title abbrev="Comprehensive DNS Resolver Defenses">Comprehensive DNS Resolver Defenses Against Cache Poisoning</title>

    <!-- add 'role="editor"' below for the editors if appropriate -->

    <!-- Another author who claims to be an editor -->

    <author fullname="Nicholas Weaver" initials="N." 
            surname="Weaver">
      <organization>International Computer Science Institute</organization>

      <address>
        <postal>
          <street>1947 Center Street suite 600</street>

          <!-- Reorder these if your country does things differently -->

          <city>Berkeley</city>

          <region>CA</region>

          <code>94704</code>

          <country>USA</country>
        </postal>

        <phone>+1 510 666 2903</phone>

        <email>nweaver@icsi.berkeley.edu</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <date year="2008" />

    <!-- If the month and year are both specified and are the current ones, xml2rfc will fill 
         in the current day for you. If only the current year is specified, xml2rfc will fill 
	 in the current day and month for you. If the year is not the current one, it is 
	 necessary to specify at least a month (xml2rfc assumes day="1" if not specified for the 
	 purpose of calculating the expiry date).  With drafts it is normally sufficient to 
	 specify just the year. -->

    <!-- Meta-data Declarations -->

    <area>General</area>

    <workgroup>DNS Extensions Working Group</workgroup>

    <keyword>DNS</keyword>

    <!-- Keywords will be incorporated into HTML output
         files in a meta tag but they have no effect on text or nroff
         output. If you submit your draft to the RFC Editor, the
         keywords will be used for the search engine. -->

    <abstract>
<t>
DNS resolvers are vulnerable to many attacks on their network
communication, ranging from blind attacks to full men-in-the-middle.
Although a full man-in-the-middle can only be countered with
cryptography, there are many layers of defenses which apply to less
powerful attackers.  Of particular interest are defenses which only
require changing the DNS resolvers, not the authoritative servers or
the DNS protocols.  This document begins with a taxonomy of attacker
capabilities and desires, and then discusses defenses against classes
of attackers, including detecting non-disruptive attacks, entropy
budgeting, detecting entropy stripping, semantics of duplication, and
cache policies to eliminate "race-until-win" conditions.  Proposed
defenses were evaluated with traces of network behavior.
</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>DNS resolvers are susceptible to many attacks on their network
      traffic, ranging from an attacker performing blind packet
      injection to a full man-in-the-middle, capable of controlling
      all traffic the resolver receives.</t>

      <t>With the recent discovery of the <xref target="kaminski">Kaminski Attack</xref>, new
      attention has been focused on securing DNS from adversaries.
      This document focuses on a subset of the problem: securing DNS
      resolvers without changing the DNS authoritative servers or
      protocols, including authorities that do not actively follow the
      DNS specification.</t>

      <t>This document begins with a taxonomy of attacker properties
      (<xref target="taxonomy"></xref>), 
      observations on race-until-win blind attacks (<xref
      target="race"></xref>), 
      the limitations of blind
      transaction attacks (<xref target="blindtransaction"></xref>),
      the evaluation strategy used
      to study possible defenses (<xref target="eval"></xref>),
      directly detecting non-disruptive attacks
      (<xref target="non-disruptive"></xref>), entropy budgeting (<xref
      target="budget"></xref>), 
      detecting entropy stripping (<xref
      target="entropy_stripping"></xref>), 
      the effects of duplication on protecting the transaction and
      cache while maintaining compatibility (<xref
      target="duplication"></xref>), and finally, the effects of cache
      policies which resist race-until-win attacks (<xref
      target="policy"></xref>).</t>

        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
    </section>

    <section anchor="taxonomy" title="A Taxonomy of Attacks">
     <t>Not all attacker capabilities are equal, and different
     defenses are able to block some classes of attackers.  By forming
     a taxonomy, one can describe the classes of attackers which each
     defense can detect or mitigate.</t>

     <section anchor="passivevactive" title="Passive or Active Attacks">
     <t>A passive attacker must wait for the targeted resolver to make
     requests, while an active attacker is able to trigger specific requests by
     the resolver.  In general, there are numerous mechanisms which an
     attacker can use to trigger requests.  One must assume that,
     should the attacker desire it, the attacker can be active.</t>
     </section>

     <section anchor="blindvaware" title="Blind or Aware Attacks">
     <t>A blind attacker does not know any of the entropy sources
     (e.g. the Transaction ID, port selection, capitalization) of the
     request.  An aware attacker knows this information because he can
     directly observe the request of the resolver.  Increasing the
     entropy of a request increases the difficulty for blind
     attackers, but has no effect on aware attackers.</t> </section>

    <section anchor="non-disruptivevdisruptive" title="Non-Disruptive or
     Disruptive Attacks"> <t>A non-disruptive attacker is unable to block
     either the resolver's request or the authority's response, while
     a disruptive attacker can halt either the legitimate request or the
     legitimate response.  Disruptions can take the shape of a DOS attack
     (if the attacker is not on the packet path), taking advantage of
     a failure of the authoritative DNS server, mechanisms in the physical
     layer which enable sequelching a sender, or dropping packets if
     the attacker is a man-in-the-middle.</t>
     </section>

    <section anchor="transactionvcache" title="Transaction or Cache Attacks">
     <t>A transaction attack targets the individual transaction
     requested by the end-client: the attacker needs to get his
     response immediately accepted by the victim application, while a
     cache attack requires that the resolver cache the attacker's
     result for future use.</t>

     <t>Transaction attacks are often less powerful than cache
     attacks: A transaction attack targets the single victim
     request, while cache attacks can have lasting effects until the
     TTL expires.  For blind attacks, as discussed in <xref
     target="blindtransaction"></xref>, blind transaction attacks are
     strictly less powerful than blind cache attacks.</t>
     </section>


     <section anchor="directvancillary" title="Direct Mapping or Ancillary Data Attacks">
     <t>A direct attack targets the immediate answer to the question
     asked by the resolver.  An ancillary data attack requires that
     the resolver accept some data, be it an NS RRSet, a CNAME, an A
     record, or other result, which is not the direct answer to the
     question.  A transaction attack must target the direct mapping,
     while cache attacks can target ancillary data.</t>
     </section>
    </section>



    <section anchor="race" title="Race-Until-Win Blind Attacks">
      <t>The Kaminski attack is a blind, cache attack targeting
      ancilllatory data. The additional power of the Kaminski attack
      is not a reduction in the number of packets required for a
      successful blind attack, but a reduction in time.  Rather than
      only running one race-condition attack every TTL, the Kaminski
      attack and variations all rely on ancillary information to
      achieve a "race-until-win" property: when the attack fails, it
      can be immediately retried without delay instead of waiting for
      the TTL to expire.</t>

      <t>
      If a blind attacker targets an actual record, that race can only
      be run once every TTL, as subsequent requests will simply hit in
      the cache.  Thus, creating any race-until-win attack on a single
      name requires targeting ancillary data.
      </t>
    </section>

     <section anchor="blindtransaction" title="Requirements for Blind Transaction Attacks">

     <t>A blind transaction attack is strictly harder than a blind
     cache attack.  In a blind cache attack, the attacker only needs
     to coerce the victim into generating DNS requests.  A blind
     transaction attack not only requires coercing the victim into
     generating a DNS request, but also requires that the victim
     act upon the result of that request.</t>

     <t>Blind transaction attacks are also less powerful than blind
     cache attacks.  In order to target a name with a race-until-win
     attack, the attacker must be able to not only get the victim to
     generate a DNS request, and act on that request, but that request
     must be valid for any arbitrary subname within the target domain.
     If the blind transaction attack
     is targeting a single name, it can never be run with a
     race-until-win property, as it must target the direct mapping and not ancillatory data.</t>

     <t>However, there does exist at least one
     significant blind transaction attack which can be conducted with
     a "race-until-win" property: targeting cookies contained in a web
     browser.
     In this attack, the attacker coerces the victim into visiting
     a site of the attacker's choosing.  This site opens up an iFrame
     which points to 1.www.target.com, which the attacker attempts to
     poison with a blind attack.  If it fails, the Javascript on the
     site creates a second iFrame, 2.www.target.com.  This process
     iterates until success, whereupon the victim's browser contacts
     the attacker's web site, presenting all relevant cookies.  Since
     each attempt uses a different name, the attacker can try
     continuously until successful.
     </t>

     <t>Although such cookie stealing is noteworthy, any site which
     allows cookies to be stolen in this manner is also trivially
     vulnerable to many attack tools such as <xref
     target="perry">Cookie Monster</xref>, and any web service which
     resists such tools also resists this attack.  It is unclear
     whether the DNS infrastructure should be concerned with this
     particular attack.</t>

     <t>It is also unclear what other blind transaction attacks are
     possible with this "race-until-win" property, but it relies on
     the end-application trusting arbitrary subnames and subdomains for
     attacker-triggered requests.  Without this property, the attacker
     cannot perform a "race-until-win" blind transaction attack.
     Thus, blind transaction attacks, although they need to be
     considered, are a far narrower threat than blind cache attacks.
          </t>
     </section>



    <section anchor="eval" title="General Evaluation Strategy">
    <t>We evaluate multiple defenses in this document using
    network traces analyzed by "Bro", a network analysis environment
    primarily used for intrusion detection.  Bro
    includes a detailed analyzer for DNS behavior, coupled to a
    scriptable analysis language.</t>

    <t>The primary evaluation was conducted on traces of the ICSI
    border, capturing 18,329,249 UDP external DNS requests generated
    by ICSI's resolvers over a period of 19 days.  This initial
    evaluation ignored PTR lookups.</t>
    </section>


    <section anchor="non-disruptive" title="Directly Detecting Non-Disruptive Attacks">
       <t>Any non-disruptive attack will, with high probability, have the
       resolver receive two separate and valid responses: one from the
       attacker and one from the authoritative server.  This will
       occur on blind attacks where the attacker does not suppress
       the correct operation of the authoritative
       resolver, such as by flooding it to achieve a denial-of-service.
       It will also occur if an
       attacker is acting as a packet injector on a broadcast media if
       the attacker is not able to squelch the sender.</t>

       <t>The presence of a changed duplicate response, where the
       second response is different from the first response within a
       short period of time can thus be used to directly detect that a
       non-disruptive attack has occurred on a transaction, and mitigate
       any poisoning of the cache.</t>

       <t>The algorithm is simple: the resolver maintains a list of
       all answered responses for a timeout which should be on the
       order of one or two seconds.  If a second response for a
       request arrives whose transaction and other entropy matches the
       accepted response but whose value is different, this should be
       treated as an attack, and all cache entries set or dependent on
       the first result should be voided, mitigating the attack's
       effect on the cache.  Thus a transaction attack is detected but
       not mitigated (as the final victim may have acted on the
       result), but a cache attack is both detected and mitigated</t>

       <t>Furthermore, if the resolver is a recursive resolver, it
       should also forward the second response to the originator with
       the TTL changed to 0 seconds, which allows the initial
       questioner, if running the same algorithm, to also know that
       the result was compromised, enabling the end system to detect
       the attack on the transaction and mitigate any effects on a local cache.</t>

       <t>We developed a Bro analysis script
       to detect this condition to evaluate the false
       positive rate.  We observed only one such site at ICSI, a realtime
       DNS blackhole service.  A run at Ohio State did found a few
       other sites, which we haven't yet assessed.</t>

       <t>As importantly, if resolvers implemented this approach, they
       would fail to cache only a handful of authorities which present
       this anomaly on normal results.  Until the stub resolvers
       are updated to treat this condition as an attack, these sites
       would still resolve successfully but would not be cached.</t>

       <t>The state-holding requirements, although nontrivial, are also
       reasonable.  Consider a recursive resolver (or a cluster node in a
       clustered resolver implementation) that generates 10,000
       outstanding external queries per second.  If the timeout is 1
       second, and the state-holding required involves 1 KB of data,
       the additional memory requirements would only be 10 MB of data
       for such a resolver.  Even with a longer timeout or a more
       active resolver, the state-holding requirements should be
       reasonable.</t>

       <t>Furthermore, although the replies themselves need to be
       maintained, for the most part this information is already
       retained in the cache.  For any data element that is
       cached, only a pointer to that element needs to be stored to
       perform this consistency check, rather than the full answer.</t>

       <t>This should also not affect any resolver composed of a
       cluster of systems as long as the load balancer is
       deterministic: always sending the second response back to the
       cluster node which received the first response.  This enables
       each cluster node to maintain its own list of replies to check
       for changed responses.</t>

       <t>This policy would have no effect on the latency of
       resolvers.  Until end-user stub resolvers and applications are
       updated to treat this as an attack, even sites with anomalous
       DNS authorities will still resolve properly, but may experience
       slightly higher load due to lack of cacheability.</t>


    </section>


    <section anchor="budget" title="Entropy Budgeting">
    <t>An important question is "how much entropy is necessary" to
    prevent blind attacks.  It's obvious that 16 bits (16b)
    of entropy in a query
    is insufficient, both to protect transactions and to protect the
    cache.  32b of entropy may in practice currently suffice to mitigate
    most attacks today, as attackers will prefer victims that do not
    implement proper randomization.</t>
 
    <t>Thus the question, assuming all resolvers implement increased
    entropy defenses, what is sufficient query entropy to protect the
    cache?  We argue that 40b of entropy is more than sufficient.  As
    the probability of attack success with K tries and N bits of
    entropy is 1-(1-1/2^N)^K, 40b of entropy requires nearly a
    terapacket of attacks to be successful with reasonable
    probability.</t>

    <t>40b of entropy is easily obtainable for resolvers not behind
    entropy-stripping devices such as NATs
    (<xref target="entropy_stripping"></xref>).  With 16b of entropy
    for the transaction ID, over 15b of entropy for source-port
    randomization, and names of at least 9 characters using 0x20
    randomization (observing that most DNS authoritative servers
    maintain capitalization, thus random capitolization can provide a
    nonce value)<xref target="0x20"></xref> achieves the necessary 40b
    threshold without resorting to duplication in most cases
    (<xref target="duplication"></xref>).</t>

    <t>It is possible for a resolver to use a simple count of
    failures, responses which match requests and appear to come from
    the proper authority but have different entropy values, to know
    that it is under attack and respond with
    duplication <xref target="duplication"></xref> while the attack is
    ongoing, well before an attacker could have an effect on the cache
    with large entropy.  In our observations on ICSI traces we noticed
    that some authoritative servers do this naturally, but the rate is
    low.</t>

    <t>If the resolver responded to 1k PPS of attacks with
    duplication, and the entropy budget is 40b, an attacker attempting
    to go below the threshold by sending 990 attacks per second would
    need over 2 hours to have even a .001% chance of success, or over
    80 days of a continual attack to have just a 1% chance.  Yet by
    requiring the attacker to send 1k packets-per-second to trigger
    duplication in the resolver, duplication which only needs to be in
    place while under attack, this shouldn't prove to be an effective
    DOS.</t>

    <t>However, the author initially advocated clearing the cache at a
    threshold of attack.  This was an error, as voiding the cache does
    not provide a benefit as the attacker knows when his attack is
    successful, and could accept the voided cache and just keep trying
    until successful.</t>

    </section>

    <section anchor="entropy_stripping" title="Entropy Stripping">

      <t> Many network devices, such as NATs, firewalls, and mandatory
       proxies, can exist between a resolver and the remote authority it
       is querying.  Some of these devices will rewrite important
       information, such as IP addresses, UDP port numbers, or even
       DNS transaction IDs, fields that the resolver relies on as
       sources of query entropy.  These devices are likely to be
       located near one or the other endpoint system.
      </t>

      <t>If a DNS resolver is attempting to increase its request
      entropy by using one or more of these sources, the resolver must
      know if these entropy sources are being stripped from its
      network communication.
      </t>

      <t>The simple solution is to query a specialized authority which
      returns the entropy value it receives as an A record.  An
      example of such an authority has been temporarily operating at
      {port,id,server}.nettest.icir.org, where the transaction ID and
      port are returned as the least significant two bytes of the
      address, while server returns the IP address of the contacting
      DNS resolver.  An alternate version, entropy.nettest.icir.org,
      returns a CNAME to a human readable form of all three values,
      while respecting the incoming capitalization to verify 0x20
      operation.
      </t>

      <t>
      When a DNS resolver starts, and when a resolver notices
      that its IP address has changed, it should query such an
      authority, provided by either the resolver's software developer
      or a third party, to determine if it is located behind a NAT or
      other entropy-stripping device.  If the resolver is behind a
      NAT, it must then use duplication (<xref target="duplication">
      </xref>) to protect the cache if the remaining entropy is not
      sufficient to meet the security goals (<xref
      target="budget"></xref>).
      </t>

      <t>
      A case where this does not provide protection is if the
      authority and the attacker, not the resolver, is behind an
      entropy-stripping network device.  In such a case, an attacker
      capable of forging packets from within the authority's network
      is likely able to perform other activities far more damaging
      than a blind attack on DNS requests from that authority.  Also,
      the entropy stripping is only affecting queries to this
      authority, not all queries performed by a resolver.  This also
      assumes than any entropy-stripper is not malicious and would
      therefore not benefit from actively whitelisting the test.
      </t>
    </section>




    <section anchor="duplication" title="On Duplication for Entropy Increase">

    <t>If 40b of entropy are available on the request and the resolver
    is not under a significant attack, duplication is not necessary.
    Thus duplication should be viewed as a fallback
    position for resolvers which are behind a NAT or other entropy
    stripping device, accessing authoritative servers which don't
    respond to 0x20 (per Dagon et. al's
    proposal <xref target="0x20"></xref>, or is known to be under an
    active blind attack.</t>

    <t>Most authorities are deterministic to multiple queries: if two
    requests for the same name are received in a short period of time,
    the returned values will be identical.  By issuing two identical
    requests with different entropy values, this nearly doubles the
    entropy if we require that both responses are identical before
    acceptance.  Specifically, if the request has K bits of entropy,
    two requests which are accepted if the responses are identical has
    2K-1 bits of entropy.</t>

    <t>Dagon et. al.'s 0x20
    proposal <xref target="0x20"></xref> uses this to increase query
    entropy when capitalization is not preserved by an authority, as
    without 0x20 the available entropy is only around 32b per query.
    Similarly, duplication can be used in any case where there is
    insufficient entropy, such as the impact of an entropy-stripping
    network device, or if the resolver knows it is currently under a
    blind attack.
    </t>

    <t>Unfortunately, not all authorities are deterministic in this
    manner, including some critical authoritative servers belonging to
    Content Distrbution Networks such as
    Akamai<xref
    target="akamai"></xref> and others that use frequently changing
    DNS responses for load-balancing.
    If two distinct responses are received, and the
    resolver randomly selects one, this reduces by 1 bit the entropy
    of the request when compared with no duplication.  Thus, for
    purposes of the cache, it is clear that a resolver must not cache
    a response when the two responses are different.
    </t>

    <t>However, it appears reasonable to return one of the values as
    the answer to the transaction, as blind transaction attacks are
    less powerful than blind cache attacks
    (<xref target="blindtransaction"></xref>).  For a blind
    transaction attack to work, the attacker must target a domain
    served by such a volatile authority as well as coerce the victim
    to act on this.  Although this does leave a small window of
    vulnerability open, it is proabably preferable to the alternative
    of not resolving thees names.</t>

    <t>By setting the TTL to 0 and
    returning a randomly selected response, this enables compatibility
    with nondeterministic authorities without compromising the
    integrity of either the resolver's cache or the final client's
    cache.  Since it also only reduces the transactional entropy by
    1b, it does not make transactions significantly less secure than
    without duplication.</t>

    <t>The alternate approach, iterate until convergence as proposed
    by <xref target="barwood">Barwood</xref>, will succeed with very high probability if the
    remote resolver returns single answers, but fails when the
    authority returns RRSets which contain more than one element where
    the RRSets are randomly chosen, if a complete match on the RRSet
    is required.  If "return one of N of the RRSets" is employed it
    works well.</t>

    </section>

    <section anchor="policy" title="Cache Policy, Scoping, and Ancillary Data Attacks">
    
      <t><xref target="RFC2181">RFC 2181 section 5.4.1</xref> is the
    current specification for how to accept ancillatory data.  It
    allows ancillatory data, if "in bailywick" (from a server which
    should be an authority for this name), to be cached for subsequent
    use, although it should not be returned as an answer.  It is this
    caching of ancillatory data that enables the "race-until-win"
    Kaminski blind cache attack.  Likewise, before bailywick-checking
    was deployed, ancillatory data was used for classic glue
    poisoning.</t>

    <t>The ancillatory data includes all the data beyond the direct
    answer to the query (including the NS RRSet, A records associated
    with the NS RRSet, A records associated with a CNAME for the
    direct answer, or any other data).  This data serves four
    purposes:</t>
    
    <t><list style="symbols">
    
    <t>The NS RRSet and associated A records needed to resolve the
    current request.</t>

    <t>Items, such as an A record for a CNAME alias, which if accepted
    will speed the current request's processing by removing the need
    to fetch additional records.</t>

    <t>Items which if placed in the cache will speed subsequent lookups.</t>

    <t>An indication that an item in the cache is now obsolete.</t>
    </list></t>

    <t>This suggests that items have a different role in two scopes,
    analogous to how programming languages view scope: the local scope
    of the current recursive query and the global scope of the cache.
    Within the local scope, for purposes of resolving the direct
    question, ancillary data often must be trusted: otherwise
    authoritative nameservers may not be reachable when they exist
    within the domain being queried or in other cases where domains
    host each-others nameservers.  Yet for the purposes of resolving a
    query, if an authority lies about the ancillatory data, it could
    just as easily lie about the direct answer, making this data
    no less trustworthy for processing this answer.</t>

    <t>Yet for purposes of inclusion into the global scope, or for
    returning as the response to a query, ancillary data must not be
    trusted.  It is, by definition, unsolicited information and not an
    authoritative response, and lies at the heart of the Kaminski
    attack.  If a recursive resolver never accepts ancillary data
    into the cache, it becomes impossible to target a single name with
    a race-until-win blind attack.</t>

    <t>However, a resolver may safely perform an independent fetch for
    any piece of ancillary data.  This second fetch must not reuse
    the local scope of the previous fetch, but instead be fetched
    using a new local context.  This enables both the use of
    ancillary data in responses (such as A records involved in
    CNAMES) and to speed subsequent responses (such as obtaining the
    NS RRSet for subsequent lookups).</t>

    <t>As an example, suppose the query for 'www.example.com' returns
    'www.example.com CNAME server.example.com, example.com NS
    ns.example.com, ns.example.com A 12.34.56.78, server.example.com A
    12.34.56.90'.  The resolver can succesfully cache and return
    'www.example.com CNAME server.example.com', but must perform
    independent queries to fetch the NS RRSet and the A records for
    ns.example.com and server.example.com, and it can't return
    'www.example.com CNAME server.example.com, server.example.com A
    12.34.56.90' to the final client until the lookup for
    server.example.com completes.</t>

    <t>As a result, all information which is either placed in the
    global scope or returned to the final client will be validated
    directly by querying the proper authoritative server.  There
    are no race-until-win attacks possible for including a name into
    the cache, as inserting a name only would take place if it doesn't
    exist in the cache which limits any attempt to once every TTL.</t>

    <t>This is more restrictive than the policy
    in <xref target="RFC2181">RFC 2181 section 5.4.1</xref>.  The RFC
    policy allows ancillary data to be accepted into the global scope
    (cache) for purposes of subsequent query processing, which enables
    race-until-win attacks.</t>

    <t>Although the RFC states that the final client should also not
    accept these authority or additional records, it is unclear
    whether the stub resolver follow the RFC.  Thus a well structured
    recursive resolver should return results which are safe even if
    the client does cache additional records in violation of the
    RFC.</t>

    <t>The simplest mechanism for validation is to perform an explicit
    fetch, a policy being implemented
    in <xref target="unbound">Unbound</xref>: For CNAMEs and A records
    that are not the direct answer to the query, such records are not
    accepted directly but are instead fetched independently.  For the
    NS record and associated A records, the separate global and local
    scope is approximated by validating the NS RRSet and all elements
    within it using subsequent separate fetches.</t>

    <t>There is some minor uncertanty if this is a close-enough
    approximation for the NS RRSet.  There exists a short-term time
    window where the NS RRSet and A records would be in the cache but
    unvalidated, which may provide a narrow opportunity for abuse.
    However, this policy does not require new data structures to
    maintain the local scope of a recursive query, and the abuse
    window is probably sufficiently small.</t>

    <t>Note that if an explicit local scope is maintained, such a
    policy of fetching all ancillary data rather than including it
    subsumes bailiwick checking for accepting data into the cache, as
    no data is included into the cache for any use that was not
    explicitly requested from an authoritative server.</t>


    <section anchor="impact" title="Preliminary Estimates of Performance Impact">

    <t>Such a policy, although protecting from race-until-win
    conditions, does impose a higher load on the DNS infrastructure.
    For queries whose direct answer is not a CNAME, it does not add
    any latency to query processing if the resolver returns an empty
    authoritative NS RRset when the NS RRset has yet to be validated,
    but the number of outstanding queries is significantly
    increased.</t>

    <t>We developed a Bro analysis script to estimate the impact of
    this policy.  This policy accepted a trace of DNS requests and
    replies and estimated the number of additional queries that would
    be generated to follow CNAME chains, to cache the NS RRSet and
    associated A records, and to cache any A records not associated
    with either a CNAME chain or the NS RRSet.  It creates a model of
    what the resolver's own cache looks like and uses this to estimate
    the load from additional requests.</t>

    <t>For ICSI's 18,329,249 queries, this simple policy would require
    an additional 16% additional fetches for NS records, an additional
    10% of fetches for A records associated with the NS RRSets, .7%
    additional fetches of A records not associated with NS RRSets and
    a trivial number of CNAMEs.  Thus this policy increases the number
    of requests by 27%, a nontrivial but still modest overhead.  Later
    on, we discuss some policies that can significantly reduce the
    number of fetches while maintaining safety.</t>

<!-- q 
    2,933,150 fetches of NS records, 1,923,145 fetches of A records
    ### it's more useful to have all of these additional figures expressed
    ### simply as %'s - no one cares about the specific numeric value
    associated with the NS records, 129,336 fetches of A records not
    associated with the NS records, and 9,457 fetches of unsolicited
    CNAME records.  This reflects only a modest overhead for ICSI's resolvers.  The overhead could be further reduced by some additional policies.</t>
    ### could foreshadow the savings attained by lazy fetching as discussed later 
    -->

    <t>Only a few queries would have their latency increased if the
    resolver does not return the NS RRset to the client for the first
    query because it is still unvalidated.  Thus the only cases where
    a query would see increased latency is when the answer is a CNAME
    and the record pointed to by this CNAME is in the record.  Only
    .2% of ICSI's queries showed this behavior.</t>


    <t>If the resolver does include the NS RRSet it needs to wait for
    it to be validated in case the final client caches the names for
    other purposes.  This would increase the latency on 16% of the
    queries which require accessing an external authority to obtain
    the NS RRSet.</t>

    </section>

    <section anchor="optone" title="Optimization: Only One A Record for the NS RRSet">

    <t>It is obviously excessive to fetch all A records for the NS
    RRSet: unless there is a failure, only one nameserver needs to be
    contacted to return future results, although that resolver might
    be sub-optimal from a latency viewpoint.  If only one A record is
    fetched for the NS RRSet rather than all A records, this reduces
    from 10% to 6% the number of additional records fetched for the
    purpose of providing a valid name associated with each NS
    RRset.</t>

<!-- 1,923,145 to 1,144,102 the number of A records that need to -->


    <t>However, this may increase the latency of subsequent responses
    if the chosen authoritative server is not the closest.  This
    latency could be mitigated by performing additional fetches of
    alternate nameservers as requests for a domain continue to be
    made.  One suggestion is to use the "closest name", to ranodmly
    select the name which most closely matches the domain.  Thus if
    "example.com NS ns.example.com, ns.example2.com, ns.example3.org",
    needed to be cached, ns.example.com would be fetched.  If this
    lookup failed, ns.example2.com would be fetched, followed by
    ns.example3.org: selecting from the most matching to
    least-matching name.  It is unclear what the latency penalty would
    be for this heuristic.</t>
    </section>

    <section anchor="object_scope" title="Optimization: Object Scope">
    <t>There does exist an additional data scope: object local scope,
    that can act as an optimization.  For example, the A records
    associated with an NS RRSet cannot be accepted directly.  Yet
    since they were returned in association with a specific query for
    the NS RRSet, they can be trusted solely for purposes of
    evaluating the NS RRset.</t>  

    <t>As an example, if the response for a query for the nameservers
    of example.com is 'example.com NS ns.example.com, ns.example.com A
    12.34.56.78', it is acceptable to use the A record solely for the
    purposes of a subsequent lookup of a value in example.com, but not
    for any other purposes.  This, in programming language terms,
    represents object-local scope: a data value that can only be used
    in the context of another data value.</t>

    <t>This arrises from a simple observation: if the response is
    corrupted, any value in the response could have been corrupted,
    including the legitimate answer.  Thus there is no risk in using
    the ancillatory data in a context where the direct answer was
    trusted.  But ancillatory data can not be trusted outside the
    context where the direct answer is trusted, because this enables
    race-until-win attacks.</t>

    <t>There are two mechanisms where support for object-local scope
    can enhance resolver performance without compromising safety.  The
    first is to collapse CNAMES and return the result, per Ohta's
    suggestion (cite).  In this, an authority that returns
    "www.example.com CNAME a.example2.com, a.example2.com A
    10.34.56.78" could be treated by the resolver (and returned to the
    final client) as "www.example.com A 10.34.45.78".  This acts to
    eliminate the latency penalty for fetching CNAME chains.</t>

    <t>The second area where object-local scope provides significant
    savings is for the NS RRset.  When the NS RRset is requested, the
    A records can't be considered as authoritative in general, but MAY
    be considered as valid only for the use of the NS RRset for
    subsequent name lookups.  This completely eliminates the need to
    look up the A records associated with the NS RRSet, while still
    preventing auxilary data from poisoning the cache.  However, it
    does require changes to cache architectures to support this
    notion: the NS RRset records must include inernal A records which
    are not exported to the rest of the cache.</t>

    <t>The savings are significant: support for object-local scope
    allows the resolver to not fetch the A records for the NS entries,
    reducing the total overhead from 27% to 17% for ICSI's traffic</t>

    </section>

    <section anchor="latency" title="Optimization: Lazily Fetching the NS RRSet">

    <t>There exists a latency/performance tradeoff in fetching the NS
    RRset.  Observe that, for many sites, only a single name is used.
    Thus, fetching (and caching) the NS RRset represents wasted effort.
    Instead of eagerly fetching the NS RRset, fetching the NS
    RRset and any associated A records can be delayed until a second
    query is generated.</t>

    <t>The traffic savings are substantial.  In the ICSI traces,
    instead of requiring 16% additional fetches of the NS RRset, only
    <!-- 717,665 --> 4% additional fetches would have been required,
    with a similar reduction in the number of A records which need to
    be fetched.  As a consequence, however, the lookup of the second
    name will be delayed as the NS RRset needs to be fetched first.
    Yet the penalty is fairly modest, as only the second (and not
    subsequent) fetches would incur this latency penalty.  Thus only
    4% of the queries would see this latency penalty.</t>

    </section>

    <section anchor="change" title="Accepting Requests for Change">

    <t>The final use of ancillary records is to indicate a change
    request: a statement that a previously cached value is no longer
    valid despite still being within the TTL of the item.  The use of
    ancillatory data to indicate changes is somewhat out of the
    protocol specification, but is often considered essential
    behavior.</t>

    <t>It is clear that a resolver must not overwrite an item in the
    cache with an ancillary item for any reason, as otherwise
    ancillary data can be used for a race-until-win attack using data
    replacement instead of data inclusion.</t>

    <t>It is also clear that a resolver must not void a cache entry
    upon a change request for an ancillary item when the response is
    not in bailiwick.  Otherwise, an attacker who controls an
    arbitrary authority could construct a race-until-win attack by
    alternating between attacking the name and performing an unrelated
    query which uses the attacker's authority uses to void the
    name.</t>

    <t>However, a resolver may (and probably should) safely respond to
    an in-bailiwick request for change by voiding the cache entry
    associated with the ancillary item.  A blind attacker cannot use
    this behavior to create a race-until-win condition, as the
    attacker would have to win a race against an arbitrary name in the
    same domain to void the cache entry and then win the next
    subsequent race to set the cache entry to the attacker's desired
    value.  As winning two back-to-back races is exponentially harder
    than winning a single race, the policy is safe from attack while
    still enabling ancillary data to act as a notification of change.
    </t>
    </section>
    </section>

    <section anchor="conclusions" title="Conclusions">
    <t>There are many defenses which can be layered to provide robust defenses for recursive DNS resolvers.</t>

    <t>By looking for duplicate responses to the same transaction
    which have different value, all non-disruptive attacks can be
    directly detected, and their effects on the cache mitigated.  By
    forwarding the second response, the final victim can also be
    notified of the attack.  This requires ~10MB of state for a
    resolver performing 10,000 external queries a second.</t>

    <t>By setting an entropy budget of 40b, blind attacks are
    infeasible, requiring terapackets to have a high probability.  By
    querying special authoritative DNS servers, a resolver can detect
    any process which reduces this entropy, and can use duplicate
    requests to restore entropy.</t>
    
    <t>For the few sites where duplication produces different results,
    it is probably safe to return a randomly selected result.
    Although this still enables transactional cache attacks, it is
    probably better to accept this very narrow window of vulnerability
    to enable resolution of key sites.</t>

    <t>Finally, cache policy can eliminate 'race-until-win' attacks
    while subsuming most bailywick checks.  By only accepting and
    returning the direct answer to requests, an attacker can no longer
    conduct any race-until-win attacks targeting specific names.  The
    overhead is reasonable as even without optimizations, a study of
    our resolvers showed a 26% increase in requestes before any
    optimizations are applied.</t>

    <t>These changes all involve only resolver behavior, and they also
    combine to provide better protection than any one defense alone.
    Increasing entropy increases the blind attacker's work in packets,
    while eliminating race-until-win increases the work in time.  With
    a sufficient entropy budget, a resolver can detect that it is
    under attack and act according.  While directly detecting
    non-disruptive attacks can detect both packet injectors and many
    blind injectors.</t>

    </section>



    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>This work is sponsored in part by NSF Grants ITR/ANI-0205519 and NSF-0433702.  All opinions are those of the author, not the funding institution.</t>

      <t>Feedback from Vern Paxson, Robin Sommer, Christian Kreibich, Paul Vixie (who also suggested the "closest name" heuristic), Seth Hall, Wooter Wijngaards, Dan Kaminski, and David Dagon</t>
    </section>

    <!-- Possibly a 'Contributors' section ... -->

    <section anchor="IANA" title="IANA Considerations">
      <t>None</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>This text is focused on security concerns for DNS resolvers.  The security aspects of each defense are discussed as part of each section.</t>
    </section>
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <!-- References split into informative and normative -->

    <!-- There are 2 ways to insert reference entries from the citation libraries:
     1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
     2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
        (for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")

     Both are cited textually in the same manner: by using xref elements.
     If you use the PI option, xml2rfc will, by default, try to find included files in the same
     directory as the including file. You can also define the XML_LIBRARY environment variable
     with a value containing a set of directories to search.  These can be either in the local
     filing system or remote ones accessed by http (http://domain/dir/... ).-->

    <references title="Normative References">
      <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
      &RFC2119;

      <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2181.xml"?-->
      &RFC2181;

    </references>

    <references title="Informative References">
      <!-- Here we use entities that we defined at the beginning. -->



      <!-- A reference written by by an organization not a person. -->

      <reference anchor="akamai" target="http://www.akamai.com">
        <!-- the following is the minimum to make xml2rfc happy -->
        <front>
          <title>The Akamai CDN</title>
          <author>
            <organization>Akamai Inc</organization>
          </author>
          <date year="2008" />
        </front>
      </reference>

      <reference anchor="barwood"
                 target="http://www.ops.ietf.org/lists/namedroppers/namedroppers.2008/msg01647.html">
       <front><title>The Birthday Defense (IETF NameDroppers Mailing List)</title>

              <author initials="G" surname="Barwood"><organization></organization></author>
              <date month="September" day="2" year="2008" /></front></reference>

      <reference anchor="0x20"
                 target="http://tools.ietf.org/html/draft-vixie-dnsext-dns0x20-00">
       <front><title>Use of Bit 0x20 in DNS Labels
                    to Improve Transaction Identity</title>

              <author initials="P" surname="Vixie"><organization>ISC</organization></author>
              <author initials="D" surname="Dagon"><organization>GaTech</organization></author>
              <date month="March" day="17" year="2008" /></front></reference>



      <reference anchor="perry"
                 target="http://fscked.org/blog/fully-automated-active-https-cookie-hijacking">
       <front><title>Fully Automated Active HTTPS Cookie Hijacking</title>

              <author initials="M" surname="Perry"><organization></organization></author>
              <date month="August" day="14" year="2008" /></front></reference>

      <reference anchor="kaminski"
                 target="http://www.kb.cert.org/vuls/id/800113">
       <front><title>Multiple DNS implementations vulnerable to cache poisoning</title>

              <author ><organization>US-CERT</organization></author>
              <date month="July" day="8" year="2008" /></front></reference>

      <reference anchor="unbound"
                 target="http://tools.ietf.org/html/draft-wijngaards-dnsext-resolver-side-mitigation-00">
       <front><title>Resolver Side Mitigations</title>

              <author initials="W" surname="Wijngaards"><organization>NLnet Labs</organization></author>
              <date month="August" day="25" year="2008" /></front></reference>
    </references>


    <!-- Change Log

v00 2006-03-15  EBD   Initial version

v01 2006-04-03  EBD   Moved PI location back to position 1 -
                      v3.1 of XMLmind is better with them at this location.
v02 2007-03-07  AH    removed extraneous nested_list attribute,
                      other minor corrections
v03 2007-03-09  EBD   Added comments on null IANA sections and fixed heading capitalization.
                      Modified comments around figure to reflect non-implementation of
                      figure indent control.  Put in reference using anchor="DOMINATION".
                      Fixed up the date specification comments to reflect current truth.
v04 2007-03-09 AH     Major changes: shortened discussion of PIs,
                      added discussion of rfc include.
v05 2007-03-10 EBD    Added preamble to C program example to tell about ABNF and alternative 
                      images. Removed meta-characters from comments (causes problems).  -->
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-22 03:27:09