One document matched: draft-jabley-dnsop-flush-reqs-00.xml


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
  <!ENTITY rfc2119 PUBLIC ''
    'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
  <!ENTITY rfc2308 PUBLIC ''
    'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2308.xml'>
]>

<rfc category="exp" ipr="trust200902"
  docName="draft-jabley-dnsop-flush-reqs-00">

  <?rfc toc="no" ?>
  <?rfc symrefs="yes" ?>
  <?rfc sortrefs="yes"?>
  <?rfc iprnotified="no" ?>
  <?rfc strict="yes" ?>

  <front>
    <title abbrev="DNS FLUSH Requirements">Requirements for a Mechanism
      for Remote-Triggered DNS Cache Flushes</title>

    <author initials='J.' surname="Abley" fullname='Joe Abley'>
      <organization>Dyn, Inc.</organization>
      <address>
        <postal>
          <street>470 Moore Street</street>
          <city>London</city>
          <region>ON</region>
          <code>N6C 2C2</code>
          <country>Canada</country>
        </postal>
        <phone>+1 519 670 9327</phone>
        <email>jabley@dyn.com</email>
      </address>
    </author>

    <author initials="W." surname="Kumari" fullname="Warren Kumarui">
      <organization>Google</organization>
      <address>
        <postal>
          <street>1600 Amphitheatre Parkway</street>
          <city>Mountain View</city>
          <region>CA</region>
          <code>94043</code>
        </postal>
        <email>warren@kumari.net</email>
      </address>
    </author>

    <date month="October" year="2013"/>

    <abstract>
      <t>Operational calamaties in the DNS happen from time to time,
        and in many cases problems persist due to DNS caching of bad data.
        Lacking any robust mechanism to signal that bad data has been
        flushed, the operators of DNS authority servers often resort
        to unauthenticated requests for help being sent to mailing
        lists, the results of which are frequently unsatisfying to all
        concerned.</t>

      <t>This document aims to present requirements for a more robust
        mechanism by which authoritative server operators can signal to
        recursive server operators that bad data has been published,
        and that targetted cache flushes may be beneficial.</t>
    </abstract>
  </front>

  <middle>
    <section title="Terminology">
      <t>This document makes use of the following taxonomy. Note
	that although it is thought that these terms (and the
	meanings presented here) are in common use, overloading and
	ambiguity abounds in practice and hence the definitions
	presented here should not be considered universally-applicable.</t>

      <t>
        <list style="hanging">
	  <t hangText="Authoritative Server:">A DNS server that
	    serves one or more DNS zones authoritatively, and which
	    does not process recursive queries. An Authoritative
	    Server may function as a Master Server, or a Slave
	    Server, or both.</t>

	  <t hangText="Master Server:">An Authoritative Server with
	    the ability to respond to zone transfer requests from
	    one or more Slave Servers and hence replicate zone data
	    from master to slave.</t>

	  <t hangText="Slave Server:">An Authoritative Server
	    configured to replicate zone data from one or more
	    Master Servers.</t>

	  <t hangText="Recursive Server:">A DNS server that processes
	    Recursive Queries on behalf of Stub Resolvers. Recursive
	    Servers ultimately obtain responses from Authoritative
	    Servers, although particular queries from Stub Resolvers
	    may be satisfied using data stored in a local cache
	    or obtained from one or more other Recursive Servers.</t>

	  <t hangText="Stub Resolver:">A DNS client that communicates
	    with one or more configured Recursive Servers in order
	    to obtain responses to queries on behalf of an
	    application.</t>
	</list>
      </t>

      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
	"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY",
	and "OPTIONAL" in this document are to be interpreted as
	described in <xref target="RFC2119"/>.</t>
    </section>

    <section title="Introduction">
     <t>Operational calamaties in the DNS happen from time to time,
        and in many cases problems persist due to DNS caching of bad data.
        Lacking any robust mechanism to signal that bad data has been
        flushed, the operators of DNS authority servers often resort
        to unauthenticated requests for help being sent to mailing
        lists, the results of which are frequently unsatisfying to all
        concerned.</t>

      <t>This document aims to present requirements for a more robust
        mechanism by which authoritative server operators can signal to
        recursive server operators that bad data has been published,
        and that targetted cache flushes may be beneficial.</t>
    </section>

    <section title="Use Cases">
      <t>The following are examples of failures that have been observed
        to cause service disruption, and that a mechanism meeting the
        requirements that follow might provide some relief from.</t>

      <t>This is all woefully incomplete. Specific examples of domain
        names (TLD and otherwise) have been omitted in the interests of
        avoiding undue embarassment.</t>

      <section title="Registrar Compromise, Domain Hijack">
        <t>Unauthorised access to a registrant's account at a registrar
          (or some other registrar-level compromise) facilitates the
          unauthorised redelegation of a domain to new nameservers,
          which serve malicious data with long TTLs.</t>

        <t>Remediation is achieved by restoring the correct delegation.
          Service disruption continues until the malicious data has
          expired from Recursive Server caches used by end-users.</t> 
      </section>

      <section title="Zone Signing Failure">
        <t>A failure to sign a zone correctly (e.g. using the wrong
          keys) results in correct zone data being published with
          signatures that cannot be validated.</t>

        <t>Remediation is achieved by fixing the signing problem
          (e.g. signing with the correct keys) and publishing a new
          revision of the zone. Service disruption may continue until
          particular elements have expired from caches (e.g. apex
          DNSKEY RRSets).</t>
      </section>

      <section title="Zone Integrity Failure">
        <t>Publication of an incomplete (e.g. truncated) zone results in
          missing data, and the absence of that data is subject to
          negative caching <xref target="RFC2308"/>.</t>

        <t>Remediation is achieved by resolving the operational
          problem that led to the incomplete zone being published,
          and publishing a successor zone that is complete. Service
          disruption may continue until all negatively-cached
          elements have expired from caches.</t>
      </section>
    </section>

    <section title="Requirements">
      <t>[These various requirements are somewhat arbitrarily spread over
        the subsections that follow. Some better organisation would make
        them more readable.]</t>

      <section title="Functional Requirements">
        <t>
          <list style="numbers">
            <t>The mechanism MUST be effective; that is, it must be
              capable of being deployed to the extent that it provides
              a significant improvement in remediation of zone publication
              problems.</t>

            <t>The mechanism MUST accommodate a maximally-constrained
              scope for a flush; that is, the resulting cache flushes
              MUST be constrained to specific parts of the namespace
              where a flush is beneficial to resolve a problem, and
              MUST minimise collateral damage to other cached data.</t>

            <t>The mechanism MUST be opt-in from the perspective of
              a Recursive Server operator; that is, no Recursive Server
              operator should be compelled to act upon a request to
              flush their cache.</t>

            <t>The mechanism MUST accommodate timely responses to
              problems.</t>

            <t>The mechanism MUST support idempotency; that is,
              transmission of multiple identical requests to flush
              MUST NOT result in more than one flush operation in
              a single cache.</t>

            <t>The mechanism SHOULD require minimal changes to DNS
              software, but MIGHT reasonably involve changes to or
              deployment of surrounding administrative scaffolding
              (scripts, etc).</t>
	  </list>
	</t>
      </section>

      <section title="Operational Requirements">
        <t>
          <list style="numbers">
            <t>It SHOULD be possible to automate the reception and
              processing of a request using the mechanism on a
              Recursive Server.</t>

            <t>The mechanism SHOULD be simple for Recursive Server
              operators to implement and operate, since those
              operators might be the recipient of many requests
              whereas the operators of Authoritative Servers should
              only reasonably expect to exercise this mechanism
              in the event of serious operational failures.</t>
          </list>
        </t>
      </section>

      <section title="Manageability Requirements">
        <t>
          <list style="numbers">
            <t>The mechanism's effectiveness SHOULD be easily
              measurable; Authoritative Server operators using the
              mechanism ought to be able to gauge its effect, and
              Recursive Server operators ought to be able to tell
              whether problem data was present (i.e. whether the
              remedy actually corresponded to a problem).</t>
          </list>
        </t>
      </section>

      <section title="Security Requirements" anchor="security">
        <t>
          <list style="numbers">
            <t>The mechanism MUST be authenticated, such that a
              Recursive Server operator can trust that a request to
              flush is authentic.</t>

            <t>It MUST be possible to rate-limit reception of cache
              flush requests in order to avoid the mechanism being used
              as a denial-of-service attack vector.</t>

            <t>The mechanism MUST NOT facilitate new exploits or
              compromises against Authoritative Servers or Recursive
              Servers.</t>
          </list>
        </t>
      </section>
    </section>

    <section title="IANA Considerations">
      <t>This document makes no request of the IANA.</t>
    </section>

    <section title="Security Considerations">
      <t>This document presents requirements for future work, and
        does not directly impact the security of the Internet.</t>

      <t>Security requirements are described in <xref
        target="security"/>.</t>
    </section>

    <section title="Acknowledgements">
      <t>Your name here, etc.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;
    </references>

    <references title="Informative References">
      &rfc2308;
    </references>

    <section title="Editorial Notes">
      <t>This section (and sub-sections) to be removed prior to
        publication.</t>

      <section title="Change History">
        <t>
          <list style="hanging">
            <t hangText="00">Initial idea, circulated for the purposes of
              entertainment.</t>
          </list>
        </t>
      </section>
    </section>
  </back>
</rfc>


PAFTECH AB 2003-20262026-04-24 04:36:26