One document matched: draft-ietf-dnsext-rfc2671bis-edns0-03.xml


<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc strict="yes" ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="std"
	   docName="draft-ietf-dnsext-rfc2671bis-edns0-03"
		 ipr="trust200902"
		 obsoletes="2671, 2673">
  <front>
    <title abbrev="EDNS0 Extensions">Extension Mechanisms for DNS (EDNS0)</title>
    <author fullname="Michael Graff" initials="M.G." surname="Graff">
      <organization>Internet Systems Consortium</organization>
      <address>
        <postal>
          <street>950 Charter Street</street>
          <city>Redwood City</city>
          <region>California</region>
          <code>94063</code>
          <country>US</country>
        </postal>
        <phone>+1 650.423.1304</phone>
        <email>mgraff@isc.org</email>
      </address>
    </author>
    <author fullname="Paul Vixie" initials="P.V." surname="Vixie">
      <organization>Internet Systems Consortium</organization>
      <address>
        <postal>
          <street>950 Charter Street</street>
          <city>Redwood City</city>
          <region>California</region>
          <code>94063</code>
          <country>US</country>
        </postal>
        <phone>+1 650.423.1301</phone>
        <email>vixie@isc.org</email>
      </address>
    </author>
    <date month="March" year="2010"/>
    <area>General</area>
    <workgroup>DNSEXT Working Group</workgroup>
    <keyword>RFC</keyword>
    <keyword>Request for Comments</keyword>
    <keyword>I-D</keyword>
    <keyword>Internet-Draft</keyword>
    <keyword>edns0</keyword>
    <keyword>dns</keyword>
    <abstract>
      <t>
        The Domain Name System's wire protocol includes a number of fixed
        fields whose range has been or soon will be exhausted and does not
        allow requestors to advertise their capabilities to responders. This
        document describes backward compatible mechanisms for allowing the
        protocol to grow.
			</t>
      <t>
        This document updates the EDNS0 specification (RFC2671) based on 10
        years of deployment experience.
			</t>
    </abstract>
  </front>
  <middle>
    <section title="Introduction">
      <t>
        DNS <xref target="RFC1035"/> specifies a Message Format and within such
        messages there are standard formats for encoding options, errors, and
        name compression. The maximum allowable size of a DNS Message is fixed.
        Many of DNS's protocol limits are too small for uses which are or which
        are desired to become common. There is no way for implementations to
        advertise their capabilities.
			</t>
      <t>
        Unextended agents will not know how to interpret the protocol
        extensions detailed here.  In practice, these clients will be upgraded
        when they have need of a new feature, and only new features will make
        use of the extensions.  Extended agents must be prepared for behavior
        of unextended clients in the face of new protocol elements, and fall
        back gracefully to unextended DNS.  <xref target="RFC2671"/>
        proposed extensions to the basic DNS protocol to overcome these
        deficiencies.
        This memo refines that specification and obsoletes
        <xref target="RFC2671"/>.
			</t>
      <t>
        <xref target="RFC2671"/> specified extended label types. The only one
        ever proposed was in RFC2673 for a label type called "Bitstring Labels."
        For various reasons introducing a new label type was found to be
        extremely difficult, and RFC2673 was moved to Experimental.
        This document Obsoletes Extended Labels.
			</t>
    </section>
    <section title="Terminology">
      <t>
        "Requestor" is the side which sends a request.  "Responder" is
        an authoritative, recursive resolver, or other DNS component which
        responds to questions.
			</t>
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in
        <xref target="RFC2119">RFC 2119</xref>.
			</t>
    </section>
    <section title="EDNS Support Requirement">
      <t>
				EDNS support is mandatory in a modern world.  DNSSEC requires EDNS
        support, and many other Features are made possible only by EDNS support
        to request or advertise them.  Many organizations are beginning to
        require DNSSEC.  Without common interoperability, DNSSEC cannot be as
        easily deployed.
      </t>
      <t>
        DNS publishers are wanting to put more data in answers. DNSSEC
        DNSKEY records, negative answers, and many other DNSSEC queries
        cause larger answers to be returned.  In order to support this, DNS
        servers, middleware, and stub resolvers MUST support larger packet
        sizes advertised via EDNS0.
			</t>
    </section>
    <section title="Affected Protocol Elements">
      <section title="Message Header">
        <t>
          The DNS Message Header's second full 16-bit word is divided into a
          4-bit OPCODE, a 4-bit RCODE, and a number of 1-bit flags
          (see <xref target="RFC1035">, section
          4.1.1</xref>). Some of these were marked for
          future use, and most these have since been allocated. Also, most of
          the RCODE values are now in use. The OPT pseudo-RR specified below
          contains subfields that carry a bit field extension of the RCODE field
          and additional flag bits, respectively.
        </t>
      </section>
      <section title="Label Types">
        <t>
          The first two bits of a wire format domain label are used to denote the
          type of the label.
          <xref target="RFC1035"/> allocates two of the four
          possible types and reserves the other two.
          More label types were defined in <xref target="RFC2671"/>.
        </t>
      </section>
      <section title="UDP Message Size">
        <t>
          Traditional DNS Messages are limited to 512 octets in size when sent
          over UDP (<xref target="RFC1035"/>).  Today, many organizations wish
          to return many records in a single reply, and special tricks are
          needed to make the responses fit in this 512-byte limit.
          Additionally, DNSSEC signatures can easily generate a much larger
          response than a 512 byte message can hold.
        </t>
        <t>
        	EDNS0 is intended to address these larger packet sizes and continue to
        	use UDP.  It specifies a way to advertise additional features such
        	as larger response size capability, which is intended to help avoid
        	truncated UDP responses which then cause retry over TCP.
        </t>
      </section>
    </section>
    <section title="Extended Label Types">
      <t>
        The first octet in the on-the-wire representation of a DNS label
        specifies the label type; the basic DNS specification
        <xref target="RFC1035"/>
        dedicates the two most significant bits of that octet for this purpose.
      </t>
      <t>
        <xref target="RFC2671"/> defined DNS label type 0b01 for use as an
        indication for Extended Label Types. A specific Extended Label Type
        is selected by the 6 least significant bits of the first octet. Thus,
        Extended Label Types are indicated by the values 64-127 (0b01xxxxxx)
        in the first octet of the label.
      </t>
      <t>
        This document does not describe any specific Extended Label Type.
      </t>
      <t>
        In practice, Extended Label Types are difficult to use due to support in
        clients and intermediate gateways. Therefore, the registry of Extended
        Label Types is requested to be closed. They cause interoperability
        problems and at present no defined label types are in use.
      </t>
      <t>
      	Bitstring labels were originally created to solve problems with IPv6
      	reverse zones.  Due to the problems of introducing a new label type
      	they were moved to experimental.  This document moves them from
      	experimental to historical, making them obsoleted.
      </t>
    </section>
    <section title="OPT pseudo-RR">
      <section title="OPT Record Definition">
        <t>
          An OPT pseudo-RR (sometimes called a meta-RR) MAY be added to the
          additional data section of a request.
        </t>
        <t>
          The OPT RR has been assigned RR type 41.
        </t>
        <t>
          If present in requests, compliant responders MUST include an OPT record
          in responses.
        </t>
        <t>
          An OPT record does not carry any DNS data. It is used only to contain
          control information pertaining to the question and answer sequence of
          a specific transaction. OPT RRs MUST NOT be cached, forwarded, or
          stored in or loaded from master files.
        </t>
        <t>
          The OPT RR MAY be placed anywhere within the additional data section.
          Only one OPT RR MAY be included within any DNS message. If a message
          with more than one OPT RR is received, a FORMERR MUST be returned.
        </t>
      </section>
      <section title="OPT Record Format">
        <t>
          An OPT RR has a fixed part and a variable set of options expressed as
          {attribute, value} pairs. The fixed part holds some DNS meta data and
          also a small collection of basic extension elements which we expect
          to be so popular that it would be a waste of wire space to encode
          them as {attribute, value} pairs.
        </t>
        <t>
          The fixed part of an OPT RR is structured as follows:
        </t>
        <texttable title="OPT RR Format">
          <ttcol>Field Name</ttcol>
          <ttcol>Field Type</ttcol>
          <ttcol>Description</ttcol>
          <c>NAME</c>
          <c>domain name</c>
          <c>empty (root domain)</c>
          <c>TYPE</c>
          <c>u_int16_t</c>
          <c>OPT</c>
          <c>CLASS</c>
          <c>u_int16_t</c>
          <c>requestor's UDP payload size</c>
          <c>TTL</c>
          <c>u_int32_t</c>
          <c>extended RCODE and flags</c>
          <c>RDLEN</c>
          <c>u_int16_t</c>
          <c>describes RDATA</c>
          <c>RDATA</c>
          <c>octet stream</c>
          <c>{attribute,value} pairs</c>
        </texttable>
        <t>
          The variable part of an OPT RR is encoded in its RDATA and is structured
          as zero or more of the following:
        </t>
        <figure>
          <artwork align="center">
 
              +0 (MSB)                            +1 (LSB)
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0: |                          OPTION-CODE                          |
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2: |                         OPTION-LENGTH                         |
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
4: |                                                               |
   /                          OPTION-DATA                          /
   /                                                               /
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
</artwork>
        </figure>
        <t>
          <list style="hanging" hangIndent="6">
            <t hangText="OPTION-CODE"><vspace/>
              Assigned by Expert Review.
            </t>
            <t hangText="OPTION-LENGTH"><vspace/>
              Size (in octets) of OPTION-DATA.
						</t>
            <t hangText="OPTION-DATA"><vspace/>
              Varies per OPTION-CODE.
            </t>
          </list>
        </t>
        <t>
          The order of appearance of option tuples is not guaranteed.  If one
          option modifies the behavior of another or multiple options are
          related to one another in some way, they have the same effect
          regardless of ordering in the RDATA wire encoding.
        </t>
        <t>
          Any OPTION-CODE values not understood by a responder or requestor
          MUST be ignored.  Specifications of such options might wish to
          include some kind of signaled acknowledgement.  For example, an option
          specification might say that if a responder sees option XYZ, it MUST
          include option XYZ in its response.
        </t>
      </section>
      <section title="Caching behavior">
        <t>
          The OPT record must not be cached.
				</t>
			</section>
			<section title="Fallback">
				<t>
				  If a requestor detects that
					the remote end does not support EDNS0, it MAY issue queries
					without an OPT record.  It MAY cache this knowledge
					for a brief time in order to avoid fallback delays in the
					future.  However, if DNSSEC is required, no fallback should
					be performed as DNSSEC is only signaled through EDNS0.
        </t>
      </section>
      <section title="Requestor's Payload Size">
        <t>
          The requestor's UDP payload size (which OPT stores in the RR CLASS
          field) is the number of octets of the largest UDP payload that can be
          reassembled and delivered in the requestor's network stack.
          Note that path MTU, with or without fragmentation, may be smaller
          than this.  Values lower than 512 MUST be treated as equal to 512.
        </t>
        <t>
        	Requestors SHOULD place a value in this field that it can actually
        	receive.  For example, if a requestor sits behind a firewall which
        	will block fragmented IP packets, a requestor SHOULD not choose a
        	value which will cause fragmentation.  Doing so will prevent large
        	responses from being received, and can cause fallback to occur.
        </t>
        <t>
          Note that a 512-octet UDP payload requires a 576-octet IP reassembly
          buffer.  Choosing 1280 for IPv4 over Ethernet would be reasonable.
          Choosing a very large value will guarantee fragmentation at the IP
          layer, and may prevent answers from being received due to a single
          fragment loss or misconfigured firewalls.
        </t>
        <t>
          The requestor's maximum payload size can change over time. It
          MUST not be cached for use beyond the transaction in which it
          is advertised.
        </t>
      </section>
      <section title="Responder's Payload Size">
        <t>
          The responder's maximum payload size can change over time, but can
          be reasonably expected to remain constant between two sequential
          transactions;  for example, a meaningless QUERY to discover a
          responder's maximum UDP payload size, followed immediately by an
          UPDATE which takes advantage of this size.  This is considered
          preferable to the outright use of TCP for oversized requests, if
          there is any reason to suspect that the responder implements
          EDNS, and if a request will not fit in the default 512 payload size
          limit.
        </t>
      </section>
      <section title="Payload Size Selection">
        <t>
          Due to transaction overhead, it is unwise to advertise an
          architectural limit as a maximum UDP payload size.  Just because
          your stack can reassemble 64KB datagrams, don't assume that you want
          to spend more than about 4KB of state memory per ongoing transaction.
        </t>
        <t>
          A requestor MAY choose to implement a fallback to smaller advertised
          sizes to work around firewall or other network limitations.
          A requestor SHOULD choose to use a fallback mechanism which begins
          with a large size, such as 4096.  If that fails, a fallback around
          the 1280 byte range SHOULD be tried, as it has a reasonable chance
          to fit within a single Ethernet frame.  Failing that, a requestor MAY
          choose a 512 byte packet, which with large answers may cause a TCP
          retry.
        </t>
      </section>
      <section title="Middleware Boxes">
        <t>
					Middleware boxes MUST NOT limit DNS messages over UDP to 512 bytes.
				</t>
        <t>
					Middleware boxes which simply forward requests to a recursive
					resolver MUST NOT modify the OPT record contents in either direction.
				</t>
				<t>
				  Middleware boxes which have additional functionality, such as
				  answering certain queries or acting like an intelligent forwarder,
				  MUST understand the OPT record.  These boxes MUST consider the
				  incoming request and any outgoing requests as separate transactions
				  if the characteristics of the messages are different.
			  </t>
      </section>
      <section title="OPT Record TTL Field Use">
        <t>
					The extended RCODE and flags (which OPT stores in the RR TTL field)
					are structured as follows:
				</t>
        <figure>
          <artwork align="center">
              +0 (MSB)                            +1 (LSB)
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0: |         EXTENDED-RCODE        |            VERSION            |
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2: | DO|                           Z                               |
   +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
					</artwork>
        </figure>
        <t>
          <list style="hanging" hangIndent="6">
            <t hangText="EXTENDED-RCODE"><vspace/>
              Forms upper 8 bits of extended 12-bit RCODE.  Note that
              EXTENDED-RCODE value 0 indicates that an unextended RCODE
              is in use (values 0 through 15).
            </t>
            <t hangText="VERSION"><vspace/>
              Indicates the implementation level of whoever sets it.  Full
              conformance with this specification is indicated by version ``0.''
              Requestors are encouraged to set this to the lowest implemented
              level capable of expressing a transaction, to minimize the responder
              and network load of discovering the greatest common implementation
              level between requestor and responder.  A requestor's version
              numbering strategy MAY ideally be a run time configuration option.
              <vspace/>
              If a responder does not implement the VERSION level of the
              request, then it answers with RCODE=BADVERS.  All responses MUST
              be limited in format to the VERSION level of the request, but the
              VERSION of each response SHOULD be the highest implementation
              level of the responder.  In this way a requestor will learn the
              implementation level of a responder as a side effect of every
              response, including error responses and including RCODE=BADVERS.
            </t>
          </list>
        </t>
      </section>
      <section title="Flags">
				<t>
        	<list style="hanging" hangIndent="6">
						<t hangText="DO"><vspace/>
							DNSSEC OK bit as defined by <xref target="RFC3225"/>.
						</t>
					  <t hangText="Z"><vspace/>
							Set to zero by senders and ignored by receivers, unless modified
							in a subsequent specification.
						</t>
					</list>
				</t>
      </section>
      <section title="OPT Options Code Allocation Procedure">
        <t>
          Allocations assigned by expert review.
          Assignment of Option Codes should be liberal, but duplicate
          functionality is to be avoided.
        </t>
      </section>
    </section>
    <section title="Transport Considerations">
      <t>
        The presence of an OPT pseudo-RR in a request should be taken as an
        indication that the requestor fully implements the given version of EDNS,
        and can correctly understand any response that conforms to that
        feature's specification.
      </t>
      <t>
        Lack of presence of an OPT record in a request MUST be taken as an
        indication that the requestor does not implement any part of this
        specification and that the responder MUST NOT include an OPT record in
        its response.
      </t>
      <t>
        Responders who do not implement these protocol extensions MUST respond
        with FORMERR messages without any OPT record.
      </t>
      <t>
        If there is a problem with processing the OPT record itself, such as an
        option value that is badly formatted or includes out of range values, a
        FORMERR MUST be returned.  If this occurs the response MUST include an
        OPT record.  This is intended to allow the requestor to to distinguish
        between servers which do not implement EDNS and format errors within
        EDNS.
      </t>
      <t>
      	The minimal response must be the DNS header, question
      	section, and an OPT record.  This must also occur when an truncated
      	response (using the DNS header's TC bit) is returned.
      </t>
    </section>
    <section title="Security Considerations">
      <t>
        Requestor-side specification of the maximum buffer size may open a
        new DNS denial of service attack if responders can be made to send
        messages which are too large for intermediate gateways to forward,
        thus leading to potential ICMP storms between gateways and responders.
      </t>
      <t>
        Announcing very large UDP buffer sizes may result in dropping by
        firewalls.  This could cause retransmissions with no hope of success.
        Some devices reject fragmented UDP packets.
      </t>
      <t>
        Announcing too small UDP buffer sizes may result in fallback to TCP.
        This is especially important with DNSSEC, where answers are much larger.
      </t>
    </section>
    <section anchor="iana_considerations" title="IANA Considerations">
      <t>
        The IANA has assigned RR type code 41 for OPT.
      </t>
      <t>
        <xref target="RFC2671"/>
        specified a number of IANA sub-registries
        within "DOMAIN NAME SYSTEM PARAMETERS:"
        <list style="symbols">
          <t>EDNS Extended Label Type</t>
          <t>EDNS Option Codes</t>
          <t>EDNS Version Numbers</t>
          <t>Domain System Response Code</t>
        </list>
        IANA is advised to re-parent these sub-registries to this document.
      </t>
      <t>
        <xref target="RFC2671"/>
        created the "EDNS Extended Label Type Registry".
        We request that this registry be closed.
      </t>
      <t>
				This document assigns option code 65535 in the "EDNS Option Codes"
				registry to "Reserved for future expansion."
			</t>
      <t>
        <xref target="RFC2671"/>
				expands the RCODE space from 4 bits to 12 bits.
				This allows more than the 16 distinct RCODE values
				allowed in <xref target="RFC1035"/>.  IETF Standards
				Action is required to add a new RCODE.  Adding new RCODEs should
				be avoided due to the difficulty in upgrading the installed base.
			</t>
      <t>
				This document assigns EDNS Extended RCODE 16 to "BADVERS".
			</t>
      <t>
        IETF Standards Action is required for assignments of new EDNS0 flags.
        Flags SHOULD be used only when necessary for DNS resolution to
        function.  For many uses, a EDNS Option Code may be preferred.
      </t>
      <t>
				IETF Standards Action is required to create new entries in the
				EDNS Version Number registry.  Expert Review is required for
				allocation of an EDNS Option Code.
			</t>
    </section>
    <appendix title="Document Editing History">
      <t>
        Following is a list of high-level changes made to the original
        RFC2671.
      </t>
      <appendix title="Changes since RFC2671">
        <t>
          <list style="symbols">
            <t>
              Support for the OPT record is now mandatory.
            </t>
            <t>
              Extended label types obsoleted and the registry is closed.
            </t>
            <t>
              The bitstring label type, which was already moved from draft
              to experimental, is requested to be moved to historical.
            </t>
            <t>
              Changes in how EDNS buffer sizes are selected, with recommendations
              on how to select them.
            </t>
            <t>
              Front material (IPR notice and such) was updated to current
              requirements.
            </t>
          </list>
        </t>
      </appendix>
      <appendix title="Changes since -02">
        <t>
          <list style="symbols">
            <t>
              Specified the method for allocation of constants.
            </t>
            <t>
              Cleaned up a lot of wording, along with quite a bit of document
              structure changes.
            </t>
          </list>
        </t>
      </appendix>
    </appendix>
  </middle>
  <back>
    <references title="Normative References">
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml"?>
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2671.xml"?>
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3225.xml"?>
    </references>
    <references title="Informative References">
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 03:19:26