One document matched: draft-ietf-idn-dnsii-mdnp-00.txt


IDN Working Group                             Edmon Chung & David Leung
Internet Draft                                              Neteka Inc.
<draft-ietf-idn-dnsii-mdnp-00.txt>                          August 2000
 
 
              The DNSII Multilingual Domain Name Protocol 
 
 
STATUS OF THIS MEMO 
 
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026.  
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts.  Internet-Drafts are draft documents valid for a maximum of 
   six months and may be updated, replaced, or obsoleted by other 
   documents at any time.  It is inappropriate to use Internet-Drafts as 
   reference material or to cite them other than as "work in progress."  
    
   The reader is cautioned not to depend on the values that appear in 
   examples to be current or complete, since their purpose is primarily 
   educational.  Distribution of this memo is unlimited. 
    
   The list of current Internet-Drafts can be accessed at  
   http://www.ietf.org/ietf/1id-abstracts.txt 
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html.  
    
    
Abstract 
    
   Historically, the DNS is capable of handling only names within the 
   basic English alphanumeric character set (plus the hyphen), yet the 
   standards were so elegantly and openly designed that the extension of 
   the DNS into a multilingual and symbols based system proves to be 
   possible with simple adjustments. 
    
   These adjustments will be made on both the client side and the server 
   side. However, DNSII works on the principal that it is preferable to 
   make the transition to multilingual domain names seamless and 
   transparent to the end-user. Which means initially the server, or 
   more specifically, the resolver, SHOULD take the primary 
   responsibility for the technical implementation of the changes 
   required for a multilingual Internet. 
    
   The DNSII protocol is designed to allow the preservation of 
   interoperability, consistency and simplicity of the original DNS, 
   while being expandable and flexible for the handling of any character 
   or symbol used for the naming of an Internet domain. 
    
   This draft forms the introduction of a series of draft including 
   intended resolution processes and other DNSII documents. 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
1. Introduction 
    
   This Internet-draft describes details of the DNSII Multilingual 
   Domain Name protocol. The Internet-Draft assumes that the reader is 
   familiar with the concepts discussed in the widely distributed RFCs 
   "Domain Names _ Concepts and Facilities" [RFC 1034] and _Domain Names 
   _Implementation and Specification" [RFC 1035]. 
    
    
1.1 Terminology 
    
   The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", 
   and "MAY" in this document are to be interpreted as described in RFC 
   2119 [RFC2119]. 
    
   A number of multilingual characters are used in this document for
   examples.  Please select your view encoding type to UTF-8 for it to be 
   displayed properly. 
    
    
1.2 DNSII 
    
   Many of the current proposals for a multilingual domain name system 
   involve working around the current ANSI based DNS.  So doing either 
   affects the integrity of the original spirit of the DNS or does not 
   well address the encoding conflict issues apparent in different 
   character encoding schemes.  
    
   The DNSII specifications takes a radically different approach: it 
   successfully identifies the difference between original DNS and DNSII 
   packets within the labels and at the same time allows the use of 
   multiple charsets to be easily incorporated in a standardized manner.  
   It causes no harm to the current DNS because it embraces the original 
   format for DNS laid out in RFC1035, complemented with the ideas 
   incorporated in EDNS [RFC2671].  
    
    
2. DNSII Protocol 
    
   The DNSII Protocol consists mainly of two parts: the InPacket DNSII 
   Identifier and the InPacket Label Encoding Type.  In addition, there 
   are several special considerations for specific record types. 
    
    
2.1 InPacket DNSII Identifier 
    
   In the DNSII specifications, an InPacket DNSII Identifier MUST be 
   inserted before a label to signify that it contains extended 
   characters that are not supported by the current DNS. 
    
   This DNSII flag, which is the first two bits of a label, effectively 
   distinguishes a DNSII compliant request from the existing format, 
   without having to conduct a guess from a name check whether the 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
   incoming packet is multilingual aware.  This is a substantial 
   improvement over character encoding schemes and multilingual 
   implementations in which it is almost impossible to determine the 
   language of an incoming request. The DNSII flag makes the process 
   clear and simple.     
    
   Currently: 
   "00"   regular label [RFC1035] 
   "11"   a redirection for DNS compression [RFC1035] 
   "01"   indicates the use of EDNS for multiple UDP packets [RFC2671] 
    
   DNSII calls for the use of the bit sequence "10" to identify that the 
   querying node is DNSII aware.  This will mean that all the possible 
   variations at top two label bits will be used.  Therefore, in 
   consideration, following two bits MUST be reserved for future 
   flagging use.  The 2 bits SHOULD be arbitrarily set to "00".  This 
   effectively opens up 3 more possible implementations for future 
   enhancements. 
    
   The motivation for this approach is the belief there should be no 
   ambiguity in name resolution.  Any name that the client wishes to 
   resolve, should resolve, regardless of the client side-encoding 
   scheme. 
    
    
2.2 InPacket Label Encoding Type (ILET) 
    
   Immediately following the 2 assigned DNSII flag and the 2 reserved 
   bits are 12 bits assigned to determine the InPacket Label Encoding 
   Type (ILET). 
    
   The ILET is a 12-bit number that is used to determine the encoding 
   scheme used by the characters of the label.  The MIBenum numbers 
   [RFC1700] SHOULD be used in this field.  The allocation of 12 bits 
   aligns perfectly with the MIBenum specification, of which the value 
   goes up to over 2200.  With 12 bits, the total possible values would 
   be 4096 (with 11 bits, the largest value that can be represented is 
   only 2047, slightly short of the specification).  The reason for the 
   adoption of MIBenum is to make use of the existing list of encoding 
   numbering schemes rather than re-inventing the wheel. 
    
   The value in the ILET field SHOULD only be allowed for the valid 
   encoding schemes defined in the MIBenum list. 
    









  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
   After identifying the encoding type, the regular count-label scheme 
   of the DNS resumes.  The resulting label should look like this: 
    
                        1 1 1 1 1 1   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 
   +---+---+-------+---------------+ 
   |1 0| z |         ILET          | 
   +---------------+---------------+ 
   |     COUNT     | characters... | 
   +---------------+---------------+ 
    
   To minimize the size of a DNS packet, if the entire label is 
   constituted in characters only from the ANSI table, the DNS label 
   will appear identical to current implementations.  The first two bits 
   will remain "00". 
   For example, using the DNSII format the label for "dns" MAY be 
   represented as: 
    
     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 
   | 1  0| 0  0| 0  0  0  0  0  0  0  0  0  0  1  1|  MIBenum 3 = ANSI 
   +-----------------------------------------------+
   |           3           |     6           4     |  "d"=64 
   +-----------------------------------------------+
   |     6           E     |     7           3     |  "n"=6E  "s"=73 
   +-----------------------------------------------+
    
   Or, the same domain label "dns" MAY also be represented as: 
    
     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 
   |           3           |           d           | 
   +-----------------------------------------------+
   |           n           |           s           | 
   +-----------------------------------------------+
    
   With a multilingual domain name ns.…––…Ιμ‡ώ©‡΄˜.tld as an example: 
    
                        1 1 1 1 1 1                     1 1 1 1 1 1   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 
   +---------------------------------------------------------------+ 
   |1 0| z |        ANSI=3         |       2       |       n       | 
   +---------------------------------------------------------------+ 
   |       s       |1 0|0 0|       UCS-2=1000      |       4       | 
   +---------------------------------------------------------------+ 
   |          …––   (U+57DF)        |          …Ιμ    (U+540D)        | 
   +---------------------------------------------------------------+ 
   |          ‡ώ©   (U+7CFB)        |          ‡΄˜    (U+7D71)        | 
   +---------------------------------------------------------------+ 
   |0 0|     3     |       t       |       l       |       d       | 
   +---------------------------------------------------------------+ 
   |       0       | 
   +---------------+ 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
   From the above example, we can see that the DNSII format is used for 
   the first label "ns", as well as for the second label, which is in 
   Chinese (the MIBenum for UCS-2 or ISO 10646 [Unicode] is 1000).  The 
   third label "tld" however uses the current format. 
    
   In any case, the count-label-count-label mechanism is largely 
   preserved.  Especially in the case of extended characters where in 
   other proposals, the "count" no longer represents the character 
   count.  In the above example, the domain is still represented as 
   2ns4…––…Ιμ‡ώ©‡΄˜3tld0, exactly in line with the original specifications. 
    
   Note that the first label in any query SHOULD be represented in DNSII 
   format to alert the destination server that it is DNSII aware.  This 
   is specifically configured for the considerations with CNAME, A6, 
   DNAME and PTR records. 
    
   This approach is used to ensure that there is no confusion about the 
   encoding format of the label.  ILET allows the capability of 
   employing all existing encoding schemes (UTF-7, UTF-8, ISO 10646 
   [UCS-2], ISO 10646 [UCS-4]).  ILET also allows the flexibility of 
   employing future encoding schemes. 
    
    
2.3 The Rationale for using ILET 
    
   Besides being able to preserve the count-label-count-label structure, 
   which in itself is actually a very important part because of the 
   problematic non-uniform byte encoding schemes, the use of ILET aligns 
   perfectly with previous IETF specifications as well as beneficial for 
   tricky case folding and canonicalization issues. 
    
   We know that all protocols MUST identify, for all character data, 
   which charset is in use [RFC2277], therefore it is necessary to 
   specify whatever encoding scheme, whether it be UTF-8, UTF-7, 16-bit 
   UCS-2 or ISO 8859 that is being used.  In essence, we understand that 
   it is paramount that a charset be clearly identified, especially in 
   situation like the DNS where no direct communication is established. 
    
   "At times and in specific cases, language information may be required 
   to achieve a particular level of quality for the purpose of 
   displaying a text stream.  For example, UTF-8 encoded Han may require 
   transmission of a language tag to select the specific glyphs to be 
   displayed at a particular level of quality. 
    
   Note that information other than language may be used to achieve the 
   required level of quality in a display process.  In particular, a 
   font tag is sufficient to produce identical results.  However, the 
   association of a language with a specific block of text has 
   usefulness far beyond its use in display.  In particular, as the 
   amount of information available in multiple languages on the World 
   Wide Web grows, it becomes critical to specify which language is in 
   use in particular documents, to assist automatic indexing and 
   retrieval of relevant documents." [RFC2130] 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
   In effect, this means for different languages, it is beneficial to be 
   able to identify the language in order to perform specific functions 
   to the characters, including case folding.  With ILET, the local 
   encoding scheme could be used and with them there are well defined 
   folding methods.  Therefore, the use of ILET enables an optimized 
   folding mechanism brought about by the preservation of local encoding 
   schemes, which is otherwise very difficult or virtually impossible to 
   do if only UTF-8 is used.  
    
   For the DNS however, a language tag is less feasible because if a 
   name is consisted of multiple languages, it would be very difficult 
   for tagging to be performed.  The possibility of having multiple 
   languages is very sound, and is used frequently as trademarks around 
   the world.  For example the famous Toys"Ο―"Us name, uses a character 
   from the Cyrillic language set. 
    
    
2.4 Considerations for Specific Requests 
    
   For certain requests, an ANSI only name could result in a 
   multilingual domain as an answer.  These include PTR, CNAME, A6 and 
   DNAME requests.  Special considerations are made within the DNSII 
   protocol to make sure that non-DNSII aware servers will not be fed 
   with a DNSII format packet. 
    
    
2.4.1 PTR Records 
    
   For all PTR requests, the first label of the query MUST use DNSII 
   format to alert the destination server.  Upon which, a DNSII packet 
   will be replied should the name contain extended characters. 
    
   If the DNSII format is not used, and the PTR record stumbles upon a 
   multilingual domain name, one of the following responses SHOULD be 
   given: 
    
   a. The implementer of DNSII MAY chose to reject the request; 
    
   or 
    
   b. An ACE format domain with a "for.ref.only" suffix MAY be returned; 
    
   or 
    
   c. A DNSII compliant server MAY return an 8-bit format of the 
   requested domain. 
    
   Since the PTR record is usually used for display purposes only, the 
   rejection (the IP address will then be used) or ACE format is 
   acceptable.  If the response is however used for further resolution, 
   an ACE format MUST not be used. 
    
    
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
2.4.2 CNAME, A6 & DNAME 
    
   For queries concerning the record types CNAME, A6 or DNAME, a DNSII 
   aware server should first check to see if the incoming request is 
   DNSII compliant (flagged by the "10" bits in the first label): 
    
   If so, and the domain to be returned includes extended characters, 
   the response SHOULD be in DNSII format. 
    
   If not, any multilingual domains returned should be in an 8 bit form. 
    
   For the above record types it is strongly RECOMMENDED not to 
   associate an alphanumeric label to a multilingual label as the 
   RDATA.  However, it is permissible to associate a multilingual label 
   with an alphanumeric label as the RDATA. 
    
    
3. Alternate Implementations 
    
   The DNSII-MDNP is intended to be a framework for the implementation 
   of multilingual domain names.  While the core concepts and the design 
   principles remain consistent, it is possible to contemplate 
   alternative implementations, which for some people may feel easier to 
   implement. 
    
    
3.1 Restricted ILET Values 
    
   One possible implementation guideline is for the ILET to be 
   restricted to values only representing ISO 10646 transformations 
   including UCS-2, UCS-4, UTF-7, UTF-8, UTF-16 and other as they become 
   available and included as a standard MIBenum. 
    
   Although this takes away some of the benefits of keeping the local 
   encoding scheme which includes the issues of case folding, 
   canonicalization and other related concerns, it creates a system that 
   on one hand contains only encoding schemes from ISO 10646, but on the 
   other hand still provides the flexibility of deploying new encoding 
   schemes that stem from ISO 10646, such as the 32-bit format that is 
   due to be used soon. 
    
   We understand it is specified that in protocols, which up to now have 
   used US-ASCII only, UTF-8 forms a simple upgrade path; however, its 
   use should be negotiated either by negotiating a protocol version or 
   by negotiating charset usage, and a fallback to UTF-7 MUST be 
   available. [RFC2130]  With DNSII, the required fallback to UTF-7 
   could easily be done by setting the ILET value to reflect UTF-7. 
    
    
    
    
    
 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
3.2 Reduced ILET Bit Allocation 
    
   Furthering the restriction of the ILET to ISO 10646 transformations 
   only, the ILET bit allocations could also be reduced from 12 bit to 5 
   bit.  This successfully creates a total of 32 possible values.  The 
   reserved bits are also reduced to one. 
    
                        1 1 1 1 1 1   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 
   +---+-+---------+---------------+ 
   |1 0|z|  ILET   |     COUNT     | 
   +---------------+---------------+ 
   | characters... | 
   +---------------+ 
    
   For example, the label "…––…Ιμ‡ώ©‡΄˜" will now be reflected in DNS packets 
   in the following form: 
    
                        1 1 1 1 1 1                     1 1 1 1 1 1   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 
   +---------------------------------------------------------------+ 
   |1 0|z| ILET=1  |       4       |          …––   (U+57DF)        | 
   +---------------------------------------------------------------+ 
   |          …Ιμ   (U+540D)         |         ‡ώ©    (U+7CFB)        | 
   +---------------------------------------------------------------+ 
   |          ‡΄˜   (U+7D71)         | 
   +--------------------------------+ 
    
   To start off with, the ILET values MAY be determined as follows: 
    
   0 = reserved for ANSI only 
   1 = 16 bit UCS-2 
   2 = UTF-8 
   3 = UTF-7 
   4 = 32 bit UCS-4 
    
    
4. Implementation & Deployment Strategies 
    
   The first step in any multilingual domain name implementation should 
   be to encourage an 8-bit clean approach to DNS.  However, even when 
   the system is 8-bit clean the problem with conflicting characters 
   still exists.  This is where the DNSII protocol becomes most 
   valuable. 
    
   Although the DNSII protocol could be implemented at any level of the 
   DNS, the following phased rollout is contemplated. 
    
   (1) Registry Level - The most meaningful starting point for 
   deployment would be at the registry level since this creates the 
   demand from the end users to use multilingual and extended character 
   domain names for Second Level Domains. 
    
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
   (2) Host Level - At the same time, registrants of the new extended 
   domain names could start to implement DNSII to host these special 
   kinds of domain names.  All other hosts that do not wish to use 
   extended characters do not have to migrate to the DNSII. 
    
   (3) Client Level - Once the multilingual aspect and the DNSII 
   specifications become mainstream, the user level resolvers will begin 
   to migrate.  This will include both the client resolver as well as 
   the ISP's DNS. 
    
   (4) Root Level - Eventually, as the DNSII is proven to be stable and 
   beneficial for the Internet at large, it could be used in the Root 
   Level so that new multilingual TLDs could be created. 
    
    
5. IDN Requirements Considerations 
    
   The DNSII protocol specification is in line with most if not all of 
   the requirements identified by the IDN work group. 
    
    
6. DNSSEC, EDNS and IPv6 Considerations 
    
   The use of DNSII should not require any adjustments with the 
   implementation of DNSSEC, EDNS or IPv6.  EDNS as well as compression 
   in fact will be done exactly the same as the existing system. 
    
   For example, the domain host.dns.…––…Ιμ‡ώ©‡΄˜.tld running with EDNS as 
   well as compression after host will look as follows: 
    
    
                          1 1 1 1 1 1                     1 1 1 1 1 1   
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 
     +---------------------------------------------------------------+ 
   20|0 1|    ELT    |0 0|     3     |       d       |       n       | 
     +---------------------------------------------------------------+ 
     |       s       |1 0|0 0|       UCS-2=1000      |       4       | 
     +---------------------------------------------------------------+ 
     |          …––   (U+57DF)        |          …Ιμ    (U+540D)        | 
     +---------------------------------------------------------------+ 
     |          ‡ώ©   (U+7CFB)        |          ‡΄˜    (U+7D71)        | 
     +---------------------------------------------------------------+ 
     |0 0|     3     |       t       |       l       |       d       | 
     +---------------------------------------------------------------+ 
     |       0       | 
     +---------------+ 
    
     +---------------------------------------------------------------+ 
     |0 0|     4     |       h       |       o       |       s       | 
     +---------------------------------------------------------------+ 
     |       t       |1 1|           21              | 
     +-----------------------------------------------+ 
    
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
7. Intellectual Property Considerations 
    
   It is the intention of Neteka to submit the DNSII protocol and other 
   elements of the multilingual domain name server software to IETF for 
   review, comment or standardization. 
    
   Neteka Inc. has applied for one or more patents on the technology 
   related to multilingual domain name server software and multilingual 
   email server software suite.  If a standard is adopted by IETF and 
   any patents are issued to Neteka with claims that are necessary for 
   practicing the standard, any party will be able to obtain the right 
   to implement, use and distribute the technology or works when 
   implementing, using or distributing technology based upon the 
   specific specifications under fair, reasonable and non-discriminatory 
   terms. 
    
    
8. References 
 
[RFC1700]   J. Reynolds, J. Postel, "ASSIGNED NUMBERS", RFC 
           1700, October 1994. 
  
[ISO10646] ISO/IEC 10646-1:2000. International Standard -- 
           Information technology -- Universal Multiple-Octet Coded 
           Character Set (UCS) 
 
[RFC1034]  Mockapetris, P., "Domain Names - Concepts and  
           Facilities," STD 13, RFC 1034, USC/ISI, November 1987 
    
[RFC1035]  Mockapetris, P., "Domain Names - Implementation and  
           Specification," STD 13, RFC 1035, USC/ISI, November  
           1987 
 
[RFC2119]  S. Bradner, "Key words for use in RFCs to Indicate  
           Requirement Levels," RFC 2119, March 1997 
 
[RFC2130]  C. Weider, et al. _The Report of the IAB Character Set 
           Workshop held 29 February - 1 March, 1996_ RFC 2130, April  
           1997 
 
[RFC2277]  H. Alvestrand, _IETF Policy on Character Sets and Languages_ 
           RFC 2277, January 1998 
 
[RFC2671]  Paul Vixie, "Extension Mechanisms for DNS (EDNS0)", August 
           1999, RFC 2671. 
 
 
 
 
Authors: 
 
Edmon Chung 
Neteka Inc. 
  

DNSII-MDNP        Multilingual Domain Name Protocol         August 2000  
 
2462 Yonge St. Toronto, 
Ontario, Canada M4P 2H5 
edmon@neteka.com 
 
David Leung 
Neteka Inc. 
2462 Yonge St. Toronto, 
Ontario, Canada M4P 2H5 
david@neteka.com 
    











































  



PAFTECH AB 2003-20262026-04-24 13:52:04