One document matched: draft-ietf-idnabis-mappings-03.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd"
[
<!ENTITY RFC3490 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3490.xml">
<!ENTITY RFC3491 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3491.xml">
<!ENTITY DRAFTIDNABISPROTOCOL SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-idnabis-protocol.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="no"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc comments="yes" ?>
<?rfc inline="yes" ?>
<rfc category="std" docName="draft-ietf-idnabis-mappings-03" ipr="pre5378Trust200902">
  <front>
    <title abbrev="IDNA Mapping">Mapping Characters in IDNA</title>

	<author initials="P." surname="Resnick" fullname="Peter W. Resnick">
		<organization>Qualcomm Incorporated</organization>
		<address>
			<postal>
				<street>5775 Morehouse Drive</street>
				<city>San Diego</city>
				<region>CA</region>
				<code>92121-1714</code>
				<country>US</country>
			</postal>
			<phone>+1 858 651 4478</phone>
			<email>presnick@qualcomm.com</email>
			<uri>http://www.qualcomm.com/~presnick/</uri>
		</address>
	</author>
	<author initials='P.' surname="Hoffman" fullname='Paul Hoffman'>
		<organization>VPN Consortium</organization>
		<address>
			<postal>
				<street>127 Segre Place</street>
					<city>Santa Cruz</city>
					<region>CA</region>
					<code>95060</code>
				<country>US</country>
			</postal>
			<phone>1-831-426-9827</phone>
			<email>paul.hoffman@vpnc.org</email>
		</address>
	</author>
		<date month="August" year="2009" />
    <area>Applications</area>
    <workgroup>IDNABIS</workgroup>
    <keyword></keyword>
    <abstract>

			<t>In the original version of the Internationalized Domain Names in Applications (IDNA) protocol, any Unicode code points taken from user input were mapped into a set of Unicode code points that "make sense", which were then encoded and passed to the domain name system (DNS). The current version of IDNA presumes that the input to the protocol comes from a set of "permitted" code points, which it then encodes and passes to the DNS, but does not specify what to do with the result of user input. This document describes the actions taken by an implementation between user input and passing permitted code points to the new IDNA protocol.</t>

    </abstract>

  </front>

  <middle>

    <section anchor="intro" title="Introduction">

			<t>This document describes the operations that can be applied to user input in order to get it into a form acceptable by the <xref target="I-D.ietf-idnabis-protocol">Internationalized Domain Names in Applications (IDNA) protocol</xref>. The document describes a general implementation procedure for mapping in section 
			<xref target="procedure" format="counter" />.</t>

			<t>It should be noted that this document does not specify the behavior of a protocol that appears "on the wire". It describes an operation that is to be applied to user input in order to prepare that user input for use in an "on the network" protocol. As unusual as this may be for an IETF protocol document, it is a necessary operation to maintain interoperability.</t>

		</section>

    <section anchor="procedure" title="The General Procedure">

			<t>This section defines a general algorithm that applications ought to implement in order to produce Unicode code points that will be valid under the IDNA protocol. An application might implement the full mapping as described below, or can choose a different mapping. In fact, an application might want to implement a full mapping that is substantially compatible with the original IDNA protocol instead of the algorithm given here.</t>


			<t>The general algorithm that an application (or the input method provided by an operating system) ought to use is relatively straightforward:</t>
			
			<t>
				<list style="numbers">

					<t>Upper case characters are mapped to their lower case equivalents by using the algorithm for mapping Unicode characters.</t>

					<t>Full-width and half-width characters (those defined with Decomposition Types <wide> and <narrow>) are mapped to their decomposition mappings as shown in the Unicode character database.</t>

					<t>All characters are mapped using Unicode Normalization Form C (NFC).</t>

					<t><xref target="I-D.ietf-idnabis-protocol" /> is specified such that the protocol acts on the indvidual labels of the domain name. If an implementation of this mapping is also performing the step of separation of the parts of a domain name into labels by using the FULL STOP character (U+002E), the following character can be mapped to the FULL STOP before label separation occurs:
						<list style="symbols">
								<!-- t>ARMENIAN FULL STOP (U+0589)</t -->
								<!-- t>ARABIC FULL STOP (U+06D4)</t -->
								<!-- t>SYRIAC SUPRALINEAR FULL STOP (U+0701)</t -->
								<!-- t>SYRIAC SUBLINEAR FULL STOP (U+0702)</t -->
								<!-- t>ETHIOPIC FULL STOP (U+1362)</t -->
								<!-- t>CANADIAN SYLLABICS FULL STOP (U+166E)</t -->
								<!-- t>MONGOLIAN FULL STOP (U+1803)</t -->
								<!-- t>MONGOLIAN MANCHU FULL STOP (U+1809)</t -->
								<!-- t>COPTIC OLD NUBIAN FULL STOP (U+2CF9)</t -->
								<!-- t>COPTIC FULL STOP (U+2CFE)</t -->
								<t>IDEOGRAPHIC FULL STOP (U+3002)</t>
								<!-- t>VAI FULL STOP (U+A60E)</t -->
								<!-- t>PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC FULL STOP (U+FE12)</t -->
								<!-- t>SMALL FULL STOP (U+FE52)</t -->
						</list>
					There are other characters that are used as "full stops" that one could consider mapping as label separators, but their use as such has not been investigated thoroughly.
					</t>
				</list>
			</t>

			<t>Definitions for the rules in this algorithm can be found in <xref target="Unicode51" />. Specifically:
			
			<list style="symbols">

				<t>Unicode Normalization Form C can be found in Annex #15 of <xref target="Unicode51" />.</t>

				<t>In order to map upper case characters to their lower case equivalents (defined in section 3.13 of <xref target="Unicode51" />), first map characters to the "Lowercase_Mapping" property (the "<lower>" entry in the second column) in <eref target="http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt"/>, if any. Then, map characters to the "Simple_Lowercase_Mapping" property (the fourteenth column) in <eref target="http://www.unicode.org/Public/UNIDATA/UnicodeData.txt"/>, if any.</t>
				
				<t>In order to map full-width and half-width characters to their decomposition mappings, map any character whose "Decomposition_Type" (contained in the first part of of the sixth column) in <eref target="http://www.unicode.org/Public/UNIDATA/UnicodeData.txt"/> is either "<wide>" or "<narrow>" to the "Decomposition_Mapping" of that character (contained in the second part of the sixth column) in <eref target="http://www.unicode.org/Public/UNIDATA/UnicodeData.txt"/>.</t>

	    	<t>The <eref target="http://www.unicode.org/Public/UNIDATA/UCD.html"/> web page has useful descriptions of the contents of these files.</t>
	    	
			</list>
			
			If this mappings in this document are applied to versions of Unicode later than Unicode 5.1, the later versions of the Unicode Standard should be consulted.</t>

			<t>These are a minimal set of mappings that an application should strongly consider doing. Of course, there are many others that might be done.</t>

    </section>

    <section anchor="IANA" title="IANA Considerations">

      <t>This memo includes no request to IANA.</t>

    </section>

    <section anchor="Security" title="Security Considerations">

			<t>This document suggests creating mappings that might cause confusion for some users while alleviating confusion in other users. Such confusion is not covered in any depth in this document (nor in the other IDNA-related documents).</t>

		</section>

  </middle>

  <back>

    <references title="Normative References">
		&DRAFTIDNABISPROTOCOL;

		<reference anchor="Unicode51">
				<front>
					<title abbrev="Unicode 5.1">The Unicode Standard, Version 5.1.0</title>
				<author>
				<organization>The Unicode Consortium</organization>
				<address />
				</author>
				<date year="2008" />
				</front>
				<annotation>defined by: The Unicode Standard, Version 5.0, Boston, MA, Addison-Wesley, 2007, ISBN 0-321-48091-0, as amended by Unicode 5.1.0 (<eref target="http://www.unicode.org/versions/Unicode5.1.0/" />).</annotation>
		  </reference>
    </references>
  </back>
</rfc>

PAFTECH AB 2003-20242024-05-09 18:25:37