One document matched: draft-ietf-idn-jpchar-00.txt
Internet Draft Yoshiro Yoneya
draft-ietf-idn-jpchar-00.txt Yasuhiro Morishita
November 17, 2000 JPNIC
Expires May 17, 2001
Japanese characters in multilingual domain name label
Status of this memo
This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document explains about Japanese characters and its canonicalization
rules in multilingual domain name labels. This document is based on
discussions and examinations in JPNIC.
Despite of IDN WG rough consensus that character set in multilingual
domain name is UCS [UCS], most popular Japanese character set used in
Japan is Japanese Industrial Standards X 0208 -- hereafter abbreviated
as "JIS" -- [JISX0208]. This means that many of PCs and most of PDAs
including handy phones in Japan can display only JIS and ASCII.
Therefore, Japanese characters used in multilingual domain name are
strongly recommended as common part of JIS, ASCII and UCS.
Furthermore, for historical reasons, JIS have many compatible code
points in Kana and Alpha-numericals. Such compatible code points are
still used widely, so that these characters SHOULD be acceptable
especially in user interface, and MUST be canonicalized before
transmission to the wire. The former half should be implemented for
localization, and the latter half must be implemented for
internationalization.
1. Japanese characters in multilingual domain name labels
In principle domain name is a symbolic name of resources on the
Internet for understanding and memorizing easily to the Internet
users. Internationalization or multilingualization of domain name
MUST obey this principle. That is, characters in multilingualized
domain name labels SHOULD be unambiguous.
JIS has a lot of characters including graphical and compatible
characters. But as for domain name, significant characters to
represent names are Kanji, Hiragana and Katakana [CJK]. Therefore,
according to the principle, Japanese characters in multilingual domain
name MUST be Kanji, Hiragana and Katakana in JIS.
The file "idntabjp10.txt" defines Japanese characters in the format of
[VERSION], with additional corresponding JIS code points as 3rd field,
that can be used in multilingual domain name labels. Some of them,
such as PROLONGED SOUND MARK (U+30FC), are categorized into graphical
character in JIS, but usage of them are part of Kanji, Hiragana or
Katakana. These characters are in canonicalized form.
2. Canonicalization rules of Japanese characters in multilingual
domain name labels
In this section, this document describes two parts of canonicalization
rules. One explains "localization", and the other comments on
"internationalization". In other words, one is for Input/Display
level, and another is for API level [IDNA].
2.1 Localization: Characters to be canonicalized before NAMEPREP
As mentioned above, JIS has a lot of compatible characters that are
regarded alpha-numeric or Katakana. The former is so called
FULL-WIDTH Alpha-numeric, and the latter is so called HALF-WIDTH kana.
These characters are prohibited in [NAMEPREP], but still widely used
in many PCs and most PDAs in Japan. Hence, application softwares that
treat Japanese characters in multilingual domain name label SHOULD
accept these compatible characters as input and canonicalize them
before [NAMEPREP].
The file "idntabjpcanon10.txt" defines compatible characters, with
additional canonicalized character code as 3rd field; that is, mapping
table of FULL-WIDTH Alpha-numeric to ASCII, and HALF-WIDTH kana to
Katakana.
The file "idntabjpcomp10.txt" defines compatible character sequences
as composed, with additional canonicalized characters code as 3rd
field; that is, composition table of Kana and voiced sound mark.
Recommended order of applying canonicalization rules is as follows:
(1) "idntabjpcanon10"
(2) "idntabjpcom10"
This part is a local part of canonicalization.
2.2 Internationalization: Characters to be canonicalized in NAMEPREP
Japanese characters in multilingual domain name labels MUST be
characters defined in "idntabjp10". Another characters except for
"idntabjp10" SHOULD be canonicalized at [NAMEPREP].
[NAMEPREP] is common and recommended rule for IDN.
This part is an international part of canonicalization.
3. Security considerations
None in particular.
4. References
[UCS] "Universal Multiple-Octet Coded Character Set",
ISO/IEC 10646-1:1993, ISBN 0-201-61633-5
[JISX0208] "Japanese Industrial Standards",
Information Technology (Terms/Code/Date elements)-99,
ISBN4-542-12976-4
[IDNREQ] "Requirements of Internationalized Domain Names",
draft-ietf-idn-requirements-03.txt, Jun 2000, Z Wenzel, J Seng
[NAMEPREP] "Preparation of Internationalized Host Names",
draft-ietf-idn-nameprep-00.txt, Jul 2000, P Hoffman, M Blanchet
[CJK] "Han Ideograph (CJK) for Internationalized Domain Names",
draft-ietf-idn-cjk-00.txt, Sep 2000, J Seng, Y Yoneya,
K Huang, K Kyongsok
[VERSION] "Handling versions of internationalized domain names protocols",
draft-ietf-idn-version-00.txt, Nov 2000, M Blanchet
5. Acknowledgements
JPNIC IDN-TF members.
6. Author's Address
Yoshiro Yoneya
Japan Network Information Center
Fuundo Bldg 1F, 1-2 Kanda-ogawamachi
Chiyoda-ku Tokyo 101-0052, Japan
yone@nic.ad.jp
Yasuhiro Morishita
Japan Network Information Center
Fuundo Bldg 1F, 1-2 Kanda-ogawamachi
Chiyoda-ku Tokyo 101-0052, Japan
yasuhiro@nic.ad.jp
| PAFTECH AB 2003-2026 | 2026-04-24 13:50:45 |