One document matched: draft-alvestrand-lang-tags-v2-00.txt
Tags for the Identification of Languages
Status of this Memo
The file name of this memo is draft-alvestrand-lang-
tags-00.txt
This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of
RFC2026.
Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its
working groups. Note that other groups may also
distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a
maximum of six months and may be updated, replaced, or
obsoleted by other documents at any time. It is
inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in
progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.
Abstract
This document describes a language tag for use in cases
where it is desired to indicate the language used in an
information object.
It also defines a Content-language: header, for use in the
case where one desires to indicate the language of
something that has RFC-822-like headers, like MIME body
parts or Web documents, and a new parameter to the
Tags for the names of languages H. T. Alvestrand
Expires December 1999
Multipart/Alternative type, to aid in the usage of the
Content-Language: header.
Comments on this draft should be sent to the mailing list
<ietf-languages@iana.org>
1. Introduction
There are a number of languages spoken by human beings in
this world.
A great number of these people would prefer to have
information presented in a language that they understand.
In some contexts, it is possible to have information in
more than one language, or it might be possible to provide
tools for assisting in the understanding of a language
(like dictionaries).
A prerequisite for any such function is a means of
labelling the information content with an identifier for
the language in which is is written.
In the tradition of solving only problems that we think we
understand, this document specifies an identifier
mechanism, and one possible use for it.
2. The Language tag
The language tag is composed of 1 or more parts: A primary
language tag and a (possibly empty) series of subtags.
The syntax of this tag in RFC-822 EBNF is:
Language-Tag = Primary-tag *( "-" Subtag )
Primary-tag = 1*8ALPHA
Subtag = 1*8ALPHA
Whitespace is not allowed within the tag.
All tags are to be treated as case insensitive; there exist
conventions for capitalization of some of them, but these
should not be taken to carry meaning.
The namespace of language tags is administered by the IANA
according to the rules in section 5 of this document.
The following registrations are predefined:
In the primary language tag:
draft-alvestrand-lang-tags-00.txt [Page 2]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
@ All 2-letter tags are interpreted according to ISO
standard 639, "Code for the representation of names of
languages" [ISO 639].
@ All 3-letter tags are interpreted according to ISO 639
part 2], "Codes for the representation of names of
languages -- Part 2: Alpha-3 code [ISO 639-2
@ The value "i" is reserved for IANA-defined registrations
@ The value "x" is reserved for private use. Subtags of
"x"will not be registered by the IANA.
@ Other values cannot be assigned except by updating this
standard.
The reason for reserving all other tags is to be open
towards new revisions of ISO 639; the use of "i" and "x" is
the minimum we can do here to be able to extend the
mechanism to meet our requirements.
In the first subtag:
- All 2-letter codes are interpreted as ISO 3166 alpha-2
country codes denoting the area in which the language is
used.
- Codes of 3 to 8 letters may be registered with the IANA
by anyone who feels a need for it, according to the rules
in chapter 5 of this document.
The information in the subtag may for instance be:
- Country identification, such as en-US (this usage is
described in ISO 639)
- Dialect or variant information, such as no-nynorsk or en-
cockney
- Languages not listed in ISO 639 that are not variants of
any listed language, which can be registered with the i-
prefix, such as i-cherokee
- Script variations, such as az-arabic and az-cyrillic
In the second and subsequent subtag, any value can be
registered.
NOTE: The ISO 639/ISO 3166 convention is that language
names are written in lower case, while country codes are
written in upper case.
This convention is recommended, but not enforced; the tags
are case insensitive.
ISO 639 defines a registration authority for additions to
and changes in the list of languages in ISO 639. This
authority is:
draft-alvestrand-lang-tags-00.txt [Page 3]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
International Information Centre for Terminology
(Infoterm)
P.O. Box 130
A-1021 Wien
Austria
Phone: +43 1 26 75 35 Ext. 312
Fax: +43 1 216 32 72
The following codes have been added in 1989 (nothing
later): ug (Uigur), iu (Inuktitut, also called Eskimo), za
(Zhuang), he (Hebrew, replacing iw), yi (Yiddish, replacing
ji), and id (Indonesian, replacing in).
The registration agency for ISO 3166 (country codes) is:
ISO 3166 Maintenance Agency Secretariat
c/o DIN Deutches Institut fuer Normung
Burggrafenstrasse 6
Postfach 1107
D-10787 Berlin
Germany
Phone: +49 30 26 01 320
Fax: +49 30 26 01 231
The country codes AA, QM-QZ, XA-XZ and ZZ are reserved by
ISO 3166 as user-assigned codes.
ISO 3166 part 2 reserves qaa through qtz as reserved for
"local use", and says that "these codes may not be
exchanged internationally".
@
2.1 Choice of language tag
One may occasionally be faced with several possible tags
for the same body of text.
Interoperability is best served if all users send the same
tag; therefore, the following guideline is recommended:
1. Use the most precise tagging that you are certain of.
2. When a language has both an ISO 639-1 2-character tag and
an ISO 639-2 3-character tag, use the ISO 639-1 2-
character tag.
3. When a language has both an ISO 639-2/T (Terminology) tag
and an ISO 639-2/B (Bibliographic) tag, and these differ,
use the Terminology tag. (NOTE: So far, all languages for
which there is a difference have 2-character tags. So
this situation will hopefully not arise.)
4. When a language has both an IANA-registered tag (i-
something) and an ISO registered tag, use the ISO tag.
draft-alvestrand-lang-tags-00.txt [Page 4]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
5. Do NOT use the UNK (Unknown) tag unless the protocol in
use forces you to give a value for the language tag, even
if you don't know the language. Omitting the tag is
preferred.
6. Do NOT use the MUL (Multiple) tag if the protocol allows
you to use multiple languages, as is the case for the
Content-Language: header.
2.2 Meaning of the language tag
The language tag always defines a language as spoken (or
written) by human beings for communication of information
to other human beings.
Computer languages are explicitly excluded.
There is no guaranteed relationship between languages whose
tags start out with the same series of subtags; especially,
they are NOT guraranteed to be mutually comprehensible,
although this will sometimes be the case.
Applications should always treat language tags as a single
token; the division into main tag and subtags is an
administrative mechanism, not a navigation aid.
The relationship between the tag and the information it
relates to is defined by the standard describing the
context in which it appears.
So, this section can only give possible examples of its
usage.
- For a single information object, it should be taken as
the set of languages that is required for a complete
comprehension of the complete object.
Example: Simple text.
- For an aggregation of information objects, it should be
taken as the set of languages used inside components of
that aggregation. Examples: Document stores and
libraries.
- For information objects whose purpose in life is
providing alternatives, it should be regarded as a hint
that the material inside is provided in several
languages, and that one has to inspect each of the
alternatives in order to find its language or languages.
In this case, multiple languages need not mean that one
needs to be multilingual to get complete understanding of
the document.
Example: MIME multipart/alternative.
- It would be possible to define (for instance) an SGML DTD
that defines a <LANG xx> tag for indicating that
following or contained text is written in this language,
such that one could write "<LANG FR>C'est la vie</LANG>";
draft-alvestrand-lang-tags-00.txt [Page 5]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
the Norwegian-speaking user could then access a French-
Norwegian dictionary to find out what the quote meant.
2.3 Language-range
Since the writing of RFC 1766, it has become apparent that
there's a need for defining a term for a set of languages
that share some common property. The following definition
of language-range is mostly lifted verbatim from RFC 2068
(HTTP/1.1).
language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA )
) | "*" )
A language-range matches a language-tag if it exactly
equals the tag, or if it exactly equals a prefix of the tag
such that the first tag character following the prefix is
"-".
The special range "*" matches every tag. A protocol which
uses language ranges may specify more rules about the
semantics of "*"; for instance, HTTP/1.1 specifies that it
only matches languages not matched by any other range
within an Accept-Language: header.
Note: This use of a prefix matching rule does not imply
that language tags are assigned to languages in such a way
that it is always true that if a user understands a
language with a certain tag, then this user will also
understand all languages with tags for which this tag is a
prefix. The prefix rule simply allows the use of prefix
tags if this is the case.
3. The Content-language header
The Language header is intended for use in the case where
one desires to indicate the language(s) of something that
has RFC-822-like headers, like MIME body parts or Web
documents.
The RFC-822 EBNF of the Content-Language header is:
Language-Header = "Content-Language" ":" 1#Language-tag
Note that the Language-Header is allowed to list several
languages in a comma-separated list.
Whitespace is allowed, which means also that one can place
parenthesized comments anywhere in the language sequence.
@
draft-alvestrand-lang-tags-00.txt [Page 6]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
3.1 Examples of Content-language values
NOTE: NONE of the subtags shown in this document have
actually been assigned; they are used for illustration
purposes only.
Norwegian official document, with parallel text in both
official versions of Norwegian. (Both versions are readable
by all Norwegians).
Content-Type: multipart/alternative;
differences=content-language
Content-Language: no-nynorsk, no-bokmaal
Voice recording from the London docks
Content-type: audio/basic
Content-Language: en-cockney
@
Document in Sami, which does not have an ISO 639 code, and
is spoken in several countries, but with about half the
speakers in Norway, with six different, mutually
incomprehensible dialects:
Content-type: text/plain; charset=iso-8859-10
Content-Language: i-sami-no (North Sami)
@
An English-French dictionary
@
Content-type: application/dictionary
Content-Language: en, fr (This is a dictionary)
An official EC document (in a few of its official
languages)
@
Content-type: multipart/alternative
Content-Language: en, fr, de, da, el, it
@
An excerpt from Star Trek
Content-type: video/mpeg
Content-Language: x-klingon
4. Use of Content-Language with Multipart/Alternative
When using the Multipart/Alternative body part of MIME, it
is possible to have the body parts giving the same
information content in different languages. In this case,
draft-alvestrand-lang-tags-00.txt [Page 7]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
one should put a Content-Language header on each of the
body parts, and a summary Content-Language header onto the
Multipart/Alternative itself.
4.1 The differences parameter to multipart/alternative
As defined in RFC 1541, Multipart/Alternative only has one
parameter: boundary.
The common usage of Multipart/Alternative is to have more
than one format of the same message (f.ex. PostScript and
ASCII).
The use of language tags to differentiate between different
alternatives will certainly not lead all MIME UAs to
present the most sensible body part as default.
Therefore, a new parameter is defined, to allow the
configuration of MIME readers to handle language
differences in a sensible manner.
Name: Differences
Value: One or more of
Content-Type
Content-Language
@
Further values can be registered with IANA; it must be the
name of a header for which a definition exists in a
published RFC. If not present, Differences=Content-Type is
assumed.
The intent is that the MIME reader can look at these
headers of the message component to do an intelligent
choice of what to present to the user, based on knowledge
about the user preferences and capabilities.
(The intent of having registration with IANA of the fields
used in this context is to maintain a list of usages that a
mail UA may expect to see, not to reject usages.)
(NOTE: The MIME specification [RFC 1521], section 7.2,
states that headers not beginning with "Content-" are
generally to be ignored in body parts. People defining a
header for use with "differences=" should take note of
this.)
The mechanism for deciding which body part to present is
outside the scope of this document.
MIME EXAMPLE:
@
Content-Type: multipart/alternative; differences=Content-
Language;
boundary="limit"
Content-Language: en, fr, de
@
draft-alvestrand-lang-tags-00.txt [Page 8]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
--limit
Content-Language: fr
@
Le renard brun et agile saute par dessus le chien paresseux
--limit
Content-Language: de
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-encoding: quoted-printable
@
Der schnelle braune Fuchs h=FCpft =FCber den faulen Hund
--limit
Content-Language: en
@
The quick brown fox jumps over the lazy dog
--limit--
@
When composing a message, the choice of sequence may be
somewhat arbitrary. However, non-MIME mail readers will
show the first body part first, meaning that this should
most likely be the language understood by most of the
recipients.
5. IANA registration procedure for language tags
Any language tag must start with an existing tag, and
extend it.
This registration form should be used by anyone who wants
to use a language tag not defined by ISO or IANA.
-----------------------------------------------------------
LANGUAGE TAG REGISTRATION FORM
Name of requester :
E-mail address of requester:
Tag to be registered :
English name of language :
Native name of language (transcribed into ASCII):
Reference to published description of the language (book or
article):
-----------------------------------------------------------
The language form must be sent to <ietf-languages@iana.org>
for a 2-week review period before it can be submitted to
draft-alvestrand-lang-tags-00.txt [Page 9]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
IANA. (This is an open list. Requests to be added should
be sent to <ietf-languages-request@iana.org>.)
When the two week period has passed, the language tag
reviewer, who is appointed by the IETF Applications Area
Director, either forwards the request to IANA@ISI.EDU, or
rejects it because of significant objections raised on the
list.
Decisions made by the reviewer may be appealed to the IESG.
All registered forms are available online in the directory
ftp://ftp.isi.edu/in-notes/iana/assignments/languages/
6. Security Considerations
Security issues are believed to be irrelevant to this memo.
7. Character set considerations
Codes may always be expressed using the US-ASCII character
repertoire (a-z), which is present in most character sets.
The issue of deciding upon the rendering of a character set
based on the language tag is not addressed in this memo;
however, it is thought impossible to make such a decision
correctly for all cases unless means of switching language
in the middle of a text are defined (for example, a
rendering engine that decides font based on Japanese or
Chinese language will fail to work when a mixed Japanese-
Chinese text is encountered)
8. Acknowledgements
This document has benefited from innumberable rounds of
review and comments in various fora of the IETF and the
Internet working groups.
As so, any list of contributors is bound to be incomplete;
please regard the following as only a selection from the
group of people who have contributed to make this document
what it is today.
In alphabetical order:
Tim Berners-Lee, Nathaniel Borenstein, Jim Conklin, Dave
Crocker, Ned Freed, Tim Goodwin, Olle Jarnefors, John
Klensin, Keith Moore, Masataka Ohta, Keld Jorn Simonsen,
Rhys Weatherley, and many, many others.
9. Author's Address
Harald Tveit Alvestrand
EDB Maxware
draft-alvestrand-lang-tags-00.txt [Page 10]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
Pirsenteret
7064 TRONDHEIM
NORWAY
EMail: Harald.Alvestrand@maxware.no
Phone: +47 73 54 57 97
@
10. References
@
[ISO 639]
ISO 639:1988 (E/F) - Code for the representation of
names of languages - The International Organization
for Standardization, 1st edition, 1988 17 pages
Prepared by ISO/TC 37 - Terminology (principles and
coordination).
[ISO 639-2]
ISO 639-2:1998 - Codes for the representation of names
of languages -- Part 2: Alpha-3 code - edition 1,
1998, 66 pages, prepared by ISO/TC 37/SC 2
[ISO 3166]
ISO 3166:1988 (E/F) - Codes for the representation of
names of countries - The International Organization
for Standardization, 3rd edition, 1988-08-15.
[RFC 1521]
Borenstein, N., and N. Freed, "MIME Part One:
Mechanisms for Specifying and Describing the Format of
Internet Message Bodies", RFC 1521, Bellcore,
Innosoft, September 1993.
[RFC 1327]
Kille, S., "Mapping between X.400(1988) / ISO 10021
and RFC 822", RFC 1327, University College London, May
1992.
Appendix A: List of language tags
This list is NOT authoritative. It was prepared based on
Keld Simonsen's publicly available lists of codes, which
were prepared from drafts of the standards.
639-1 639-2/T 639-2/B English name
aa aar aar Afar
ab abk abk Abkhazian
ace ace Achinese
ach ach Acoli
ada ada Adangme
draft-alvestrand-lang-tags-00.txt [Page 11]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
afa afa Afro-Asiatic (Other)
afh afh Afrihili
af afr afr Afrikaans
aka aka Akan
akk akk Akkadian
ale ale Aleut
alg alg Algonquian languages
am amh amh Amharic
ang ang English, Old (ca. 450-1100)
apa apa Apache languages
ar ara ara Arabic
arc arc Aramaic
arn arn Araucanian
arp arp Arapaho
art art Artificial (Other)
arw arw Arawak
as asm asm Assamese
ath ath Athapascan languages
aus aus Australian languages
ava ava Avaric
ave ave Avestan
awa awa Awadhi
ay aym aym Aymara
az aze aze Azerbaijani
bad bad Banda
bai bai Bamileke languages
ba bak bak Bashkir
bal bal Baluchi
bam bam Bambara
ban ban Balinese
bas bas Basa
bat bat Baltic (Other)
bej bej Beja
be bel bel Belarussian (ISO 639-1: Byelorussian)
bem bem Bemba
bn ben ben Bengali (ISO 639-1: Bengali; Bangla)
ber ber Berber (Other)
bho bho Bhojpuri
bi bih bih Bihari
bik bik Bikol
bin bin Bini
bis bis Bislama
bla bla Siksika
bnt bnt Bantu (Other)
bo bod tib Tibetan
bra bra Braj
br bre bre Breton
btk btk Batak (Indonesia)
draft-alvestrand-lang-tags-00.txt [Page 12]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
bua bua Buriat
bug bug Buginese
bg bul bul Bulgarian
cad cad Caddo
cai cai Central American Indian (Other)
car car Carib
ca cat cat Catalan
cau cau Caucasian (Other)
ceb ceb Cebuano
cel cel Celtic (Other)
cs ces cze Czech
cha cha Chamorro
chb chb Chibcha
che che Chechen
chg chg Chagatai
chk chk Chuukese
chm chm Mari
chn chn Chinook jargon
cho cho Choctaw
chp chp Chipewyan
chr chr Cherokee
chu chu Church Slavic
chv chv Chuvash
chy chy Cheyenne
cmc cmc Chamic languages
cop cop Coptic
cor cor Cornish
co cos cos Corsican
cpe cpe Creoles and pidgins, English-based (Other)
cpf cpf Creoles and pidgins, French-based (Other)
cpp cpp Creoles and pidgins, Portuguese-based
(Other)
cre cre Cree
crp crp Creoles and pidgins (Other)
cus cus Cushitic (Other)
cy cym wel Welsh
dak dak Dakota
da dan dan Danish
day day Dayak
del del Delaware
den den Slave (Athapascan)
de deu ger German
dgr dgr Dogrib
din din Dinka
div div Divehi
doi doi Dogri
dra dra Dravidian (Other)
dua dua Duala
draft-alvestrand-lang-tags-00.txt [Page 13]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
dum dum Dutch, Middle (ca. 1050-1350)
dyu dyu Dyula
dz dzo dzo Dzongkha (Bhutani in ISO 639-1)
efi efi Efik
egy egy Egyptian (Ancient)
eka eka Ekajuk
el ell gre Greek, Modern (post 1453)
elx elx Elamite
en eng eng English
enm enm English, Middle (1100-1500)
eo epo epo Esperanto
et est est Estonian
eu eus baq Basque
ewe ewe Ewe
ewo ewo Ewondo
fan fan Fang
fo fao fao Faroese
fa fas per Persian
fat fat Fanti
fj fij fij Fijian (ISO 639-1: Fiji)
fi fin fin Finnish
fiu fiu Finno-Ugrian (Other)
fon fon Fon
fr fra fre French
frm frm French, Middle (ca. 1400-1600)
fro fro French, Old (842-ca. 1400)
fy fry fry Frisian
ful ful Fulah
fur fur Friulian
gaa gaa Ga
gay gay Gayo
gba gba Gbaya
gem gem Germanic (Other)
gez gez Geez
gil gil Gilbertese
gd gla gla Gaelic (Scots)
ga gle gle Irish
gl glg glg Gallegan (Galician in ISO 639-1)
glv glv Manx
gmh gmh German, Middle High (ca. 1050-1500)
goh goh German, Old High (ca. 750-1050)
gon gon Gondi
gor gor Gorontalo
got got Gothic
grb grb Grebo
grc grc Greek, Ancient (to 1453)
gn grn grn Guarani
gu guj guj Gujarati
draft-alvestrand-lang-tags-00.txt [Page 14]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
gwi gwi Gwich'in
hai hai Haida
ha hau hau Hausa
haw haw Hawaiian
he heb heb Hebrew (iw in 639-1 first edition)
her her Herero
hil hil Hiligaynon
him him Himachali
hi hin hin Hindi
hit hit Hittite
hmn hmn Hmong
hmo hmo Hiri Motu
hr hrv scr Croatian
hu hun hun Hungarian
hup hup Hupa
hy hye arm Armenian
iba iba Iban
ibo ibo Igbo
ijo ijo Ijo
iu iku iku Inuktitut
ie ile ile Interlingue
ilo ilo Iloko
ia ina ina Interlingua (International Auxilary Language
Association)
inc inc Indic (Other)
id ind ind Indonesian (in in 639-1 first edition)
ine ine Indo-European (Other)
ik ipk ipk Inupiak
ira ira Iranian (Other)
iro iro Iroquoian languages
is isl ice Icelandic
it ita ita Italian
jw jaw jav Javanese
ja jpn jpn Japanese
jpr jpr Judeo-Persian
jrb jrb Judeo-Arabic
kaa kaa Kara-Kalpak
kab kab Kabyle
kac kac Kachin
kl kal kal Kalaallisut (Greenlandic in 639-1)
kam kam Kamba
kn kan kan Kannada
kar kar Karen
ks kas kas Kashmiri
ka kat geo Georgian
kau kau Kanuri
kaw kaw Kawi
kk kaz kaz Kazakh
draft-alvestrand-lang-tags-00.txt [Page 15]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
kha kha Khasi
khi khi Khoisan (Other)
km khm khm Khmer (Cambodian in 639-1)
kho kho Khotanese
kik kik Kikuyu
rw kin kin Kinyarwanda
ky kir kir Kirghiz
kmb kmb Kimbundu
kok kok Konkani
kom kom Komi
kon kon Kongo
ko kor kor Korean
kos kos Kosraean
kpe kpe Kpelle
kro kro Kru
kru kru Kurukh
kua kua Kuanyama
kum kum Kumyk
ku kur kur Kurdish
kut kut Kutenai
lad lad Ladino
lah lah Lahnda
lam lam Lamba
lo lao lao Lao (Laotian in 639-1)
la lat lat Latin
lv lav lav Latvian (Latvian, Lettish in 639-1)
lez lez Lezghian
ln lin lin Lingala
lt lit lit Lithuanian
lol lol Mongo
loz loz Lozi
ltz ltz Letzeburgesch
lua lua Luba-Lulua
lub lub Luba-Katanga
lug lug Ganda
lui lui Luiseno
lun lun Lunda
luo luo Luo (Kenya and Tanzania)
lus lus Lushai
mad mad Madurese
mag mag Magahi
mah mah Marshall
mai mai Maithili
mak mak Makasar
ml mal mal Malayalam
man man Mandingo
map map Austronesian (Other)
mr mar mar Marathi
draft-alvestrand-lang-tags-00.txt [Page 16]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
mas mas Masai
mdr mdr Mandar
men men Mende
mga mga Irish, Middle (900-1200)
mic mic Micmac
min min Minangkabau
mis mis Miscellaneous languages
mk mkd mac Macedonian
mkh mkh Mon-Khmer (Other)
mg mlg mlg Malagasy
mt mlt mlt Maltese
mni mni Manipuri
mno mno Manobo languages
moh moh Mohawk
mo mol mol Moldavian
mn mon mon Mongolian
mos mos Mossi
mi mri mao Maori
ms msa may Malay
mul mul Multiple languages
mun mun Munda languages
mus mus Creek
mwr mwr Marwari
my mya bur Burmese
myn myn Mayan languages
nah nah Nahuatl
nai nai North American Indian (Other)
na nau nau Nauru
nav nav Navajo
nbl nbl Ndebele, South
nde nde Ndebele, North
ndo ndo Ndonga
ne nep nep Nepali
new new Newari
nia nia Nias
nic nic Niger-Kordofanian (Other)
niu niu Niuean
nl nld dut Dutch
non non Norse, Old
no nor nor Norwegian
nso nso Sohto, Northern
nub nub Nubian languages
nya nya Nyanja
nym nym Nyamwezi
nyn nyn Nyankole
nyo nyo Nyoro
nzi nzi Nzima
oc oci oci Occitan (post 1500)
draft-alvestrand-lang-tags-00.txt [Page 17]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
oji oji Ojibwa
or ori ori Oriya
om orm orm Oromo
osa osa Osage
oss oss Ossetic
ota ota Turkish, Ottoman (1500-1928)
oto oto Otomian languages
paa paa Papuan (Other)
pag pag Pangasinan
pal pal Pahlavi
pam pam Pampanga
pa pan pan Panjabi (Punjabi in 639-1)
pap pap Papiamento
pau pau Palauan
peo peo Persian, Old (ca. 600-400 B.C.)
phi phi Philippine (Other)
phn phn Phoenician
pli pli Pali
pl pol pol Polish
pon pon Pohnpeian
por por Portuguese
pra pra Prakrit languages
pro pro Provencal, Old (to 1500)
ps pus pus Pushto (Pashto, Pushto in 639-1)
qaa-qtz qaa-qtz Reserved for local use
qu que que Quechua
raj raj Rajasthani
rap rap Rapanui
rar rar Rarotongan
roa roa Romance (Other)
rm roh roh Raeto-Romance (Rhaeto-Romance in 639-1)
rom rom Romany
ron rum Romanian
rn run run Rundi (Kirundi in 639-1)
ru rus rus Russian
sad sad Sandawe
sg sag sag Sango (Sangho in 639-1)
sah sah Yakut
sai sai South American Indian (Other)
sal sal Salishan languages
sam sam Samaritan Aramaic
sa san san Sanskrit
sas sas Sasak
sat sat Santali
sco sco Scots
sel sel Selkup
sem sem Semitic (Other)
sga sga Irish, Old (to 900)
draft-alvestrand-lang-tags-00.txt [Page 18]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
shn shn Shan
sid sid Sidamo
si sin sin Sinhalese
sio sio Siouan languages
sit sit Sino-Tibetan (Other)
sla sla Slavic (Other)
sk slk slo Slovak
sl slv slv Slovenian
smi smi Sami languages
sm smo smo Samoan
sn sna sna Shona
sd snd snd Sindhi
snk snk Soninke
sog sog Sogdian
so som som Somali
son son Songhai
st sot sot Sotho, Southern (Sesotho in 639-1)
es spa spa Spanish (but note that T code changes to esp
in 2003)
sq sqi alb Albanian
srd srd Sardinian
sr srp scc Serbian
srr srr Serer
ssa ssa Nilo-Saharan (Other)
ss ssw ssw Swati (Siswati in 639-1)
suk suk Sukuma
su sun sun Sundanese
sus sus Susu
sux sux Sumerian
swa swa Swahili
sv swe swe Swedish
syr syr Syriac
tah tah Tahitian
tai tai Tai (Other)
ta tam tam Tamil
tt tat tat Tatar
te tel tel Telugu
tem tem Timne
ter ter Tereno
tet tet Tetum
tg tgk tgk Tajik
tl tgl tgl Tagalog
th tha tha Thai
tig tig Tigre
ti tir tir Tigrinya
tiv tiv Tiv
tkl tkl Tokelau
tli tli Tlingit
draft-alvestrand-lang-tags-00.txt [Page 19]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
tmh tmh Tamashek
tog tog Tonga (Nyasa)
to ton ton Tonga (Tonga Islands)
tpi tpi Tok Pisin
tsi tsi Tsimshian
tn tsn tsn Tswana (Setswana in 639-1)
ts tso tso Tsonga
tk tuk tuk Turkmen
tum tum Tumbuka
tr tur tur Turkish
tut tut Altaic (Other)
tvl tvl Tuvalu
tw twi twi Twi
tyv tyv Tuvinian
uga uga Ugaritic
ug uig uig Uighur
uk ukr ukr Ukrainian
umb umb Umbundu
und und Undetermined
ur urd urd Urdu
uz uzb uzb Uzbek
vai vai Vai
ven ven Venda
vi vie vie Vietnamese
vo vol vol Volapuk
vot vot Votic
wak wak Wakashan languages
wal wal Walamo
war war Waray
was was Washo
wen wen Sorbian languages
wo wol wol Wolof
xh xho xho Xhosa
yao yao Yao
yap yap Yapese
yi yid yid Yiddish (ji in first edition of 639-1)
yo yor yor Yoruba
ypk ypk Yupik languages
zap zap Zapotec
zen zen Zenaga
za zha zha Zhuang
zh zho chi Chinese
znd znd Zande
zu zul zul Zulu
zun zun Zuni
At the moment I have been unable to find an entry in ISO
639-2 for the following 639-1 code:
draft-alvestrand-lang-tags-00.txt [Page 20]
Tags for the names of languages H. T. Alvestrand
Expires December 1999
sh Serbo-Croatian
This may be a political problem related to recent events in
the Balkans; Serbian (sr, srp, scc) and Croatian (hr, hrv,
scr) both have their own language codes.
Appendix B: Changes from RFC 1766
@ Email list address changed from ietf-types@uninett.no to
ietf-languages@iana.org
@ Updated author's address
@ Added language-range construct from HTTP/1.1
@ Added use of ISO 639-2 language codes
@ Added list of language codes
Appendix X: TODO list
Here is the list of changes that need to be done to this
doc before advancing it to Draft or reissuing it.
- Find out whether anyone supports the
multipart/alternative;difference= stuff; rip it out if
not.
- Fix document heading, boilerplate and formatting
draft-alvestrand-lang-tags-00.txt [Page 21]
| PAFTECH AB 2003-2026 | 2026-04-24 01:30:00 |