One document matched: draft-alvestrand-lang-tag-v2-02.txt

Differences from draft-alvestrand-lang-tag-v2-01.txt


Internet-Draft                                       H. Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt                    EDB Maxware                   
Target Category: Standards Track                                     
                                                         June 2000 
Obsoletes: RFC 1766                         Expires: December 2000 
 
 
 
 
 
 
 
 
 

Tags for the Identification of Languages 
 

Status of this Memo

     The file name of this memo is draft-alvestrand-lang-tag-v2-0.txt 
     This document is an Internet-Draft and is in full conformance with 
     all provisions of Section 10 of RFC 2026. 
     Internet-Drafts are working documents of the Internet Engineering 
     Task Force (IETF), its areas, and its working groups.  Note that 
     other groups may also distribute working documents as Internet-
     Drafts. 
     Internet-Drafts are draft documents valid for a maximum of six 
     months and may be updated, replaced, or obsoleted by other 
     documents at any time.  It is inappropriate to use Internet- 
     Drafts as reference material or to cite them other than as "work 
     in progress." 
     The list of current Internet-Drafts can be accessed at 
     http://www.ietf.org/ietf/1id-abstracts.txt 
     The list of Internet-Draft Shadow Directories can be accessed at 
     http://www.ietf.org/shadow.html. 
     
Comments on this draft should be sent to the mailing list <ietf-
languages@iana.org> 

Abstract 
This document describes a language tag for use in cases where it is 
desired to indicate the language used in an information object. 
It also defines a "Content-language:" header, for use in the case where 
one desires to indicate the language of something that has RFC-822-like 
headers, like MIME body parts or Web documents, and a new parameter to 
the Multipart/Alternative type, to aid in the usage of the Content-
Language: header. 
 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
1. Introduction 
 
There are a number of languages presently or previously used by human 
beings in this world. 
A great number of these people would prefer to have information 
presented in a language which they understand. 
In some contexts, it is possible to have information available in more 
than one language, or it might be possible to provide tools  (such as 
dictionaries) to assist in the understanding of a language. 
In other cases, it may be desirable to use a computer program to 
convert information from one format (such as plaintext) into another 
(such as computer-synthesized speech, or Braille, or high-quality print 
renderings). 
 
A prerequisite for any such function is a means of labelling the 
information content with an identifier for the language that is used in 
this information content. 
This document specifies an identifier mechanism, and one possible use 
for it. 
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
"SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in [RFC 2119]. 

2. The Language tag 

2.1 Language tag syntax 
The language tag is composed of one or more parts: A primary language 
tag and a (possibly empty) series of subtags. 
 
The syntax of this tag in RFC 2234 ABNF is: 
 Language-Tag = Primary-tag *( "-" Subtag ) 
 Primary-tag = 1*8ALPHA 
 Subtag = 1*8ALPHA 
 
All tags are to be treated as case insensitive; there exist conventions 
for capitalization of some of them, but these should not be taken to 
carry meaning. For instance, ISO 3166 recommends that country codes are 
capitalized (MN Mongolia), while ISO 639 recommends that language codes 
are written in lower case (mn Mongolian). 





 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 2] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
2.2 Language tag sources 
 
The namespace of language tags is administered by the IANA according to 
the rules in section 5 of this document. 
The following registrations are predefined: 
In the primary language tag: 
- All 2-letter tags are interpreted according to assignments found in 
  ISO standard 639, "Code for the representation of names of languages" 
  [ISO 639], or subsequently made by the standardÆs registration 
  authority. 
  (Note: A revision is underway, and is expected to be released as ISO 
  639-1:2000) 
- All 3-letter tags are interpreted according to assignments found in 
  ISO 639 part 2, "Codes for the representation of names of languages -
  - Part 2: Alpha-3 code [ISO 639-2] , or subsequently made by the 
  standardÆs registration authority. 
 
- The value "i" is reserved for IANA-defined registrations 
- The value "x" is reserved for private use. Subtags of "x" shall not 
  be registered by the IANA. 
- Other values shall not be assigned except by revision of this 
  standard. 
The reason for reserving all other tags is to be open towards new 
revisions of ISO 639; the use of "i" and "x" is the minimum we can do 
here to be able to extend the mechanism to meet our immediate 
requirements. 
In the first subtag: 
- All 2-letter codes are interpreted as ISO 3166 alpha-2 country codes 
  denoting the area in which the language is used. 
- Codes of 3 to 8 letters may be registered with the IANA, according to 
  the rules in chapter 5 of this document. 
The information in the subtag may for instance be: 
- Country identification, such as en-US (this usage is described in ISO 
  639) 
- Dialect or variant information, such as no-nyn (nynorsk) or en-scouse 
- Languages not listed in ISO 639 that are not variants of any listed 
  language, which can be registered with the i-prefix, such as i-
  cherokee 
- Script variations, such as az-Arab and az-Cyrl (Azerbaijani in Arabic 
  or Cyrillic script - these script codes are suggested by the pending 
  script code standard ISO/DIS 15924) 
 

 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 3] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
This document does not place any restriction on what values one can 
register here, as long as they conform to the rules in section 5. 
     NOTE IN DRAFT: It has been suggested that subtags of 4 characters 
     be reserved for ISO/DIS 15924 codes. Opinions for and against are 
     sought. 
ISO 639 defines a registration authority for additions to and changes 
in the list of languages in ISO 639. This authority is: 
      International Information Centre for Terminology (Infoterm) 
      P.O. Box 130 
      A-1021 Wien 
      Austria 
      Phone: +43 1  26 75 35 Ext. 312 
      Fax:   +43 1 216 32 72 
 
ISO 639-2 defines a registration authority for additions to and changes 
in the list of languages in ISO 639-2. This authority is: 
     Library of Congress 
     Network Development and MARC Standards Office 
     Washington, D.C. 20540 
     USA 
     Phone: +1 202 707 6237 
     Fax:   +1 202 707 0115 
          URL: http://www.loc.gov/standards/iso639 
 
The registration agency for ISO 3166 (country codes) is: 
      ISO 3166 Maintenance Agency Secretariat 
      c/o DIN Deutsches Institut fuer Normung 
      Burggrafenstrasse 6 
      Postfach 1107 
      D-10787 Berlin 
      Germany 
      Phone: +49 30 26 01 320 
      Fax:   +49 30 26 01 231 
 
ISO 3166 reserves the country codes AA, QM-QZ, XA-XZ and ZZ as user-
assigned codes. 
2.3 Choice of language tag 
One may occasionally be faced with several possible tags for the same 
body of text. 
Interoperability is best served if all users send the same tag, and use 
the same tag for the same language for all documents. Exact 
requirements may need to vary by application area; if so, the 
application protocol specification MUST specify how the procedure 
varies from the one given here. 

 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 4] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
The text below is based on the set of tags known to the tagging entity. 
1. Use the most precise tagging known to the sender that can be 
  ascertained. 
2. When a language has both an ISO 639-1 2-character tag and an ISO 639-
  2 3-character tag, you MUST use the ISO 639-1 2-character tag. 
3. When a language has no ISO 639-1 2-character tag, and the ISO 639-2/T 
  (Terminology) tag and the ISO 639-2/B (Bibliographic) tag differ, you 
  MUST use the Terminology tag. 
  NOTE: At present, all languages for which there is a difference have 
  2-character tags, and the displeasure of developers about the 
  existence of 2 tag sets has been adequately communicated to ISO. So 
  this situation will hopefully not arise) 
4. When a language has both an IANA-registered tag (i-something) and an 
  ISO registered tag, you MUST use the ISO tag. 
  NOTE: When such a situation is discovered, the IANA-registered tag 
  SHOULD be deprecated as soon as possible. 
5. You SHOULD NOT use the UND (Undetermined) tag unless the protocol in 
  use forces you to give a value for the language tag, even if the 
  language is unknown. Omitting the tag is preferred. 
6. You MUST NOT use the MUL (Multiple) tag if the protocol allows you to 
  use multiple languages, as is the case for the Content-Language: 
  header. 
NOTE: In order to avoid versioning difficulties in applications such as 
that of RFC 1766, the ISO 639 RA-JAC has agreed on the following policy 
statement: 
 
  "After the publication of ISO/DIS 639-1 as an International Standard, 
  no new 2-letter code shall be added to ISO 639-1 unless a 3-letter 
  code is also added at the same time to ISO 639-2. In addition, no 
  language with a 3-letter code available at the time of publication of 
  ISO 639-1 which at that time had no 2-letter code shall be 
  subsequently given a 2-letter code."
   
This will ensure that, for example, a user who implements "hwi" 
(HawaiÆian), which currently has no 2-letter code, will not find his or 
her data invalidated by eventual addition of a 2-letter code for that 
language.
 

2.4 Meaning of the language tag 
 
The language tag always defines a language as spoken (or written, 
signed or otherwise signalled) by human beings for communication of 
information to other human beings. 
Computer languages such as programming languages are explicitly 
excluded. 

 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 5] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
There is no guaranteed relationship between languages whose tags begin 
with the same series of subtags; specifically, they are NOT guaranteed 
to be mutually intelligible, although it will sometimes be the case 
that they are. 
Applications should always treat a language tag as a single token; the 
division into main tag and subtags is an administrative mechanism, not 
a navigation aid. 
The relationship between the tag and the information it relates to is 
defined by the standard describing the context in which it appears. 
Accordingly, this section can only give possible examples of its usage. 
- For a single information object, it should be taken as the set of 
  languages that is required for a complete comprehension of the 
  complete object. 
  Example: Plain text documents. 
- For an aggregation of information objects, it should be taken as the 
  set of languages used inside components of that aggregation.  
  Examples: Document stores and libraries. 
- For information objects whose purpose is to provide alternatives, it 
  should be regarded as a hint that the material inside is provided in 
  several languages, and that one has to inspect each of the 
  alternatives in order to find its language or languages.  In this 
  case, multiple languages need not mean that one needs to be 
  multilingual to get complete understanding of the document. 
  Example: MIME multipart/alternative. 
- In markup languages, such as HTML, it is possible to define a 
  construct embedding a language tag to indicate that contained text is 
  written in this language, such that one could write <DIV 
  lang="FR">C'est la vie</DIV> inside a Norwegian document; the 
  Norwegian-speaking user could then access a French-Norwegian 
  dictionary to find out what the marked section meant. 
  If the user were listening to that document through a speech 
  synthesis interface, this formation could be used to signal the 
  synthesizer to appropriately apply French text-to-speech 
  pronunciation rules to that span of text, instead of misapplying the 
  Norwegian rules. 
 

2.5 Language-range 
Since the publication of RFC 1766, it has become apparent that there is 
a need to define a term for a set of languages that share some common 
property. The following definition of language-range is derived from 
RFC 2616 (HTTP/1.1). 
 
          language-range  = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) / "*" ) 
 

 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 6] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
A language-range matches a language-tag if it exactly equals the tag, 
or if it exactly equals a prefix of the tag such that the first tag 
character following the prefix is "-". 
 The special range "*" matches any tag. A protocol which uses language 
ranges may specify additional rules about the semantics of "*"; for 
instance, HTTP/1.1 specifies that it only matches languages not matched 
by any other range within an "Accept-Language:" header. 
NOTE: This use of a prefix matching rule does not imply that language 
tags are assigned to languages in such a way that it is always true 
that if a user understands a language with a certain tag, then this 
user will also understand all languages with tags for which this tag is 
a prefix. The prefix rule simply allows the use of prefix tags if this 
is the case. 
 

3. The Content-language header 
The "Content-Language" header is intended for use in the case where one 
desires to indicate the language(s) of something that has RFC-822-like 
headers, such as MIME body parts or Web documents. 
The RFC-822 EBNF of the Content-Language header is: 
 Content-Language = "Content-Language" ":" 1#Language-tag 
 
Or in RFC 2234 ABNF: 
 
Content-Language = "Content-Language" CFWS ":" Language-List 
Language-List = Language-Tag [ CFWS "," CFWS Language-List ]  
 
The Content-Language header may list several languages in a comma-
separated list. 
The CFWS construct is intended to function like the whitespace 
convention in RFC 822, which means also that one can place 
parenthesized comments anywhere in the language sequence, or use 
continuation lines. A formal definition is given in the update to the 
RFC 822 grammar, currently a work in progress. 

3.1 Examples of Content-language values 
   
Norwegian official document, with parallel text in both official 
versions of Norwegian. (Both versions are readable by all Norwegians). 
 
   Content-Type: multipart/alternative; 
          differences=content-language 
   Content-Language: no-nyn, no-bok 
 
Voice recording from Liverpool downtown 
 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 7] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
   Content-type: audio/basic 
   Content-Language: en-scouse 
 
Document in Mingo, an American Indian language which does not have an 
ISO 639 code: 
   Content-type: text/plain 
   Content-Language: i-mingo 
 
An English-French dictionary 
 
   Content-type: application/dictionary 
   Content-Language: en, fr (This is a dictionary) 
 
An official European Commission document (in a few of its official 
languages) 
 
   Content-type: multipart/alternative 
   Content-Language: da, de, el, en, fr, it 
 
An excerpt from Star Trek 
   Content-type: video/mpeg 
   Content-Language: i-klingon 
 
(All the tags used in these examples were registered with IANA after 
the publication of RFC 1766) 
 

4. IANA registration procedure for language tags 
Any language tag shall begin with an existing tag, and extend it. 
The registration form given here must be used by anyone who wants to 
use a language tag not defined by ISO or IANA. 
---------------------------------------------------------------------- 
LANGUAGE TAG REGISTRATION FORM 
 
Name of requester          : 
E-mail address of requester: 
Tag to be registered       : 
 
English name of language   : 
 
Native name of language (transcribed into ASCII): 
 
Reference to published description of the language (book or article): 
 
Any other relevant information: 
 
---------------------------------------------------------------------- 
 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 8] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
The language form must be sent to <ietf-languages@iana.org> for a 2-
week review period before it can be submitted to IANA.  (This is an 
open list. Requests to be added should be sent to <ietf-languages-
request@iana.org>.) 
When the two week period has passed, the language tag reviewer, who is 
appointed by the IETF Applications Area Director, either forwards the 
request to IANA@ISI.EDU, or rejects it because of significant 
objections raised on the list. Note that the reviewer can raise 
objections on the list himself, if he so desires. The important thing 
is that the objection must be made publicly. 
The applicant is free to modify a rejected application with additional 
information and submit it again; this restarts the 2-week comment 
period. 
Decisions made by the reviewer may be appealed to the IESG. 
All registered forms are available online in the directory 
ftp://ftp.isi.edu/in-notes/iana/assignments/languages/ 
Updates of registrations follow the same procedure as registrations. 
The language tag reviewer decides whether to allow a new registrant to 
update a registration made by someone else; in the normal case, 
objections by the original registrant would carry extra weight in such 
a decision. 
There is no deletion of registrations; when some registered tag should 
not be used any more, for instance because a corresponding ISO 639 code 
has been registered, the registration should  be amended by adding a 
remark like "DO NOT USE: use <new code> instead" to the "other relevant 
information" section. 

5. Security Considerations 
The only security issue that has been raised with language tags since 
the publication of RFC 1766, which stated that "Security issues are 
believed to be irrelevant to this memo", is a concern with language 
ranges used in content negotiation - that they may be used to infer the 
nationality of the sender, and thus identify potential targets for 
surveilllance. 
This is a special case of the general problem that anything you send is 
visible to the receiving party; it is useful to be aware that such 
concerns can exist in some cases. 
The exact magnitude of the threat, and any possible countermeasures, is 
left to each application protocol. 

6. Character set considerations 
Codes may always be expressed using the US-ASCII character repertoire 
(a-z), which is present in most character sets. 
The issue of deciding upon the rendering of a character set based on 
the language tag is not addressed in this memo; however, it is thought 
 
draft-alvestrand-lang-tags-v2-01.txt                     [Page 9] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
impossible to make such a decision correctly for all cases unless means 
of switching language in the middle of a text are defined (for example, 
a rendering engine that decides font based on Japanese or Chinese 
language may fail to work when a mixed Japanese-Chinese text is 
encountered) 

7. Acknowledgements 
This document has benefited from many rounds of review and comments in 
various fora of the IETF and the Internet working groups. 
Any list of contributors is bound to be incomplete; please regard the 
following as only a selection from the group of people who have 
contributed to make this document what it is today. 
In alphabetical order: 
Tim Berners-Lee, Nathaniel Borenstein, Sean M. Burke, Jim Conklin, John 
Cowan, Dave Crocker, Martin Duerst, Michael Everson, Ned Freed, Tim 
Goodwin, Dirk-Willem van Gulik, Paul Hoffman, Olle Jarnefors, John 
Klensin, Keith Moore, Masataka Ohta, Keld Jorn Simonsen, Rhys 
Weatherley, Misha Wolf, Francois Yergeau and many, many others. 
 
Special thanks must go to Michael Everson, who has served as language 
tag reviewer for almost the complete period since the publication of 
RFC 1766, and has provided a great deal of input to this revision. 

8. Author's Address 
Harald Tveit Alvestrand 
EDB Maxware 
Pirsenteret 
N-7462 TRONDHEIM 
NORWAY 
EMail: Harald.Alvestrand@maxware.no 
Phone: +47 73 54 57 97 

9. References 
 
[ISO 639] 
     ISO 639:1988 (E/F) - Code for the representation of names of 
     languages - The International Organization for Standardization, 
     1st edition, 1988-04-01 Prepared by ISO/TC 37 - Terminology 
     (principles and coordination). 
     Note that a new version (ISO 639-1:2000) is in preparation at the 
     time of this writing. 
[ISO 639-2] 



 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 10] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
     ISO 639-2:1998 - Codes for the representation of names of 
     languages -- Part 2: Alpha-3 code  - edition 1, 1998-11-01, 66 
     pages, prepared by ISO/TC 37/SC 2 
      
[ISO 3166] 
     ISO 3166:1988 (E/F) - Codes for the representation of names of 
     countries - The International Organization for Standardization, 
     3rd edition, 1988-08-15. 
[ISO 15924] 
     ISO/DIS 15924 - Codes for the representation of names of scripts 
(under development by ISO TC46/SC2) 
 [RFC 1327] 
     Kille, S., "Mapping between X.400(1988) / ISO 10021 and RFC 822", 
     RFC 1327, University College London, May 1992.  
[RFC 1521] 
     Borenstein, N., and N. Freed, "MIME Part One: Mechanisms for 
     Specifying and Describing the Format of Internet Message Bodies", 
     RFC 1521, Bellcore, Innosoft, September 1993. 
[RFC 2119] 
     Key words for use in RFCs to Indicate Requirement Levels. S. 
     Bradner. March 1997. 
[RFC 2234] 
     Augmented BNF for Syntax Specifications: ABNF. D. Crocker, Ed., P. 
Overell, November 1997. 
[RFC 2616] 
     Hypertext Transfer Protocol -- HTTP/1.1. R. Fielding, J. Gettys,  
     J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee. June 
     1999. 

Appendix A: Language Tag Reference Material 
The Library of Congress, maintainers of ISO 639-2, has made the list of 
languages registered available on the Internet. 
At the time of this writing, it can be found at 
http://www.loc.gov/standards/iso639-2/langhome.html 
 
The IANA registration forms for registered language codes can be found 
at 
http://www.isi.edu/in-notes/iana/assignments/languages/ 
 
 




 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 11] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
Appendix B: Changes from RFC 1766 
. Email list address changed from ietf-types@uninett.no to ietf-
  languages@iana.org 
. Updated author's address 
. Added language-range construct from HTTP/1.1 
. Added use of ISO 639-2 language codes 
. Added reference to Library of Congress lists of language codes 
. Changed examples to use registered tags 
. Moved Multipart/Alternative-related stuff  to appendix C 
. Added "Any other information" to registration form 
. Added description of procedure for updating registrations 

Appendix C: Use of Content-Language with Multipart/Alternative 
 
NOTE: This appendix details an idea that was proposed in RFC 1766 to 
deal with a particular kind of alternative content. However, this has 
not found use in practice, and is therefore not suitable for the IETF 
standards track. It is being preserved here as a non-normative appendix 
only. 
When using the Multipart/Alternative body part of MIME, it is possible 
to have the body parts giving the same information content in different 
languages. In this case, one should put a Content-Language header on 
each of the body parts, and a summary Content-Language header onto the 
Multipart/Alternative itself. 

The differences parameter to multipart/alternative 
As defined in RFC 1541, "Multipart/Alternative" only has one parameter: 
boundary. 
The common usage of "Multipart/Alternative" is to have more than one 
format of the same message (f.ex. PostScript and ASCII). 
The use of language tags to differentiate between different 
alternatives will certainly not lead all MIME UAs to present the most 
meaningful, understandable or significant body part as default. 
Therefore, a new parameter is defined, to allow the configuration of 
MIME readers to handle language differences in a sensible manner. 
     Name: Differences 
     Value: One or more of 
           Content-Type 
           Content-Language 
 
 

 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 12] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
Further values can be registered with IANA; these shall refer to the 
name of a header for which a definition exists in a published RFC.  If 
not present, "Differences=Content-Type" is assumed. 
The intent is that the MIME reader can look at these headers of the 
message component to make an intelligent choice of what to present to 
the user, based on knowledge about the user preferences and 
capabilities. 
(The intent of having registration with IANA of the fields used in this 
context is to maintain a list of usages that a mail UA may expect to 
encounter, not to reject usages.) 
(NOTE: The MIME specification [RFC 1521], section 7.2, states that 
headers not beginning with "Content-" are generally to be ignored in 
body parts. People defining a header for use with "differences=" should 
take note of this.) 
The mechanism for deciding which body part to present is outside the 
scope of this document. 
MIME EXAMPLE: 
 
 
Content-Type: multipart/alternative; differences=Content-Language; 
          boundary="limit" 
Content-Language: en, fr, de 
 
--limit 
Content-Language: fr 
 
Le renard brun et agile saute par dessus le chien paresseux 
--limit 
Content-Language: de 
Content-Type: text/plain; charset=iso-8859-1 
Content-Transfer-encoding: quoted-printable 
 
Der schnelle braune Fuchs h=FCpft =FCber den faulen Hund 
--limit 
Content-Language: en 
 
The quick brown fox jumps over the lazy dog 
--limit-- 
 
When composing a message, the choice of sequence may be arbitrary. 
However, non-MIME mail readers will show the first body part first, 
meaning that this should most likely be the language understood by most 
of the recipients. 

Appendix X1: Changes from draft -00 to -01 
This appendix is to be deleted by the RFC Editor before publication as 
RFC. 
Changes from draft-00: 
 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 13] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
- Fixed up the language tag table 
- Moved multipart/alternative stuff to appendix 
- Changed examples to use registered tags 
- Added * in languagte tag table to indicate B/T conflicts 
- Considered, but did not adopt, changing from recommending T codes to 
  recommending B codes. At the moment, the only argument that appeals 
  to the author is that the T codes look more like the 639-1 codes than 
  the B codes do. 
- Added procedures for updating a registration 
Here is the list of changes that need to be done to this doc before 
advancing it to Draft or reissuing it. 
- Decide whether or not to write anything about use of country codes in 
  other places than the first subtag, or region codes, or script codes 
- Decide whether it is worth it to try to write down any more 
  guidelines for what language tags people should register 
 

Appendix X2: Changes from draft -01 to -02 
This appendix is to be deleted by the RFC Editor before publication as 
RFC. 
- Minor updates 
- Added reference to Library of Congress code lists instead of 
  including code values 
- Changed grammars to use RFC 2234 ABNF 
- Used MUST and SHOULD in label choice algorithm 
 

TODO list 
Consider whether to include Accept-Language: as a generic. If so, 
decide whether to use the HTTP standard's "weight factor" or the 
nonstandard, but commonly used "best language first". 
Suggested language, to be placed as section 3.2: 
     The "Accept-Language" header is intended for use in the case where 
     a user or a process 
     desires to indentify the language(s) he prefers when RFC-822-like 
     headers, such as MIME body parts or Web documents are used. 
      
     The RFC-822 EBNF of the Accept-Language header is: 
      
      Accept-Language = "Accept-Language" ":" Language-List 

 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 14] 
Tags for the names of languages                  Harald Alvestrand 
draft-alvestrand-lang-tag-v2-02.txt          Expires December 2000 
 
      
     The Accept-Language header may list several languages in a 
     comma-separated list. The leftmost is the most preferred. 
     Note that HTTP uses the same header name with a different syntax; 
     see RFC 2616 for the details. 
      
Consider whether more guidance on "appropriate" tags is needed. 
Consider whether we need to allow numbers in language tags. 
 
 








































 
draft-alvestrand-lang-tags-v2-01.txt                    [Page 15] 

PAFTECH AB 2003-20262026-04-24 03:05:24