One document matched: draft-klensin-ima-framework-00.txt
Network Working Group J. Klensin
Internet-Draft
Expires: April 2, 2006 Y. Ko
MOCOCO, Inc.
September 29, 2005
Overview and Framework for Internationalized Email
draft-klensin-ima-framework-00.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 2, 2006.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract
Full use of electronic mail throughout the world requires that people
be able to use their own names, written correctly in their own
languages and scripts, as mailbox names in email addresses. This
document introduces a series of specifications and operational
suggestions that define mechanisms and protocol extensions needed to
fully support internationalized email addresses. These changes
include an SMTP extension and extension of email header syntax to
Klensin & Ko Expires April 2, 2006 [Page 1]
Internet-Draft IMA Framework September 2005
accommodate UTF-8 data. The document set also will include
discussion of key assumptions and issues in deploying fully
internationalized email.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Role of This Specification . . . . . . . . . . . . . . . . 3
1.2. Problem statement . . . . . . . . . . . . . . . . . . . . 3
1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2. Overview of the Approach . . . . . . . . . . . . . . . . . . . 5
3. Document Roadmap . . . . . . . . . . . . . . . . . . . . . . . 5
4. Overview of Protocol Extensions and Changes . . . . . . . . . 6
4.1. SMTP Extension for Internationalized eMail Address . . . . 6
4.2. Transmission of Email Header in UTF-8 Encoding . . . . . . 6
4.3. Downgrading Mechanism for Backward Compatibility . . . . . 7
5. Advice to Designers and Operators of Mail-receiving Systems . 7
6. Internationalization Considerations . . . . . . . . . . . . . 8
7. Additional Issues . . . . . . . . . . . . . . . . . . . . . . 8
7.1. Impact to IRI . . . . . . . . . . . . . . . . . . . . . . 8
7.2. POP and IMAP . . . . . . . . . . . . . . . . . . . . . . . 8
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
9. Security Considerations . . . . . . . . . . . . . . . . . . . 9
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9
11. Change History . . . . . . . . . . . . . . . . . . . . . . . . 10
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
12.1. Normative References . . . . . . . . . . . . . . . . . . . 10
12.2. Informative References . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13
Intellectual Property and Copyright Statements . . . . . . . . . . 14
Klensin & Ko Expires April 2, 2006 [Page 2]
Internet-Draft IMA Framework September 2005
1. Introduction
In order to use internationalized email addresses, we need to
internationalize both domain part and local part of email address.
The domain part of email addresses is already internationalized
[RFC3490], while the local part is not. Without these extensions,
the mailbox name is restricted to a subset of 7-bit ASCII in
[RFC2821]. Though MIME enables the transport of non-ASCII data, it
does not provide a mechanism for internationalized email address.
[RFC2047] defines an encoding mechanism for some specific message
header fields to accommodate non-ASCII data. However, it does not
address the issue of email addresses that include non-ASCII
characters.
1.1. Role of This Specification
This document presents the overview and framework for an approach to
the next stage of email internationalization. This new stage
requires not only internationalization of addresses and headers, but
also associated transport and delivery models. The history of
developments and design ideas leading to this specification is
described in [IMA-history].
This document describes how the various elements of email
internationalization fit together and provides a roadmap for
navigating the various documents involved.
1.2. Problem statement
[[anchor1: Note in draft: this section needs very significant
reworking for both content and presentation. Changed with -01c, but
may still not be good enough]]
Though domain names are already internationalized, the
internationalized forms are far from general adoption by ordinary
users. One of the reasons for this is that we do not yet have fully
internationalized naming schemes. Domain names are just one of the
various names and identifiers that are required to be
internationalized.
Email addresses are a particularly important example of where
internationalization of domain names alone is not sufficient. Unless
email addresses are presented to the user in familiar characters and
formats, the user's perception will not be of internationalization
and behavior that is culturally friendly. One thing most of us have
almost certainly learned from the experience with email usage is that
users strongly prefer email addresses that closely resemble names or
initials to those involving. If the names or initials of the names
Klensin & Ko Expires April 2, 2006 [Page 3]
Internet-Draft IMA Framework September 2005
in the email address is expressed in their native languages, which
will be very good news to those whose native language is not written
in a subset of a Roman-derived script.
Internationalization of email addresses is not merely a matter of
changing the SMTP envelope, or of modifying the From, To, and Cc
headers, or of permitting upgraded mail user agents (MUA) to decode a
special coding and display local characters. To be perceived as
usable by end users, the addresses must be internationalized, and
handled consistently, in all of the contexts in which they occur.
That requirement has far-reaching implications: collections of
patches and workarounds are not adequate. Instead, we need to build
a fully internationalized email environment, focusing on permitting
efficient communication among those who share a language or other
community. That, in turn, implies changes to the mail header
environment to permit the full range of Unicode characters where that
makes sense, an SMTP extension to permit UTF-8 mail addressing and
delivery of those extended headers, and (finally) a requirement for
support of the 8BITMIME option so that all of this can be transported
through the mail system without having to overcome the limitation
that headers not have content-transfer-encodings.
1.3. Terminology
This document assumes a reasonable understanding of the protocols and
terminology of the core email standards as documented in [RFC2821]
and [RFC2822].
Much of the description in this document depends on the abstractions
of "Mail Transfer Agent" ("MTA") and "Mail User Agent" ("MUA").
However, it is important to understand that those terms and the
underlying concepts postdate the design of the Internet's email
architecture and the "protocols on the wire" principle. That email
architecture, as it has evolved, and the "wire" principle have
prevented any strong and standardized distinctions about how MTAs and
MUAs interact on a given origin or destination host (or even whether
they are separate).
In this document, an address is "all-ASCII" if every character in the
address is in the ASCII character repertoire [ASCII]; an address is
"non-ASCII" if any character is not in the ASCII character
repertoire. The term "all-ASCII" is also applied to other protocol
elements when the distinction is important, with "non-ASCII" or
"internationalized" as its opposite.
The term "internationalized email address", or "IMA", refers to an
address permitted by this specification. [[anchor3: Note in Draft/
Placeholder: it appears that the term "IMA" is not used in a precise
Klensin & Ko Expires April 2, 2006 [Page 4]
Internet-Draft IMA Framework September 2005
and consistent way across the document set. It is sometimes used to
refer simply to a "non-ASCII" address; sometimes to an address that
contains non-ASCII characters, even if that address is encoded into
ASCII characters (i.e., as an ACE); and sometimes as an address that
may contain non-ASCII characters but may also be a traditional
adress. The definition needs to be clarified in an upcoming draft
and all uses of the term brought into line with the definition.]]
The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED",
and "MAY" in this document are to be interpreted as described in RFC
2119 [RFC2119].
2. Overview of the Approach
This set of specifications changes both SMTP and the format of email
headers to permit non-ASCII characters to be represented directly.
Each important component of the work is described in a separate
document. The document set, whose members are described in the next
section, also contains informational documents whose purpose is to
provide operational and implementation suggestions and guidance for
the protocols.
3. Document Roadmap
In addition to this document, the following documents make up this
specification and provide advice and context for it.
o SMTP extensions. This document provides an SMTP extension for
internationalized addresses, as provided for in RFC 2821 [IMA-
SMTPext].
o Email headers in UTF-8. This document essentially updates RFC
2822 to permit some information in email headers to be expressed
directly by Unicode characters encoded in UTF-8 when the SMTP
extension is used [IMA-UTF8].
o Downgrading from internationalized addressing with the SMTP
extension and UTF-8 headers to traditional email formats and
characters [IMA-downgrade].
o Operational guidelines and suggestions for the deployment of
internationalized email [IMA-ops].
o Special considerations for mailing lists and similar distributions
during the transition to internationalized email [IMA-Exploder].
o Design decisions, history, and alternative models for
internationalized Internet email [IMA-history].
Klensin & Ko Expires April 2, 2006 [Page 5]
Internet-Draft IMA Framework September 2005
4. Overview of Protocol Extensions and Changes
4.1. SMTP Extension for Internationalized eMail Address
An SMTP extension, "IMA" is specified that
o Permits the use of UTF-8 strings in email addresses, both local
parts and domain names
o Permits the selective use of UTF-8 strings in email headers (see
the next subsection)
o Requires support for the 8BITMIME extension so that header
information can be transmitted without using a special content-
transfer-encoding.
Some general principles apply to this work.
1. Whatever encoding is used should apply to the whole address and
be directly compatible with software used at the user interface.
2. An SMTP relay must
* Either recognize the format explicitly, agreeing to do so via
an ESMTP option,
* Select and use an ASCII-only address, or
* Bounce the message so that the sender can make another plan.
3. In the interest of interoperability, charsets other than UTF-8
are prohibited. There is no practical way to identify them
properly with an extension like this without introducing great
complexity.
4.2. Transmission of Email Header in UTF-8 Encoding
[[anchor8: Note in Draft: Much better than earlier version and good
enough for now. It could still benefit from a further rework in
-01.]] There are many places in MUAs or in user presentation in
which email addresses or domain names appear. Examples include the
conventional From, To, or Cc header fields; Message-IDs; In-Reply-To
fields that may contain addresses or domain names; in message bodies;
or elsewhere. We must examine all of them from an
internationalization perspective. The user will expect to see
mailbox and domain names in local characters, and to see them
consistently. Variations on that problem will exist with any
internationalization method, whether transport or MUA-only in
structure. Perhaps, if we have to live with it for a short time as a
transition activity, that is worthwhile. But the only practical way
to avoid it, in both the medium and the longer term, is to have the
encodings used in transport be as nearly as possible the same as the
encodings used in message headers and message bodies.
It seems clear that the point at which email local parts are
internationalized is the point that email headers should simply be
shifted to a full internationalized form, presumably using UTF-8
Klensin & Ko Expires April 2, 2006 [Page 6]
Internet-Draft IMA Framework September 2005
rather than ASCII as the base character set for other than protocol
elements such as the header field names themselves. The transition
to that model includes support for address, and address-related,
fields within the headers of legacy systems. This is done by
extending the encoding models of [RFC2045] and [RFC2231]. However,
our target should be fully internationalized headers, as discussed
[IMA-UTF8].
4.3. Downgrading Mechanism for Backward Compatibility
As with any use of the SMTP extension mechanism, there is always a
possibility of a client that requires the feature encountering a
server that does not. In the case of IMA, the risk should be
minimized by the fact that the selection of submission servers are
presumably under the control of the client and the selection of
potential intermediate relays is under the control of the
administration of the final delivery server.
For those situations, there are basically two possibilities:
o Reject or bounce the message, requiring the sender to resubmit it
with traditional-format addresses and headers.
o Figure out a way to downgrade the envelope or message body in
transit. Especially when internationalized addresses are
involved, downgrading will require either that an all-ASCII
address be obtained from some source or computed. An optional
extension parameter is provided as a way of transmitting an
alternate address. Computing an ASCII form of an IMA address
requires that the sender have some knowledge that is normally
restricted to final delivery servers, but save extensions may be
feasible there too. Downgrade issues and a specification are
discussed in [IMA-downgrade].
The first of these two options, that of rejecting or returning the
message to the sender MAY always be chosen.
There is also a third case, one in which the client is IMA-capable,
the server is not, but the message does not require the extended
capabilities. In other words, both the addresses in the envelope and
the entire set of headers of the message are entirely in ASCII
(perhaps including encoded-words in the headers). In that case, the
client SHOULD send the message whether or not the server announces
the IMA capability.
5. Advice to Designers and Operators of Mail-receiving Systems
[[anchor10: Note in draft: The material that follows contains some
forward-looking, predictive, statements. Be sure they are true
Klensin & Ko Expires April 2, 2006 [Page 7]
Internet-Draft IMA Framework September 2005
before Last Call.]]
In addition to the protocol specification materials in this set of
documents, the working group has had extensive discussions about
operational considerations in the use of internationalized addresses.
Those topics include how such addresses should be chosen, how they
should relate to ASCII alternatives if such alternatives exist, the
management of mailing lists that might support and contain a mixture
of all-ASCII and non-ASCII addresses, and so on. Those issues are
discussed in [IMA-ops] and [IMA-Exploder].
6. Internationalization Considerations
This entire specification addresses issues in internationalization
and especially the boundaries between internationalization and
localization and between network protocols and client/user interface
actions.
7. Additional Issues
This section identifies issues that are not covered as part of this
set of specifications, but that will need to be considered as part of
IMA deployment.
7.1. Impact to IRI
The mailto: schema in IRI [RFC3987] may need to be modified when IMA
is standardized.
7.2. POP and IMAP
While SMTP takes care of the transportation of messages, IMAP
[RFC3501] and POP3 [RFC1939] are among mechanisms used to handle the
retrieval of mail objects from a mail store by a client. The use of
internationalized mail addresses or UTF-8 headers will require
extensions to POP and IMAP and/or modifications to the design and
implementation of mail stores and the mechanisms that final delivery
SMTP servers use to put mail into them. However, those mechanisms
are separate from those associated with transport across the network
and are not discussed in this series of documents. The general
issues are covered in [IMA-imap-pop].
8. IANA Considerations
This specification does not contemplate any IANA registrations or
Klensin & Ko Expires April 2, 2006 [Page 8]
Internet-Draft IMA Framework September 2005
other actions.
9. Security Considerations
Any expansion of permitted characters and encoding forms in email
addresses raises some risks. There have been discussions on so
called "IDN-spoofing". IDN homograph attacks allow an attacker/
phisher to spoof the domain/URLs of businesses. The same kind of
attack is also possible on the local part of internationalized email
addresses. It should be noted that one of the proposed fixes for,
e.g., URLs, does not work for email local parts since they are case-
sensitive. That fix involves forcing all elements that are displayed
to be in lower-case and normalized,
Since email addresses are often transcribed from business cards and
notes on paper, they are subject to problems arising from confusable
characters. These problems are somewhat reduced if the domain
associated with the mailbox is unambiguous and supports a relatively
small number of mailboxes whose names follow local system
conventions; they are increased with very large mail systems in which
users can freely select their own addresses.
The internationalization of email addresses and headers must not
leave the Internet less secure than it is that without the required
extensions. The requirements and mechanisms documented in this set
of IMA specifications do not, in general, raise any new security
issues other than those associated with confusable characters -- a
topic that is being explored thoroughly elsewhere. [[anchor16: Note
in Draft: If the IAB-IDN report is completed and published, a
reference to it should go here.]] Specific issues are discussed in
more detail in the other documents in this set. However, in
particular, caution should be taken that any "downgrading" mechanism,
or use of downgraded addresses, does not inappropriately assume
authenticated bindings between the IMA and ASCII addresses.
In addition, email addresses are used in many contexts other than
sending mail, such as for identifiers under various circumstances.
Each of those contexts will need to be evaluated, in turn, to
determine whether the use of non-ASCII forms is appropriate and what
particular issues they raise.
10. Acknowledgements
This document, and the related ones, were originally derived from
drafts by John Klensin and the JET group [Klensin-emailaddr], [JET-
IMA]. The work drew inspiration from discussions on the "IMAA"
Klensin & Ko Expires April 2, 2006 [Page 9]
Internet-Draft IMA Framework September 2005
mailing list, sponsored by the Internet Mail Consortium and
especially from an early draft by Paul Hoffman and Adam Costello
[Hoffman-IMAA] that attempted to define an MUA-only solution to the
IMA problem. [[anchor18: Note in draft: may want to move some of this
to "history" or reference it]]
11. Change History
[[anchor20: Note to RFC Editor: this section to be removed prior to
publication]]
Version 00 This version supercedes draft-lee-jet-ima-00 and
draft-klensin-emailaddr-i18n-03. It represents a major rewrite
and change of architecture from the former and incorporates many
ideas and some text from the latter.
12. References
12.1. Normative References
[ASCII] American National Standards Institute (formerly United
States of America Standards Institute), "USA Code for
Information Interchange", ANSI X3.4-1968, 1968.
ANSI X3.4-1968 has been replaced by newer versions with
slight modifications, but the 1968 version remains
definitive for the Internet.
[IMA-Exploder]
"Placeholder: whatever we call the mailing list document",
2005.
[IMA-SMTPext]
Yao, J., Ed., "SMTP Extension for Internationalized Email
Address", draft-yao-smtpext-00 (work in progress),
September 2005.
[IMA-UTF8]
Yeh, J., "Transmission of Email Headers in UTF-8
Encoding", draft-yeh-ima-utf8headers-00 (work in
progress), October 2005.
[IMA-downgrade]
YONEYA, Y., Ed., "Placeholder: whatever we call the
downgrading document", October 2005.
Klensin & Ko Expires April 2, 2006 [Page 10]
Internet-Draft IMA Framework September 2005
[IMA-ops] "Placeholder: whatever we call the operations document",
2005.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels'", RFC 2119, March 1997.
[RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821,
April 2001.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003.
12.2. Informative References
[Hoffman-IMAA]
Hoffman, P. and A. Costello, "Internationalizing Mail
Addresses in Applications (IMAA)", draft-hoffman-imaa-03
(work in progress), October 2003.
[IMA-history]
Klensin, J., "Decisions and Alternatives for
Internationalization of Email Addresses", Internet-
Draft forthcoming, September 2005.
[IMA-imap-pop]
Klensin, J., "Considerations for IMAP and POP in
Conjunction with Email Address Internationalization",
draft-klensin-ima-imappop-00a (work in progress),
October 2005.
[JET-IMA] Yao, J. and J. Yeh, "Internationalized eMail Address
(IMA)", draft-lee-jet-ima-00 (work in progress),
June 2005.
[Klensin-emailaddr]
Klensin, J., "Internationalization of Email Addresses",
draft-klensin-emailaddr-i18n-03 (work in progress),
July 2005.
[RFC1939] Myers, J. and M. Rose, "Post Office Protocol - Version 3",
STD 53, RFC 1939, May 1996.
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, November 1996.
[RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
Klensin & Ko Expires April 2, 2006 [Page 11]
Internet-Draft IMA Framework September 2005
Part Three: Message Header Extensions for Non-ASCII Text",
RFC 2047, November 1996.
[RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded
Word Extensions: Character Sets, Languages, and
Continuations", RFC 2231, November 1997.
[RFC2449] Gellens, R., Newman, C., and L. Lundblade, "POP3 Extension
Mechanism", RFC 2449, November 1998.
[RFC2822] Resnick, P., "Internet Message Format", RFC 2822,
April 2001.
[RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
4rev1", RFC 3501, March 2003.
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource
Identifiers (IRIs)", RFC 3987, January 2005.
Klensin & Ko Expires April 2, 2006 [Page 12]
Internet-Draft IMA Framework September 2005
Authors' Addresses
John C Klensin
1770 Massachusetts Ave, #322
Cambridge, MA 02140
USA
Phone: +1 617 491 5735
Email: john-ietf@jck.com
YangWoo Ko
MOCOCO, Inc.
996-1, 11F, Mirae Asset Venture Tower, Daechi-dong
Gangnam-gu, Seoul 135-280
Korea
Email: yw@mrko.pe.kr
Klensin & Ko Expires April 2, 2006 [Page 13]
Internet-Draft IMA Framework September 2005
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Klensin & Ko Expires April 2, 2006 [Page 14]
| PAFTECH AB 2003-2026 | 2026-04-24 04:25:37 |