One document matched: draft-ietf-eai-5738bis-12.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!-- change the "txt" on the previous line to "xml" to make this valid XML -->
<!--
XML2RFC offers an include feature described in the XML2RFC README
file. That syntax, however, contradicts the DTD requirements to
have <reference> elements within the <references> element, so an
XML parser is likely to find your XML file invalid. It may be
possible that XML2RFC will change their DTD so that the XML file
remains valid when their style of include is used.
In the meantime therefore, we use an alternative valid-XML approach
to includes, which unfortunately require that define your includes
at the beginning of the file. Since the biggest benefit of includes
is for references, this requires that your references be defined in
ENTITY clauses here before being "included" and cited elsewhere in
the file.
-->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc1341 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1341.xml">
<!ENTITY rfc2045 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2045.xml">
<!ENTITY rfc2047 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2047.xml">
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc2183 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2183.xml">
<!ENTITY rfc2231 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2231.xml">
<!ENTITY rfc5890 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5890.xml">
<!ENTITY rfc3501 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3501.xml">
<!ENTITY rfc3629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3629.xml">
<!ENTITY rfc4013 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4013.xml">
<!ENTITY rfc4466 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4466.xml">
<!ENTITY rfc4469 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4469.xml">
<!ENTITY rfc5161 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5161.xml"> <!ENTITY rfc5198 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5198.xml">
<!ENTITY rfc5234 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5234.xml">
<!ENTITY rfc5258 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5258.xml">
<!ENTITY rfc5259 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5259.xml">
<!ENTITY rfc5322 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5322.xml">
<!ENTITY rfc6532 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6532.xml">
<!ENTITY rfc2342 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2342.xml">
<!ENTITY rfc4314 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4314.xml">
<!ENTITY rfc5738 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5738.xml">
<!ENTITY rfc2049 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2049.xml">
<!ENTITY rfc2088 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2088.xml">
<!ENTITY rfc2277 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2277.xml">
<!ENTITY rfc5721 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5721.xml">
<!ENTITY rfc6530 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6530.xml">
<!ENTITY popimapdowngrade PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-eai-popimap-downgrade-08.xml'>
<!ENTITY simpledowngrade PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-eai-simpledowngrade-07.xml'>
]>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc strict="no"?>
<?rfc rfcedstyle="yes"?>
<!--
This template is for authors of IETF specifications containing MIB
modules. This template can be used as a starting point to produce
specifications that comply with the Operations & Management Area
guidelines for MIB module documents.
-->
<!--
Throughout this template, the marker "[TODO]" is used to indicate an
element or text that requires replacement or removal.
-->
<!-- Intellectual Property section -->
<!--
The Intellectual Property section will be generated automatically by
XML2RFC, based on the ipr attribute in the rfc element.
-->
<!--
[TODO] specify the name of the output document. This is optional;
the default is to use the base portion of the XML filename.
[TODO]For Internet-drafts, indicate which intellectual property notice
to use per the rules of RFC3978.
Specify this in the ipr attribute. The value can be:
full3978 -
noModification3978 -
noDerivatives3978 -
[TODO] Specify the category attribute per RFC2026
options are info, std, bcp, or exp.
[TODO] if this document updates an RFC, specify the RFC in the
"updates" attribute
-->
<rfc category="std" obsoletes="RFC5738" docName="draft-ietf-eai-5738bis-12"
ipr="pre5378Trust200902">
<front>
<!--
Throughout this template, the marker "[TODO]" is used to indicate an
element or text that requires replacement or removal.
-->
<!--
[TODO] Enter the full document title and an abbreviated version
to use in the page header.
-->
<title abbrev="IMAP Support for UTF-8">IMAP Support for UTF-8</title>
<!-- [TODO] copy the author block as many times as needed, one for each author.-->
<!-- If the author is acting as editor, use the <role=editor> attribute-->
<!-- see RFC2223 for guidelines regarding author names -->
<author fullname="Pete Resnick" initials="P" role="editor" surname="Resnick">
<organization>Qualcomm Incorporated</organization>
<address>
<postal>
<street>5775 Morehouse Drive</street>
<city>San Diego, CA 92121-1714</city>
<country>US</country>
</postal>
<phone>+1 858 651 4478</phone>
<email>presnick@qti.qualcomm.com</email>
</address>
</author>
<author fullname="Chris Newman" initials="C" role="editor" surname="Newman">
<organization>Oracle</organization>
<address>
<postal>
<street>800 Royal Oaks</street>
<city>Monrovia, CA 91016</city>
<country>USA</country>
</postal>
<phone></phone>
<email>chris.newman@oracle.com</email>
</address>
</author>
<author fullname="Sean Shen" initials="S" role="editor" surname="Shen">
<organization>CNNIC</organization>
<address>
<postal>
<street>No.4 South 4th Zhongguancun Street</street>
<city>Beijing, 100190</city>
<country>China</country>
</postal>
<phone>+86 10-58813038</phone>
<email>shenshuo@cnnic.cn</email>
</address>
</author>
<date/>
<area>Applications &app; Application Area</area>
<workgroup>Internet Engineering Task Force</workgroup>
<keyword>IMAP</keyword>
<keyword>IDNA</keyword>
<abstract>
<t>This specification extends the Internet Message Access Protocol
version 4rev1 (IMAP4rev1) to support UTF-8 encoded international
characters in user names, mail addresses and message headers.
This specification replaces RFC 5738.</t>
</abstract>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t> This specification forms part of the Email Address Internationalization
protocols described in the <xref target="RFC6530">Email Address Internationalization Framework
document</xref>. It extends IMAP4rev1 <xref target="RFC3501"/> to permit UTF-8
<xref target="RFC3629"/> in headers as described in "Internationalized Email
Headers" <xref target="RFC6532"/>. It also adds a mechanism to support mailbox
names using the UTF-8 charset. This
specification creates two new IMAP capabilities to allow servers to
advertise these new extensions.</t>
<!-- <t>[**]</t> -->
<t> This specification assumes that the IMAP server
will be operating in a fully internationalized environment,
i.e., one in which all clients accessing the server will be
able to accept non-ASCII message header fields and other
information as specified in <xref target="utf8-accept"/>. At least during a
transition period, that assumption will not be realistic
for many environments; the issues involved are discussed in
<xref target="legacy"/> below.
</t>
<t>
This specification replaces an earlier, experimental, approach to the
same problem <xref target="RFC5738"/>.
</t>
</section>
<section anchor="conventions" title="Conventions Used in this Document">
<t>The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
in this document are to be interpreted as defined in "Key words for
use in RFCs to Indicate Requirement Levels" <xref target="RFC2119"/>. </t>
<t> The formal syntax uses the Augmented Backus-Naur Form (ABNF)
<xref target="RFC5234"/> notation. In addition, rules from IMAP4rev1 <xref target="RFC3501"/>, UTF-8
<xref target="RFC3629"/>, "Collected Extensions to IMAP4 ABNF"
<xref target="RFC4466"/>, and IMAP4
LIST Command Extensions <xref target="RFC5258"/> are also referenced.
This document assumes that the reader
will have a reasonably good understanding of the RFCs above.
</t>
<t> In examples, "C:" and "S:" indicate lines sent by the client and
server, respectively. If a single "C:" or "S:" label applies to
multiple lines, then the line breaks between those lines are for
editorial clarity only and are not part of the actual protocol
exchange.
</t>
</section>
<section anchor="utf8-accept"
title="UTF8=ACCEPT IMAP Capability and UTF-8 in IMAP Quoted Strings">
<t>The "UTF8=ACCEPT" capability indicates that the server supports the
ability to open mailboxes containing internationalized messages with
SELECT and EXAMINE, and can provide UTF-8 responses to the LIST and
LSUB commands. This capability also affects other IMAP extensions that can return
mailbox names
or their prefixes, such as NAMESPACE <xref target="RFC2342"/> and ACL <xref target="RFC4314"/>.
</t>
<t>
The "UTF8=ONLY" capability described in <xref target="utf8-only"/> implies the
"UTF8=ACCEPT" capability. A server is said to "support UTF8=ACCEPT"
if it advertises either "UTF8=ACCEPT" or "UTF8=ONLY".
</t>
<t>
A client MUST use the "ENABLE" command (defined in <xref target="RFC5161"/>) with
the "UTF8=ACCEPT" option (defined in <xref target="utf8-append"/> below) to indicate to
the server that the client accepts UTF-8 in quoted-strings and
supports UTF8=ACCEPT extension. The "ENABLE UTF8=ACCEPT" command
is only valid in the authenticated state.
</t>
<t>
The IMAP4rev1 <xref target="RFC3501"/> base specification forbids the use of 8-bit
characters in atoms or quoted strings. Thus, a UTF-8 string can only
be sent as a literal. This can be inconvenient from a coding
standpoint, and unless the server offers IMAP4 non-synchronizing
literals <xref target="RFC2088"/>, this requires an extra round trip for each UTF-8
string sent by the client. When the IMAP server supports "UTF8=ACCEPT"
it supports UTF-8 in quoted-strings with the following syntax:
</t>
<figure>
<artwork>
quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE
; QUOTED-CHAR is not modified, as it will affect
; other RFC 3501 ABNF non terminal.
uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4
UTF8-2 = <Defined in Section 4 of RFC3629>
UTF8-3 = <Defined in Section 4 of RFC3629>
UTF8-4 = <Defined in Section 4 of RFC3629>
</artwork>
</figure>
<t>
When this extended quoting mechanism is used by the client, then the
server MUST reject with a "BAD" response any octet sequences with the
high bit set that fail to comply with the formal syntax in <xref target="RFC3629"/>.
The IMAP server MUST NOT send UTF-8 in quoted strings to the client
unless the client has indicated support for that syntax by using the
"ENABLE UTF8=ACCEPT" command.
</t>
<t>
If the server supports "UTF8=ACCEPT", the client MAY use extended
quoted syntax with any IMAP argument that permits a string (including
astring and nstring).
However, if characters outside the
US-ASCII repertoire are used in an inappropriate place, the results
would be the same as if other syntactically valid but semantically
invalid characters were used. Specific cases where UTF-8 characters
are permitted or not permitted are described in the following
paragraphs.
</t>
<t>
All IMAP servers that support "UTF8=ACCEPT" SHOULD accept UTF-8 in
mailbox names, and those that also support the
"Mailbox International Naming Convention" described in RFC 3501,
Section 5.1.3, MUST accept utf8-quoted mailbox names and convert them
to the appropriate internal format. Mailbox names MUST comply with
the Net-Unicode Definition (<xref target="RFC5198"/>, Section 2) with the specific
exception that they MUST NOT contain control characters (0000-001F,
0080-009F), delete (007F), line separator (2028), or paragraph
separator (2029).</t>
<t>
Once an IMAP client has enabled UTF-8 support with the
"ENABLE UTF8=ACCEPT" command, it MUST NOT issue a SEARCH
command that contains a CHARSET specification. If an IMAP server
receives such a SEARCH command in that situation,
it SHOULD reject the command
with a BAD response (due to the conflicting charset labels).
</t>
</section>
<section anchor="utf8-append" title="IMAP UTF8 Append Data Extension">
<t>
If the server supports "UTF8=ACCEPT", then the server accepts UTF-8
headers in the APPEND command message argument.
A client that sends a message with UTF-8 headers to the server MUST
send them using the "UTF8" APPEND data extension. If the server also
advertises the CATENATE capability (as specified in <xref target="RFC4469"/>), the
client can use the same data extension to include such a message in a
CATENATE message part. The ABNF for the APPEND data extension and
CATENATE extension follows:
</t>
<figure>
<artwork>
utf8-literal = "UTF8" SP "(" literal8 ")"
literal8 = <Defined in RFC 4466>
append-data =/ utf8-literal
cat-part =/ utf8-literal
</artwork>
</figure>
<t>
If an IMAP server supports "UTF8=ACCEPT" and the IMAP client has not
issued the "ENABLE UTF8=ACCEPT" command, the server MUST reject with
a "NO" response an APPEND command that includes any 8-bit character
in message header fields.
</t>
</section>
<section anchor="login-command" title="LOGIN Command and UTF-8">
<t>
This specification doesn't extend the IMAP LOGIN command <xref target="RFC3501"/>
to support UTF-8 usernames and passwords. Whenever a client needs
to use UTF-8 username/passwords, it MUST use the IMAP AUTHENTICATE
command which is already capable of passing UTF-8 user names and
credentials.
</t>
<t> Although the use of
the IMAP AUTHENTICATE command in this way makes it syntactically legal to have a UTF-8 user name
or password, there is no guarantee the user provisioning system used
by the IMAP server will allow such identities. This is an
implementation decision and may depend on what identity system the
IMAP server is configured to use.</t>
</section>
<section anchor="utf8-only" title="UTF8=ONLY Capability">
<t>
The "UTF8=ONLY" capability indicates that the server supports
"UTF8=ACCEPT" (see <xref target="utf8-append"/>), and also that it requires support
for UTF-8 from clients. In particular, this means that it will send UTF-8 in
quoted strings, and it will not accept the older international
mailbox name convention (modified UTF-7). Because these are
incompatible changes to IMAP, explicit server announcement and
client confirmation is necessary: clients MUST use the "ENABLE
UTF8=ACCEPT" command before using this server. A server that
advertises "UTF8=ONLY" will reject with a "NO [CANNOT]" response
any command that might require UTF-8 support and is not preceded
by an "ENABLE UTF8=ACCEPT" command.
</t>
<t>
IMAP clients that find support for a server that announces
"UTF8=ONLY" problematic are encouraged to at least detect the
announcement and provide an informative error message to the
end-user.
</t>
<t>
Because the "UTF8=ONLY" server capability includes support for
"UTF8=ACCEPT", the capability string will include at most one of
those and never both. For the client, "ENABLE UTF8=ACCEPT" is
always used -- never "ENABLE UTF8-ONLY".
</t>
</section>
<section anchor="legacy" title=" Dealing With Legacy Clients">
<t> In most situations, it will be difficult or impossible for
the implementer or operator of an IMAP (or POP) server to
know whether all of the clients that might access it, or
the associated mail store more generally, will be able to
support the facilities defined in this document. In almost
all cases, servers who conform to this specification will
have to be prepared to deal with clients that do not enable
the relevant capabilities. Unfortunately, there is no
completely satisfactory way to do so other than for systems
that wish to receive email that requires SMTPUTF8
capabilities to be sure that all components of those
systems -- including IMAP and other clients selected by
users -- are upgraded appropriately.
</t>
<t>
Choices available to the server when a message that requires SMTPUTF8
is encountered and the client doesn't enable UTF-8 capability include
hiding the problematic message(s), creating in band or out of band
notifications or error messages, or somehow trying to create a
variation on the message with the intention of providing useful
information to that client about what has occurred. Such variant
messages cannot be actual substitutes for the original message:
they will almost always be impossible to reply to (either at
all or without loss of information); the new header fields or
specialized constructs for server-client communication
may go beyond the requirements of, e.g., RFC 5322; they may
consequently confuse some legacy mail user agents
(including IMAP clients) or otherwise may not provide the
expected information to users.
There are also tradeoffs in
constructing variants of the original message between
accepting complexity and additional computation costs in
order to try to preserve as much information as possible
(for example, in <xref target="I-D.ietf-eai-popimap-downgrade"/>) and
trying to minimize those costs while still providing useful
information (for example, in
<xref target="I-D.ietf-eai-simpledowngrade"/>).
</t>
<t>
Implementations that choose to do downgrading SHOULD use one
of the standardized algorithms, <xref target="I-D.ietf-eai-popimap-downgrade"/> or
<xref target="I-D.ietf-eai-simpledowngrade"/>. Getting downgrade algorithms right,
and minimizing the risk of operational problems and harm to the email
system, is tricky and requires careful engineering. These two
algorithms are well understood and carefully designed.
</t>
<t> Because such messages are really variations on the original
ones, not really "downgraded ones" (although that
terminology is often used for convenience), they inevitably
have relationships to the original ones that the IMAP
specification <xref target="RFC3501"/> did not anticipate.
This brings up two concerns in particular:
First, digital signatures computed over and intended for the original
message will often not be applicable to the variant message, and
will often fail signature verification. (It will be possible for some
digital signatures to be verified, if they cover only parts of the
original message that are not affected in the creation of the variant.)
Second, servers that
may be accessed by the same user with different clients or methods
(e.g., POP or webmail systems in addition to IMAP or IMAP clients
with different capabilities) will need to exert extreme care to be
sure that UIDVALIDITY behaves as the user would expect.
Those issues may be especially
sensitive if the server caches the variant message or
computes and stores it when the message arrives with the
intent of making either form available depending on client
capabilities.
Additionally, in order to cope with the case when a server compliant
with this
extension returns the same UIDVALIDITY to both legacy and
UTF8=ACCEPT-aware clients,
a client upgraded from non UTF8=ACCEPT aware MUST discard its cache of
messages
downloaded from the server.
</t>
<t> The best (or "least bad") approach for any given
environment will depend on local conditions, local
assumptions about user behavior, the degree of control the
server operator has over client usage and upgrading, the
options that are actually available, and so on. It is
impossible, at least at the time of publication of this specification, to give good advice that
will apply to all situations, or even particular profiles
of situations, other than "upgrade legacy clients as soon
as possible".
</t>
</section>
<section anchor="mailstore" title="Issues with UTF-8 Header Mailstore">
<t>
When an IMAP server uses a mailbox format that supports UTF-8 headers
and it permits selection or examination of that mailbox without
issuing "ENABLE UTF8=ACCEPT" first, it is the
responsibility of the server to comply
with the IMAP4rev1 base specification <xref target="RFC3501"/> and <xref target="RFC5322"/> with
respect to all header information transmitted over the wire.
The issue of handling messages containing non-ASCII
characters in legacy environments is discussed in <xref target="legacy"/>.
</t>
</section>
<section anchor="iana" title="IANA Considerations">
<t> This document redefines two capabilities ("UTF8=ACCEPT" and
"UTF8=ONLY") in the IMAP 4 Capabilities registry <xref target="RFC3501"/>.
Three other capabilities that were described in the experimental
predecessor to this document (UTF8=ALL, UTF8=APPEND, UTF8=USER)
are now made OBSOLETE. IANA is asked to change the IMAP 4
Capabilities registry as follows:
</t>
<t>
<figure>
<artwork ><![CDATA[
OLD:
+--------------+---------------------------+
| UTF8=ACCEPT | [RFC5738] |
| UTF8=ALL | [RFC5738] |
| UTF8=APPEND | [RFC5738] |
| UTF8=ONLY | [RFC5738] |
| UTF8=USER | [RFC5738] |
+--------------+---------------------------+
]]> </artwork>
</figure>
<figure>
<artwork ><![CDATA[
NEW:
+--------------+---------------------------+
| UTF8=ACCEPT | [[this RFC]] |
| UTF8=ALL | OBSOLETE (was [RFC5738]) |
| UTF8=APPEND | OBSOLETE (was [RFC5738]) |
| UTF8=ONLY | [[this RFC]] |
| UTF8=USER | OBSOLETE (was [RFC5738]) |
+--------------+---------------------------+
]]> </artwork>
</figure>
</t>
</section>
<section anchor="security" title="Security Considerations">
<t> The security considerations of UTF-8 <xref target="RFC3629"/> and SASLprep <xref target="RFC4013"/>
apply to this specification, particularly with respect to use of
UTF-8 in user names and passwords. Otherwise, this is not believed
to alter the security considerations of IMAP4rev1.</t>
<t>Special considerations, some of them with security
implications, occur if a server that conforms to this
specification is accessed by a client that does not, as well as in some more complex
situations in which a given message is
accessed by multiple clients that might use different
protocols and/or support different capabilities. Those
issues are discussed in <xref target="legacy"/> above.
</t>
</section>
</middle>
<back>
<references title="Normative References">
&rfc2119;
&rfc3501;
&rfc3629;
&rfc4013;
&rfc4466;
&rfc4469;
&rfc5161;
&rfc5198;
&rfc5234;
&rfc5258;
&rfc6532;
&rfc5322;
&rfc6530;
&popimapdowngrade;
&simpledowngrade;
</references>
<references title="Informative References">
&rfc2088;
&rfc5738;
&rfc2342;
&rfc4314;
</references>
<section anchor="rationale" title="Design Rationale">
<t> This non-normative section discusses the reasons behind some of the
design choices in the above specification.</t>
<t> The basic approach of advertising the ability to access a mailbox in
UTF-8 mode is intended to permit graceful upgrade, including servers
that support multiple mailbox formats. In particular, it would be
undesirable to force conversion of an entire server mailstore to
UTF-8 headers, so being able to phase-in support for new mailboxes
and gradually migrate old mailboxes is permitted by this design.</t>
<t> The "UTF8=ONLY" mechanism simplifies diagnosis of interoperability
problems when legacy support goes away. In the situation where
backwards compatibility is broken anyway, just-send-UTF-8 IMAP has
the advantage that it might work with some legacy clients. However,
the difficulty of diagnosing interoperability problems caused by a
just-send-UTF-8 IMAP mechanism is the reason the "UTF8=ONLY"
capability mechanism was chosen.
</t>
<t> </t>
</section>
<section anchor="ack" title="Acknowledgments">
<t>The authors wish to thank the participants of the EAI working group
for their contributions to this document with particular thanks to
Harald Alvestrand, David Black, Randall Gellens, Arnt Gulbrandsen,
Kari Hurtta, John Klensin, Xiaodong Lee, Charles Lindsey, Alexey
Melnikov, Subramanian Moonesamy, Shawn Steele, Daniel Taharlev, and
Joseph Yee for their specific contributions to the discussion.</t>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 04:08:22 |