One document matched: draft-ietf-eai-pop-09.xml
<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc strict="yes"?>
<?rfc symrefs="yes"?>
<?rfc linkmailto="no"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<rfc ipr="pre5378Trust200902" docName="draft-ietf-eai-pop-09.txt" category="exp"> <!-- updates="1939" -->
<front>
<title>POP3 Support for UTF-8</title>
<?rfc include="author.rgellens"?>
<?rfc include="author.cnewman"?>
<date month="October" year="2009" />
<area>Applications</area>
<keyword>I-D</keyword>
<keyword>Internet-Draft</keyword>
<abstract><t>
This specification extends the Post Office Protocol version 3 (POP3) to support un-encoded international characters in user names, passwords, mail addresses, message headers, and protocol-level textual error strings.
</t></abstract>
</front>
<middle>
<section title="Introduction">
<t>This document forms part of the Email Address Internationalization
(EAI) experiment described in the <xref target="RFC4952">EAI Framework</xref> document
(for background, please see the charter of the EAI working
group) and should be evaluated within the context of EAI. As part of the overall EAI work, email messages may be transmitted and delivered
containing un-encoded UTF-8 characters, and mail drops accessed using <xref target="RFC1939">POP3</xref> might natively store UTF-8.</t>
<t>This specification extends <xref target="RFC1939">POP3</xref> using the
<xref target="RFC2449">POP3 Extension Mechanism</xref> to
permit un-encoded <xref target="RFC3629">UTF-8</xref> in headers as described in <xref target="RFC5335">Internationalized Email Headers</xref>. It also adds a mechanism to support login names outside the ASCII character set, and a mechanism to support UTF-8 protocol-level error strings in a language appropriate for the user.</t>
<t>This document updates <xref target="RFC1939">POP3</xref>, and the
fact that an Experimental specification updates a Standards-Track
specification means that people who participate in the experiment
have to consider the standard updated. Note that, as an Experimental document, there is no "Updates" header. If and when a version of this document moves to the standards track, an "Updates: 1939" header should be added.</t>
<t>Within this specification, the term down-conversion refers to the process of modifying a message containing <xref target="RFC5335">UTF8 headers</xref> or body parts with 8bit content-transfer-encoding as defined in <xref
target="RFC2045">MIME section 2.8</xref> into conforming 7-bit <xref target="RFC5322">Internet Message Format</xref> with <xref target="RFC2047">Message Header Extensions for Non-ASCII Text</xref> and other 7-bit encodings. Down-conversion is specified by <xref target="RFC5504">Downgrading mechanism for Email Address Internationalization</xref>.</t>
<section title="Conventions Used in this Document">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in
<xref target="RFC2119">"Key words for use in RFCs to Indicate
Requirement Levels"</xref>.</t>
<t>The formal syntax uses the <xref target="RFC5234">Augmented
Backus-Naur Form (ABNF)</xref> notation including the core rules
defined in Appendix B of RFC 5234.</t>
<t>In examples, "C:" and "S:" indicate lines sent by the client and
server respectively. If a single "C:" or "S:" label applies to multiple
lines, then the line breaks between those lines are for editorial
clarity only and are not part of the actual protocol exchange.</t>
<t>Note that examples always use 7-bit ASCII characters due to limitations of this document format; in particular, some examples for the "LANG" command may appear silly as a result.</t>
</section>
<section title="Change History">
<t>This section describes the change history of this Internet draft and will be removed when/if this is published as an RFC.</t>
<section title="Changes from -08 to -09">
<t><list style="symbols">
<t>Added new paragraph to start of Introduction to more clearly explain this document within the larger EAI context.</t>
<t>Added informative reference to EAI Framework (RFC 4952).</t>
<t>Removed "Updates: 1939" header and added a note that one should be added if and when it is published on the standards track.</t>
<t>Added clarifying text that the "language range" argument of RFC 4647 is the "Basic Language Range".</t>
</list>
</t>
</section>
<section title="Changes from -07 to -08">
<t><list style="symbols">
<t>Changed wording on applying SASLprep to APOP digest inputs.</t>
<t>Added mandatory rejection of user names or passwords which fail to comply with formal syntax of RFC 3629.</t>
<t>Added text that, when applying SASLprep, servers MUST reject user names or passwords which contain characters listed in section 2.3 of RFC 4013.</t>
<t>Added normative reference to RFC 3629.</t>
<t>Changed SASLprep language so that both clients and servers MUST apply SASLprep to user names and passwords used to compute APOP digest, and servers SHOULD apply SASLprep to arguments of USER and PASS.</t>
<t>Fixed typo ("ACII" instead of "ASCII").</t>
<t>Clarified that size doesn't include byte-stuffing.</t>
<t>Added explanation to Introduction regarding updating RFC 1939.</t>
<t>Added more prominent text on examples which are silly because they use 7-bit ASCII.</t>
<t>Replaced French examples with Spanish to try and make them slightly less embarrassing.</t>
<t>Replaced French-Canadian (fr-ca) example with Swedish, to try and avoid accented characters. Also added Swedish to language listing.</t>
<t>Added "Examples" to LANG command examples.</t>
<t>Added introductory text to sections on LANG and UTF8 capability tags.</t>
</list>
</t>
</section>
<section title="Changes from -06 to -07">
<t><list style="symbols">
<t>Added discussion about accuracy of size.</t>
<t>Added mention of potential buffer overflow problems because of inaccurate sizes to the Security Considerations.</t>
<t>Added informative reference to SASL for POP3 (RFC 5034).</t>
<t>Removed text making changes to AUTH, as this is handled by POP3 SASL.</t>
<t>Fixed typo ("depricated" instead of "deprecated").</t>
<t>Reworded Design Rationale appendix.</t>
</list>
</t>
</section>
<section title="Changes from -05 to -06">
<t><list style="symbols">
<t>Removed LIST and TOP as possible arguments to the UTF8 tag in the CAPA response.</t>
<t>Clarified that the UTF8 command has no parameters.</t>
<t>Changed "arguments" to "arguments with CAPA tag" to clarify that these are possible arguments to the tag in the CAPA response and not command parameters.</t>
<t>Clarified use of "argument" to refer to CAPA tag and "parameter" to refer to commands.</t>
<t>Clarified that free-form text is non-standard.</t>
<t>Removed open issue (downgrading).</t>
<t>Added discussion of downgrading to Appendix A.</t>
<t>Updated downgrade reference to RFC 5504.</t>
<t>Tweaked RFC 2119 text to satisfy I-D nit checker.</t>
</list>
</t>
</section>
<section title="Changes from -04 to -05">
<t><list style="symbols">
<t>Downgrading is back to an informative, not normative reference, and is suggested as a good idea but explicitly not required.</t>
<t>Language listing now specifies that the human-readable description of a language is in the language itself.</t>
<t>Updated 2822 reference to 5322, made text "Internet Message Format".</t>
<t>Updated reference to utf8headers draft to RFC5335.</t>
<t>Updated reference to RFC4234 to RFC5234.</t>
</list>
</t>
</section>
<section title="Changes from -03 to -04">
<t><list style="symbols">
<t>Specified that it is an error to issue STLS after UTF8.</t>
<t>Removed prior open issues.</t>
<t>Downgrading added as open issue.</t>
</list>
</t>
</section>
<section title="Changes from -02 to -03">
<t><list style="symbols">
<t>Updated references.</t>
<t>Replaced US-ASCII with ASCII.</t>
<t>Added comment to language listing failure example.</t>
<t>Replaced RET8, LST8, and TOP8 commands with a single mode-switch
UTF8 command issued before authentication. This simplifies the
protocol, and allows servers to optionally down-convert a cache of the maildrop prior to issuing the +OK response entering TRANSACTION state.</t>
<t>Removed most up-conversion material.</t>
<t>Removed definition of up-conversion.</t>
<t>Removed IMAP4 reference.</t>
<t>Added AUTH command to those affected by UTF8 capability.</t>
<t>Removed LST8 and TOP8 capability parameters and commands.</t>
<t>Removed NO-RETR capability. POP servers are now unconditionally required to support down-conversion of UTF8-native maildrops.</t>
<t>Added sentence about modifying authentication code to Security Considerations.</t>
<t>eai-downgrade draft is now normative and required.</t>
<t>Deleted references to RFCs 1341, 1847, 2049, 2183, 3501, 3516, and 3490.</t>
</list>
</t>
</section>
<section title="Changes from -01 to -02">
<t><list style="symbols">
<t>Minor grammatical tweaks.</t>
<t>Add passwords to Abstract.</t>
<t>Removed new editor's name from Acknowledgments.</t>
</list>
</t>
</section>
<section title="Changes from -00 to -01">
<t><list style="symbols">
<t>Update references</t>
</list>
</t>
</section>
<section title="Changes from draft-newman-ima-pop">
<t><list style="symbols">
<t>Change title to make this a WG document.</t>
<t>Add LANG command and extension.</t>
<t>Rename RET8 capability to UTF8 and add sub-sections for arguments.</t>
<t>Add TOP8 command.</t>
<t>Add definition of up-conversion and down-conversion.</t>
<t>Some grammar fix-ups and section re-ordering based on RFC editor style.</t>
</list>
</t>
</section>
</section>
<section title="Open Issues">
<t><list style="format %d. ">
<t>none</t>
</list>
</t>
</section>
</section>
<?rfc needLines="8"?>
<section title="LANG Capability">
<t>Per the <xref target="RFC2449">POP3 Extension Mechanism</xref>, this document adds a new capability response tag to indicate support for a new command: LANG. The capability tag and new command are described below.</t>
<t><list style="hanging">
<t hangText="CAPA tag:"><vspace/>LANG</t>
<t hangText="Arguments with CAPA tag:"><vspace/>none</t>
<t hangText="Added Commands:"><vspace/>LANG</t>
<t hangText="Standard commands affected:"><vspace/>All</t>
<t hangText="Announced states / possible differences:"><vspace/>both / no</t>
<t hangText="Commands valid in states:"><vspace/>AUTHENTICATION, TRANSACTION</t>
<t hangText="Specification reference:"><vspace/>this document</t>
<t hangText="Discussion:"></t>
</list>
</t>
<t>POP3 allows most +OK and -ERR server responses to include
human-readable text that in some cases needs to be presented to the
user. But that text is limited to ASCII by the <xref
target="RFC1939">POP3 specification</xref>. The LANG capability and
command permit a POP3 client to negotiate which language the server
should use when sending human-readable text.</t>
<t>A server that advertises the LANG extension MUST use the language
"i-default" as described in <xref target="RFC2277"></xref> as its default
language until another supported language is negotiated by the client.
A server MUST include "i-default" as one of its supported languages.</t>
<t>The LANG command requests that human-readable text included in all
subsequent +OK and -ERR responses be localized to a language matching
the language range argument (the "Basic Language Range" as described by
<xref target="RFC4647"></xref>). If the command succeeds, the server returns a
+OK response followed by a single space, the exact language tag
selected, another space, and the rest of the line is human-readable text
in the appropriate language. This and subsequent protocol-level
human readable text is encoded in the UTF-8 charset.</t>
<t>If the command fails, the server returns an -ERR response and
subsequent human-readable response text continues to use the language
that was previously active (typically i-default).</t>
<t>The special "*" language range argument indicates a request
to use a language designated as preferred by the server administrator. The preferred language MAY vary based on the currently active user.</t>
<t>If no argument is given and the POP3 server issues a positive response, then the response given is multi-line. After the initial +OK, for each language tag the server supports, the POP3 server responds with a line for that language. This line is called a "language listing".</t>
<t>In order to simplify parsing, all POP3 servers are required to use a certain format for language listings. A language listing consists of the <xref
target="RFC4646">language tag</xref> of the message, optionally followed by a single space and a human readable description of the language in the language itself, using the UTF-8 charset.</t>
<figure title="Examples">
<preamble>Examples:</preamble>
<artwork><![CDATA[
< Note that some examples do not include the correct character
accents due to limitations of this document format. >
< The server defaults to using English i-default responses until
the client explicitly changes the language. >
C: USER karen
S: +OK Hello, karen
C: PASS password
S: +OK karen's maildrop contains 2 messages (320 octets)
< Client requests deprecated MUL language. Server replies
with -ERR response >
C: LANG MUL
S: -ERR invalid language MUL
< A LANG command with no parameters is a request for
a language listing. >
C: LANG
S: +OK Language listing follows:
S: en English
S: en-boont English Boontling dialect
S: de Deutsch
S: it Italiano
S: es Espanol
S: sv Svenska
S: i-default Default language
S: .
< A request for a language listing might fail >
C: LANG
S: -ERR Server is unable to list languages
< Once the client changes the language, all responses will be in
that language starting with the response to the LANG command.
C: LANG es
S: +OK es Idioma cambiado
< If a server does not support the requested primary language,
responses will continue to be returned in the current language
the server is using. >
C: LANG uga
S: -ERR es Idioma <<UGA>> no es conocido
C: LANG sv
S: +OK sv Kommandot "LANG" lyckades
C: LANG *
S: +OK es Idioma cambiado
]]></artwork>
</figure>
</section>
<section title="UTF8 Capability">
<t>Per the <xref target="RFC2449">POP3 Extension Mechanism</xref>, this document adds a new capability response tag to indicate support for new server functionality including a new command, UTF8. The capability tag and new command and functionality are described below.</t>
<t><list style="hanging">
<t hangText="CAPA tag:"><vspace/>UTF8</t>
<t hangText="Arguments with CAPA tag:"><vspace/>USER</t>
<t hangText="Added Commands:"><vspace/>UTF8</t>
<t hangText="Standard commands affected:"><vspace/>USER, PASS, APOP, LIST, TOP, RETR</t>
<t hangText="Announced states / possible differences:"><vspace/>both / no</t>
<t hangText="Commands valid in states:"><vspace/>AUTHORIZATION</t>
<t hangText="Specification reference:"><vspace/>this document</t>
<t hangText="Discussion:"></t>
</list>
</t>
<t>This capability adds the "UTF8" command to POP3. The UTF8 command
switches the session from ASCII to UTF8 mode.</t>
<section anchor="UTF8_Command" title="The UTF8 Command">
<t>The UTF8 command enables UTF8 mode. The UTF8 command has no parameters.</t>
<t>Maildrops can natively store UTF8 or be limited to ASCII. UTF8 mode has no effect on messages in an
ASCII-only maildrop. Messages in native-UTF8 maildrops can be ASCII
or UTF8 using <xref
target="RFC5335">internationalized headers</xref>
and/or 8bit content-transfer-encoding as defined in <xref
target="RFC2045">MIME section 2.8</xref>.
In UTF8 mode, both UTF8 and ASCII messages are sent to the client
as-is (without conversion). When not in UTF8 mode, UTF8 messages in
a native UTF8 maildrop MUST be down-converted (downgraded) to comply with unextended POP and Internet Mail Format. POP servers (unlike SMTP and Submit servers) are not required to use <xref target="RFC5504">Downgrading mechanism for Email Address Internationalization</xref>.</t>
<t>
Discussion: The main argument against a single required mechanism for downgrade by a POP server is that the only clients that have any use for
a standardized downgraded message (because they wish to interpret downgrade headers, for example) are ones that can support UTF8 and hence will issue the UTF8 command in the first place. The counter argument to this is that non-UTF8 clients might be upgraded in the future; it's desirable for an upgraded client to be capable of interpreting prior downgraded messages in the local mail store, which is most likely if the messages were downgraded using one standardized procedure.</t>
<t>Therefore, while POP servers are not required to use the <xref target="RFC5504">Downgrading mechanism for Email Address Internationalization</xref>, there are advantages to them doing so.</t>
<t>Note that even in UTF8 mode, MIME binary content-transfer-encoding is still not permitted.</t>
<t>The octet count (size) of a message reported in a response to
the LIST command SHOULD match the actual number of octets sent in a
RETR response (not counting byte-stuffing). Sizes reported elsewhere, such as in STAT responses and
non-standardized free-form text in positive status indicators (following
"+OK") need not be accurate, but it is preferable if they are.</t>
<t>Discussion: Mail stores are either ASCII or native UTF-8, and clients either issue the UTF8 command or not. The message needs converting only when it is native UTF8 and the client has not issued the UTF8 command, in which case the server must downconvert it. The downconverted message may be larger. The server may choose various strategies regarding downconversion, which include when to downconvert, whether to cache or store the downconverted form of a message (and if so, for how long), and whether to calculate or retain the size of a downconverted message independently of the downconverted content. If the server does not have immediate access to the accurate downconverted size, it may be faster to estimate rather than calculate it. Servers are expected to normally follow the <xref target="RFC1939">RFC 1939</xref> text on using the "exact size" in a scan listing, but there may be situations with maildrops containing very large numbers of messages in which this might be a problem. If the server does estimate, reporting a scan listing size smaller than what it turns out to be could be a problem for some clients. In summary, it is better for servers to report accurate sizes, but if not, high guesses are better than small ones. Some POP servers include the message size in the non-standardized text response following "+OK" (the 'text' production of <xref target="RFC2449">RFC 2449</xref>), in a RETR or TOP response (possibly because some examples in <xref target="RFC1939">POP3</xref> do so). There has been at least one known case of a client relying on this to know when it had received all of the message rather than following the <xref target="RFC1939">POP3</xref> rule of looking for a line consisting of a termination octet (".") and a CRLF pair. While any such client is non-compliant, if a server does include the size in such text, it is better if it is accurate.</t>
<t>Clients MUST NOT issue the <xref target="RFC2595">STLS command</xref> after issuing UTF8; servers MAY (but are not required to) enforce this by rejecting with an "-ERR" response an STLS command issued subsequent to a successful UTF8 command. (Because this is a protocol error as
opposed to a failure based on conditions, an <xref target="RFC2449">extended response code</xref> is not specified.)</t>
</section>
<section title="USER Argument to UTF8 Capability">
<t>If the USER argument is included with this capability, it indicates
that the server accepts UTF-8 user names and passwords.</t>
<t>Servers which include the USER argument in the UTF8 capability response SHOULD apply <xref
target="RFC4013">SASLprep</xref> to the arguments of the USER and PASS commands.</t>
<t>A client or server that supports APOP and permits UTF-8 in user
names or passwords MUST apply <xref target="RFC4013">SASLprep</xref> to the user name and password used to compute the APOP digest.</t>
<t>When applying <xref target="RFC4013">SASLprep</xref>, servers MUST reject UTF-8 user names or passwords which contain a Unicode character listed in section 2.3 of <xref target="RFC4013">SASLprep</xref>.</t>
<t>The client does not need to issue the UTF8 command prior to using UTF8 in authentication. However, clients MUST NOT use UTF8 in USER, PASS, or APOP commands unless the USER argument is included in the UTF8 capability response.</t>
<t>The server MUST reject UTF-8 user names or passwords which fail to comply with the formal syntax in <xref
target="RFC3629">UTF-8</xref>.</t>
<t>Use of UTF8 in the AUTH command is governed by the <xref
target="RFC5034">POP3 SASL</xref> mechanism.</t>
</section>
</section>
<section title="Issues with UTF-8 Header maildrop">
<t>When a POP3 server uses a UTF8-native maildrop, it is the responsibility
of the server to comply with the <xref target="RFC1939">POP3 base
specification</xref> and <xref target="RFC5322">Internet Message Format</xref> when not in UTF8 mode. Mechanisms for 7-bit
downgrading to help comply with the standards are described in <xref
target="RFC5504">Downgrading mechanism for Email Address Internationalization</xref>.</t>
</section>
<?rfc needLines="4"?>
<section title="IANA Considerations">
<t>This adds two new capabilities ("UTF8" and "LANG") to the <xref target="RFC2449">POP3 capability registry</xref>.</t>
</section>
<section title="Security Considerations">
<t>The security considerations of <xref target="RFC3629">UTF-8</xref>
and <xref target="RFC4013">SASLprep</xref> apply to this specification,
particularly with respect to use of UTF-8 in user names and passwords.</t>
<t>The "LANG *" command can reveal the existence and preferred language of a user to an active attacker probing the system if the active language changes in response to the USER, PASS, or APOP commands prior to validating the user's credentials. Servers MUST implement a configuration to prevent this exposure.</t>
<t>It is possible for a man-in-the-middle attacker to insert a LANG command in the command stream thus making protocol-level diagnostic responses unintelligible to the user. A mechanism to integrity protect the session, such as <xref target="RFC2595">TLS</xref> can be used to defeat such attacks.</t>
<t>Modifying server authentication code (in this case, to support UTF8) needs to be done with care to avoid introducing vulnerabilities (for
example, in string parsing).</t>
<t>The <xref target="UTF8_Command">UTF8 Command</xref> description contains a discussion on reporting inaccurate sizes. An additional risk to doing so is that, if a client allocates buffers based on the reported size, it may overrun the buffer, crash, or have other problems if the message data is larger than reported.</t>
</section>
</middle>
<back>
<references title="Normative References">
<?rfc include="reference.RFC.1939"?> <!-- POP3 -->
<?rfc include="reference.RFC.2045"?> <!-- MIME Bodies -->
<?rfc include="reference.RFC.2047"?> <!-- MIME Headers -->
<?rfc include="reference.RFC.2119"?> <!-- Keywords -->
<?rfc include="reference.RFC.2277"?> <!-- Policy -->
<?rfc include="reference.RFC.2449"?> <!-- POP3 CAPA -->
<?rfc include="reference.RFC.5322"?> <!-- Message Format -->
<?rfc include="reference.RFC.4646"?> <!-- language tags -->
<?rfc include="reference.RFC.4647"?> <!-- language range -->
<?rfc include="reference.RFC.3629"?> <!-- UTF-8 -->
<?rfc include="reference.RFC.4013"?> <!-- SASLprep -->
<?rfc include="reference.RFC.5234"?> <!-- ABNF -->
<?rfc include="reference.RFC.5335"?> <!-- UTF8headers -->
</references>
<references title="Informative References">
<?rfc include="reference.RFC.2595"?> <!-- IMAP/POP/ACAP TLS -->
<?rfc include="reference.RFC.4952"?> <!-- EAI Framework -->
<?rfc include="reference.RFC.5034"?> <!-- POP3 SASL -->
<?rfc include="reference.RFC.5504"?> <!-- EAI Downgrading Mechanism -->
</references>
<section title="Design Rationale">
<t>This non-normative section discusses the reasons behind some of the design choices in the above specification.</t>
<t>Having servers perform up-conversion so that, at a minimum, RFC2047-encoded words are decoded into UTF8 is tempting, since this is an area that clients often fail to correctly implement. However, after much discussion the group felt that the benefits did not justify the burden.</t>
<t>Due to interoperability problems with RFC 2047 and limited deployment of RFC 2231, it is hoped these 7-bit encoding mechanisms can be deprecated in the future when UTF-8 header support becomes prevalent. </t>
<t>USER is optional because the implementation burden of <xref target="RFC4013">SASLprep</xref> is not well understood and mandating such support in all cases could negatively impact deployment.</t>
<t>While it is possible to provide useful examples for language
negotiation without support for non-ASCII characters, it is
difficult to provide useful examples for commands specifically
designed to use the UTF-8 charset un-encoded when the document
format is limited to ASCII. As a result, there are no plans to
provide examples for that part of the specification as long as this
remains an experimental proposal. However, implementers of this
specification are encouraged to provide examples to the document
author for a future revision.</t> <t>While down-conversion of
native-UTF8 messages is mandatory in the absence of the UTF8
command, servers are not required to do so as specified in <xref
target="RFC5504">Downgrading Mechanism</xref>. As clients are
upgraded with UTF8 support and the ability to intelligently handle
(e.g., display and reply to) UTF8 messages that were downgraded in
transit, it is better if they are also able to handle messages in
the local mail store that were downgraded by the POP server. This
is more likely if the POP server downgrades messages using the same
mechanism as an SMTP server.</t>
</section>
<section title="Acknowledgments">
<t>Thanks to John Klensin, Tony Hansen and other EAI
working group participants who provided helpful suggestions and
interesting debate that improved this specification.</t>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:55:11 |