One document matched: draft-ietf-precis-problem-statement-00.txt
Network Working Group M. Blanchet
Internet-Draft Viagenie
Intended status: Informational A. Sullivan
Expires: April 21, 2011 October 18, 2010
Stringprep Revision Problem Statement
draft-ietf-precis-problem-statement-00.txt
Abstract
Using Unicode codepoints in protocol strings that expect comparison
with other strings [[anchor1: The WG will need to decide whether
"other strings" is too broad. In particular, what about protocol
slots that can take strings other than plain ASCII?
--ajs@shinkuro.com]] requires preparation of the string that contains
the Unicode codepoints. Internationalizing Domain Names in
Applications (IDNA2003) defined and used Stringprep and Nameprep.
Other protocols subsequently defined Stringprep profiles. A new
approach different from Stringprep and Nameprep is used for a
revision of IDNA2003 (called IDNA2008). Other Stringprep profiles
need to be similarly updated or a replacement of Stringprep need to
be designed. This document outlines the issues to be faced by those
designing a Stringprep replacement.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 21, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Blanchet & Sullivan Expires April 21, 2011 [Page 1]
Internet-Draft Stringprep Revision Problem Statement October 2010
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Blanchet & Sullivan Expires April 21, 2011 [Page 2]
Internet-Draft Stringprep Revision Problem Statement October 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Usage and Issues of Stringprep . . . . . . . . . . . . . . . . 5
2.1. Issues raised during newprep BOF . . . . . . . . . . . . . 5
2.2. Specific issues with particular Stringprep profiles . . . 6
2.3. Inclusion vs. exclusion of characters . . . . . . . . . . 6
2.4. Stringprep and NFKC . . . . . . . . . . . . . . . . . . . 7
2.5. Case mapping . . . . . . . . . . . . . . . . . . . . . . . 7
2.6. Whether to use ASCII-compatible encoding . . . . . . . . . 7
2.7. Issues with delimiters . . . . . . . . . . . . . . . . . . 8
3. Considerations for Stringprep replacement . . . . . . . . . . 8
4. Security Considerations . . . . . . . . . . . . . . . . . . . 9
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
6. Discussion home for this draft . . . . . . . . . . . . . . . . 9
7. Informative References . . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12
Blanchet & Sullivan Expires April 21, 2011 [Page 3]
Internet-Draft Stringprep Revision Problem Statement October 2010
1. Introduction
Internationalizing Domain Names in Applications (IDNA2003) [RFC3490],
[RFC3491], [RFC3492], [RFC3454] described a mechanism for encoding
UTF-8 labels making up Internationalized Domain Names (IDNs) as
standard DNS labels. The labels were processed using a method called
Nameprep [RFC3491] and Punycode [RFC3492]. That method was specific
to IDNA2003, but is generalized as Stringprep [RFC3454]. The general
mechanism can be used to help other protocols with similar needs, but
with different constraints than IDNA2003.
Stringprep defines a framework within which protocols define their
Stringprep profiles. Known IETF specifications using Stringprep are
listed below:
o The Nameprep profile [RFC3490] for use in Internationalized Domain
Names (IDNs);
o NFSv4 [RFC3530] and NFSv4.1 [RFC5661];
o The iSCSI profile [RFC3722] for use in Internet Small Computer
Systems Interface (iSCSI) Names;
o EAP [RFC3748];
o The Nodeprep and Resourceprep profiles [RFC3920] for use in the
Extensible Messaging and Presence Protocol (XMPP), and the XMPP to
CPIM mapping [RFC3922];
o The Policy MIB profile [RFC4011] for use in the Simple Network
Management Protocol (SNMP);
o The SASLprep profile [RFC4013] for use in the Simple
Authentication and Security Layer (SASL), and SASL itself
[RFC4422];
o TLS [RFC4279];
o IMAP4 using SASLprep [RFC4314];
o The trace profile [RFC4505] for use with the SASL ANONYMOUS
mechanism;
o The LDAP profile [RFC4518] for use with LDAP [RFC4511] and its
authentication methods [RFC4513];
o Plain SASL using SASLprep [RFC4616];
o NNTP using SASLprep [RFC4643];
o PKIX subject identification using LDAPprep [RFC4683];
o Internet Application Protocol Collation Registry [RFC4790];
o SMTP Auth using SASLprep [RFC4954];
o POP3 Auth using SASLprep [RFC5034];
o TLS SRP using SASLprep [RFC5054];
o IRI and URI in XMPP [RFC5122];
o PKIX CRL using LDAPprep [RFC5280];
o IAX using Nameprep [RFC5456];
o SASL SCRAM using SASLprep [RFC5802];
o Remote management of Sieve using SASLprep [RFC5804];
Blanchet & Sullivan Expires April 21, 2011 [Page 4]
Internet-Draft Stringprep Revision Problem Statement October 2010
o The i;unicode-casemap Unicode Collation [RFC5051].
There turned out to be some difficulties with IDNA2003, documented in
[RFC4690]. These difficulties led to a new IDN specification, called
IDNA2008 [RFC5890], [RFC5891], [RFC5892], [RFC5893]. Additional
background and explanations of the decisions embodied in IDNA2008 is
presented in [RFC5894]. One of the effects of IDNA2008 is that
Nameprep and Stringprep are not used at all. Instead, an algorithm
based on Unicode properties of codepoints is defined. That algorithm
generates a stable and complete table of the supported Unicode
codepoints. This algorithm is based on an inclusion-based approach,
instead of the exclusion-based approach of Stringprep/Nameprep.
This document lists the shortcomings and issues found by protocols
listed above that defined Stringprep profiles. It also lists some
early conclusions and requirements for a potential replacement of
Stringprep.
2. Usage and Issues of Stringprep
2.1. Issues raised during newprep BOF
During IETF 77, a BOF discussed the current state of the protocols
that have defined Stringprep profiles [NEWPREP]. The main
conclusions are :
o Stringprep is bound to a specific version of Unicode: 3.2.
Stringprep has not been updated to new versions of Unicode.
Therefore, the protocols using Stringprep are stuck to Unicode
3.2.
o The protocols need to be updated to support new versions of
Unicode. The protocols would like to not be bound to a specific
version of Unicode, but rather have better Unicode agility in the
way of IDNA2008. This is important partly because it is usually
impossible for an application to require Unicode 3.2; the
application gets whatever version of Unicode is available on the
host.
o The protocols require better bidirectional support (bidi) than
currently offered by Stringprep.
o If the protocols are updated to use a new version of Stringprep or
another framework, then backward compatibility is an important
requirement. For example, Stringprep is based on and may use NFKC
[UAX15], while IDNA2008 mostly uses NFC [UAX15].
o Protocols use each other; for example, a protocol can use user
identifiers that are later passed to SASL, LDAP or another
authentication mechanism. Therefore, common set of rules or
classes of strings are preferred over specific rules for each
protocol.
Blanchet & Sullivan Expires April 21, 2011 [Page 5]
Internet-Draft Stringprep Revision Problem Statement October 2010
Protocols that use Stringprep profiles use strings for different
purposes:
o XMPP uses a different Stringprep profile for each part of the XMPP
address (JID): a localpart which is similar to a username and used
for authentication, a domainpart which is a domain name and a
resource part which is less restrictive than the localpart.
o iSCSI uses a Stringprep profile for the IQN, which is very similar
to (often is) a DNS domain name.
o SASL and LDAP uses a Stringprep profile for usernames.
o LDAP uses a set of Stringprep profiles.
During the newprep BOF, it was the consensus of the attendees that it
would be highly preferable to have a replacement of Stringprep, with
similar characteristics to IDNA2008. That replacement should be
defined so that the protocols could use internationalized strings
without a lot of specialized internationalization work, since
internationalization expertise is not available in the respective
protocols or working groups.
2.2. Specific issues with particular Stringprep profiles
[[anchor6: This section is where issues raised in the individual
profile reviews goes. A review of the WG trac state on 2010-10-06 of
the tracker suggests those reviews haven't happened yet.
--ajs@shinkuro.com]]
2.3. Inclusion vs. exclusion of characters
One of the primary changes of IDNA2008 is in the way it approaches
Unicode characters. IDNA2003 created an explicit list of excluded or
mapped-away characters; anything in Unicode 3.2 that was not so
listed could be assumed to be allowed under the protocol. IDNA2008
begins instead from the assumption that characters are disallowed,
and then relies on Unicode properties to derive whether a given
character actually is allowed in the protocol.
Moreover, there is more than one class of "allowed in the protocol".
While some characters are simply disallowed, some are allowed only in
certain contexts. The reasons for the context-dependent rules have
to do with the way some characters are used. For instance, the ZERO
WIDTH JOINER and ZERO WIDTH NON-JOINER characters (ZWJ, U+200D and
ZWNJ, U+200C) are allowed with contextual rules because they are
required in some circumstances, yet are considered punctuation by
Unicode and would therefore be DISALLOWED under the usual IDNA2008
derivation rules.
The working group needs to decide whether similar contextual cases
need to be supported.
Blanchet & Sullivan Expires April 21, 2011 [Page 6]
Internet-Draft Stringprep Revision Problem Statement October 2010
2.4. Stringprep and NFKC
Stringprep profiles may use normalization. If they do, they use NFKC
[UAX15]. It is not clear that NFKC is the right normalization to use
in all cases. In [UAX15], there is the following observation
regarding Normalization Forms KC and KD: "It is best to think of
these Normalization Forms as being like uppercase or lowercase
mappings: useful in certain contexts for identifying core meanings,
but also performing modifications to the text that may not always be
appropriate." For things like the spelling of users' names, then,
NKFC may not be the best form to use. At the same time, one of the
nice things about NFKC is that it deals with the width of characters
that are otherwise similar, by canonicalizing half-width to full-
width. This mapping step can be crucial in practice. The WG will
need to analyze the different use profiles and consider whether NFKC
or NFC is a better normalization for each profile.
2.5. Case mapping
In IDNA2003, labels are always mapped to lower case before the
Punycode transformation. In IDNA2003, there is no mapping at all:
input is either a valid U-label or it is not. At the same time,
upper-case characters are by definition not valid U-labels, because
they fall into the Unstable category (category B) of [RFC5892].
If there are protocols that require upper and lower cases be
preserved, then the analogy with IDNA2008 will break down. The
working group will need to decide whether there are any cases that
require upper case, and what to do about it if so.
2.6. Whether to use ASCII-compatible encoding
The development of IDNA2008 depended on the notion that there was a
narrow repertoire of reasonable traditional labels, and what was
necessary was to internationalize that repertoire rather than to
incorporate any characters into domain name labels. More exactly,
the idea was to internationalize the traditional hostname rules (the
"LDH rule". See [RFC4690], section 5.1.). Efforts to
internationalize email ([RFC5336]) have started from different
assumptions. The email example suggests that in some cases, the
right answer might be to internationalize the target protocol rather
than to depend on a technology to ensure protocol slots can use only
ASCII. The working group will need to determine which approach is
correct for the different use-cases.
Blanchet & Sullivan Expires April 21, 2011 [Page 7]
Internet-Draft Stringprep Revision Problem Statement October 2010
2.7. Issues with delimiters
There are two kinds of issues to address with delimiters. First,
exactly where a delimiter will appear on the screen when dealing with
bidirectional parts of a string can be extremely surprising. In the
case of IDNA2008, just what to do in these cases remains a display
issue (there is no question about the wire format, because the wire
format is an A-label and it is always left to right).
Second, there is the question of whether to include different kinds
of protocol separators. For instance, FULL STOP, U+002E (.) may not
be available on all keyboards. In addition, in some languages there
is more than one full stop which are variants of one another. The
working group will need to decide how to handle such cases: whether
there will be a mapping, some restrictions, or something else.
3. Considerations for Stringprep replacement
The above suggests the following direction for the working group:
o A stringprep replacement should be defined.
o The replacement should take an approach similar to IDNA2008, in
that it enables Unicode agility.
o Protocols share similar characteristics of strings. Therefore,
defining i18n preparation algorithms for a (small) set of string
classes may be sufficient for most cases and provides the
coherence among a set of protocol friends.
o The sets of string classes need to be evaluated for the following
properties:
* the normalization needed (NFC vs NFKC);
* whether case-folding, case preservation, and case-insensitive
matching is needed;
* what restrictions on input are reasonable for the class (i.e.
whether there is something like an "LDH rule" for the class),
or whether the ASCII-only input in the protocol slot is lightly
constrained;
* the extent to which bidi considerations are important for the
class.
Existing deployments already depend on Stringprep profiles.
Therefore, the working group will need to consider the effects of any
new strategy on existing deployments. By way of comparison, it is
worth noting that some characters were acceptable in IDNA labels
under IDNA2003, but are not protocol-valid under IDNA2008 (and
conversely). Different implementers may make different decisions
about what to do in such cases; this could have interoperability
effects. The working group will need to trade better support for
different linguistic environments against the potential side effects
Blanchet & Sullivan Expires April 21, 2011 [Page 8]
Internet-Draft Stringprep Revision Problem Statement October 2010
of backward incompatibility.
4. Security Considerations
This document merely states what problems are to be solved, and does
not define a protocol. There are undoubtedly security implications
of the particular results that will come from the work to be
completed.
5. IANA Considerations
This document has no actions for IANA.
6. Discussion home for this draft
This document is intended to define the problem space discussed on
the precis@ietf.org mailing list.
7. Informative References
[NEWPREP] "Newprep BoF Meeting Minutes", March 2010.
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454,
December 2002.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003.
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)",
RFC 3491, March 2003.
[RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode
for Internationalized Domain Names in Applications
(IDNA)", RFC 3492, March 2003.
[RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R.,
Beame, C., Eisler, M., and D. Noveck, "Network File System
(NFS) version 4 Protocol", RFC 3530, April 2003.
[RFC3722] Bakke, M., "String Profile for Internet Small Computer
Systems Interface (iSCSI) Names", RFC 3722, April 2004.
Blanchet & Sullivan Expires April 21, 2011 [Page 9]
Internet-Draft Stringprep Revision Problem Statement October 2010
[RFC3748] Aboba, B., Blunk, L., Vollbrecht, J., Carlson, J., and H.
Levkowetz, "Extensible Authentication Protocol (EAP)",
RFC 3748, June 2004.
[RFC3920] Saint-Andre, P., Ed., "Extensible Messaging and Presence
Protocol (XMPP): Core", RFC 3920, October 2004.
[RFC3922] Saint-Andre, P., "Mapping the Extensible Messaging and
Presence Protocol (XMPP) to Common Presence and Instant
Messaging (CPIM)", RFC 3922, October 2004.
[RFC4011] Waldbusser, S., Saperia, J., and T. Hongal, "Policy Based
Management MIB", RFC 4011, March 2005.
[RFC4013] Zeilenga, K., "SASLprep: Stringprep Profile for User Names
and Passwords", RFC 4013, February 2005.
[RFC4279] Eronen, P. and H. Tschofenig, "Pre-Shared Key Ciphersuites
for Transport Layer Security (TLS)", RFC 4279,
December 2005.
[RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) Extension",
RFC 4314, December 2005.
[RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and
Security Layer (SASL)", RFC 4422, June 2006.
[RFC4505] Zeilenga, K., "Anonymous Simple Authentication and
Security Layer (SASL) Mechanism", RFC 4505, June 2006.
[RFC4511] Sermersheim, J., "Lightweight Directory Access Protocol
(LDAP): The Protocol", RFC 4511, June 2006.
[RFC4513] Harrison, R., "Lightweight Directory Access Protocol
(LDAP): Authentication Methods and Security Mechanisms",
RFC 4513, June 2006.
[RFC4518] Zeilenga, K., "Lightweight Directory Access Protocol
(LDAP): Internationalized String Preparation", RFC 4518,
June 2006.
[RFC4616] Zeilenga, K., "The PLAIN Simple Authentication and
Security Layer (SASL) Mechanism", RFC 4616, August 2006.
[RFC4643] Vinocur, J. and K. Murchison, "Network News Transfer
Protocol (NNTP) Extension for Authentication", RFC 4643,
October 2006.
Blanchet & Sullivan Expires April 21, 2011 [Page 10]
Internet-Draft Stringprep Revision Problem Statement October 2010
[RFC4683] Park, J., Lee, J., Lee, H., Park, S., and T. Polk,
"Internet X.509 Public Key Infrastructure Subject
Identification Method (SIM)", RFC 4683, October 2006.
[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
Recommendations for Internationalized Domain Names
(IDNs)", RFC 4690, September 2006.
[RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet
Application Protocol Collation Registry", RFC 4790,
March 2007.
[RFC4954] Siemborski, R. and A. Melnikov, "SMTP Service Extension
for Authentication", RFC 4954, July 2007.
[RFC5034] Siemborski, R. and A. Menon-Sen, "The Post Office Protocol
(POP3) Simple Authentication and Security Layer (SASL)
Authentication Mechanism", RFC 5034, July 2007.
[RFC5051] Crispin, M., "i;unicode-casemap - Simple Unicode Collation
Algorithm", RFC 5051, October 2007.
[RFC5054] Taylor, D., Wu, T., Mavrogiannopoulos, N., and T. Perrin,
"Using the Secure Remote Password (SRP) Protocol for TLS
Authentication", RFC 5054, November 2007.
[RFC5122] Saint-Andre, P., "Internationalized Resource Identifiers
(IRIs) and Uniform Resource Identifiers (URIs) for the
Extensible Messaging and Presence Protocol (XMPP)",
RFC 5122, February 2008.
[RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
Housley, R., and W. Polk, "Internet X.509 Public Key
Infrastructure Certificate and Certificate Revocation List
(CRL) Profile", RFC 5280, May 2008.
[RFC5336] Yao, J. and W. Mao, "SMTP Extension for Internationalized
Email Addresses", RFC 5336, September 2008.
[RFC5456] Spencer, M., Capouch, B., Guy, E., Miller, F., and K.
Shumard, "IAX: Inter-Asterisk eXchange Version 2",
RFC 5456, February 2010.
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol",
RFC 5661, January 2010.
[RFC5802] Newman, C., Menon-Sen, A., Melnikov, A., and N. Williams,
Blanchet & Sullivan Expires April 21, 2011 [Page 11]
Internet-Draft Stringprep Revision Problem Statement October 2010
"Salted Challenge Response Authentication Mechanism
(SCRAM) SASL and GSS-API Mechanisms", RFC 5802, July 2010.
[RFC5804] Melnikov, A. and T. Martin, "A Protocol for Remotely
Managing Sieve Scripts", RFC 5804, July 2010.
[RFC5890] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Definitions and Document Framework",
RFC 5890, August 2010.
[RFC5891] Klensin, J., "Internationalized Domain Names in
Applications (IDNA): Protocol", RFC 5891, August 2010.
[RFC5892] Faltstrom, P., "The Unicode Code Points and
Internationalized Domain Names for Applications (IDNA)",
RFC 5892, August 2010.
[RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for
Internationalized Domain Names for Applications (IDNA)",
RFC 5893, August 2010.
[RFC5894] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Background, Explanation, and
Rationale", RFC 5894, August 2010.
[UAX15] "Unicode Standard Annex #15: Unicode Normalization Forms",
UAX 15, September 2009.
Authors' Addresses
Marc Blanchet
Viagenie
2600 boul. Laurier, suite 625
Quebec, QC G1V 4W1
Canada
Email: Marc.Blanchet@viagenie.ca
URI: http://viagenie.ca
Andrew Sullivan
519 Maitland St.
London, ON N6B 2Z5
Canada
Email: ajs@crankycanuck.ca
Blanchet & Sullivan Expires April 21, 2011 [Page 12]
| PAFTECH AB 2003-2026 | 2026-04-23 09:22:21 |