One document matched: draft-ietf-eai-mailinglistbis-04.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC2369 SYSTEM "reference.RFC.2369.xml">
<!ENTITY RFC2919 SYSTEM "reference.RFC.2919.xml">
<!ENTITY RFC3986 SYSTEM "reference.RFC.3986.xml">
<!ENTITY RFC5228 SYSTEM "reference.RFC.5228.xml">
<!ENTITY RFC6068 SYSTEM "reference.RFC.6068.xml">
<!ENTITY RFC6409 SYSTEM "reference.RFC.6409.xml">
<!ENTITY RFC6530 SYSTEM "reference.RFC.6530.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc toc="yes" ?>
<?rfc tocindent="yes" ?>
<?rfc tocdepth="2" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>

<rfc category="info" docName="draft-ietf-eai-mailinglistbis-04" ipr="trust200902">
  <front>
    <title>Mailing Lists and non-ASCII Addresses</title>
    
    <author fullname="John Levine" initials="J." surname="Levine">
      <organization>Taughannock Networks</organization>

      <address>
        <postal>
          <street>PO Box 727</street>
          <city>Trumansburg</city>
          <code>14886</code>
          <region>NY</region>
        </postal>
        <phone>+1 831 480 2300</phone>
        <email>standards@taugh.com</email>
        <uri>http://jl.ly</uri>
      </address>
    </author>

    <author fullname="Randall Gellens" initials="R." surname="Gellens">
      <organization>Qualcomm Incorporated</organization>

      <address>
        <postal>
          <street>5775 Morehouse Drive</street>
          <city>San Diego</city>
          <code>92121</code>
          <region>CA</region>
        </postal>
        <email>rg+ietf@qualcomm.com</email>
      </address>
    </author>

    <date month="July" year="2012" />

    <area>Applications</area>
    <workgroup>EAI</workgroup>

    <keyword>Mail</keyword>
    <keyword>internationalization</keyword>
    <keyword>mailing lists</keyword>

    <abstract>
      <t>   This document describes considerations for mailing lists with the 
	 introduction of non-ASCII UTF-8 email addresses.  
	 It outlines some possible scenarios for handling lists
	 with mixtures of non-ASCII and traditional addresses, but
	 does not specify protocol changes or offer implementation or
	 deployment advice.</t>

      <t><spanx style="strong">NOTE TO REVIEWERS: Missing or odd-looking
	   references between sections are due to bugs in xml2rfc.  The
	   XML is OK, and the HTML output looks reasonable.</spanx></t>

    </abstract>
  </front>
  
  <middle>
     <section title="Introduction">

	<t>This document describes considerations for mailing lists with
	   the introduction of non-ASCII UTF-8 email addresses. The usage
	   of such addresses is described in <xref target='RFC6530' />.
	</t>

	<t> Mailing lists are an important part of email usage and
	   collaborative communications.  The introduction of internationalized
	   email addresses affects mailing lists in three main areas: (1)
	   transport (receiving and sending messages); (2) message headers of
	   received and retransmitted messages; and (3) mailing list operational
	   policies. </t>

	<t>A mailing list is a mechanism that distributes
	   a message to multiple recipients when the originator sends it to a single address.
	   An agent, usually software rather than
	   a person, at that single address receives the message and
	   then causes the message to be redistributed to a list of recipients.
	   This agent usually sets the envelope return address (henceforth called the
	   bounce address) of the redistributed
	   message to a different address from that of the original message.
	   Using a different bounce address directs error
	   and other automatically generated messages to an error handling
	   address associated with the mailing list.  This sends error
	   and other automatic messages to the list agent, which can often
	   do something useful with them, rather than to the original sender, who typically
	   doesn't control the list and hence can't do anything about them.</t>
   
	<t>In most cases, the 
	   mailing list agent redistributes a received message to its 
	   subscribers as a new message, that is, conceptually it uses
	   <xref target="RFC6409">message submission</xref> (as did the sender of the original message). 
	   The exception, where the mailing list is not managed by a separate agent that 
	   receives and redistributes messages in separate transactions, but is 
	   implemented by an expansion step within an SMTP transaction where one local 
	   address expands to multiple local or non-local addresses, is not
	   addressed by this document.</t>
    
	<section title="Mailing list header additions and modifications">
	<t>Some list agents alter message header fields,
	   while others do not.  A number of standardized list-related header
	   fields have been defined, and many lists add one or more of these
	   headers.  Separate from these standardized list-specific header
	   fields, and despite a history of interoperability problems from doing
	   so, some lists alter or add header fields in an attempt to control
	   where replies are sent.  Such lists typically add or replace the
	   "Reply-To" field and some add or replace the "Sender" field.
	   Some lists alter or replace other fields, including "From".</t>
   
	<t>Among these list-specific header fields are those specified
	   in <xref target="RFC2369">RFCs 2369</xref> and <xref target="RFC2919">2919</xref>.
	   For more information, see <xref target="listheaders" />.</t>
	</section>
	<section title="Non-ASCII email addresses">

	<t>While the mail transport protocol is the same for
	      regular email recipients and mailing list recipients, list agents
	      have special considerations with non-ASCII email addresses
	      because they retransmit messages composed by other agents to
	      potentially many recipients. </t>
	      
	   <t>There are considerations for non-ASCII email addresses in
	      the envelope as well as in header fields of redistributed messages.
	      In particular, a message with non-ASCII addresses in the headers or envelope
	      cannot be sent to non-SMTPUTF8 recipients.</t>
   
	   <t>With mailing lists, there are two different types of
	      considerations: first, the purely technical ones involving message
	      handling, error cases, and the like, and second, those
	      that arise from the fact that humans use mailing lists to communicate.
	      As an example of the first, list agents might choose to reject all
	      messages from non-ASCII addresses if they are unprepared to
	      handle SMTPUTF8 mail.  As an
	      example of the second, a user who sends a message to a list often is
	      unaware of the list membership.  In particular, the user often doesn't
	      know if the members are SMTPUTF8 mail users or not, and often neither the
	      original sender nor the recipients personally know each other.  As a
	      consequence of this, remedies that may be readily available for
	      one-to-one communication might not be appropriate when dealing with
	      mailing lists.  For example, if a user sends a message which is
	      undeliverable, normally the telephone, instant messaging, or other
	      forms of communication are available to obtain a working address.
	      With mailing lists, the users may not have any recourse.  Of course,
	      with mailing lists, the original sender usually does not know which
	      list members successfully received a message, or if it was
	      undeliverable to some.</t>
   
	   <t>Conceptually, a mailing list's internationalization can
	      be divided into three capabilities: First, does the list have a non-ASCII
	      submission address?  Second, does the list agent accept
	      subscriptions for addresses containing non-ASCII characters?
	      And third, does the list agent accept
	      messages that require SMTPUTF8 capabilities?</t>
	   
	   <t>If a list has subscribers with ASCII addresses, those subscribers
	      might or might not be able to accept SMTPUTF8 messages.</t>

	</section>
     </section>

     <section anchor="Scenarios" title ="Scenarios Involving Mailing Lists">
    
	<t> Generally (and exclusively within the scope of this document),
	   an original message is sent to a mailing list as a completely separate
	   and independent transaction from the list agent sending the
	   retransmitted message to one or more list recipients.  In both cases,
	   the message might be addressed only to the list address, or
	   might have recipients in addition to the list.
	   Furthermore, the list agent
	   might choose to send the retransmitted message to each list recipient
	   in a separate message submission transaction, or might
	   choose to include multiple recipients per transaction.  Often,
	   list agents are constructed to work in cooperation with, rather than
	   include the functionality of, a message submission server, and hence
	   the list transmits to a single submission server one copy of the
	   retransmitted message.  The submission server then decides which recipients to
	   include in which transaction.</t> 

	<section title="Fully SMTPUTF8 lists">
	   <t>Some lists may wish to be fully SMTPUTF8.
	      That is, all subscribers are expected to be able to receive SMTPUTF8 mail.
	      For list hygiene reasons, such a list would probably want to prevent
	      subscriptions from addresses that are unable to receive SMTPUTF8 mail.
	      If a putative subscriber has a non-ASCII address, it must be able
	      to receive SMTPUTF8 mail, but there is no way to tell whether a subscriber
	      with an ASCII address can receive SMTPUTF8 mail short of sending an SMTPUTF8 probe
	      or confirmation message and somehow finding out whether it was delivered,
	      e.g., if the user clicked a link in the confirmation message.
	   </t>
	</section>

	<section title="Mixed SMTPUTF8 and ASCII lists">
	   <t>Other lists may wish to handle a mixture of SMTPUTF8 and ASCII subscribers,
	      either as a transitional measure as subscribers upgrade to SMTPUTF8-capable
	      mail software, or as an ongoing feature. While it is not possible in
	      general to downgrade SMTPUTF8 mail to ASCII mail, list software might divide
	      the recipients into two sets, SMTPUTF8 and ASCII recipients, and create a downgraded
	      version of SMTPUTF8 list messages to send to ASCII recipients.
	      See <xref target='dghead' /> and <xref target='dgsub' />.
	      
	   </t>
	   <t>To determine which set an address belongs in, list software might make the
	      conservative assumption that ASCII addresses get ASCII messages, it might
	      try to probe the address with an SMTPUTF8 test message, or it might let the
	      subscriber set the message format manually, similar to the way that some
	      lists now let subscribers choose between plain text and HTML mail, or
	      individual messages and a daily digest.</t>
	   <t>To determine whether a message needs to be downgraded for ASCII recipients,
	      list software might assume that any message received via an SMTPUTF8 SMTP session
	      is an SMTPUTF8 message, or might examine the headers and body of the
	      message to see whether it needs SMTPUTF8 treatment.
	      Depending on the interface between the list software and the MTA and MDA that handle
	      incoming messages, it may not be able to tell the type of session
	      for incoming messages.</t>
	</section>
	    
	<section title="SMTP issues">
	   <t>Mailing list software usually changes the envelope addresses on each message.
	      The bounce address is set to an address that will return bounces to the list agent,
	      and the recipient addresses are set to the subscribers of the list. For some lists,
	      all messages to a list get the same bounce address. For others,
	      list software may create a
	      bounce address per recipient, or a unique bounce address
	      per message per recipient, bounce management techniques known as
	      <xref target='VERP'>Variable Envelope Return Path or VERP</xref>.</t>
	   <t> The bounce address for a list typically includes the name of the list, so a list with
	      a non-ASCII name will have a non-ASCII bounce address.
	      Given the unknown paths that bounce
	      messages might take, list software might instead use an ASCII bounce address to
	      make it more likely that bounces can be delivered back to the list agent. Similarly,
	      a VERP address for each subscriber typically embeds a version of the subscriber's
	      address so the VERP bounce address for a non-ASCII subscriber address
	      will be a non-ASCII address. For the same reason, the list software might
	      use ASCII bounce addresses that encode the recipient's identity in some other way.
	   </t>
	</section>

     </section>

     <section title="List headers" anchor="listheaders">
	<t>List agents typically adds list-specific headers to each message before resending
	   the message to list recipients.</t>

	<section title="SMTPUTF8 list headers">
	   <t>The list headers in <xref target="RFC2369">RFCs 2369</xref> and <xref target="RFC2919">2919</xref>
	      were all specified before SMTPUTF8 mail existed and their definitions do
	      not address where non-ASCII characters might appear.
	      These include, for example:</t>

	   <figure><artwork>
List-Id: List Header Mailing List 
   <list-header.example.com>
List-Help: 
   <mailto:list@example.com?subject=help> 
List-Unsubscribe: 
   <mailto:list@example.com?subject=unsubscribe>
List-Subscribe: 
   <mailto:list@example.com?subject=subscribe>
List-Post: 
   <mailto:list@example.com>
List-Owner: 
   <mailto:listmom@example.com> (Contact Person for Help)
 
List-Archive: <mailto:archive@example.com?subject=index%20list>
	      </artwork></figure>
    
	   <t>As described in <xref target="RFC2369" />, "The contents of the list header fields
	      mostly consist of angle-bracket ('<', '>') enclosed URLs, with
	      internal whitespace being ignored." <xref target="RFC2919" /> specifies
	      that "The list identifier will, in most cases, appear like a host
	      name in a domain of the list owner."
	      Since these headers were defined in the context of ASCII mail,
	      these headers permit only ASCII text including in the URLs.
	   </t>
	   <t>    
	      The most commonly-used URI schemes in List-* headers tend to be http and 
	      <xref target='RFC6068'>mailto</xref>, although they sometimes include https and ftp,
	      and in principle can contain any valid URI.
	   </t>
	   <t>
	      Even if a scheme permits an internationalized form, it
	      should use a pure ASCII form of the URI described in
	      <xref target ="RFC3986" />. Future work may extend
	      these header fields or define replacements to directly
	      support unencoded non-ASCII outside the ASCII repertoire in these
	      and other header fields, but in the absence of such extension
	      or replacement, non-ASCII characters can only be included by
	      encoding them as ASCII.
	      </t>
	      <t>

		 The encoding technique specified in
		 <xref target="RFC3986" /> is to use a
		 pair of hex digits preceded by a percent sign, but percent signs have
		 been used informally in mail addresses to do source routing.  Although
		 few mail systems still permit source routing, a lot of mail software
		 still forbids or escapes characters formerly used for source routing,
		 which can lead to unfortunate interactions with percent-encoded URIs
		 or any URI that includes one of those characters.
		 If a program interpreting a mailto: URI knew that the MUA in use
		 were able to handle non-ASCII data, the program could pass the URI in
		 unencoded non-ASCII,
		 avoiding problems with misinterpreted percent signs, but at this point
		 there is no standard or even informal way for MUAs to signal SMTPUTF8
		 capabilities.  Also, note that whether internationalized domain names
		 should be percent-encoded or puny-coded depends on the context in which
		 they occur.

	      </t>
	   <t>
	      The List-ID header field uniquely identifies a list.  The 
	      intent is that the value of this header remain constant, even if the 
	      machine or system used to operate or host the list changes.  
	      This 
	      header field is often used in various filters and tests, such as 
	      client-side filters, <xref target='RFC5228'>Sieve filters</xref>, and so forth.
	      If the definition of a List-ID header field were to be extended to allow non-ASCII text,
	      filters and 
	      tests might not properly compare encoded and unencoded versions
	      of a non-ASCII value.  In addition to these comparison considerations, 
	      it is
	      generally desirable that this header field contain something 
	      meaningful that users can type in.  However, ASCII encodings of 
	      non-ASCII characters are unlikely to be meaningful to users or easy 
	      for them to accurately type.  
	   </t>
	</section>
      
	<section title="Downgrading list headers" anchor='dghead'>
	   <t>
	      If list software prepares a downgraded version of an SMTPUTF8 message,
	      all the List-* headers must be downgraded.
	      In particular, if a List-* header contains a non-ASCII 
	      mailto (even encoded in ASCII), it may be advisable to edit the 
	      header to remove the non-ASCII address, or replace it with an equivalent
	      ASCII address if one is known to the list software.
	      Otherwise, a client might run 
	      into trouble if the decoded mailto results in a non-ASCII address.
	      If a header that contains a mailto URL is downgraded by percent encoding,
	      some mail software may misinterpret the percent signs as attempted
	      source routing.
	   </t>
	   <t>
	      When downgrading list headers, it may not be possible to produce a
	      downgraded version that is satisfactorily equivalent to the original
	      header.
	      In particular, if a non-ASCII List-ID is downgraded to an ASCII version,
	      software and humans at recipient systems will typically not be able
	      to tell that both refer to the same list.
	   </t>
	   <t>If lists permit mail with multiple MIME parts, some MIME
	      headers in SMTPUTF8 messages may include non-ASCII characters in file names
	      and other descriptive text strings.  Downgrading these strings
	      may lose the sense of the names, break references from other MIME
	      parts (such as HTML IMG references to embedded images) and otherwise
	      damage the mail.</t>
	</section>
      
	<section title="Subscribers' addresses in downgraded headers" anchor='dgsub'>
	   <t>List software typically leaves the original submitter's address in the
	      From: line, both so that recipients can tell who wrote the message, and
	      so that they have a choice of responding to the list or directly to the
	      submitter.
	      If a submitter has a non-ASCII address, there is no way to downgrade the From:
	      header and preserve the address so that ASCII recipients can respond to it,
	      since non-SMTPUTF8 mail systems can't send mail to non-ASCII addresses.
	   </t>
	   <t>
	      Possible work arounds (none implemented that we know of) might include
	      allowing subscribers with non-ASCII addresses to register an alternate ASCII
	      address with the list software, having the list software itself create
	      ASCII forwarding addresses, or just putting the list's address in the From:
	      line and losing the ability to respond directly to the submitter.</t>
	</section>
     </section>

     <section title="Security considerations">
	<t>None beyond what mailing list agents do now.</t>
     </section>
  </middle>

  <back>
     <references title="Normative References">
	&RFC3986;
	&RFC6068;
	&RFC6409;
	&RFC6530;
     </references>
     <references title="Informative References">
	&RFC2919;
	&RFC2369;
	&RFC5228;
	<reference anchor="VERP">
	   <front><title>Variable Envelope Return Path</title>
	      <author fullname='Wikipedia'>
		 <address>
		    <uri>https://secure.wikimedia.org/wikipedia/en/wiki/Variable_envelope_return_path</uri>
		 </address>
	      </author>
	      <date />
	   </front>
	</reference>
     </references>

     <section title="Change Log">
	<t><spanx style="strong">NOTE TO RFC EDITOR: This section may be removed
	   upon publication of this document as an RFC.</spanx></t>
	
	<section title="Change from -03 to -04">
	   <t>Update references</t>
	</section>
	<section title="Change from -02 to -03">
	   <t>Distinguish lists from agents.</t>
	   <t>Change refs to EAI to non-ASCII addresses or SMTPUTF8 mail capabilities.</t>
	   <t>Reference for VERP</t>
	   <t>Clarify discussions of IRIs.</t>
	   <t>CapiTALiZe MaiLto and hTTp ConsistentlY.</t>
	</section>
	<section title="Changes up to -02">
	   <t>Various editorial changes.</t>
	   <t>Refer to RFC 6068.</t>
	</section>
	<section title="Changes up to -00">
	   <t>Rewrite completely.</t>
	</section>
     </section>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 04:21:05