One document matched: draft-ietf-marf-redaction-03.xml


<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfcXXXX.dtd'>
<rfc ipr="trust200902" docName="draft-ietf-marf-redaction-03" category="info">
	<?rfc toc="yes" ?>
	<?rfc tocompact="yes" ?>
	<?rfc symrefs="yes" ?>
	<?rfc sortrefs="yes" ?>
	<?rfc compact="yes" ?>
	<?rfc rfcprocack="yes" ?>
	
	<front>
		<title abbrev="Redaction">
			Redaction of Potentially Sensitive Data from Mail Abuse Reports
		</title>
		<author initials="J.D." surname="Falk" fullname="J.D. Falk" role="editor">
			<organization>Return Path</organization>
			<address>
				<postal>
					<street>100 Mathilda Place, Suite 100</street>
					<city>Sunnyvale</city>
					<region>CA</region>
					<code>94086</code>
					<country>US</country>
				</postal>
				<email>ietf@cybernothing.org</email>
				<uri>http://www.returnpath.net/</uri>
			</address>
		</author>
		<author initials="M." surname="Kucherawy" fullname="M. Kucherawy" role="editor">
			<organization>Cloudmark</organization>
			<address>
				<postal>
					<street>128 King St., 2nd Floor</street>
					<city>San Francisco</city>
					<region>CA</region>
					<code>94107</code>
					<country>US</country>
				</postal>
				<email>msk@cloudmark.com</email>
			</address>
		</author>

		<date/>
		
		<area>Applications</area>
		<workgroup>MARF Working Group</workgroup>
		<keyword>ARF</keyword>
		<keyword>MARF</keyword>
		<keyword>feedback loop</keyword>
		<keyword>spam reporting</keyword>
		
		<abstract>
			<t> Email messages often contain information which might be considered
				private or sensitive, per either regulation or social norms.  When
				such a message becomes the subject of a report intended to be
				shared with other entities, the report generator may wish to
				redact or elide the sensitive portions of the message.  This memo
				suggests one method for doing so effectively. </t>

			<t> [NOTE TO EDITOR: Murray Kucherawy is listed as an
			     author only to enable him to complete the
			     publication process on behalf of J.D. Falk.
			     Please remove Murray from the author list prior
			     to publication.] </t>
		</abstract>
	</front>
	
	<middle>
		<section title="Introduction">
			<t> <xref target="ARF"/> defines a message format for sending reports of 
				abuse in the messaging infrastructure, with an eye toward automating 
				both the generating and consumption of those reports. </t>
			
			<t> For privacy considerations it might be the policy of a report
			    generator to redact, or obscure, portions of the report that might
			    identify an end user that caused the report to be generated.
			    Precisely how this is done is unspecified in <xref target="ARF"/> 
				as it will generally be a matter of local policy.  That specification 
				does admonish generators against being too over-zealous with this
			    practice, as obscuring too much data makes the report non-actionable. </t>

			<t> Previous redaction practices, such as replacing local-parts of
			    addresses with a uniform string like "xxxxxxxx", often frustrates any
			    kind of prioritizing or grouping of reports. </t>

			<t> Generally, it is assumed that the recipient-identifying fields of a 
				message, when copied into a report, are to be obscured to protect 
				the identity of then end user who submitted the complaint about the 
				message.  However, it is also presumed that other data will be left 
				intact, and that data could theoretically be correlated against log
				files or other resources to determine the intended recipient of the 
				message. </t>
		</section>

		<section title="Recommended Practice">
			<t> To enable correlation of reports that might refer to a common but
			    anonymous source, the following redaction practice is recommended
			    (but not required): </t>
			
			<t>
				<list style="numbers">
					<t> Select an arbitrary string that will be used by an Administrative
						Domain (ADMD) that generates reports.  This string will not be 
						changed except according to a key rotation policy or similar.  
						Call this the "redaction key". </t>
					
					<t> Identify string(s) (such as local-parts of email addresses) in a
						message that need to be redacted.  Call this the "private data". </t>
						
					<t> Construct a new string that is a copy of the redaction key with
						the private data concatenated to it. </t>
						
					<t> Compute a digest of that string with any hashing/digest algorithm
						such as one defined in <xref target="FIPS-180-3-2008"/>. </t>
						
					<t> Encode that hash with the base64 algorithm as defined in 
						<xref target="MIME"/>. </t>
						
					<t> Replace the private data with the encoded hash when generating
						the report. </t>
				</list>
			</t>
			
			<t> This has the effect of obscuring the data in an irreversible way but
			    still allows the report recipient to observe that numerous reports
			    are about one particular end user.  Such detection enables the
			    receiver to prioritize its reactions based on problems that appear to
			    be focused on specific end users that may be under attack. </t>
			
		</section>
		
		<section title="Security Considerations">
		 <section title="General">
			<t> General security issues with respect to these
			    reports are found in <xref target="ARF"/>. </t>
		 </section>

		 <section title="Digest Collisions">
			<t> Message digest collisions are a well-understood
			    issue.  Their application here involves a
			    report receiver improperly concluding that two
			    pieces of redacted information were originally the
			    same when in fact they are not.  This can lead
			    to a denial of service, where the inadvertently
			    improper application of complaint data causes
			    unjustified corrective action.  Such cases are
			    sufficiently unlikely as to be of little
			    concern. </t>
		 </section>

		 <section title="Information Not Redacted">
			<t> Although the identity of a report generator can
			    be redacted using this mechanism, other properties
			    of a message (such as the Message-ID field)
			    that are not redacted could be used to recover
			    the original data.  It is incumbent on the
			    report generator to anticipate and redact or
			    otherwise obscure such data, or accept that such
			    recovery is possible. </t>

			<t> <xref target="FBL-BCP"/> and Section 8 of
			    <xref target="ARF"/> discuss topics related to
			    establishment of bilateral agreements between
			    report producers and consumers.  The issues
			    raised here are also things to be considered when
			    establishing such agreements. </t>
		 </section>
		</section>

		<section title="Privacy Considerations">
			<t> While the method of redaction described in this document may 
				reduce the likelihood of some types of private data from
				leaking between Administrative Domains, it is extremely unlikely
				that report generation software could ever be created to recognize
				all of the different ways that private information may be expressed
				through human written language.  If further protections are
				required, implementers may wish to consider establishing 
				some sort of out-of-band arrangements between the 
				relevant entities to contain private data as much as possible. </t>
		</section>
		
		<section title="IANA Considerations">
			<t> This memo includes no request to IANA. </t>
			<t> [RFC Editor note: This section may be removed prior
			    to publication.] </t>
		</section>
	</middle>
	
	<back>
		<references title="Normative References">
			<reference anchor='ARF'>
				<front>
					<title>An Extensible Format for Email Feedback Reports</title>
					<author initials='Y.' surname='Shafranovich' fullname='Y. Shafranovich'>
						<organization />
					</author>
					<author initials='J.' surname='Levine' fullname='J. Levine'>
						<organization />
					</author>
					<author initials='M.' surname='Kucherawy' fullname='M. Kucherawy'>
						<organization />
					</author>
					<date year='2010' month='August' />
				</front>
				<seriesInfo name='RFC' value='5965' />
			</reference>
		</references> 

		<references title="Informative References">
			<reference anchor='FBL-BCP'>
				<front>
					<title abbrev='FBL Recommendations'>
						Complaint Feedback Loop
						Operational Recommendations
					</title>

					<author initials='J.D.' surname='Falk' fullname='J.D. Falk'>
						<organization>Messaging Anti-Abuse Working Group</organization>
					</author>
					<date year='2011' month='November' />
				</front>
				<seriesInfo name='RFC' value='6449' />
			</reference>

			<reference anchor='MIME'>
				<front>
					<title abbrev='Internet Message Bodies'>Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</title>
					<author initials='N.' surname='Freed' fullname='Ned Freed'>
						<organization>Innosoft International, Inc.</organization>
					</author>
					<author initials='N.S.' surname='Borenstein' fullname='Nathaniel S. Borenstein'>
						<organization>First Virtual Holdings</organization>
					</author>
					<date year='1996' month='November' />
				</front>
				<seriesInfo name='RFC' value='2045' />
			</reference>

			
         <reference
            anchor="FIPS-180-3-2008">
            <front>
               <title>Secure Hash Standard</title>

               <author
                  fullname="U.S. Department of Commerce"
                  surname="U.S. Department of Commerce" />
               <date
                  month="October"
                  year="2008" />
            </front>
            <seriesInfo
               name="FIPS PUB"
               value="180-3" />
         </reference>
		</references>

		<section title="Example">
			<t> Assume the following input message:

	<figure><artwork>
  From: alice@example.com
  To: bob@example.net
  Subject: Make money fast!
  Message-ID: <123456789@mailer.example.com>
  Date: Thu, 17 Nov 2011 22:19:40 -0500

  Want to make a lot of money really fast?  Check it out!
  http://www.example.com/scam/0xd0d0cafe
	</artwork></figure> </t>

	<t> On receipt, bob@example.net reports this message as abusive
	    through whatever mechanism his mailbox provider has
	    established.  This causes an <xref target="ARF"/> message to
	    be generated.  However, example.net wishes to obscure Bob's
	    email address lest it be relayed to the offending agent, which
	    could lead to more trouble for Bob. </t>

	<t> Thus, example.net plans to redact the local-part of the recipient
	    address in the To: field.  It has selected a redaction key of
	    "potatoes", and the private data in this case is the string
	    "bob".  The concatenation of "potatoesbob" is digested with SHA1
	    and then base64-encoded to the string
	    "rZ8cqXWGiKHzhz1MsFRGTysHia4=". </t>

	<t> Thus, when constructing the ARF message in response to Bob's
	    complaint, the following form of the received message is used in
	    the third part of the ARF report:

	<figure><artwork>
  From: alice@example.com
  To: rZ8cqXWGiKHzhz1MsFRGTysHia4=@example.net
  Subject: Make money fast!
  Message-ID: <123456789@mailer.example.com>
  Date: Thu, 17 Nov 2011 22:19:40 -0500

  Want to make a lot of money really fast?  Check it out!
  http://www.example.com/scam/0xd0d0cafe
	</artwork></figure> </t>

	<t> Note, however, that it is possible the redacted information can
	    be recovered by agents at example.com by searching their logs for
	    the original envelope associated with the message by correlating
	    with the Message-ID contents, which were not redacted here.  It
	    is expected that feedback loops generating such reports involve
	    senders that have been vetted against such information
	    leakage. </t>
		</section>

		<section title="Acknowledgements">
			<t> Much of the text in this document was initially moved from 
				other MARF working group documents, crafted by Murray Kucherawy
				with contributions from Monica Chew, Tim Draegen, Michael
				Adkins, and myself.  Additional feedback was provided by 
			        S. Moonesamy, Alessandro Vesely, and Mykyta Yevstifeyev. </t>
		</section>
		
	</back>
</rfc>

PAFTECH AB 2003-20262026-04-24 05:40:10