One document matched: draft-ietf-marf-redaction-02.xml
<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfcXXXX.dtd'>
<rfc ipr="trust200902" docName="draft-ietf-marf-redaction-02" category="info">
<?rfc toc="yes" ?>
<?rfc tocompact="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc rfcprocack="yes" ?>
<front>
<title abbrev="Redaction">
Redaction of Potentially Sensitive Data from Mail Abuse Reports
</title>
<author initials="J.D." surname="Falk" fullname="J.D. Falk" role="editor">
<organization>Return Path</organization>
<address>
<postal>
<street>100 Mathilda Place, Suite 100</street>
<city>Sunnyvale</city>
<region>CA</region>
<code>94086</code>
<country>US</country>
</postal>
<email>ietf@cybernothing.org</email>
<uri>http://www.returnpath.net/</uri>
</address>
</author>
<author initials="M." surname="Kucherawy" fullname="M. Kucherawy" role="editor">
<organization>Cloudmark</organization>
<address>
<postal>
<street>128 King St., 2nd Floor</street>
<city>San Francisco</city>
<region>CA</region>
<code>94107</code>
<country>US</country>
</postal>
<email>msk@cloudmark.com</email>
</address>
</author>
<date/>
<area>Applications</area>
<workgroup>MARF Working Group</workgroup>
<keyword>ARF</keyword>
<keyword>MARF</keyword>
<keyword>feedback loop</keyword>
<keyword>spam reporting</keyword>
<abstract>
<t> Email messages often contain information which might be considered
private or sensitive, per either regulation or social norms. When
such a message becomes the subject of a report intended to be
shared with other entities, the report generator may wish to
redact or elide the sensitive portions of the message. This memo
suggests one method for doing so effectively. </t>
<t> [NOTE TO EDITOR: Murray Kucherawy is listed as an
author only to enable him to complete the
publication process on behalf of J.D. Falk.
Please remove Murray from the author list prior
to publication.] </t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t> <xref target="ARF"/> defines a message format for sending reports of
abuse in the messaging infrastructure, with an eye toward automating
both the generating and consumption of those reports. </t>
<t> For privacy considerations it might be the policy of a report
generator to redact, or obscure, portions of the report that might
identify an end user that caused the report to be generated.
Precisely how this is done is unspecified in <xref target="ARF"/>
as it will generally be a matter of local policy. That specification
does admonish generators against being too over-zealous with this
practice, as obscuring too much data makes the report non-actionable. </t>
<t> Previous redaction practices, such as replacing local-parts of
addresses with a uniform string like "xxxxxxxx", often frustrates any
kind of prioritizing or grouping of reports. </t>
<t> Generally, it is assumed that the recipient-identifying fields of a
message, when copied into a report, are to be obscured to protect
the identity of then end user who submitted the complaint about the
message. However, it is also presumed that other data will be left
intact, and that data could theoretically be correlated against log
files or other resources to determine the intended recipient of the
message. </t>
</section>
<section title="Recommended Practice">
<t> To enable correlation of reports that might refer to a common but
anonymous source, the following redaction practice is recommended
(but not required): </t>
<t>
<list style="numbers">
<t> Select an arbitrary string that will be used by an Administrative
Domain (ADMD) that generates reports. This string will not be
changed except according to a key rotation policy or similar.
Call this the "redaction key". </t>
<t> Identify string(s) (such as local-parts of email addresses) in a
message that need to be redacted. Call this the "private data". </t>
<t> Construct a new string that is a copy of the redaction key with
the private data concatenated to it. </t>
<t> Compute a digest of that string with any hashing/digest algorithm
such as one defined in <xref target="FIPS-180-3-2008"/>. </t>
<t> Encode that hash with the base64 algorithm as defined in
<xref target="MIME"/>. </t>
<t> Replace the private data with the encoded hash when generating
the report. </t>
</list>
</t>
<t> This has the effect of obscuring the data in an irreversible way but
still allows the report recipient to observe that numerous reports
are about one particular end user. Such detection enables the
receiver to prioritize its reactions based on problems that appear to
be focused on specific end users that may be under attack. </t>
</section>
<section title="Security Considerations">
<section title="General">
<t> General security issues with respect to these
reports are found in <xref target="ARF"/>. </t>
</section>
<section title="Digest Collisions">
<t> Message digest collisions are a well-understood
issue. Their application here involves a
report receiver improperly concluding that two
pieces of redacted information were originally the
same when in fact they are not. This can lead
to a denial of service, where the inadvertently
improper application of complaint data causes
unjustified corrective action. Such cases are
sufficiently unlikely as to be of little
concern. </t>
</section>
<section title="Information Not Redacted">
<t> Although the identity of a report generator can
be redacted using this mechanism, other properties
of a message (such as the Message-ID field)
that are not redacted could be used to recover
the original data. It is incumbent on the
report generator to anticipate and redact or
otherwise obscure such data, or accept that such
recovery is possible. </t>
<t> Section 8 of <xref target="ARF"/> covers
topics related to establishment of bilateral
agreements between report producers and consumers.
The issues raised here are also things to be
considered when establishing such agreements. </t>
</section>
</section>
<section title="Privacy Considerations">
<t> While the method of redaction described in this document may
reduce the likelihood of some types of private data from
leaking between Administrative Domains, it is extremely unlikely
that report generation software could ever be created to recognize
all of the different ways that private information may be expressed
through human written language. If further protections are
required, implementers may wish to consider establishing
some sort of out-of-band arrangements between the
relevant entities to contain private data as much as possible. </t>
</section>
<section title="IANA Considerations">
<t> This memo includes no request to IANA. </t>
<t> [RFC Editor note: This section may be removed prior
to publication.] </t>
</section>
</middle>
<back>
<references title="Normative References">
<reference anchor='ARF'>
<front>
<title>An Extensible Format for Email Feedback Reports</title>
<author initials='Y.' surname='Shafranovich' fullname='Y. Shafranovich'>
<organization />
</author>
<author initials='J.' surname='Levine' fullname='J. Levine'>
<organization />
</author>
<author initials='M.' surname='Kucherawy' fullname='M. Kucherawy'>
<organization />
</author>
<date year='2010' month='August' />
</front>
<seriesInfo name='RFC' value='5965' />
</reference>
</references>
<references title="Informative References">
<reference anchor='MIME'>
<front>
<title abbrev='Internet Message Bodies'>Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</title>
<author initials='N.' surname='Freed' fullname='Ned Freed'>
<organization>Innosoft International, Inc.</organization>
</author>
<author initials='N.S.' surname='Borenstein' fullname='Nathaniel S. Borenstein'>
<organization>First Virtual Holdings</organization>
</author>
<date year='1996' month='November' />
</front>
<seriesInfo name='RFC' value='2045' />
</reference>
<reference
anchor="FIPS-180-3-2008">
<front>
<title>Secure Hash Standard</title>
<author
fullname="U.S. Department of Commerce"
surname="U.S. Department of Commerce" />
<date
month="October"
year="2008" />
</front>
<seriesInfo
name="FIPS PUB"
value="180-3" />
</reference>
</references>
<section title="Example">
<t> Assume the following input message:
<figure><artwork>
From: alice@example.com
To: bob@example.net
Subject: Make money fast!
Message-ID: <123456789@mailer.example.com>
Date: Thu, 17 Nov 2011 22:19:40 -0500
Want to make a lot of money really fast? Check it out!
http://www.example.com/scam/0xd0d0cafe
</artwork></figure> </t>
<t> On receipt, bob@example.net reports this message as abusive
through whatever mechanism his email service provider has
established. This causes an <xref target="ARF"/> message to
be generated. However, example.net wishes to obscure Bob's
email address lest it be relayed to the offending agent, which
could lead to more trouble for Bob. </t>
<t> Thus, example.net plans to redact the local-part of the recipient
address in the To: field. It has selected a redaction key of
"potatoes", and the private data in this case is the string
"bob". The concatenation of "potatoesbob" is digested with SHA1
and then base64-encoded to the string
"rZ8cqXWGiKHzhz1MsFRGTysHia4=". </t>
<t> Thus, when constructing the ARF message in response to Bob's
complaint, the following form of the received message is used in
the third part of the ARF report:
<figure><artwork>
From: alice@example.com
To: rZ8cqXWGiKHzhz1MsFRGTysHia4=@example.net
Subject: Make money fast!
Message-ID: <123456789@mailer.example.com>
Date: Thu, 17 Nov 2011 22:19:40 -0500
Want to make a lot of money really fast? Check it out!
http://www.example.com/scam/0xd0d0cafe
</artwork></figure> </t>
<t> Note, however, that it is possible the redacted information can
be recovered by agents at example.com by searching their logs for
the original envelope associated with the message by correlating
with the Message-ID contents, which were not redacted here. It
is expected that feedback loops generating such reports involve
senders that have been vetted against such information
leakage. </t>
</section>
<section title="Acknowledgements">
<t> Much of the text in this document was initially moved from
other MARF working group documents, crafted by Murray Kucherawy
with contributions from Monica Chew, Tim Draegen, Michael
Adkins, and myself. Additional feedback was provided by
S. Moonesamy, Alessandro Vesely, and Mykytka Yevstifeyev. </t>
</section>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:40:17 |