One document matched: draft-gellens-format-03.txt
Differences from draft-gellens-format-02.txt
Internet Draft: The Text/Plain Format Parameter R. Gellens, Editor
Document: draft-gellens-format-03.txt Qualcomm
Expires: 24 August 1999 24 February 1999
The Text/Plain Format Parameter
Status of this Memo:
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet- Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
<http://www.ietf.org/ietf/1id-abstracts.txt>
The list of Internet-Draft Shadow Directories can be accessed at
<http://www.ietf.org/shadow.html>.
A version of this draft document is intended for submission to the
RFC editor as a Proposed Standard for the Internet Community.
Discussion and suggestions for improvement are requested.
Comments:
Private comments should be sent to the author. Public comments may
be sent to the IETF 822 mailing list, <ietf-822@imc.org>. To
subscribe, send a message to <ietf-822-request@imc.org> with the
word SUBSCRIBE as the body of the message. Archives for the list
are at <http://www.imc.org/ietf-822/>.
Copyright Notice
Copyright (C) The Internet Society 1999. All Rights Reserved.
Gellens [Page 1] Expires August 1999Internet Draft The Format Parameter February 1999
Table of Contents
1. Changes in this Version . . . . . . . . . . . . . . . . . . 2
2. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . 2
3. Conventions Used in this Document . . . . . . . . . . . . . 3
4. The Problem . . . . . . . . . . . . . . . . . . . . . . . . 3
4.1. Paragraph Text . . . . . . . . . . . . . . . . . . . . . 3
4.2. Embarrassing Line Wrap . . . . . . . . . . . . . . . . 4
4.3. New Media Types . . . . . . . . . . . . . . . . . . . . 4
5. The Format Parameter to the Text/Plain Media Type . . . . . 5
5.1 Interpreting Format=Flowed . . . . . . . . . . . . . . . 5
5.2. Generating Format=Flowed . . . . . . . . . . . . . . . 6
5.3. Usenet Signature Convention . . . . . . . . . . . . . . 7
5.4. Space-Stuffing . . . . . . . . . . . . . . . . . . . . 7
5.5. Quoting . . . . . . . . . . . . . . . . . . . . . . . . 7
5.6. Digital Signatures and Encryption . . . . . . . . . . . 9
5.7. Line Analysis Table . . . . . . . . . . . . . . . . . . 9
5.8. Examples . . . . . . . . . . . . . . . . . . . . . . . 9
6. ABNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7. Failure Modes . . . . . . . . . . . . . . . . . . . . . . . 10
7.1. Trailing White Space Corruption . . . . . . . . . . . . 10
8. Security Considerations . . . . . . . . . . . . . . . . . . 11
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 11
10. Internationalization Considerations . . . . . . . . . . . . 11
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 11
12. References . . . . . . . . . . . . . . . . . . . . . . . . 12
13. Editor's Address . . . . . . . . . . . . . . . . . . . . . . 12
14. Full Copyright Statement . . . . . . . . . . . . . . . . . 12
1. Changes in this Version
- The quote indicator is now ">"
- Lines now space-stuffed
- Flowed lines now end in one or more spaces
- Reconciled text, examples, and ABNF for paragraph definition
- Clarified Quoted-Printable instructions
- Added "Interpreting Format=Flowed" section
- Clarified section on Quoting
- Added section on space-stuffing
- Added examples to section on "Embarrassing Line Wrap"
- Modified "Generating Format=Flowed" and ABNF in light of other
changes
- Added IANA and Internationalization sections
- Updated Acknowledgments and Internet Draft boiler plate
2. Abstract
Interoperability problems have been observed with erroneous
labelling of paragraph text as Text/Plain, and with various forms of
"embarrassing line wrap." (See section 4.)
Gellens [Page 2] Expires August 1999Internet Draft The Format Parameter February 1999
Attempts to deploy new media types, such as Text/Enriched [RICH] and
Text/HTML [HTML] have suffered from a lack of backwards
compatibility and an often hostile user reaction at the receiving
end.
What is required is a format which is in all significant ways
Text/Plain, and therefore is quite suitable for display as
Text/Plain, and yet allows the sender to express to the receiver
which lines can be considered a logical paragraph, and thus flowed
(wrapped and joined) as appropriate.
This memo proposes a new parameter to be used with Text/Plain, and,
in the presence of this parameter, the use of trailing whitespace to
indicate flowed lines. This results in an encoding which appears as
normal Text/Plain in older implementations, since it is in fact
normal Text/Plain.
3. Conventions Used in this Document
The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD
NOT", and "MAY" in this document are to be interpreted as described
in "Key words for use in RFCs to Indicate Requirement Levels"
[KEYWORDS].
4. The Problem
The Text/Plain media type is the lowest common denominator of
Internet email, with lines of no more than 997 characters (by
convention usually no more than 80), and where the CRLF sequence
represents a line break [MIME-IMT].
Text/Plain is usually displayed as preformatted text, often in a
fixed font. That is, the characters start at the left margin of the
display window, and advance to the right until a CRLF sequence is
seen, at which point a new line is started, again at the left
margin. When a line length exceeds the display window, some clients
will wrap the line, while others invoke a horizontal scroll bar.
Text which meets this description is defined by this memo as
"fixed".
Some interoperability problems have been observed with this media
type:
4.1. Paragraph Text
Many modern programs use a proportional-spaced font and CRLF to
represent paragraph breaks. Line breaks are "soft", occurring as
needed on display. That is, characters are grouped into a paragraph
until a CRLF sequence is seen, at which point a new paragraph is
Gellens [Page 3] Expires August 1999Internet Draft The Format Parameter February 1999
started. Each paragraph is displayed, starting at the left margin
(or paragraph indent), and continuing to the right until a word is
encountered which does not fit in the remaining display width. This
word is displayed at the left margin of the next line. This
continues until the paragraph ends (a CRLF is seen). Extra vertical
space is left between paragraphs.
Text which meets this description is defined by this memo as
"flowed".
Numerous software products erroneously label this media type as
Text/Plain, resulting in much user discomfort.
4.2. Embarrassing Line Wrap
As Text/Plain messages get quoted in replies or forwarded messages,
the length of each line gradually increases, resulting in
"embarrassing line wrap." This results in text which is at best hard
to read, and often confuses attributions.
Example:
>>>>>>This is a comment from the first message to show a
>quoting example.
>>>>>This is a comment from the second message to show a
>quoting example.
>>>>This is a comment from the third message.
>>>This is a comment from the fourth message.
It can be confusing to assign attribution to lines 2 and 4 above.
In addition, as devices with display widths smaller than 80
characters become more popular, embarrassing line wrap has become
even more prevalent, even with unquoted text.
Example:
This is paragraph text that is
meant to be flowed across
several lines.
However, the sending mailer is
converting it to fixed text at
a width of 72
characters, which causes it to
look like this when shown on a
PDA with only
30 character lines.
4.3. New Media Types
Attempts to deploy new media types, such as Text/Enriched [RICH] and
Text/HTML [HTML] have suffered from a lack of backwards
Gellens [Page 4] Expires August 1999Internet Draft The Format Parameter February 1999
compatibility and an often hostile user reaction at the receiving
end.
In particular, Text/Enriched requires that open angle brackets ("<")
and hard line breaks be doubled, with resulting user unhappiness
when viewed as Text/Plain. Text/HTML requires even more alteration
of text, with a corresponding increase in user complaints.
A proposal to define a new media type to explicitly represent the
paragraph form suffered from a lack of interoperability with
currently deployed software. Some programs treat unknown subtypes
of TEXT as an attachment.
What is desired is a format which is in all significant ways
Text/Plain, and therefore is quite suitable for display as
Text/Plain, and yet allows the sender to express to the receiver
which lines can be considered a logical paragraph, and thus flowed
(wrapped and joined) as appropriate.
5. The Format Parameter to the Text/Plain Media Type
This document defines a new MIME parameter for use with Text/Plain:
Name: Format
Value: Fixed, Flowed
(Neither the parameter name nor its value are case sensitive.)
If not specified, a value of Fixed is assumed. The semantics of the
Fixed value are the usual associated with Text/Plain [MIME-IMT].
A value of Flowed indicates that the definition of flowed text (as
specified in this memo) was used on generation, and MAY be used on
reception.
This section discusses flowed text; section 6 provides a formal
definition.
Because flowed lines are all-but-indistinguishable from fixed lines,
currently deployed software treats flowed lines as normal Text/Plain
(which is what they are). Thus, no interoperability problems are
expected.
Note that this memo describes an on-the-wire format. It does not
address formats for local file storage.
5.1 Interpreting Format=Flowed
If the first character of a line is a quote mark (">"), the line is
considered to be quoted (see section 5.5). Logically, all quote
marks are counted and deleted, resulting in a line with a non-zero
Gellens [Page 5] Expires August 1999Internet Draft The Format Parameter February 1999
quote depth, and content. (The agent is of course free to display
the content with quote marks or excerpt bars or anything else.)
Logically, this test for quoted lines is done before any other tests
(that is, space-stuffed and flowed).
If the first character of a line is a space, the line has been
space-stuffed (see section 5.4). Logically, this leading space is
deleted before examining the line further (that is, before checking
for flowed).
If the line ends in one or more spaces, the line is flowed.
Otherwise it is fixed.
A series of one or more flowed lines followed by one fixed line is
considered a paragraph, and MAY be flowed (wrapped and unwrapped) as
appropriate on display and in the construction of new messages (see
section 5.5).
A line consisting of one or more spaces (after deleting a stuffed
space) is considered a flowed line.
5.2. Generating Format=Flowed
When generating Format=Flowed text, lines SHOULD be shorter than 80
characters. As suggested values, any paragraph longer than 79
characters in total length could be wrapped using lines of 72 or
fewer characters. While the specific line length used is a matter
of aesthetics and preference, longer lines are more likely to
require rewrapping and to encounter difficulties with older mailers.
It has been suggested that 66 character lines are the most readable.
When creating flowed text, the generating agent wraps, that is,
inserts 'soft' line breaks (SP CRLF sequences) as needed. Soft line
breaks are added between words.
A generating agent SHOULD:
1. Ensure all lines (fixed and flowed) are less than 80
characters in length, not counting the CRLF.
2. Trim spaces before user-inserted hard line breaks.
3. Space-stuff lines which start with a space, "From ", or
">".
In order to create messages which do not require space-stuffing, and
are thus more aesthetically pleasing when viewed as Format=Fixed, a
generating agent MAY avoid wrapping immediately before ">", "From ",
or space.
(See sections 5.4 and 5.5 for more information on space-stuffing and
quoting, respectively.)
A Format=Flowed message consists of zero or more paragraphs, each
containing one or more flowed lines followed by one fixed line. The
Gellens [Page 6] Expires August 1999Internet Draft The Format Parameter February 1999
usual case is a series of flowed text lines with blank (empty) fixed
lines between them.
Any number of fixed lines can appear between paragraphs.
[Quoted-Printable] encoding SHOULD NOT be used with Format=Flowed
unless absolutely necessary (for example, non-US-ASCII (8-bit)
characters over a strictly 7-bit transport such as unextended SMTP).
In particular, a message SHOULD NOT be encoded in Quoted-Printable
for the sole purpose of protecting the trailing space on flowed
lines unless the body part is cryptographically signed (see Section
5.6).
The intent of Format=Flowed is to allow user agents to generate
flowed text which is non-obnoxious when viewed as pure Text/Plain;
use of Quoted-Printable hinders this and may cause Format=Flowed to
be rejected by end users.
5.3. Usenet Signature Convention
There is a convention in Usenet news of using "-- " as the separator
line between the body and the signature of a message. When
generating a Format=Flowed message containing a Usenet-style
separator before the signature, the separator line is sent as-is.
This is a special case; an (optionally quoted) line consisting of
DASH DASH SP is not considered flowed.
5.4. Space-Stuffing
In order to allow for unquoted lines which start with ">", and to
protect against systems which "From-munge" in-transit messages
(modifying any line which starts with "From " to ">From "),
Format=Flowed provides for space-stuffing.
Space-stuffing adds a single space to the start of any line which
needs protection when the message is generated. On reception, if
the first character of a line is a space, it is logically deleted.
This occurs after the test for a quoted line, and before the test
for a flowed line.
On generation, unquoted lines which start with ">", and any line
which starts with a space or "From " needs to be space-stuffed.
Other lines MAY be space-stuffed as desired.
5.5. Quoting
In Format=Flowed, the canonical quote indicator (or quote mark) is
one or more close angle bracket (">") characters. Lines which start
with the quote indicator are considered quoted. The number of ">"
characters at the start of the line specifies the quote depth.
Flowed lines which are also quoted may require special handling on
display and when copied to new messages.
Gellens [Page 7] Expires August 1999Internet Draft The Format Parameter February 1999
When creating quoted flowed lines, each such line starts with the
quote indicator.
Note that because of space-stuffing, the lines
>> Exit, Stage Left
and
>>Exit, Stage Left
are semantically identical; both have a quote-depth of two, and a
content of "Exit, Stage Left".
However, the line
> > Exit, Stage Left
is different. It has a quote-depth of one, and a content of
"> Exit, Stage Left".
When generating quoted flowed lines, an agent needs to pay attention
to changes in quote depth. A sequence of quoted lines of the same
quote depth SHOULD be encoded as a paragraph, with the last line
generated as fixed and prior lines generated as flowed.
If a receiving agent wishes to reformat flowed quoted lines (joining
and/or wrapping them) on display or when generating new messages,
the lines SHOULD be de-quoted, reformatted, and then re-quoted. To
de-quote, the number of close angle brackets in the quote indicator
at the start of each line is counted. Consecutive lines with the
same quoting depth are considered one paragraph and are reformatted
together. To re-quote after reformatting, a quote indicator
containing the same number of close angle brackets originally
present are prefixed to each line.
On reception, if a change in quoting depth occurs on a flowed line,
this is an improperly formatted message. The receiver SHOULD handle
this error by using the 'quote-depth-wins' rule, which is to ignore
the flowed indicator and treat the line as fixed. That is, the
change in quote depth ends the paragraph.
For example, consider the following sequence of lines (using '*' to
indicate a soft line break, and '#' to indicate a hard line break):
> Thou villainous ill-breeding spongy dizzy-eyed*
> reeky elf-skinned pigeon-egg!* <--- problem ---<
>> Thou artless swag-bellied milk-livered*
>> dismal-dreaming idle-headed scut!#
>>> Thou errant folly-fallen spleeny reeling-ripe*
>>> unmuzzled ratsbane!#
>>>> Henceforth, the coding style is to be strictly*
>>>> enforced, including the use of only upper case.#
>>>>> I've noticed a lack of adherence to the coding*
>>>>> styles, of late.#
>>>>>> Any complaints?#
The second line ends in a soft line break, even though it is the
Gellens [Page 8] Expires August 1999Internet Draft The Format Parameter February 1999
last line of the one-deep quote block. The question then arises as
to how this line should be interpreted, considering that the next
line is the first line of the two-deep quote block.
The example text above, when processed according to quote-depth
wins, results in the first two lines being considered as one quoted,
flowed section, with a quote depth of 1; the third and fourth lines
become a quoted, flowed section, with a quote depth of 2.
A generating agent SHOULD NOT create this situation; a receiving
agent SHOULD handle it using quote-depth wins.
5.6. Digital Signatures and Encryption
If a message is digitally signed or encrypted, and is natively in
paragraph form, it is important that cryptographic processing use
the on-the-wire Format=Flowed format. That is, during generation
the message SHOULD be prepared for transmission, including addition
of soft line breaks, and space-stuffing before being digitally
signed or encrypted; similarly, on receipt the message SHOULD have
the signature verified or be decrypted before removal of stuffed
spaces, soft line breaks and quote marks, and reflowing.
5.7. Line Analysis Table
Lines contained in a Text/Plain body part with Format=Flowed can be
analyzed by examining the start and end of the line. If the line
starts with the quote indicator, it is quoted. If the line ends
with exactly one space character, it is flowed. This is summarized
by the following table:
Starts Ends in
with One or Line
Quote Two Spaces Type
------ ---------- ---------------
no no unquoted, fixed
yes no quoted, fixed
no yes unquoted, flowed
yes yes quoted, flowed
5.8. Examples
The following example contains three paragraphs:
`Take some more tea,' the March Hare said to Alice, very
earnestly.
`I've had nothing yet,' Alice replied in an offended tone, `so I
can't take more.'
`You mean you can't take LESS,' said the Hatter: `it's very easy
to take MORE than nothing.'
Gellens [Page 9] Expires August 1999Internet Draft The Format Parameter February 1999
This could be encoded as follows (using '*' to indicate a soft line
break, that is, SP CRLF sequence, and '#' to indicate a hard line
break, that is, CRLF):
`Take some more tea,' the March Hare said to Alice, very*
earnestly.#
#
`I've had nothing yet,' Alice replied in an offended tone, `so* I
can't take more.'#
#
`You mean you can't take LESS,' said the Hatter: `it's very* easy
to take MORE than nothing.'#
Here we have the same exchange, in quoted form:
>>>Take some more tea.#
>>I've had nothing yet, so I can't take more.#
>You mean you can't take LESS, it's very easy to take*
>MORE than nothing.#
6. ABNF
The constructs used in Text/Plain; Format=Flowed body parts are
described using [ABNF]:
paragraph = 1*flowed-line fixed-line
fixed-line = fixed / sig-sep
fixed = [quote] [stuffing] *text-char non-sp CRLF
flowed-line = flow-qt / flow-unqt
flow-qt = quote [stuffing] *text-char 1*SP CRLF
flow-unqt = [stuffing] *text-char 1*SP CRLF
non-empty = *text-char non-sp
non-sp = %01-09 / %0B / %0C / %0E-19 / %21-7F
; any 7-bit US-ASCII character, excluding
NUL, CR, LF, and SP
quote = 1*">"
sig-sep = [quote] "--" SP CRLF
stuffing = [SP] ; space-stuffed, added on generation if
needed, deleted on reception
text-char = non-sp / SP
7. Failure Modes
7.1. Trailing White Space Corruption
There are systems in existence which alter trailing whitespace on
messages which pass through them. Such systems may strip, or in
rarer cases, add trailing whitespace, in violation of RFC 821 [SMTP]
Gellens [Page 10] Expires August 1999Internet Draft The Format Parameter February 1999
section 4.5.2.
Stripping trailing whitespace has the effect of converting flowed
lines to fixed lines, which results in a message no worse than if
Format=Flowed had not been used.
Adding trailing whitespace most often has no effect or merely
converts flowed lines to fixed, but if exactly one trailing space is
added to one or more lines of a message which uses the Format=Flowed
parameter, the effect may be a corrupted display or reply. Since
most systems which add trailing white space do so to create a line
which fills an internal record format, the result is almost always a
line which contains an even number of characters (counting the added
trailing white space).
One possible avoidance, therefore, would be to define Format=Flowed
lines to use either one or two trailing space characters to indicate
a flowed line, such that the total line length is odd. However,
considering the scarcity of such systems today, it is not worth the
added complexity.
8. Security Considerations
This parameter introduces no security considerations beyond those
which apply to Text/Plain.
Section 5.6 discusses the interaction between Format=Flowed and
digital signatures or encryption.
9. IANA Considerations
IANA is requested to add a reference to this specification in the
Text/Plain Media Type registration.
10. Internationalization Considerations
The line wrap and quoting specifications of Format=Flowed may not be
suitable for certain charsets, such as for Arabic and Hebrew
characters that read from right to left. Care should be taken in
applying format=flowed in these cases, as format=fixed combined with
quoted-printable encoding may be more suitable.
11. Acknowledgments
This proposal evolved from a discussion of Chris Newman's
Text/Paragraph draft which took place on the IETF 822 mailing list.
Special thanks to Ian Bell, Steve Dorner, Brian Kelley, Dan Kohn,
and Laurence Lundblade.
Gellens [Page 11] Expires August 1999Internet Draft The Format Parameter February 1999
12. References
[ABNF] Crocker, Overell, "Augmented BNF for Syntax Specifications:
ABNF", RFC 2234, Internet Mail Consortium, Demon Internet Ltd.,
November 1997.
[KEYWORDS] Bradner, "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119, Harvard University, March 1997.
[RICH] Resnick, Walker, "The text/enriched MIME Content-type", RFC
1896, QUALCOMM, InterCon, February 1996.
[MIME-IMT] Freed, Borenstein, "Multipurpose Internet Mail Extensions
(MIME) Part Two: Media Types", RFC 2046, Innosoft, First Virtual,
November 1996.
[Quoted-Printable] Freed, Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Bodies", RFC
2045, Innosoft, First Virtual, November 1996.
[SMTP] Postel, "Simple Mail Transfer Protocol", RFC 821, Information
Sciences Institute, August 1982.
13. Editor's Address
Randall Gellens +1 619 651 5115
QUALCOMM Incorporated randy@qualcomm.com
6455 Lusk Blvd.
San Diego, CA 92121-2779
USA
14. Full Copyright Statement
Copyright (C) The Internet Society 1999. All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
Gellens [Page 12] Expires August 1999Internet Draft The Format Parameter February 1999
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Gellens [Page 13] Expires August 1999| PAFTECH AB 2003-2026 | 2026-04-24 07:29:31 |