One document matched: draft-eastlake-proto-doc-pov-04.txt
Differences from draft-eastlake-proto-doc-pov-03.txt
Network Working Group Donald Eastlake 3rd
INTERNET-DRAFT Motorola
Expires: March 2002 September 2001
The Protocol versus Document Points of View
--- -------- ------ -------- ------ -- ----
<draft-eastlake-proto-doc-pov-04.txt>
Status of This Document
This draft is intended to become an Informational RFC. It's
distribution is unlimited. Please send comments to the author.
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026. Internet-Drafts are
working documents of the Internet Engineering Task Force (IETF), its
areas, and its working groups. Note that other groups may also
distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Two points of view are contrasted: the "document" point of view,
where digital objects of interest are like pieces of paper, and the
"protocol" point of view where objects of interest are composite
dynamic protocol messages. While each point of view has a place,
adherence to a document point of view can be damaging to protocol
design. By understanding both of these points of view, conflicts
between them may be clarified and reduced.
D. Eastlake 3rd [Page 1]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
Table of Contents
Status of This Document....................................1
Abstract...................................................1
Table of Contents..........................................2
1. Introduction............................................3
2. Points of View..........................................3
2.1 The Basic Points of View...............................3
2.2 Questions of Meaning...................................4
2.2.1 Core Meaning.........................................4
2.2.2 Adjunct Meaning......................................5
2.3 Processing Models......................................5
2.3.1 Amount of Processing.................................5
2.3.2 Granularity of Processing............................6
2.3.3 Extensibility of Processing..........................6
2.4 Security and Canonicalization..........................7
2.4.1 Canonicalization.....................................7
2.4.2 Digital Authentication...............................8
2.4.3 Canonicalization and Digital Authentication..........9
2.4.4 Encryption...........................................9
2.5 Unique Internal Labels................................10
3. Examples...............................................11
4. Resolution of the Points of View.......................11
5. Conclusion.............................................12
References................................................13
Author's Address..........................................14
Expiration and File Name..................................14
D. Eastlake 3rd [Page 2]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
1. Introduction
Two points of view are contrasted: the "document" point of view,
where digital objects of interest are like pieces of paper, and the
"protocol" point of view where objects of interest are composite
dynamic protocol messages. While each point of view has a place,
adherence to a document point of view can be damaging to protocol
design. By understanding both of these points of view, conflicts
between them may be clarified and reduced.
Much of the IETF's traditional work has concerned low level binary
protocol constructs. These are almost always viewed from the
protocol point of view. But as higher level application constructs
and syntaxes are involved in the IETF and other standards processes,
difficulties can arise due to participants who have the document
point of view. These two different points of view of the document
and protocol oriented are defined and explored in Section 2 below.
Those accustomed to one point of view frequently have great
difficulty in appreciating the other. Even after they understand the
other, they almost always start by consider things from their
accustomed point of view, assume that most of the universe of
interest is best viewed from their perspective, and sometimes
commonly slip back into thinking about things entirely from that
point of view.
Section 3 gives some examples. And Section 4 tries to synthesize the
views and give general design advice in areas which can reasonably be
viewed either way.
2. Points of View
The following subsections contrast the document and protocol points
of view. Each viewpoint is EXAGGERATED for effect.
The document point of view is indicated in paragraphs headed "DOCUM"
while the protocol point of view is indicated in paragraphs headed
"PROTO".
2.1 The Basic Points of View
DOCUM: What is important are complete (digital) documents, analogous
to pieces of paper, viewed by people. A major concern is to be
able to present such documents as directly as possible to a court
or other third party. Since what is presented to the person is
all that is important, anything which can effect this, such as a
D. Eastlake 3rd [Page 3]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
"style sheet", MUST be considered part of the document. Sometimes
the fact that the "document" originates in a computer, may travel
over, be processed in, and stored in computer systems, and is
viewed on a computer and that such operations may involve
transcoding, enveloping, or data reconstruction, is forgotten.
PROTO: What is important are bits on the wire generated and consumed
by well defined computer protocol processes. No person ever sees
the full message as such; it is only viewed as a whole by a geek
when debugging and even then they see some translated visible
form. If you actually ever have to demonstrate something about
such a message in a court or to a third party, there isn't any way
to avoid having experts interpret it. Sometimes the fact that
pieces of such messages may end up being included in or
influencing data displayed to a person is forgotten.
2.2 Questions of Meaning
"Human" meaning is something which the document oriented tend to
consider extremely important but the protocol oriented rarely think
about at all.
2.2.1 Core Meaning
DOCUM: The "meaning" of a document is a deep and interesting human
question related to volition. It is probably necessary for the
document to include or reference human language policy and/or
warranty / disclaimer information. At an absolute minimum, some
sort of semantic labeling is required. The assumed situation is
always a person interpreting the whole "document" without other
context. Thus it is reasonable to consult attorneys during
message design, require human readable statements to be "within
the four corners" of the document, etc.
PROTO: The "meaning" of a protocol message should be clear from the
protocol specification. It is frequently defined in terms of the
state machines of the sender and recipient processes and may have
only the most remote connection with human volition. Such
processes have additional context and the message is usually only
meaningful with that additional context. Adding any human
readable text that is not functionally required is silly.
Consulting attorneys in design is a bad idea that complicates the
protocol and could tie a design effort in knots.
D. Eastlake 3rd [Page 4]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
2.2.2 Adjunct Meaning
Adjuncts are things that can be added or are logically addenda.
DOCUM: From a document point of view, at the top level we have the
equivalent of a person looking at a document. So adjunct items
such as digital signatures, person's names, dates, etc., must be
carefully documented as to meaning. Thus a digital signature
needs to include, in human readable form, what that signature
means (is the signer a witness, author, guarantor, or what?).
Similarly, a person's name or date needs to be accompanied by what
that person's role is or the meaning of the date, such as editor,
author, contributor or date of creation, modification, or
distribution. Furthermore, given the unrestrained scope of what
can be documented, there is a risk, in the design process, of
trying to enumerate and standardize all possible "semantic tags"
for each type of adjunct data. This can be a difficult, complex,
and essentially infinite task (i.e., a rat hole).
PROTO: From a protocol point of view, the semantics of the message
and every adjunct in it are defined in the protocol specification.
Thus, if there is a slot for a digital signature, person's name, a
date, or whatever, the party that is to enter that data, the party
or parties that are to read it, and its meaning are all pre-
defined. Even if there are several possible meanings, the
specific meaning that applies can be specified by a separate
enumerated type field. There is no reason to have such a field be
directly human readable. Only the "meanings" directly relevant to
the particular protocol need be considered. Another way to look at
this is that the "meaning" of each adjunct, instead of being
pushed into and coupled with the adjunct, as the document point of
view encourages, is commonly promoted to the level of the protocol
specification, resulting in simpler adjuncts.
2.3 Processing Models
The document oriented and protocol oriented have very different views
on what is likely to happen to an object.
2.3.1 Amount of Processing
DOCUM: The model is of a quasi-static object like a piece of paper.
About all you do to pieces of paper is transfer them as a whole,
from one storage area to another, or add signatures, date stamps,
or similar adjuncts. (Possibly you might want an extract from a
document or to combine multiple documents into a summary but this
D. Eastlake 3rd [Page 5]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
isn't the common case.)
PROTO: The standard model of a protocol message is as an ephemeral
composite, multi-level object created by a source process and
consumed by a destination process. Such a message is constructed
from information contained in previously received messages,
locally stored information, local calculations, etc. It is normal
for their to be quite complex processing.
2.3.2 Granularity of Processing
DOCUM: The document view is generally of uniform processing or
evaluation of the entire object being specified. There may be an
allowance for attachments but if so they would probably be simple,
one level, self documenting attachments.
PROTO: Processing is complex and almost always affects different
pieces of the message differently. Some pieces may be intended
for use only by the destination process and be extensively
processed there. Others may be present so the destination can
process can, at some point, do minimal processing and forward them
in other messages to yet other processes. The object's structure
can be quite rich and have multilevel or recursive aspects.
Because messages are processed in context, you can have things
like a signature which covers the combination of some date in the
message, some received in previous messages and stored, and some
locally calculated data.
2.3.3 Extensibility of Processing
DOCUM: The document oriented don't usually think of extensibility as
a major problem. They assume that their design, perhaps with some
simple version scheme, will meet all requirements. Or, coming from
an SGML/DTD world of closed systems, they may assume that
knowledge of new versions or extensions can be easily and
synchronously distributed to all participating sites.
PROTO: Protocolists assume that protocols will always need to be
extended and that it will not be possible to update all
implementations as such extensions are deployed and/or retired.
This is a difficult problem but those from the protocol point of
view try to provide the tools need. For example, carefully
defined versioning and extension/feature labeling, the ability to
negotiate version and features where possible and at least a
specification of how parties running different levels should
interact, providing length/delimiting information for all data so
D. Eastlake 3rd [Page 6]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
it can at least be skipped if not understood, destination labeling
so that a process can tell that it should ignore data except for
passing it through to a later player, etc.
2.4 Security and Canonicalization
Security is a subtle area. Some of the problems can be solved in a
general way, and those solutions are typically incorporated into
standard security syntaxes such as those for ASN.1 [RFC 2630] and XML
Signatures [RFC 3075]. But there are application specific questions,
particularly questions of exactly what information for which need to
provide authentication or confidentiality.
Questions of exactly what needs to be secured and how to do so
robustly are deeply entwined with canonicalization. They are also
somewhat different for authentication and encryption, as discussed
below.
2.4.1 Canonicalization
Canonicalization is the transformation of the "significant"
information in a message into a "standard" form, discarding
"insignificant" information. For example, encoding into a standard
character set or changing line endings into a standard encoding and
discarding the information as to what the original character set or
line ending encodings were. Obvious, what is "significant" and what
is "insignificant" varies with the application or protocol and can be
tricky to determine. However, it is common that for a particular
syntax, such as ASCII [ASCII], ASN.1 [ASN.1], or XML [XML], a
standard canonicalization is specified or developed through practice.
This leads to the design of applications that assume such standard
canonicalization and in turn reduces the need for per-application
canonicalization.
DOCUM: From the document point of view, canonicalization is suspect
if not outright evil. After all, if you have a piece of paper
with writing on it, any modification to "standardize" its format
can be an unauthorized change in the original message as created
by the "author" who is always visualized as a person. From the
document point of view, digital signatures are like authenticating
signatures or seals or time stamps on the bottom of the "piece of
paper". They do not justify and should not depend on changes in
the message appearing above them. Similarly, from the document
point of view, encryption is just putting the "piece of paper" in
a vault that only certain people can open, and does not justify
any standardization or canonicalization of the message.
D. Eastlake 3rd [Page 7]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
PROTO: From the protocol point of view, you have a pattern of bits
that are calculated, processed / stored / communicated, and
finally parsed and acted on. Most of these bits have never been
seen and never will be seen by a person. In fact, many of the
parts of the message will be artifacts of encoding, protocol
structure, and computer representation rather than anything
intended for a person to see. Perhaps in theory, the "original",
idiosyncratic form of any digitally signed part could be conveyed
unchanged through the computer process, storage, and
communications channels which implement the protocol and usefully
signed in that form. But in practical systems of any complexity,
this is unreasonably difficult, at least for most parts of
messages. And if it were possible, it would be virtually useless,
as you would still have to determine the equivalence of the local
message form with the preserved original form. Thus, signed data
must be canonicalized as part of signing and verification to
compensate for insignificant changes made in processing, storage,
and communication. Even if, miraculously, an initial system
design avoids all cases of signed message reconstruction based on
processed data or re-encoding based on character set or line
ending or capitalization or numeric representation or time zones
or whatever, later protocol revisions and extensions are certain
to eventually require such reconstruction and/or re-encoding.
Therefore canonicalization is simply a necessity. It is just a
question of exactly what canonicalization or canonicalizations.
2.4.2 Digital Authentication
DOCUM: The document oriented view on authentication tends to be a
"digital signature" and "Forms" point of view. Since the worry is
always about human third parties and viewing the document in
isolation, they want the "digital signature" characteristics of
"non-repudiability", etc. (See any standard reference on the
subject for the usual meaning of these terms in this context.)
From their point of view, you have a piece of paper or form which
a person signs. Sometimes a signature covers only part of a form,
but that's usually because a signature can only cover data which
is already there. And normally at least one signature covers the
"whole" document/form. Thus they want to be able to insert
digital signatures into documents without changing the document
type and even "inside" the data being signed (requires a mechanism
to skip the signature so that it does not try to sign itself).
PROTO: From a protocol point of view, the right kind of
authentication to use, whether "digital signature" or symmetric
keyed authentication code or biometric or whatever, is just
another engineering decision effected by question of efficiency,
desired security model, etc. Furthermore, the concept of signing
D. Eastlake 3rd [Page 8]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
a "whole" message seems very peculiar (unless it is a copy being
saved for archival purposes in which case you might be signing a
whole archive at once anyway). Typical messages are made up of
various pieces with various destinations, sources, and security
requirements. Furthermore, there are commonly fields you can't
sign because they change as the message is communicated and
processed, such as hop counts, routing history, or local
forwarding tag. Certainly, different kinds of authentication are
commonly mixed in one message.
2.4.3 Canonicalization and Digital Authentication
For authenticating protocol system messages of practical complexity,
you are faced with the choice of
(1) doing "too little canonicalization" and having brittle
authentication, useless due to insignificant failures to verify, or
(2) doing the sometimes difficult and tricky work of selecting or
designing an appropriate canonicalization or canonicalizations to be
used as part of authentication generation and verification, producing
robust and useful authentication, or
(3) doing "too much canonicalization" had having insecure
authentication, useless because it still verifies even when
significant changes are made in the signed data.
The only useful option above is number 2.
2.4.4 Encryption
In terms of processing, transmission, and storage, encryption turns
out to be much easier than signatures to get working. Why? Because
the output of encryption is essentially random bits and it is clear
from the beginning that those bits need to be transferred to the
destination in some absolutely clean way that does not change even
one bit. Because the encrypted bits are meaningless to a human
being, there is no temptation from the document oriented to try to
make them more "readable". So appropriate techniques of encoding at
the source, such as Base64 [RFC 2045], and decoding at the
destination, are always incorporated to protect or "armor" the
encrypted data.
While the application of canonicalization is more obvious with
digital signatures, it may also apply to encryption, particularly
encryption of parts of a message. Sometimes elements of the
environment where the plain text data is found may affect its
interpretation. For example, the character encoding or bindings of
dummy symbols. When the data is decrypted, it may be into an
D. Eastlake 3rd [Page 9]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
environment with a different character encoding and dummy symbol
bindings. With a plain message part, it is usually clear what of
these environmental elements need to be incorporated in or conveyed
with the message. But a encrypted message part is opaque. Thus some
canonical representation that incorporates such environmental factors
may be needed.
DOCUM: Encryption of the entire document is usually what is thought
of. Because signatures are always thought of as human assent,
people with a document point of view tend to vehemently assert
that encrypted data should never be signed unless you know what
the plain text is.
PROTO: Messages are complex composite multi-level structures some
pieces of which are forwarded multiple hops. Thus the design
question is what fields should be encrypted by what techniques to
what destination or destinations and with what canonicalization.
It sometimes makes perfect sense to signe encrypted data you don't
understand; for example, the signature could just be for integrity
protection or as a time stamp, as specified in the protocol.
2.5 Unique Internal Labels
It is desirable to be able to reference parts of structured messages
or objects by some sort of "label" or "id" or "tag". The idea is
that this forms a fixed "anchor" that can be used "globally", at
least within an application domain, to reference the tagged part.
DOCUM: From the document point of view, it seems logical to just
provide for a text tag. The concept would be that users or
applications could easily come up with short readable tags. These
would probably be meaningful to a person if humanly generated
(i.e., "Susan") and at least fairly short and systematic if
automatically generated (i.e., "A123"). The ID attribute type in
XML [XML] appears to have been thought of this way, although it
can be used in other ways.
PROTO: From a protocol point of view, unique internal labels look
very different than they do from a document point of view. Since
you should assume that pieces of different protocol messages will
later be combined in a variety of ways, previously unique labels
can conflict. There are in really only three possibilities if you
need such tags, as follows:
(1) Have a system for dynamically rewriting such tags to maintain
uniqueness. This is usually a disaster as it (a) invalidates
any stored copies of the tags that are not rewritten, and it
is usually impossible to be sure there aren't more copies
lurking somewhere you failed to update, and (b) invalidates
D. Eastlake 3rd [Page 10]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
digital signatures that cover a changed tag.
(2) Use some form of hierarchical qualified tags. Thus the total
tag can remain unique even if a part is moved, because its
qualification changes. This avoids the digital signature
problems of the above possibility. But it destroys the
concept of a globally unique anchor embedded in and moving
with the data. And stored tags may still be invalidated by
data moves. Nevertheless, within a particular carefully
designed protocol, such as IOTP [RFC 2801], this can work.
(3) Construct a lengthy globally unique tag string. This can be
done successfully by using a good enough random number
generator and big enough random tags or more sequentially as
in the way email messages IDs are created [RFC 2822].
Thus, from a protocol point of view, such tags are difficult but
if you really need them, choice 3 works best.
3. Examples
IETF protocols are replete with examples of the protocol viewpoint
such as TCP [RFC 793], IPSEC [RFC 2411], SMTP [RFC 2821], and IOTP
[RFC 2801, 2802].
An example of something that can easily be viewed both ways and where
the best results frequently result from attention to not only the
document but also the protocol point of view, is the eXtensible
Markup Language [XML].
An example of something designed, to a significant extent, from the
document point of view is the X.509 Certificate [X509v3].
4. Resolution of the Points of View
There is some merit to each point of view. Certainly the document
point of view has some intuitive simplicity and appeal and is OK for
applications where it meets the needs.
The protocol point of view can come close to encompassing the
document point of view as a limiting case. In particular, as
the complexity of messages declines to a single payload (perhaps
with a few attachments) and
the mutability of the payload declines to some standard format
that needs little or no canonicalization and
the number of parties and amount of processing as messages are
D. Eastlake 3rd [Page 11]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
transferred declines and
the portion of the message intended for more or less direct human
consumption increases,
the protocol point of view would be narrowed to something close to
the document point of view. Even when the document point of view is
questionable, the addition of a few options to a protocol, such as
optional lack of canonicalication or optional policy statement /
pointer / semantic label inclusion, will usually mollify the
perceived needs of those looking at things from a document point of
view.
On the other hand, the document point of view is hard to stretch to
encompass the protocol case. From a piece of paper point of view,
canonicalization is wrong, inclusion of human language policy text
within every significant object and a semantic tag with every adjunct
should be mandatory, etc. Objects designed in this way are rarely
suitable for protocol use, at least low level protocol use, as they
tend to be improperly structured to accommodate hierarchy and
complexity, inefficient (due to unnecessary text and self documenting
inclusions), and insecure (due to brittle signatures).
Thus, to produce usable protocols, it is best to start with the
protocol point of view and add such document point of view items as
are necessary to achieve consensus.
5. Conclusion
I hope that this document will help explain to those of either point
of view where those with the other view are coming from. Perhaps
this will decrease conflict, shed some light -- in particular on the
difficulties of security design -- and lead to better protocol
designs.
D. Eastlake 3rd [Page 12]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
References
[ASCII] - "USA Standard Code for Information Interchange", X3.4,
American National Standards Institute: New York, 1968.
[RFC 793] - "Transmission Control Protocol", J. Postel, Sep-01-1981.
[RFC 2045] - "Multipurpose Internet Mail Extensions (MIME) Part One:
Format of Internet Message Bodies", N. Freed & N. Borenstein,
November 1996.
[RFC 2411] - "IP Security Document Roadmap", R. Thayer, N. Doraswamy,
R. Glenn, November 1998.
[RFC 2630] - "Cryptographic Message Syntax", R. Housley, June 1999.
[RFC 2801] - "Internet Open Trading Protocol - IOTP Version 1.0", D.
Burdett, April 2000.
[RFC 2802] - "Digital Signatures for the v1.0 Internet Open Trading
Protocol (IOTP)", K. Davidson, Y. Kawatsura, April 2000.
[RFC 2821] - "Simple Mail Transfer Protocol", J. Klensin, Editor,
April 2001.
[RFC 2822] - "Internet Message Format", P. Resnick, Editor, April
2001.
[RFC 3075] - "XML-Signature Syntax and Processing", D. Eastlake, J.
Reagle, D. Solo, March 2001.
[X509v3] - "ITU-T Recommendation X.509 version 3 (1997), Information
Technology - Open Systems Interconnection - The Directory
Authentication Framework", ISO/IEC 9594-8:1997.
[XML] - "Extensible Markup Language (XML) 1.0 Recommendation (2nd
Edition)". T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler,
October 2000. <http://www.w3.org/TR/2000/REC-xml-20001006>
D. Eastlake 3rd [Page 13]
INTERNET-DRAFT Protocol versus Document Viewpoints September 2001
Author's Address
The author of this document is:
Donald E. Eastlake 3rd
Motorola
155 Beaver Street
Milford, MA 01757 USA
Phone: +1 508-261-5434 (w)
+1 508-634-2066 (h)
Fax: +1 508-261-4777 (w)
EMail: Donald.Eastlake@motorola.com
Expiration and File Name
This draft expires March 2002.
Its file name is <draft-eastlake-proto-doc-pov-04.txt>.
D. Eastlake 3rd [Page 14]
| PAFTECH AB 2003-2026 | 2026-04-23 16:11:11 |