One document matched: draft-eastlake-proto-doc-pov-02.txt

Differences from draft-eastlake-proto-doc-pov-01.txt


Network Working Group                                Donald Eastlake 3rd
INTERNET-DRAFT                                                  Motorola
Expires: August 2001                                       February 2001



                Protocol versus Document Points of View
                -------- ------ -------- ------ -- ----
                 <draft-eastlake-proto-doc-pov-02.txt>



Status of This Document

   This draft is intended to become an Informational RFC.  It's
   distribution is unlimited. Please send comments to the author.

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC 2026.  Internet-Drafts are
   working documents of the Internet Engineering Task Force (IETF), its
   areas, and its working groups.  Note that other groups may also
   distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.



Abstract

   Two points of view are contrasted: the "document" point of view,
   where data objects of interest are like pieces of paper, and the
   "protocol" point of view where objects of interest are composite
   protocol messages.  While each point of view has a place, adherence
   to a document point of view is damaging to protocol design.  By
   understanding both of these points of view, conflicts between them
   may be clarified and ameliorated.










D. Eastlake 3rd                                                 [Page 1]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


Table of Contents

      Status of This Document....................................1
      Abstract...................................................1

      Table of Contents..........................................2

      1. Introduction............................................3
      2. Points of View..........................................3
      2.1 The Basic Points of View...............................3
      2.2 Questions of Meaning...................................4
      2.2.1 Core Meaning.........................................4
      2.2.2 Adjunct Meaning......................................4
      2.3 Processing Models......................................5
      2.3.1 Amount of Processing.................................5
      2.3.2 Granularity of Processing............................6
      2.3.3 Extensibility of Processing..........................6
      2.4 Security and Canonicalization..........................6
      2.4.1 Canonicalization.....................................7
      2.4.2 Digital Authentication...............................8
      2.4.3 Canonicalization and Digital Authentication..........9
      2.4.4 Encryption...........................................9
      2.5 Unique Internal Labels................................10
      3. Examples...............................................10
      4. Resolution of the Points of View.......................11
      5. Conclusion.............................................12

      References................................................13

      Author's Address..........................................14
      Expiration and File Name..................................14





















D. Eastlake 3rd                                                 [Page 2]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


1. Introduction

   Much of the IETF's traditional work has concerned low level binary
   protocol constructs.  These are almost always viewed from the
   protocol point of view.  But as higher level application constructs
   and syntaxes are involved in the IETF standards process, difficulties
   can arise due to participants who are fixated on the document point
   of view.  These two different points of view of documentists and
   protocolists are defined and explored in Section 2 below.

   Those practiced in and accustomed to one point of view sometime have
   great difficulty in appreciating the other.  Even after they
   understand the other, they commonly slip back into thinking about
   things entirely from their accustomed point of view.

   Section 3 gives some examples.  And Section 4 tries to synthesize the
   views and give general design advice in areas which can reasonably be
   viewed either way.



2. Points of View

   The following subsections contrast the document and protocol points
   of view.  Each viewpoint is EXAGGERATED for effect.

   The document point of view is indicated in paragraphs headed "DOCUM"
   while the protocol point of view is indicated in paragraphs headed
   "PROTO".



2.1 The Basic Points of View

   DOCUM: What is important are complete (digital) documents, analogous
      to pieces of paper, viewed by people.  A major concern is to be
      able to present such documents as directly as possible to a court
      or other third party.  Since what is presented to the person is
      all that is important, anything which can effect this, such as a
      "style sheet", MUST be considered part of the document.  Sometimes
      the fact that the "document" originates in a computer, may travel
      over, be processed in, and stored in computer systems, and is
      viewed on a computer, are forgotten.

   PROTO: What is important are bits on the wire generated and consumed
      by well defined computer protocol processes.  Normally no person
      ever sees the message as such; it is only viewed as a whole by a
      geek when debugging.  If you actually ever have to demonstrate
      something about such a message in a court or to a third party,
      there isn't any way to avoid having experts interpret it.


D. Eastlake 3rd                                                 [Page 3]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


      Sometimes the fact that pieces of such messages may end up being
      included in or influence data displayed to a person is forgotten.



2.2 Questions of Meaning

   Human meaning is something about which documentists tend to really
   get wrapped around the axle but protocolists rarely think about at
   all.



2.2.1 Core Meaning

   DOCUM: The "meaning" of a document is a deep and interesting human
      question related to human volition.  It is probably necessary for
      the document to include or reference human language policy and/or
      warranty / disclaimer information.  At an absolute minimum, some
      sort of semantic labeling is required as the assumed situation is
      a person interpreting the whole "document" without other context.
      Thus it is reasonable to consult attorneys during document design,
      require human readable statements to be "within the four corners"
      of the document, etc.

   PROTO: The "meaning" of a protocol message should be completely clear
      from the protocol specification.  It is frequently defined in
      terms of the state machines of the sender and recipient processes
      and has only the remotest connection with human volition.  Such
      processes usually have additional context and the message is
      usually only meaningful with that additional context.  Adding any
      human readable text that is not functionally required is silly.
      Consulting attorneys in design is a bad idea that complicates the
      protocol and could tie a design effort in knots.



2.2.2 Adjunct Meaning

   Adjuncts are things that can be added or are logically addenda.

   DOCUM: From a document point of view, at the top level we have the
      equivalent of a person looking at a piece of paper.  So adjunct
      items such as digital signatures, person's names, dates, etc.,
      must, in general, be self documenting as to meaning.  Thus a
      digital signature needs to include what that signature means (is
      the signer a witness, author, guarantor, or what?).  Similarly, a
      person's name or date need to be accompanied by what that person's
      role is or the meaning of the date, such as editor, author,
      contributor or date of creation, modification, or distribution.


D. Eastlake 3rd                                                 [Page 4]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


      Furthermore, given the unrestrained scope of what can be
      documented, there is a risk in trying to enumerate and standardize
      all possible "semantic tags" for each type of adjunct data.  This
      can be a difficult, complex, and essentially infinite task (i.e.,
      rat hole).

   PROTO: From a protocol point of view, the semantics of the message
      and every adjunct in it are defined in the protocol specification.
      Thus, if there is a slot for a digital signature, person's name, a
      date, or whatever, the party that is to enter that data, the party
      or parties that are to read it, and its meaning are all pre-
      defined.  Even if there are several possible meanings, the
      specific meaning that applies can be specified by a separate
      enumerated field and only the meanings relevant to the particular
      protocol need be considered.  Thus, there is no need to accompany
      each adjunct with a meaning field or semantic label.  Another way
      to look at this is that the "meaning" of each adjunct, instead of
      being pushed into and coupled with the adjunct, as the document
      point of view encourages, is generally promoted to the level of
      the protocol specification, resulting in simpler adjuncts.



2.3 Processing Models

   Documentists and protocolists have very different views on what is
   likely to happen to an object.



2.3.1 Amount of Processing

   DOCUM: The model of a document is as a quasi-static object somewhat
      like a piece of paper.  About all you do to documents is transfer
      them as a whole, from one storage area to another, or add
      signatures, date stamps, or similar attachments.  (Possibly you
      might want an extract from a document or to combine multiple
      documents into a summary but this isn't the common case.)

   PROTO: The standard model of a protocol message is as an ephemeral
      composite object created by a source process and consumed by a
      destination process.  Such a message is constructed from
      information contained in previously received messages, locally
      stored information, local calculations, etc.  It is common for
      their to be quite complex processing.







D. Eastlake 3rd                                                 [Page 5]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


2.3.2 Granularity of Processing

   DOCUM: The document view is generally uniform processing or
      evaluation of the overall unified object being specified.  There
      may be an allowance for attachments but if so they would probably
      be simple, one level, self documenting attachments.

   PROTO: Processing is complex and almost always affects different
      pieces of the object differently.  Some pieces may be intended for
      use only by the destination process and be extensively processed
      there.  Others may be present so the destination can process can,
      at some point, do minimal processing and forward them in other
      messages to yet other processes. The objects structure can be
      quite rich and have multilevel or recursive aspects.  Because
      messages are processed in context you can have things like a
      signature which covers the combination of some date in the
      message, some received in previous messages and stored, and some
      locally calculated data.



2.3.3 Extensibility of Processing

   DOCUM: Documentists don't usually think of extensibility as a serious
      problem.  They assume that either their design, perhaps will some
      simple version scheme, will meet all requirements. Coming from an
      SGML/DTD world of closed systems, they may assume that knowledge
      of new versions or extensions can be easily and rapidly
      distributed to all participating sites.

   PROTO: Protocolists assume that protocols will need to be extended
      and that it will not be possible to update all implementions as
      such extensions are deployed and/or retired.  This is a difficult
      problem but those from the protocol point of view try to provide
      the tools need.  For example, carefully defined versioning and
      extension/feature labeling, the ability to negotiate version and
      features where possible and at least a specification of how
      parties running different levels should interact, providing
      length/delimiting information for all data so it can at least be
      skipped if not understood, destination labeling so that a process
      can tell that it should ignore data except for passing it through
      to a later player, etc.



2.4 Security and Canonicalization

   Security is a subtle area.  Some of the problems can be solved in a
   general way, and those solutions are typically incorporated into
   standard security syntaxes such as those for ASN.1 [RFC 2630] and XML


D. Eastlake 3rd                                                 [Page 6]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


   [XMLDSIG].  But there are application specific questions,
   particularly questions of exactly what information for which you need
   to provide authentication or confidentiality.

   Questions of exactly what needs to be secured and how to do so
   robustly are deeply entwined with canonicalization.  They are also
   slightly different for authentication and encryption, as discussed
   below.



2.4.1 Canonicalization

   Canonicalization is the transformation of the information in a
   message into a "standard" form, discarding "insignificant"
   information.  For example, encoding into a standard character set or
   changing line endings into a standard encoding and discarding the
   information as to what the original character set or line ending
   encodings were.  Obvious, what is "standard" and what is
   "insignificant" varies with the application or protocol and can be
   tricky to determine.  (However, it is common for a particular syntax,
   such as ASCII [ASCII], ASN.1 [ASN.1], or XML [XML], that a standard
   canonicalization is specified or developed through practice.  This
   leads to the design of applications that assume such standard
   canonicalization and in turn reduces the need for per-application
   canonicalization.)

   DOCUM: From the document point of view, canonicalization is suspect
      if not outright evil.  After all, if you have a piece of paper
      with writing on it, any modification to "standardize" its format
      can be an unauthorized change in the original message as created
      by the author.  From the document point of view, digital
      signatures are like authenticating signatures or seals or time
      stamps on the bottom of the "piece of paper".  They do not justify
      and should not depend on the slightest change in the message
      appearing above them.  Similarly, from the document point of view,
      encryption is just putting the "piece of paper" in a vault that
      only certain people can open, and does not justify any
      standardization or canonicalization of the message.

   PROTO: From the protocol point of view, you have a pattern of bits
      that are calculated, processed / stored / communicated, and
      finally parsed and acted on.  Most of these bits have never been
      seen and never will be seen by a person.  In fact, many of the
      parts of the message will be artifacts of encoding, protocol
      structure, and computer representation rather than anything
      intended for a person to see.  In theory, the "original"
      idiosyncratic form of any digitally signed part could be conveyed
      unchanged through the computer process, storage, and
      communications channels which implement the protocol and usefully


D. Eastlake 3rd                                                 [Page 7]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


      signed in that form.  But in practical systems of any complexity,
      this is unreasonably difficult for parts of some messages. Thus,
      signed data must be canonicalized as part of signing and
      verification to compensate for insignificant changes made in
      processing, storage, and communication.  Even if, miraculously, an
      initial system design avoids all cases of signed message part
      reconstruction based on processed data or re-encoding based on
      character set or line ending or capitalization or numeric
      representation or time zones or whatever, later protocol revisions
      and extensions are certain to require such reconstruction and/or
      re-encoding.  Therefore canonicalization is simply a necessity.
      It is just a question of exactly what canonicalization or
      canonicalizations.



2.4.2 Digital Authentication

   DOCUM: The documentists view on authentication tends to be a "digital
      signature" and "Forms" point of view.  Since documentists are
      always worried about third parties and viewing the document in
      isolation, they want the "digital signature" characteristics of
      "non-repudiability", etc. (See any standard reference on the
      subject for the usual meaning of these terms in this context.)
      From their point of view, you have a document or form which people
      sign.  Sometimes a signature covers only part of a form, but
      that's usually because a signature can only cover data which is
      already there.  And normally at least one signature covers the
      "whole" document/form.  Thus they want to be able to insert
      digital signatures into documents without changing the document
      type.

   PROTO: From a protocol point of view, the right kind of
      authentication to use, whether "digital signature" or symmetric
      keyed authentication code or whatever, is just another design
      decision effected by question of efficiency, desired security
      model, etc.  Furthermore, the concept of signing a "whole" message
      seems very peculiar (unless it is a copy being saved for archival
      purposes in which case you might be signing a whole archive at
      once anyway).  Typical messages are made up of various pieces with
      various destinations, sources, and security requirements.
      Furthermore, there are normally fields you can't sign because they
      change as the message is communicated and processed, such as hop
      counts or routing history.  Certainly different kinds of
      authentication are normally mixed in one message.







D. Eastlake 3rd                                                 [Page 8]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


2.4.3 Canonicalization and Digital Authentication

   For authenticating protocol system messages of practical complexity,
   you are faced with the choice of
      (1) doing no canonicalization and having brittle authentication,
   useless due to insignificant failures to verify, or
      (2) doing the sometimes difficult and tricky work of selecting or
   designing an appropriate canonicalization or caonnicalizaitons to be
   used as part of authentication generation and verification, producing
   robust and useful authentication.



2.4.4 Encryption

   In terms of processing, transmission, and storage, encryption turns
   out to be much easier than signatures to get working.  Why?  Because
   the output of encryption is essentially random bits and it is clear
   from the beginning that those bits need to be transferred to the
   destination in some absolutely clean way that does not change even
   one bit.  Because the encrypted bits are, by definition, meaningless
   to a human being, there is no temptation from a documentist to try to
   make them "readable". So appropriate techniques of encoding at the
   source, such as Base64 [RFC 2045], and decoding at the destination,
   are incorporated to protect or "armor" the encrypted data.

   While the application of canonicalization is more obvious with
   digital signatures, it may also apply to encryption, particularly
   encryption of parts of a message.  Sometimes elements of the
   environment where the encrypted data is found effect its
   interpretation.  For example, the character encoding or bindings of
   dummy symbols.  When the data is decrypted, it may be into an
   environment with a different character encoding and dummy symbol
   bindings.  With a plain text message part, it is usually clear what
   of these environmental elements need to be incorporated in or
   conveyed with the message.  But a encrypted message part is opaque.
   Thus some canonical representation that incorporates such
   environmental factors may be needed.

   DOCUM: Encryption of the entire document is usually what is thought
      of, although there are still questions as to whether document
      signatures should be inside or outside of the encryption or both.

   PROTO: Messages are complex composite structures some pieces of which
      are forwarded multiple hopes. Thus the question design is what
      fields should be encrypted by what techniques to what destination
      or destinations.





D. Eastlake 3rd                                                 [Page 9]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


2.5 Unique Internal Labels

   It is desirable to be able to reference parts of structured messages
   / objects by some sort of "label" or "id" or "tag".  The idea is that
   this forms a fixed "anchor" that can be used "globally", at least
   within an application domain, to reference the tagged part.

   DOCUM: From the document point of view, it seems logical to just
      provide for a text tag.  The concept would be that users or
      applications could easily come up with short readable tags.  These
      would probably be meaningful to a person if humanly generated
      (i.e., "Susan") and at least fairly short and systematic if
      automatically generated (i.e., "A123").  The ID attribute type in
      XML [XML] appears to have been thought of this way, although it
      can be used in other ways.

   PROTO: From a protocol point of view, unique internal labels look
      very different than they do from a document point of view.  Since
      you should assume that pieces of different protocol messages will
      later be combined in a variety of ways, previously unique labels
      can conflict.  There are in really only three possibilities if you
      need such tags, as follows:
      (1) Have a system for dynamically rewriting such tags to maintain
          uniqueness.  This is usually a disaster as it (a) invalidates
          any stored copies of the tags that are not rewritten, and it
          is usually impossible to be sure there aren't more copies
          lurking somewhere you failed to update, and (b) invalidates
          digital signatures that cover a changed tag.
      (2) Use some form of hierarhical qualified tags.  Thus the total
          tag can remain unique even if a part is moved, because its
          qualification changes.  This avoids the digital signature
          problems of possibility 1.  But can it destroy the concept of
          a globally unique anchor embedded in and moving with the data.
          And stored tags may still be invalidated by data moves.
          Nevertheless, for a particular carefully designed protocol,
          such as IOTP [RFC 2801], this can work.
      (3) Construct a lengthy globally unique tag string.  This can be
          done successfully by using a good enough random number
          generator and big enough random tags or more sequentially as
          in the way email messages IDs are created [RFC 822].
      Thus, from a protocol point of view, such tags are difficult but
      if you really need them, choice 3 work best.



3. Examples

   IETF protocols are replete with examples of the protocol viewpoint
   such as TCP [RFC 793], IPSEC [RFC 2411], SMTP [RFC 821], and IOTP
   [RFC 2801-2803].


D. Eastlake 3rd                                                [Page 10]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


   An example of something that can easily be viewed both ways and where
   the best results frequently attention to not only the document but
   also the protocol point of view is the eXtensible Markup Language
   (XML [XML]).

   An example of something designed, to a significant extent, from the
   document point of view is the X.509v3 Certificate [X509v3].



4. Resolution of the Points of View

   There is some merit to each point of view.  Certainly the document
   point of view has some intuitive simplicity and appeal and is fine
   for applications where it meets the needs.

   The protocol point of view can come close to encompassing the
   document point of view as a limiting case.  In particular, as

      the complexity of messages declines to a single payload (perhaps
      with a few attachments) and

      the mutability of the payload declines to some standard format
      that needs no canonicalization and

      the number of parties and amount of processing as messages are
      transferred declines and

      the portion of the message intended for more or less direct human
      consumption increases,

   the protocol point of view would be narrowed to something close to
   the document point of view.  Even when the document point of view is
   questionable, the addition of a few options to a protocol, such as
   optional lack of canonicalication or optional policy statement /
   pointer / semantic label inclusion, will usually satisfy the
   perceived needs of those holding a document point of view.

   On the other hand, the document point of view is hard to stretch to
   encompass the protocol case.  From a document point of view,
   canonicalization is wrong, inclusion of human language policy text
   within every significant object and a meaning with every adjunct
   should be mandatory, etc.  Objects designed in this way are rarely
   suitable for protocol use as they tend to be inefficient (due to
   unnecessary text and self documenting inclusions) and insecure (due
   to brittle signatures).

   Thus, to produce usable protocols, it is best to start with the
   protocol point of view and add only such limited document point of
   view items as are necessary to achieve consensus.


D. Eastlake 3rd                                                [Page 11]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


5. Conclusion

   The author hopes that this document will help explain to those of
   either point of view where those with the other view are coming from.
   Perhaps this will decrease conflict, shed some light -- in particular
   on the difficulties of security design, and lead to better protocol
   design.













































D. Eastlake 3rd                                                [Page 12]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


References

   [ASCII] - "USA Standard Code for Information Interchange", X3.4,
   American National Standards Institute: New York, 1968.

   [RFC 793] - "Transmission Control Protocol", J. Postel, Sep-01-1981.

   [RFC 821] - "Simple Mail Transfer Protocol", J. Postel, Aug-01-1982.

   [RFC 822] - "Standard for the format of ARPA Internet text messages",
   D. Crocker, Aug-13-1982.

   [RFC 2045] - "Multipurpose Internet Mail Extensions (MIME) Part One:
   Format of Internet Message Bodies", N. Freed & N. Borenstein,
   November 1996.

   [RFC 2411] - "IP Security Document Roadmap", R. Thayer, N. Doraswamy,
   R. Glenn, November 1998.

   [RFC 2630] - "Cryptographic Message Syntax", R. Housley, June 1999.

   [RFC 2801] - "Internet Open Trading Protocol - IOTP Version 1.0", D.
   Burdett, April 2000.

   [RFC 2802] - "Digital Signatures for the v1.0 Internet Open Trading
   Protocol (IOTP)", K. Davidson, Y. Kawatsura, April 2000.

   [RFC 2803] - "Digest Values for DOM (DOMHASH)", H. Maruyama, K.
   Tamura, N.  Uramoto, April 2000.

   [X509v3] - "ITU-T Recommendation X.509 version 3 (1997), Information
   Technology - Open Systems Interconnection - The Directory
   Authentication Framework",  ISO/IEC 9594-8:1997.

   [XML] - Extensible Markup Language (XML) 1.0 Recommendation. T. Bray,
   J. Paoli, C. M. Sperberg-McQueen. February 1998.
   <http://www.w3.org/TR/1998/REC-xml-19980210>

   [XMLDSIG] - draft-ietf-xmldsig-core-*.txt, D. Eastlake, J. Reagle, D.
   Solo, October 2000.












D. Eastlake 3rd                                                [Page 13]


INTERNET-DRAFT    Protocol versus Document Viewpoints      February 2001


Author's Address

   The author of this document is:

        Donald E. Eastlake 3rd
        Motorola
        155 Beaver Street
        Milford, MA 01757 USA

        Phone:  +1 508-261-5434 (w)
                +1 508-634-2066 (h)
        Fax:    +1 508-261-4777 (w)
        EMail:  Donald.Eastlake@motorola.com



Expiration and File Name

   This draft expires July 2001.

   Its file name is <draft-eastlake-proto-doc-pov-02.txt>.































D. Eastlake 3rd                                                [Page 14]


PAFTECH AB 2003-20262026-04-23 22:48:26