One document matched: draft-iab-doi-05.xml


<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
  <!ENTITY rfc3650 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3650.xml">
  <!ENTITY rfc3651 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3651.xml">
  <!ENTITY rfc3652 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3652.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc toc="yes" ?>
<?rfc tocindent="yes" ?>
<?rfc tocdepth="2" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="yes" ?>

<rfc category="info" docName="draft-iab-doi-05" ipr="trust200902">
  <front>
    <title abbrev="DOIs for RFCs">Assigning Digital Object Identifiers to RFCs</title>

    <author fullname="John Levine" initials="J." surname="Levine">
      <organization>Taughannock Networks</organization>

      <address>
        <postal>
          <street>PO Box 727</street>
          <city>Trumansburg</city>
          <code>14886</code>
          <region>NY</region>
        </postal>
        <phone>+1 831 480 2300</phone>
        <email>standards@taugh.com</email>
        <uri>http://jl.ly</uri>
      </address>
    </author>

    <date month="August" year="2015" />

    <area>RFC Editor</area>

    <keyword>DOI</keyword>
    <keyword>indexing</keyword>

    <abstract>
       <t>
	  We describe the way that  Digital Object Identifiers (DOIs) are assigned to past and future RFCs.
	  The DOI is a widely used system that
	  assigns unique identifiers to digital documents that can be queried
	  and managed in a consistent fashion.
       </t>
    </abstract>
  </front>
  <middle>
     <section title="Introduction">
	<t>
	  The Digital Object Identifier (DOI) system
	  assigns unique identifiers to digital documents that can be queried
	  and managed in a consistent fashion.  The structure of DOIs is
	  defined by <xref target="ISO-DOI">ISO 26324:2012</xref> and is implemented
	  by a group of registration agencies coordinated by the International DOI Foundation.
	</t>
	<t>
	   Each DOI is associated with bibliographic metadata about the object, including one or more URIs
	   where the object can be found.
	   The metadata is stored in a public database with entries retrieved via HTTP.
	</t>
	<t>
	  DOIs are widely used by publishers and consumers of technical journals and other
	  technical material published online.

	  <list style="empty">
	     <t>
	     ``Typical web addresses are unreliable for locating online resources, because
	     they can move, change or disappear entirely. But persistent identifiers are fixed, with an
	     infrastructure that allows for the location of the item to be updated. The result is that
	     the identifier can provide persistent access to the data. DataCite provides
	     such a service, and DOIs (used by DataCite) were by far the identifier most commonly
	     mentioned by interviewees, closely followed by Handles (on which the DOI system is
	     built). There was a keen preference for DOIs from interviewees because this is a system
	     already used and understood by publishers for traditional publications and so the barrier
	     to uptake would presumably be lower than for an entirely novel system.''
	     </t>
	  </list>
	  (From <xref target="CITABILITY" />, page 15, citations omitted.)
	  </t>
	  <t>
	   Some scholarly publishers accept DOIs as references in published documents,
	   and some versions of bibtex can automatically retrieve the bibliographic
	   data for a DOI and format it.
	   DOIs may have other advantages, such as making it easier to find the free online
	   versions of RFCs rather than paywalled copies when following references or using
	   some document indexes.
	</t>
	<t>
	   The benefits of DOIs apply equally to documents from all of the RFC submission streams,
	   so all RFCs are assigned DOIs.
	</t>
     </section>
     <section title="Structure and resolution of DOIs">
	<t>
	   DOIs are an application of the Handle system defined by RFCs
	   <xref target="RFC3650" />, <xref target="RFC3651" />, and <xref target="RFC3652" />.
	   A DOI for an RFC might be
	</t>
	<figure>
	   <artwork>10.17487/rfc1149</artwork>
	</figure>
	<t>The first part of a DOI is the number 10, which means a DOI within the handle system,
	   a dot, and a unique number assigned to a publisher, in this case 17487.
	   This part is the DOI prefix.
	   Following that is a slash and a text string assigned by the publisher,
	   called the DOI suffix.
	</t><t>
	   DOIs are treated as opaque identifiers.
	   The DOI suffixes assigned to RFCs are currently based on the doc-id field of the
	   XML index rfc-index.xml, but those for future RFCs might be based on something else
	   if circumstances change.
	   Hence,
	   the reliable way to find the DOI for an RFC is to not to guess, but to look it
	   up in the RFC index, or in the bibxml entries generated from the index.
	</t>
	<t>
	   Although the handle system has its own protocol described in <xref target="RFC3652" />,
	   the usual way to look up a DOI is to use web lookup.
	   A proposed "doi:" URN was never widely implemented, so the
	   standard way to look up a DOI is to use the public http proxy at https://dx.doi.org.
	   The example DOI above could be looked up at:
	   </t>
	<figure>
	   <artwork>https://dx.doi.org/10.17487/rfc1149</artwork>
	</figure>
	<t>Whenever a publisher assigns a DOI, it provides the bibliographic metadata for the
	   object (henceforth called a document, since that is what they are in this
	   context) to its registration agency which then makes it available to clients
	   that look up DOIs.
	   The document's metadata is typically uploaded to the registration agency in XML
	   using an HTTP based API.
	   Users or publishing software
	   can retrieve the metadata by fetching the DOI's URL and using standard HTTP
	   content negotiation to request application/citeproc+json, application/rdf+xml, or
	   other bibliographic formats.
	</t>
	<t>
	   Publishers have considerable flexibility as to what resides
	   at the URI(s) that a DOI refers to.
	   Sometimes it's the document itself, while for commercial publishers it's
	   typically a page with the abstract and bibliographic information, and some
	   way to buy the actual document.
	   Since some RFCs are in multiple formats (e.g., Postscript and text),
	   an appropriate URI is that of the RFC Editor's info page that has
	   the document's abstract and links to the document(s) in various formats.
	   Hence the URI above when requested as text/html redirects to:
	</t>
	<figure>
	   <artwork>https://www.rfc-editor.org/info/rfc1149</artwork>
	</figure>
	<t>
	   More information on the structure and use of DOIs is in the
	   <xref target="DOI-HB">DOI Handbook</xref>.
	</t>
     </section>
     <section title="DOIs for RFCs">
	<t>With DOIs assigned to each RFC, it is useful to include DOI information
	in the XML bibliography as a "seriesInfo" item, so that
	rendering engines can display it if desired.
	Online databases and indexes that include RFCs should be updated to include the DOI,
	e.g., the ACM Digital Library.
	(A practical advantage of this is that the DOI would link directly to the RFC
	Editor, rather
	than perhaps to a copy of an RFC behind a paywall.)
	</t>
	<t>
	   Since RFCs are immutable, existing RFCs still wouldn't mention their own DOIs within
	   the RFC itself, but putting their DOIs into indexes would provide value.	   
	</t>
     </section>
     <section title="The process of assigning DOIs">
	<t>There are three phases to assigning DOIs to RFCs:
	   getting a DOI prefix, retroactively assigning DOIs to existing documents,
	   and updating the publication process to assign DOIs as new RFCs are published.
	</t>
	<section title="Getting a DOI prefix">
	   <t>
	   There are ten <xref target='DOI-RA'>registration agencies</xref> that assign
	   DOI prefixes.
	   Most of them serve specialized audiences or limited geographic areas, but there
	   are a few that handle scholarly and technical materials.
	   The RFC Editor chose Crossref, an agency widely used by journal publishers.
	   All registration agencies charge for DOIs to defray the cost of maintaining
	   the metadata databases.
	   Crossref publishes its price list;
	   the prices are on the order of $660/year for membership,
	   and deposit fees of 15 cents per document
	   for a bulk upload of the backfile (the existing RFCs), and $1/per document to deposit
	   them as they are published.
	   </t>
	   <t>
	      The RFC Editor's DOI prefix is 10.17487.
	   </t>
	</section>
	<section title="Retroactively assigning DOIs">
	   <t>
	      Other than paying the deposit fees,
	      assigning DOIs to all of the existing RFCs was primarily a software problem.
	      We updated the RFC Production Center's internal database to include a DOI
	      field for each RFC, changed the schema for the XML index rfc-index.xml
	      to include a DOI
	      field, and updated the scripts that create the XML and text indexes to include the DOI
	      for each RFC.
	      A specialized DOI submission script extracted the metadata for all of the RFCs
	      from the XML index and
	      submitted it to the registration agency using the agency's online API.
	   </t>
	</section>
	<section title="Assigning DOIs to new RFCs">
	   <t>
	      As RFCs are published, the publication software assigns a
	      DOI to each new RFC.
	      The submission script extracts the metadata for new RFCs from the XML index and submits
	      the information for new RFCs to the registration agency.
	   </t>
	</section>
	<section title="Use of DOIs in RFCs">
	   <t>
	      The DOI agency requests that documents that are assigned DOIs in turn
	      include DOIs when possible when referring to other organizations' documents.
	      DOIs can be listed using the existing seriesInfo field in the xml2rfc reference
	      entity, and authors are requested provide DOIs for non-RFC documents when possible.
	      The RFC production center might add missing DOIs when it's easy to do so,
	      e.g., when the same reference with a DOI has appeared in a prior RFC, or
	      a quick online search finds the DOI.
	      With DOIs in the xml2rfc reference databases, DOIs in references from citation
	      libraries can appear in the RFCs automatically.
	   </t>
	   <t>
	      The RFC Style Guide will be updated to describe the rules for including DOIs
	      in the References sections of RFCs.
	   </t>
	   <t>
	      Since it is usually possible to retrieve the bibliographic information
	      for a document from its DOI (as bibtex can do, described above), it might
	      also be worth adding this feature to xml2rfc, so a reference with only a DOI
	      could be automatically fetched and expanded.
	   </t>
	</section>
     </section>
     <section title="Internationalization">
	<t>Adding DOIs presents no new internationalization issues.
	</t>
	<t>Since DOIs are opaque, the characters used in any particular DOI are unimportant beyond
	   ensuring that they can be represented where needed.
	   The Handle system says they are UTF-8 encoded Unicode, but in practice all DOIs appear
	   to use only printable ASCII characters.
	   The metadata for each RFC is uploaded as UTF-8 encoded XML.
	</t>
     </section>
  </middle>
       
  <back>
    <references title="Informative References">
    &rfc3650;
    &rfc3651;
    &rfc3652;
    <reference anchor='ISO-DOI' target='http://www.iso.org/iso/catalogue_detail?csnumber=43506'>
       <front>
	  <title>ISO 26324:2012
	     Information and documentation -- Digital object identifier system</title>
	  <author><organization>International Organization for Standardization (ISO)</organization></author>
	  <date year='2012' />
       </front>
    </reference>
      
    <reference anchor='DOI-HB' target='http://www.doi.org/hb.html'>
       <front>
	  <title>DOI Handbook</title>
	  <author><organization>International DOI Foundation</organization></author>
	  <date year='2012' month='April' />
       </front>
       <seriesInfo name="DOI" value="10.1000/182" />
    </reference>
    <reference anchor='DOI-RA' target='http://www.doi.org/registration_agencies.html'>
       <front>
	  <title>DOI Registration Agencies</title>
	  <author><organization>International DOI Foundation</organization></author>
	  <date year='2013' month='July' />
       </front>
    </reference>
    <reference anchor='CITABILITY' target='http://www.stm-assoc.org/2012_07_10_STM_Research_Data_Group_Data_Citation_and_Evolving_Roles_ODE_Report.pdf'>
       <front>
	  <title>Report
	     on best practices for citability of data and on evolving roles in scholarly
	     communication
	  </title>
	  <author initials='R.' surname='Kotarski' fullname='R. Kotarski' />
	  <author initials='S.' surname='Reilly' fullname='S. Reilly' />
	  <author initials='S.' surname='Schrimpf' fullname='S. Schrimpf' />
	  <author initials='E.' surname='Smit' fullname='E. Smit' />
	  <author initials='K.' surname='Walshe' fullname='K. Walshe' />
	  <date year='2012' />
       </front>
    </reference>
    </references>
    <section title="Change Log">
       <t>Remove this section before publication, please.</t>
      <section title="Changes from -04 to -05">
	 <t>
	    Editorial clarifications, reorganize first part and add quote from [CITABILITY].
	    Add i18n section.
	 </t>
      </section>
      <section title="Changes from -03 to -04">
	 <t>
	    Make the rest of everything present tense.  Fix typos, note that RSE
	    style guide will include use of DOIs.
	 </t>
      </section>
      <section title="Changes from -02 to -03">
	 <t>
	    Make everything present tense, minor adjustments to reflect reality.
	 </t>
      </section>
      <section title="Changes from -01 to -02">
	 <t>Clarify submission process, multi-document DOIs.
	    Note all streams treated the same.
	    Remove unused reference.
	 </t>
      </section>
      <section title="Changes from -00 to -01">
        <t>DOI in the xml, not necessarily in the text
	</t>
        <t>Use of DOI in RFCs section.
	</t>
      </section>
    </section>
  </back>
</rfc>

PAFTECH AB 2003-20262026-04-24 08:53:11