Internet Draft                                              Tim Kindberg
Document: draft-kindberg-tag-uri-04.txt      Hewlett-Packard Corporation
Expires: March 1, 2003                                      Sandro Hawke
                                               World Wide Web Consortium
                                                          September 2002


                 The 'tag' URI scheme and URN namespace


STATUS OF THIS MEMO

   This document is an Internet-Draft and is in  full  conformance  with
   all provisions of Section 10 of RFC2026.

   This document replaces draft-kindberg-tag-uri-03.txt.

   Internet-Drafts are working documents  of  the  Internet  Engineering
   Task  Force  (IETF),  its  areas,  and its working groups.  Note that
   other groups may  also  distribute  working  documents  as  Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and  may be updated, replaced, or obsoleted by other documents at any
   time.  It  is  inappropriate  to  use  Internet-Drafts  as  reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html.

   This Internet-draft will expire on March 1, 2003.

   Copyright Notice Copyright (C) The Internet Society (2001). All
   Rights Reserved.

   DISCLAIMER. The views and opinions of authors expressed herein do not
   necessarily state or reflect those of the World Wide Web Consortium,
   and may not be  used for advertising or product endorsement purposes.
   This proposal  has not undergone technical review within the
   Consortium and must not be construed as a Consortium recommendation.

ABSTRACT

   This document describes the "tag" Uniform Resource Identifier (URI)
   scheme and the "tag" Uniform Resource Name (URN) namespace, for
   identifiers that are unique across space and time. Tag URIs (also



Kindberg         Informational - Expires March 1, 2003          [Page 1]

Internet-Draft                    tags                    September 2002


   known as "tags") are distinct from most other URIs in that they are
   intended for uses that are independent of any particular method for
   resource location or name resolution. A tag URI may be used purely as
   an entity identifier. It may also be presented to services for
   resolution into a web resource or into one or more further URIs, but
   no particular resolution scheme is implied or preferred by a tag URI
   itself. Unlike UUIDs or GUIDs such as "uuid" URIs and "urn:oid" URIs,
   which also have some of the above properties, tag URIs are designed
   to be tractable to humans. Furthermore, they have many of the
   desirable properties that "http" URLs have when used as identifiers,
   but none of the drawbacks.


0. TERMINOLOGY
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.


1. INTRODUCTION

   A tag is a type of Uniform Resource Identifier (URI) [RFC2396]
   designed to meet the following requirements:

   1) Identifiers are unique across space and time and come from a
      practically inexhaustible supply;
   2) identifiers are relatively convenient for humans to mint (create),
      read, type, remember etc.;
   3) zero registration cost, at least to holders of domain names or
      email addresses; and negligible cost to mint new identifiers;
   4) independence of any particular resource-location or identifier-
      resolution scheme.

   For example, the above requirements may apply in the case of a user
   who wants to place identifiers on their documents:

   A) They want to be sure that the identifier is unique. Global
      uniqueness is valuable because it guarantees that no identifier
      can conflict with another, no matter how resources become shared.
   B) It is useful for the identifier to be tractable to humans: they
      should be able to mint new identifiers conveniently, and to type
      them into emails and forms.
   C) They do not want to have to communicate with anyone else in order
      to mint identifiers for their documents.
   D) As a good net citizen, the user does not want to use an identifier
      that might be assumed by software to imply the existence of a
      corresponding resource in a default binding scheme – so that an
      attempt to retrieve that resource is likely but doomed to failure.



Kindberg         Informational - Expires March 1, 2003          [Page 2]

Internet-Draft                    tags                    September 2002


      Of course, this leaves them free to exploit the identifier in
      particular applications and services, where the context is clear.

   Existing identification schemes satisfy some but not all of the
   general requirements 1-4. For example:

   UUIDs [UUID, ISO11578] are hard for humans to read.

   OIDs [OID, RFC3061] and Digital Object Identifiers [DOI] require
   naming authorities to register themselves, even if they already hold
   a domain name registration.

   URLs (in particular, "http" URLs) are sometimes used as identifiers
   that satisfy most of our requirements. Many users and organisations
   have already registered a domain name, and the use of the domain name
   to mint identifiers comes at no additional cost. But there are
   drawbacks to URLs-as-identifiers:

   1) Software might try to dereference a URL-as-identifier, even though
      there is no resource at the "location".
   2) The new holder of a domain name can't be sure that they are
      minting new names. If Smith registers champignon.net and then
      Jones registers it, how can Jones know, in general, whether Smith
      has already used http://champignon.net/99?


1.1 OUTLINE

   Section 2 gives a specification for tags: their syntax and the rules
   governing their creation and comparison. Section 3 revisits the
   requirements outlined above and shows that the tag specification
   meets them. Section 4 explains how tags are embedded in the URN
   namespace, by adopting "tag" as a URN namespace identifier. Section 5
   covers security considerations. Appendix A contains a URN namespace
   registration request for "tag".


1.2 FURTHER INFORMATION AND DISCUSSION OF THIS DOCUMENT

   Information about the tag URI scheme additional to this document --
   motivation, genesis and discussion -- can be obtained from
   http://www.taguri.org.

   Earlier drafts of this document have been discussed on uri@w3.org.
   The authors welcome further discussion and comments.






Kindberg         Informational - Expires March 1, 2003          [Page 3]

Internet-Draft                    tags                    September 2002


2. THE 'TAG' URI SCHEME

   Examples of tag URIs are:

         tag:hpl.hp.com,2001:tst.1234567890
         tag:hp.com,2000-12-30:tst.1234567890
         tag:exploratorium.edu,2001-06:pi.99
         tag:fred@flintstone.biz,2001-07-02:rock.123
         tag:sandro@w3.org,2001:Sandro
         tag:myIDs.com,2001-09-01:TimKindberg/doc.101

   Each tag consists of a "tag authority" followed, optionally, by a
   specific identifier. The tag authority consists of an "authority
   name" -- a fully qualified domain name or an email address containing
   a fully qualified domain name -- followed by a date. The tag
   authority is globally unique because domain names and email addresses
   are assigned to at most one entity at a time. That entity can be sure
   of minting unique identifiers.

   The date specifies, according to the Gregorian calendar, any
   particular day on which the authority name was assigned to the
   minting entity at 00:00 UTC (the start of the day). The date is
   specified using one of the "YYYY", "YYYY-MM" and "YYYY-MM-DD" formats
   allowed by the ISO 8601 standard [ISO8601]. The tag specification
   permits no other formats.

   In the interests of brevity, the month and day default to "01". A day
   value of 01 MAY be omitted; a month value of 01 MAY be omitted unless
   it is followed by a day value other than 01. For example, "2001-07"
   is the date 2001-07-01 and "2000" is the date 2000-01-01. All dates
   specify a moment (00:00) of a single day; they MUST NOT be taken as
   periods of a day or more, such as "the whole of July 2001" or "the
   whole of 2000".

   It is RECOMMENDED that tag authorities use only one formulation for a
   given date, since alternative formulations of the same date will be
   counted as distinct and hence tags containing them will be unequal.
   For example, tags beginning "tag:hp.com,2000:" are never equal to
   those beginning "tag:hp.com,2000-01-01:", even though they refer to
   the same date.

   A tag authority mints specific identifiers that are unique within its
   context, in accordance with any internal scheme that uses only URI
   characters. Some tag authorities (e.g. corporations, mailing lists)
   consist of many people, in which case group decision-making and
   record-keeping procedures are required to achieve uniqueness.





Kindberg         Informational - Expires March 1, 2003          [Page 4]

Internet-Draft                    tags                    September 2002


   Entities that were assigned an authority name at 00:00 UTC -- the
   start -- of a given date MAY mint tags rooted at that date-qualified
   name. An entity MUST NOT mint tags under an authority name that was
   assigned to a different entity on the given date, and it MUST NOT
   mint tags under a future date.

   An entity that acquires an authority name immediately after a period
   during which the name was unassigned MAY mint tags as if the entity
   was assigned the name during the unassigned period. This practice has
   considerable potential for error and MUST NOT be used unless the
   entity has substantial evidence that the name was unassigned during
   that period. The authors are currently unaware of any mechanism that
   would count as evidence, other than daily polling of the "whois"
   registry.

   For example, Hewlett-Packard holds the domain registration for hp.com
   and may mint any tags rooted at that name with a current or past date
   when it held the registration. It must not mint tags such as
   "tag:champignon.net,2001:" under domain names not registered to it.
   It must not mint tags dated in the future, such as
   "tag:hp.com,2999:". If it obtains assignment of
   "extremelyunlikelytobeassigned.org" on 2001-05-01, then it must not
   mint tags under "extremelyunlikelytobeassigned.org,2001-01-01" unless
   it has evidence proving that that name was continuously unassigned
   between 2001-01-01 and 2001-05-01.

   The general syntax of a tag URI, in ABNF, is:

         tagURI        = "tag:" tagAuthority ":" [specific]
   Where:
         tagAuthority  = authorityName "," date
         authorityName = DNSname / emailAddress
         date          = 4*dig ["-" 2*dig ["-" 2*dig ]] ; [ISO8601]
         DNSname       = DNScomp / DNSname "." DNScomp  ; [RFC1035]
         DNScomp       = lowAlphaNum [*(lowAlphaNum /"-") lowAlphaNum]
         emailAddress  = 1*(lowAlphaNum /"-"/"."/"_") "@" DNSname
         lowAlphaNum   = dig / lowAlpha
         specific      = 1*(URIchars)  ; URIchars defined in [RFC2396]
         lowAlpha      = %x61-7A ; any char in the range "a" through "z"
         dig           = %x30-39 ; any char in the range "0" through "9"


   The component "tagAuthority" is the name space part of the URI. To
   avoid ambiguity, this MUST be expressed in lower case; the domain
   name in "authorityName" (whether an email address or a simple domain
   name) MUST be fully qualified.





Kindberg         Informational - Expires March 1, 2003          [Page 5]

Internet-Draft                    tags                    September 2002


   Authority names could, in principle, belong to any syntactically
   distinct namespaces whose names are assigned to a unique entity at a
   time. Those include, for example, certain IP addresses, certain MAC
   addresses, and telephone numbers. However, to simplify the tag
   scheme, we restrict authority names to be domain names and email
   addresses. Future standards efforts may allow use of other authority
   names following syntax that is disjoint from this syntax. To allow
   for such developments, software that processes tags MUST NOT reject
   them on the grounds that they are outside the syntax defined above.

   The component "specific" is the name-space-specific part of the URI:
   it is any string of valid URI characters [RFC2396] chosen by the
   minter of the URI. It is RECOMMENDED that specific identifiers should
   be human-friendly.


2.1 EQUALITY OF TAGS

   As with URIs in general, different tags may identify the same thing.
   Tags themselves, however, are simply strings of characters and are
   considered equal if and only if they are completely indistinguishable
   in their machine representations. That is, one can compare tags for
   equality by comparing the numeric codes of their characters, in
   sequence, for numeric equality.

   This equality-criterion allows for simplification of tag-handling
   software, which does not have to transform tags in any way to compare
   them. However, it puts the onus on minting authorities not to
   inadvertently create tags that were intended to be identical but are
   unequal, such as "tag:sandro@w3.org,2001-01-01:Sandro" and
   "tag:sandro@w3.org,2001:Sandro".

   Organizations may choose to define equivalances of tags, saying for
   instance that "tag:sandro@w3.org,2001-01-01:Sandro" and
   "tag:sandro@w3.org,2001:Sandro" should be treated as synonyms, but
   the two tags themselves remain distinct. Conventions for asserting
   equivalences like this are beyond the scope of this document.


3. MEETING REQUIREMENTS 1-4

   Requirement 2 of Section 1 -- convenience for humans -- is met by the
   URL-like syntax for tag authorities. However, the onus is on
   individual naming authorities to use human-friendly specific
   identifiers.

   Requirement 3 -- negligible costs -- follows from use of domain names
   and email addresses. Those identifiers are already held by many



Kindberg         Informational - Expires March 1, 2003          [Page 6]

Internet-Draft                    tags                    September 2002


   individuals and organisations and are cheap to obtain. Specific
   identifiers may be minted without communication with any other
   entity.

   Requirement 4 -- independence of resolution schemes -- is asserted by
   definition. However, this state of affairs is subject to actual usage
   conventions.

   Requirement 1 specifies uniqueness over space and time. Tag URIs meet
   that requirement by using uniquely assigned authority names and by
   handling transfers of their assignment, e.g. the transfer of a domain
   name's registration from one entity to another. The date is used to
   guarantee uniqueness of "tagAuthority" across assignments of the
   authority name.

   For example, suppose that on November 2, 2001, the champignon.net
   domain registration becomes assigned to a new entity. That entity
   must qualify the domain name with a date on which it is or was
   assigned to it, to ensure that its tag authority is and will remain
   unique. In particular, it must take care not to use defaults in such
   a way as to specify an earlier date. For example, the new assignee of
   champignon.net may use 2001-11-02, 2001-12 or 2002 (assuming it
   retains the assignment) but not 2001 or 2001-11.


4. TAG URNS

   The tag namespace lends itself to embedding in the URN namespace
   [RFC2141] by adopting "tag" as a URN namespace identifier, to form a
   set of tag-based URIs that inherit the URN property of denoting one
   resource persistently in any given naming context.

   The syntax for tag URNs is thus:

         "urn:tag:" tagAuthority ":" [specific]

   with syntactic components as defined in Section 2. For example,
   urn:tag:timothy@hpl.hp.com,2001:fred is a tag URN. Any binding of
   that URI must be guaranteed to be persistent.

   Appendix A contains a completed URN namespace identifier registration
   template, requesting assignment of "tag".


5. SECURITY CONSIDERATIONS

   Minting a tag, by itself, is an operation internal to the minting
   entity with no external consequences. The consequences of using an



Kindberg         Informational - Expires March 1, 2003          [Page 7]

Internet-Draft                    tags                    September 2002


   improperly minted tag (due to malice or error) in an application
   depends on the application, and must be considered in the design of
   any application that uses tags.

   There is a significant possibility of minting errors by people who
   fail to apply the rules governing minting dates, or who use a shared
   (organizational) authority-name without prior organization-wide
   agreement. Tag-aware software MAY help catch and warn against these
   errors. (As stated in Section 2, however, to allow for future
   expansion, software MUST NOT reject tags which do not conform to the
   syntax specified in Section 2. It MAY however have a "lint" mode,
   where it checks URIs for conformance to particular specifications and
   notifies the person requesting the check of abnormalities.)


REFERENCES

   [DOI]       Norman Paskin (1997). Information Identifiers. Learned
               Publishing, Vol. 10, No. 2, pp. 135-156, April. See also
               www.doi.org.

   [ISO11578]  ISO (International Organization for Standardization).
               ISO/IEC 11578:1996. "Information technology - Open
               Systems Interconnection - Remote Procedure Call (RPC)"

   [ISO8601]   ISO (International Organization for Standardization). ISO
               8601:1988. Data elements and interchange formats --
               Information interchange -- Representation of dates and
               times.

   [OID]       ITU-T recommendation X.208 (ASN.1). See also RFC 1778.

   [RFC822]    David H. Crocker (1982). Standard for the format of ARPA
               Internet text messages.

   [RFC1035]   P. Mocapetris (1987). Domain Names - implementation and
               specification.

   [RFC2141]   R. Moats (1997). URN syntax.

   [RFC2396]   T. Berners-Lee, R. Fielding, L. Masinter (1998). Uniform
               Resource Identifiers (URI): Generic Syntax.

   [RFC3061]   M. Mealling (2001). A URN Namespace of Object
               Identifiers.

   [UUID]      Paul Leach, Rich Salz (1997). UUIDs and GUIDs. Internet-
               Draft Draft-leach-uuids-01.



Kindberg         Informational - Expires March 1, 2003          [Page 8]

Internet-Draft                    tags                    September 2002


AUTHORS' ADDRESSES

   Tim Kindberg
   Hewlett-Packard Laboratories
   1501 Page Mill Road, MS 1138
   Palo Alto, CA 94304, USA
   Tel:   +1 650 857-5609
   Email: timothy@hpl.hp.com

   Sandro Hawke
   World Wide Web Consortium
   200 Technology Square
   Cambridge, MA 02139, USA
   Tel:   +1 617 253-7288
   Email: sandro@w3.org


APPENDIX A: REQUEST FOR REGISTRATION OF 'TAG' URN NAMESPACE IDENTIFIER

   Namespace ID:

      tag

   Registration Information:

      Version 3
      Date: 1 August 2002

   Declared registrant of the namespace:

      Name:          Tim Kindberg
      E-mail:        timothy@hpl.hp.com
      Affiliation:   Hewlett-Packard Company
      Address:       1501 Page Mill Rd
                     Palo Alto, Ca 94304, USA

   Declared registrant of the namespace:

      Name:          Sandro Hawke
      E-mail:        sandro@w3.org
      Affiliation:   World Wide Web Consortium
      Address:       200 Technology Square
                     Cambridge, MA 02139, USA

   Declaration of structure:






Kindberg         Informational - Expires March 1, 2003          [Page 9]

Internet-Draft                    tags                    September 2002


      The identifier structure is as follows:

      "urn:tag:" authorityName "," date ":" [specific] (see Section 2)

   Relevant ancillary documentation:

      See Sections 1-4 above and References section below.

   Identifier uniqueness considerations:

      Guaranteed by uniqueness of assignment of a domain name or email
      address on a given date -- as long as the "specific" string is
      locally unique. See Section 3 above.

   Identifier persistence considerations:

      A tag URN persistently designates the resource to which the minter
      of the tag bound it. This Draft does not mandate any particular
      protocol for effecting that binding.

   Process of identifier assignment:

      Assignment of these URNs is delegated to entities that hold domain
      names or email addresses (See Section 2).

   Process for identifier resolution:

      No resolution procedures are mandated.

   Rules for Lexical Equivalence:

      Character-by-character equality. See section 2.1 above.

   Conformance with URN Syntax:

      No special considerations.

   Validation mechanism:

      None specified.

   Scope:

      Global.







Kindberg         Informational - Expires March 1, 2003         [Page 10]