Network Working Group                                            M. Wahl
INTERNET-DRAFT                                       Critical Angle Inc.
                                                                T. Howes
                                           Netscape Communications Corp.
Expires in six months from                              10 November 1996
Intended Category: Standards Track


                     Use of Language Codes in LDAP
                  <draft-ietf-asid-ldapv3-lang-00.txt>

1. Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working 
   documents of the Internet Engineering Task Force (IETF), its areas, and
   its working groups.  Note that other groups may also distribute working
   documents as Internet-Drafts.
 
   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference material
   or to cite them other than as "work in progress."
 
   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing  contained in the Internet-Drafts Shadow
   Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe),
   ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim).

2. Abstract

   The Lightweight Directory Access Protocol [1] provides a means for 
   clients to interrogate and modify information stored in a distributed
   directory system.  The information in the directory is maintained as 
   attributes [2] of entries.  Most of these attributes have syntaxes which
   are human-readable strings, and it is desirable to be able to indicate the
   natural language associated with attribute values.  

   This document describes how language codes [3] are carried in LDAP and are
   to be interpreted by LDAP servers.  All implementations must be prepared to
   accept language codes in the LDAP protocols.  Servers may or may not be 
   capable of storing attributes with language codes in the directory.

3. Language Codes

   Section 2 of RFC 1766 [3] describes the language code format which is used 
   in LDAP.  Briefly, it is a string of ASCII alphabetic characters and 
   hyphens.  Examples include "fr", "en-US" and "ja-JP". 

   Language codes are case insensitive.  For example, the language code "en-us"
   is the same as "EN-US" and "en-us".  One language code is a prefix of 
   another if both codes are equal up to the length of the first code.  For
   example, the language code "en" is a prefix of the language codes "en-us" 
   and "EN-US".





Wahl, Howes                                                        [Page 1]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996

   Implementations must not otherwise interpret the structure of the code when
   comparing two codes, but should treat them as simply strings of characters.
   Client and server implementations must allow any arbitrary string which 
   follows the patterns given in RFC 1766 to be used as a language code.

4. Use of Language Codes in LDAP
  
   This section describes how LDAP implementations must interpret language 
   codes in performing operations.  

   In general, an attribute with a language code is to be treated as a subtype 
   of the attribute without a language code.  If a server does not support 
   storing language codes with attribute values in the DIT, then it must 
   always treat an attribute with a language code as an unrecognized attribute.

   Clients may request the use of a particular language through the 
   preferredLanguage service control.  This control determines how the server
   interprets attributes without an explicit language parameter.  The details
   of this interaction for specific operations are given below.

4.1. Attribute Description

   An attribute consists of a type, a list of options for that type, and a 
   set of one or more values.  In LDAP, the type and the options are combined
   into the AttributeDescription, defined in section 4.1.4 of [1]. This is 
   represented as an attribute type name and a possibly-empty list of
   options.  One of these options associates a natural language with values
   for that attribute. 

        <language-option> ::= "lang=" <lang-code>

        <lang-code> ::= <printable-ascii> -- a code as defined in RFC 1766

   There can be at most one language option present in an AttributeDescription.

   The language code has no effect on the character set encoding for string 
   representations of DirectoryString syntax values; the UTF-8 representation
   of UniversalString (ISO 10646) is always used.

   Examples of valid AttributeDescription:
        givenName;lang=en-US
        CN;lang=ja-JP-kanji
        CN;lang=ja-JP-romaji

   In LDAP and in examples in this document, a directory attribute is 
   represented as an AttributeDescription with a list of values.  Note that 
   the data may be stored in the LDAP server in a different representation.

4.2.  Preferred Language Control

   The preferredLanguage session control is always non-critical.  Its value is
   a language code as defined in RFC 1766 [3].  If this control is absent,
   the default is that there is no preferred language for the client.



Wahl, Howes                                                        [Page 2]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996

   It is recommended that clients should use the most general language code 
   which is suitable for their purpose.  A language code with multiple 
   subtags may result in too much directory information being filtered out
   of responses.  In most cases, it is recommended that only the primary 
   language tag (such as "EN") should be provided.

   If the server supports the storing of language codes with attribute values  
   in the DIT, then it must indicate that "preferredLanguage" is a supported 
   control in the supportedControl attribute of the root DSE.  Otherwise it 
   must not indicate support for the "preferredLanguage" control.

4.3. Distinguished Names and Relative Distinguished Names

   No attribute description options are permitted in Distinguished Names or 
   Relative Distinguished Names.  Thus language codes MUST NOT be used in 
   forming DNs.

4.4. Search Filter

   A client may provide a language code in an AttributeDescription in a search
   filter.  If present, then only attribute values in the directory which 
   match the base attribute type or its subtype, the language code and the 
   assertion value match this filter. 

   Thus for example a filter of an equality match of type "name;lang=en-US" 
   and assertion value "Billy Ray", against the following directory entry

   objectclass: top                     DOES NOT MATCH (wrong type)
   objectclass: person                  DOES NOT MATCH (wrong type)
   name;lang=EN-US: Billy Ray           MATCHES 
   name;lang=EN-US: Billy Bob           DOES NOT MATCH (wrong value)
   CN;lang=EN-US;dynamic: Billy Ray     MATCHES
   CN;lang=en;dynamic: Billy Ray        DOES NOT MATCH (differing lang=)
   name: Billy Ray                      DOES NOT MATCH (no lang=)
   SN: Ray                              DOES NOT MATCH (wrong value)

   (Note that "CN" and "SN" are subtypes of "name".)  

   If the server does not support storing language codes with attribute values
   in the DIT, then any filter which includes a language code will always fail 
   to match, as it is an unrecognized attribute type (note however than no 
   error will be returned because of this).

   If no language code is specified in the search filter, then only the 
   base attribute type and the assertion value need match the value in the 
   directory.  










Wahl, Howes                                                        [Page 3]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996

   Thus for example a filter of an equality match of type "name" and assertion 
   value "Billy Ray", against the following directory entry

   objectclass: top                     DOES NOT MATCH (wrong type)
   objectclass: person                  DOES NOT MATCH (wrong type)
   name;lang=EN-US: Billy Ray           MATCHES
   name;lang=EN-US: Billy Bob           DOES NOT MATCH (wrong value)
   CN;lang=EN-US;dynamic: Billy Ray     MATCHES
   CN;lang=en;dynamic: Billy Ray        MATCHES
   name: Billy Ray                      MATCHES
   SN: Ray                              DOES NOT MATCH (wrong value)

   There is no effect of the preferredLanguage control in filtering.

4.5. Compare 

   A client may provide a language code in an AttributeDescription used in
   a compare request AttributeValueAssertion.  This is to be treated by 
   servers the same as the use of language codes in a search filter with an 
   equality match, as described in the previous section.  If there is no
   attribute in the entry with the same subType and language code, the 
   noSuchAttributeType error must be returned.

   A server may return a language code as part of the matchedSubtype field
   in the result.  

   Thus for example a compare request of type "name" and assertion value 
   "Johann", against an entry with all the following directory entry

   objectclass: top
   objectclass: person
   givenName;lang=de-DE: Johann
   CN: Johann Sibelius
   SN: Sibelius

   The server must return compareTrue, and may set the matchedSubtype field
   to be "givenName;lang=de-DE".
 
   If the server does not support storing language codes with attribute values
   in the DIT, then any comparison which includes a language code will always 
   fail to locate an attribute type, and noSuchAttributeType must be returned.

   There is no effect of the preferredLanguage control in comparing.

4.6. Requested Attributes in Search

   Clients may provide language codes in AttributeDescription in the 
   requested attribute list in a search request.

   If a language code is provided in an attribute description, then only
   attribute values in a directory entry which have the same language code 
   as that provided may be returned. Thus if a client requests an attribute
   "description;lang=en", the server must not return values of an attribute 
   "description" or "description;lang=fr".


Wahl, Howes                                                        [Page 4]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996

   Clients may provide in the attribute list multiple AttributeDescription 
   which have the same base attribute type but different options. For example
   a client may provide both "name;lang=en" and "name;lang=fr", and this 
   would permit an attribute with either language code to be returned.  Note 
   there would be no need to provide both "name" and "name;lang=en" since
   all subtypes of name would match "name".

   If a server does not support storing language codes with attribute values
   in the DIT, then any attribute descriptions in the list which include 
   language codes are to be ignored, just as if they were unknown attribute
   types.

   If a request is made specifying all attributes or an attribute is 
   requested without providing a language code, and the preferredLanguage 
   control has not been set, then all attribute values regardless of their 
   language code are returned.

   For example, if the client has set no preferredLanguage session control and
   requests a "description" attribute, and a matching entry contains 

   objectclass: top
   objectclass: organization
   O: Software GmbH
   description: software  
   description;lang=en: software products
   description;lang=de: softwareproduckte
   postalAddress: Berlin 8001 Germany
   postalAddress;lang=de: Berlin 8001 

   The server will return: 

   description: software
   description;lang=en: software products
   description;lang=de: softwareproduckte
   
   If the client has set a preferredLanguage control, then attributes are 
   excluded from the result if either of the following is true:
     - the attribute has a language code for which the preferredLanguage value 
       is not a prefix, or 
     - the attribute does not have a language code, but there is another 
       attribute of the same type or a subtype in the entry, which has a 
       language code for which the preferredLanguage value is a prefix.

   For example, if the client sets that the preferredLanguage was "en" and
   requests all attributes, then the following will be returned.  The 
   "description;lang=de" and "postalAddress;lang=de" are excluded, since the 
   language code in these attributes does not match the preferredLanguage.
   The "description" attribute is excluded, since it is a subtype of the 
   "description;lang=en" attribute, which does match the language code.  

   objectclass: top
   objectclass: organization
   O: Software GmbH
   description;lang=en: software products
   postalAddress: Berlin 8001 Germany

Wahl, Howes                                                        [Page 5]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996
  
   If a server does not support storing language codes with attribute values
   in the DIT, then it will ignore the preferredLanguage control.

4.7. Add Operation

   Clients may provide language codes in AttributeDescription in attributes
   of a new entry to be created, subject to the limitation that the client 
   must provide the attribute values used in the RDN without any language code
   or any other option.

   A client may provide multiple attributes with the same attribute type and 
   value, so long as each attribute has a different language code.

   Servers which support storing language codes in the DIT must allow any
   attributes with DirectoryString to have a language code associated with it. 
   Servers may allow language codes to be associated with other attributes.

   For example, the following is a legal request.

   objectclass: top
   objectclass: person
   objectclass: residentialPerson
   name: John Smith
   CN: John Smith
   CN;lang=en: John Smith
   SN: Smith
   streetAddress: 1 University Street
   streetAddress;lang=en: 1 University Street
   streetAddress;lang=fr: 1 rue University
   houseIdentifier;lang=fr: 9e etage

   If a server does not support storing language codes with attribute values
   in the DIT, then it must treat an AttributeDescription with a language 
   code as an unrecognized attribute. If the server forbids the addition of 
   unrecognized attributes then it must fail the add request with the 
   appropriate result code.

   There is no effect of the preferredLanguage control in storing attributes
   in the add operation.

4.8. Modify Operation

   A client may provide a language code in an AttributeDescription as part of
   a modification element in the modify operation.  

   Attribute types and language codes must match exactly against values stored
   in the directory.  For example, if the modification is a "delete", then if
   the stored values to be deleted have a language code, the language code must
   be provided in the modify operation, and if the stored values to be deleted
   do not have a language code, then no language code is to be provided.

   If the server does not support storing language codes with attribute values
   in the DIT, then it must treat an AttributeDescription with a language code
   as an unrecognized attribute, and must fail the request with an 
   appropriate result code.

Wahl, Howes                                                        [Page 6]   

INTERNET-DRAFT            Use of Language Codes in LDAP       November 1996

   There is no effect of the preferredLanguage control in performing this 
   operation.

4.9. Diagnostic Messages

   If the server supports returning diagnostic messages in more than one 
   language, then if the preferredLanguage control has been set, it may 
   use the preferredLanguage to choose an appropriate message.  If the 
   preferredLanguage is not recognized, the diagnostic messages must be 
   returned in the default language.

   It is strongly recommended that in the default language for diagnostic 
   messages, only printable ASCII characters be used, as not all clients 
   will be able to display the full range of Unicode.

5.  Security Considerations

   Security issues are not discussed in this memo.

6.  Bibliography

   [1] M.Wahl, T. Howes, S. Kille, "Lightweight Directory Access Protocol 
       (Version 3)", INTERNET DRAFT <draft-ietf-asid-ldapv3-protocol-03.txt>,
       October 1996.

   [2] M. Wahl, A. Coulbeck, T. Howes, S. Kille, "Lightweight X.500 Directory 
       Access Protocol Standard and Pilot Attribute Definitions", 
       <draft-ietf-asid-ldapv3-attributes-03.txt>, October 1996.

   [3] H. Alvestrand, "Tags for the Identification of Languages",
       RFC 1766, March 1995.

7.  Authors Addresses

       Mark Wahl
       Critical Angle Inc.
       4815 W Braker Lane #502-385
       Austin, TX 78759
       USA

       EMail:  M.Wahl@critical-angle.com


       Tim Howes
       Netscape Communications Corp.
       501 E. Middlefield Rd
       Mountain View, CA 94043
       USA
       
       Phone:  +1 415 937-3419
       EMail:   howes@netscape.com

<draft-ietf-asid-ldapv3-lang-00.txt> Expires: April 5, 1997



Wahl, Howes                                                        [Page 7]