Subject scheme maps

A subject scheme map enables adopters to create custom controlled values and to manage metadata attribute values for an organization or a project without having to write a DITA specialization.

Subject scheme maps use key definition to define a collection of controlled values rather than a collection of topics. The highest level of map that uses the set of controlled values must reference the subject scheme map in which those controlled values are defined.

A controlled value is a short, readable, and meaningful keyword that can abe used as a value in a metadata attribute. For example, the @audience metadata attribute may take a value that identifies the user group associated with a particular content unit. Typical user values for a medical-equipment product line might include therapist, oncologist, physicist, radiologist, and so on. In a subject scheme map, an information architect can define a list of these audience values. Authoring tools may use these lists of controlled values to provide value lists from which authors may select values when they are entering metadata.

If controlled values for a metadata attribute are defined using the subject scheme map, tools may give an organization a list of readable labels, a hierarchy of values to simplify selection, and a shared definition of the value.

Controlled values may be used to classify content for filtering and flagging at build time. They may also be used for retrieval and traversal of the content at run time if information viewers that provide such functionality are available.

Tools may validate controlled values for attributes by reference to the subject scheme map. As with all key definitions and references, the reference must appear in the highest map that makes use of the controlled values.

Defining a list of controlled values

A specialized DITA map, <subjectScheme> is used to define a collection of controlled values. Each controlled value is defined using a specialized topic reference, called <subjectdef>. The <subjectdef> is used to define both a category and a list of controlled values. The top-level <subjectdef> defines the category and the children define the controlled values. The following example illustrates the use of <subjectdef> to define controlled values for a group of users:

<subjectScheme>
<!-- Pull in a scheme that defines audience user values -->
    <subjectdef keys="users">
        <subjectdef keys="therapist">
        <subjectdef keys="oncologist">
        <subjectdef keys="radiationphysicist">
        <subjectdef keys="radiologist">
    </subjectdef>

 <!-- Define an enumeration of the audience attribute, equal to
       each value in the users subject. This makes the following values
       valid for the audience attribute: therapist, oncologist, physicist, radiologist -->
     <enumerationdef>
         <attributedef name="audience"/>
         <subjectdef keyref="users"/>
     </enumerationdef>...
</subjectScheme>

Within the <subjectdef> element

<navtitle> can provide a more readable value name
<shortdesc> within <topicmeta> can provide a definition

An enumeration may be defined with hierarchical levels by nesting subject definitions. If filtering or flagging excludes "therapist" and does not explicity identify "novice", processing should apply filtering to all subsets of therapist. If filtering includes "novice" but does not explicity exclude "therapist", processing should include the general therapist content because it applies to "novice". If flagging explicity includes "therapist" but is not set explicity for "novice", processing should apply the "therapist" flag to the "novice" content as a special type of therapist.

<subjectScheme>
    <subjectdef keys="users">
        <subjectdef keys="therapist">
            <subjectdef keys="novice"/>
            <subjectdef keys="expert"/>
        </subjectdef>
        <subjectdef keys="oncologist">
        <subjectdef keys="physicist">
        <subjectdef keys="radiologist">
    </subjectdef>

The <subjectdef> element can use an @href attribute to refer to a more detailed definition of the subject. For example, the value of "oncologist" could refer to an encyclopedia entry that describes the oncologist role in medicine.

<subjectdef keys="oncologist" href="encyclopedia/oncologist.dita"/>

These definitions may help to clarify the meaning of a value, especially when different parts of an organization may use the same term differently. An editor may support drilling down to the subject definition topic for a detailed explanation of the subject. DITA output formatting may produce a help file, PDF, or other readable catalog for understanding the controlled values.

Validating metadata attributes against a subject scheme

After locating the scheme, editors may validate an attribute against the bound enumeration, preventing users from entering misspelled or undefined values. A map editor may validate the audience attribute in a map against the scheme. A processor may check that all values listed for an attribute in a DITAVAL file are bound to the attribute by the scheme before filtering or flagging.

Scaling a subject scheme to define a taxonomy

A taxonomy differs from a controlled values list primarily in the degree of precision with which the metadata values are defined. A set of controlled values lists is sometimes regarded as the simplest form of taxonomy. Regardless of whether the goal is a simple list of controlled values or a taxonomy:

The same core elements are used (subjectScheme, subjectdef, and schemeref).
A category and its subjects can have a binding that enumerates the values of a metadata attribute.

Beyond the core elements and the attribute binding elements, sophisticated taxonomies can take advantage of some optional elements in the scheme. Most of these optional elements make it possible to specify more precise relationships among subjects.

The <hasNarrower>, <hasPart>, <hasKind>, <hasInstance>, and <hasRelated> elements specify the kind of relationship in a hierarchy between a container subject and its contained subjects. The following example defines San Francisco as an instance of a city but a geographic part of California.

<subjectScheme>
  <hasInstance>
    <subjectdef keys="city" navtitle="City">
       <subjectdef keys="la" navtitle="Los Angeles"/>
       <subjectdef keys="nyc" navtitle=New York City"/>
       <subjectdef keys="sf" navtitle="San Francisco">
    </subjectdef>
    <subjectdef keys="state" navtitle="State">
       <subjectdef keys="ca" navtitle="California"/>
       <subjectdef keys="ny" navtitle=New York"/>
    </subjectdef>
   </hasInstance>
   <hasPart>
      <subjectdef keys="place" navtitle="Place">
        <subjectdef keys="ca">
          <subjectdef keys="la">
          <subjectdef keys="sf">
      </subjectdef>
      <subjectdef keys="ny">
         <subjectdef keys="nyc">
      </subjectdef>
    </hasPart>
 </subjectScheme>

Sophisticated tools can use this scheme to associate content about San Francisco with related content about other California places or with related content about other cities (depending on the interests of the current user).

The scheme can also define relationships between subjects that are not hierarchical. For instance, cities sometimes have "sister city" relationships. The example scheme could add a subjectRelTable element to define these associative relationships, with a row for each sister-city pair and the two cities in different columns in the row.

While users who have access to sophisticated processing tools benefit from defining taxonomies with this level of precision, other users can safely ignore this advanced markup and define taxonomies with hierarchies of subjectdef elements that aren't precise about the kind of relationship between the subjects.

Darwin Information Typing Architecture (DITA) Version 1.2

Subject scheme maps

Defining a list of controlled values

Validating metadata attributes against a subject scheme

Scaling a subject scheme to define a taxonomy