ISO/IEC JTC 1/SC 34 N0554

ISO/IEC logo

ISO/IEC JTC 1/SC 34

Information Technology --
Document Description and Processing Languages

TITLE: Topic Maps -- Reference Model
SOURCE: Patrick Durusau and Steven R. Newcomb
PROJECT: WD 13250-5: Information Technology -- Topic Maps -- Reference Model
PROJECT EDITORS: Mr. Patrick Durusau; Dr. Steven R. Newcomb
STATUS: Informational (4.6)
ACTION: For review and comment
DATE: 2004-11-07
DISTRIBUTION: SC34 and Liaisons
REPLY TO:

Dr. James David Mason
(ISO/IEC JTC 1/SC 34 Chairman)
Y-12 National Security Complex
Bldg. 9113, M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
Network: masonjd@y12.doe.gov
http://www.y12.doe.gov/sgml/sc34/
ftp://ftp.y12.doe.gov/pub/sgml/sc34/

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: jtc1sc34@scc.ca
http://www.jtc1sc34.org



2004-11-07

JTC 1/SC 34 N 0554

Topic Maps -- Reference Model

Version 4.6, 2004-11-07
The current editors' working revision is available at http://www.isotopicmaps.org/TMMM/TMMM-latest.html.

The current revision with no editorial marks is available at http://www.isotopicmaps.org/TMMM/TMMM-latest-clean.html.

Send comments to sc34wg3@isotopicmaps.org.
When commenting about any particular paragraph, please mention the paragraph's unique identifier (e.g. "parid2902"); these are not subject to change.
(Please don't bother to mention section numbers, note numbers, etc.; they change from revision to revision.)
(See how to make your comment easy for us to publish.)

Click on red underlined parids (e.g., [parid0000]) to see comments that have been submitted.

Table of Contents

0 Introduction (Informative)
1  Scope
2  Glossary
2.1 built-in
2.2 conferred
2.3 disclosure
2.4 merging
2.5 Other Property (OP)
2.6 property class
2.7 property instance
2.8 reification
2.9 subject
2.10 Subject Identity Property (SIP)
2.11 subject proxy
2.12 topic map
2.13 Topic Map Application (TMA)
2.14 topic map view
2.15 Topic Maps Reference Model (TMRM)
3  Subjects and Subject Proxies
4  Disclosures of Topic Map Applications (TMAs)
5  Formal Model (remarks on Barta's Tau)

CHANGE HISTORY:

[parid9004] Version 4 is yet another complete redraft, with the goal of achieving both clarity and brevity.

[parid9003] Version 3 is a complete redraft that narrows the focus to nomenclature and Disclosure.

[parid9002] Version 2 is a major revision of Version 1. The ideas of Version 1 are preserved, but they are now all explained in terms of the properties of topics. There are also some terminological changes; for example, what in Version 1 was called a "node" is called a "topic" in Version 2.

[parid9001] Made IS13250::t-roles{ } an SIDP instead of an OP. According to the TMRM paradigm, it is inescapable that the subject of a t-node is its roles, at least for all purposes of merging topics.


0    [parid0001] Introduction (Informative)

[parid0010] Topic maps are sets of subject proxies, each of which is a surrogate for a single subject. In the language of ISO 13250:2002, the general notion of subject proxies was divided into "topics," "associations," and "occurrences," despite all three being surrogates for subjects in any given topic map. The Topic Maps Reference Model (TMRM) concerns itself with surrogates for all kinds of subjects, including but not limited to subjects that fall strictly into one of the three traditional categories; it defines the notion of "subject proxy".

[parid0011] The TMRM also specifies requirements for disclosing how to determine whether two or more subject proxies are proxies for the same subject, and how to view all of them as a single proxy. In a topic map, if a subject proxy is the only proxy for its subject, then users can access all information about that subject through that proxy's single (virtual) "location".

[parid0020] Topic maps are useful ways of viewing information because, as in all kinds of maps (including geographic maps), every specific location is unique, and all information that is relevant to a given geographic location is available at the corresponding location in the map. For example, in a map of Russia that depicts cities and elevations, the elevation of St. Petersburg can be found at the place where St. Petersburg is depicted, but one would not expect to find this information at the place where Moscow is depicted. The idea of a unique location within the framework of a geographic map corresponds to the use of "subject proxy" within the TMRM.

[parid0030] All maps are written within specific frameworks of expression. In the case of geographic maps, the nature of the correspondence between locations on the map and actual geographic locations is determined by the framework within which the map is expressed. Often a projection technique, such as the Mercator projection technique, is an important aspect of the frameworks of geographic maps. Other aspects of framework design include decisions about what to depict and what not to depict, the symbols that will be used, and so on; the framework of a map can have many aspects and dimensions. No map can be understood in the absence of at least some understanding of its framework of expression. When people share the same understanding of the framework of a given map, they can understand that map in the same way.

[parid0031] Any map, regardless of its framework, can be seen in terms of any other mapping framework, but only if both frameworks are understood. Maps can be combined ("merged") if their frameworks are known, and if their contents can be understood in terms of a framework capable of encompassing the combination of their contents.

[parid0060] In the TMRM, the framework of expression of a topic map is called a "Topic Map Application (TMA)". The TMRM requires TMA disclosures to disclose, among other things:

  1. [parid0070] the definitions of the kinds of properties ("property classes") that subject proxies can have,

  2. [parid0080] the rules for determining when multiple proxies are surrogates for the same subject, and

  3. [parid0090] the rules for merging the values of the properties of proxies, when it has been determined that the proxies are surrogates for the same subject and they need to be viewable as a single proxy.

[parid0061] The TMRM's disclosure requirements are designed to facilitate the uniform understanding and merging of diverse topic maps, so that all of the diverse, independently expressed information about each subject can be viewed as if it were available at that subject's unique virtual "location" -- its unique subject proxy.


1   [parid0110] Scope

[parid0120] This International Standard specifies:

  1. [parid0130] The abstract definition of subject proxy.

  2. [parid0131] The abstract definition of merging subject proxies.

  3. [parid0140] The abstract definition of Topic Map Application (TMA), and requirements to be met by disclosures of TMAs.

  4. [parid0141] Other definitions and specifications in support of the above.

[parid0150] This International Standard does not specify:

  1. [parid0160] The subjects of subject proxies nor constraints on such subjects.

  2. [parid0161] The classes of the properties of subject proxies nor constraints on such classes.

  3. [parid0162] The values of the properties of subject proxies nor constraints on such values.

  4. [parid0170] The supporting algorithms and data models that may be used to represent subjects, to detect when two or more subject proxies are proxies for the same subject, or to merge the values of Subject Identity Properties or Other Properties.


2   [parid0190] Glossary

2.1   [parid0230] built-in

[parid0240] (Said of the value of a property of a subject proxy:) Unsupported by a disclosed conferral rule; given; axiomatic. The opposite of conferred (the values of properties of subject proxies can be either built-in or conferred).

Note 1: 

[parid0242] The TMRM does not constrain the rules, if any, whereby built-in values are calculated and assigned, nor does it require or constrain their disclosure. Such rules are not a part of any TMA. Nevertheless, the operation of such rules results in subject proxies whose properties are governed by one or more TMAs, so, by establishing disclosure requirements for TMAs, the TMRM facilitates the expression of such rules.



2.2   [parid0250] conferred

[parid0260] (Said of the value of a property of a subject proxy:) Existing because of the operation of a conferral rule that is defined by the Topic Map Application (TMA) that defines the class of the property. The opposite of built-in (the values of properties of subject proxies can be either built-in or conferred).


2.3   [parid0263] disclosure

[parid0264] Information that, either within itself and/or by reference to other information, comprehensively defines (or is part of a comprehensive definition of) a Topic Map Application (TMA). Optionally, a disclosure may also define one or more interchange syntaxes, data models, implementations, and/or implementation strategies, along with unambiguous and comprehensive instructions as to how instances of each such syntax, data model, etc. are intended to be interpreted in terms of the defined Topic Map Application.


2.4   [parid0270] merging

[parid0280] The process whereby two subject proxies that are surrogates for the same subject become viewable as a single resulting subject proxy. The single proxy that results from such a merger has instances of all of the property classes of which either or both of the original two proxies have instances. If an instance of a given property class appears in only one of the original two proxies, then that property instance appears unchanged in the resulting single proxy. If both of the original subject proxies have instances of a given property class, the values of those instances are combined in conformance with the rule for combining the values of instances of that class that is disclosed by the TMA that governs the class, and the resulting value becomes the value of the instance of that property class that appears in the resulting single proxy.


2.5   [parid0310] Other Property (OP)

[parid0320] A property class or property instance that is not a Subject Identity Property (SIP).


2.6   [parid0330] property class

[parid340] A named type of Subject Identity Property (SIP) or Other Property (OP), instances of which may appear in subject proxies. Within a TMA, the names of all property classes are unique.


2.7   [parid0350] property instance

[parid0360] One of the named values that comprise a subject proxy: an instance of a Subject Identity Property (SIP) or Other Property (OP). Its name is the same as the name of the property class of which it is an instance. In the subject proxy in which it appears, it is the only instance of its class.


2.8   [parid0365] reification

[parid0366] The representation of a subject by a subject proxy.

Note 2: 

[parid0367] The question of what has or has not been reified in a topic map view can be answered by examining the SIPs of its subject proxies. In every subject proxy, at least one Subject Identity Property (SIP) must appear, and no more than one SIP that was defined by any given single TMA can appear in a given subject proxy.



2.9   [parid0370] subject

[parid0380] Any thing whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. A potential or actual subject of conversation.


2.10   [parid0390] Subject Identity Property (SIP)

[parid0400]

  1. [parid0391] A property class whose instances specify the subjects of the subject proxies in which they appear.

  2. [parid0392] A property instance that specifies the subject of the subject proxy in which it appears.

Note 3: 

[parid0405] In every subject proxy, there is always at least one SIP. As with all properties, both SIPs and OPs, the nature and structure of the value of an SIP is defined and constrained by the TMA that defines the corresponding SIP class. The value of an SIP or OP may be a single value, or it may have multiple named and/or unnamed components, and the components themselves may have multiple named and/or unnamed components, recursively. Property values may be other subject proxies and/or any kinds of data; the TMRM does not constrain the designs of property values.



2.11   [parid0410] subject proxy

[parid0420] A surrogate for a subject. Subject proxies consist of property instances, at least one of which must be a Subject Identity Property (SIP). In a subject proxy, there cannot be more than one SIP whose class is defined by the same Topic Map Application (TMA).


2.12   [parid0430] topic map

[parid0440]

  1. [parid0450] A set of subject proxies that is treated as a unit. The classes of the properties (both SIPs and OPs) that comprise the subject proxies, and the rules for recognizing when multiple proxies are surrogates for the same subject, are both disclosed by one or more TMAs (called the "governing TMAs"). The governing TMAs also disclose the rules for viewing the values of multiple instances of each property class as a single value of that class; these rules enable multiple proxies for the same subject to be viewed as if they were a single proxy with a single instance of each property class.

  2. [parid0460] An document written in a topic map syntax; an interchangeable expression of a set of subject proxies.

Note 4: 

[parid0465] The first definition of "topic map" given above is a restatement of the definition in ISO 13250:2002, clause 3.26 a). (That clause is expressed in terms of ISO/IEC 10744:1997.)

[parid0470] The second definition for "topic map" reflects the definition in ISO 13250:2002, clause 3.26 b): "Any topic map document conforming to the SGML Architecture defined by this International Standard, or the document element (topicmap) of such a document."



2.13   [parid0480] Topic Map Application (TMA)

[parid0490] A Disclosure of rules that govern properties of subject proxies.


2.14   [parid0560] topic map view

[parid0570] A subject-oriented view of a corpus of information, via a set of subject proxies that are governed by a TMA. By conforming to the TMA's rules for recognizing subject sameness, for merging the properties of proxies, etc., a topic map view can provide the convenience of subject-oriented access to the contents of the corpus.


2.15   [parid0580] Topic Maps Reference Model (TMRM)

[parid0590] This International Standard.


3   [parid0600] Subjects and Subject Proxies

[parid0610] In a topic map view, all subjects for which a single virtual "location" can be established have a surrogate, that is, a subject proxy. Every subject proxy is a surrogate for a single unique subject.

[parid0620] The constraint that every subject proxy must have, among its properties, one SIP per governing TMA, fulfills two related but independent functions:

  1. [parid0630] the SIP instances collectively identify the subjects that the author of a particular topic map view chose to reify; and

  2. [parid0640] the classes of the SIP instances reveal how the author chose to distinguish those subjects from one another, or, in other words, how the author chose to recognize when two or more proxies have the same subject.

[parid0650] The function of SIP instances (that is, specifying the subjects of the subject proxies created by an author) enables interchange of topic map views with the expectation that the view as authored can be the same view seen by a user. The user is not constrained to take that view, but, since the ability to interchange topic map views is a requirement, this International Standard must enable authors to give users of topic map views the ability to replicate the views as authored.

[parid0660] The function of SIP classes (that is, revealing how the topic map view's author chose to distinguish the subjects of subject proxies) facilitates the merging of independently authored topic map views, not only when the views are expressed in terms defined by the same TMAs, but also when the TMAs are different. In the latter case, each independent TMA's Disclosure of its means for distinguishing the subjects of subject proxies can provide a useful basis for designing rules for combining the proxies governed by the independent TMAs.

Editor's Note 1: 

[parid0670] PD Question: Should the notion of assertions be treated here? Reasoning that assertion models are definitions of the SIP of the subject proxy known as the 'a-node' in earlier versions of the TMRM. There is no requirement that the values that make up the SIP of the 'a-node' of an assertion be other subject proxies, although there are obvious benefits to a TMA making and enforcing such requirements.

[parid0671] SRN: I think it's the camel's nose under the tent. There is no limit to the amount of advice we could provide here for TMA designers. The design of an assertion model, and even the question of whether a TMA must or should provide a property value type of "subject proxy", are things I think we'd better exclude from this standard.


4   [parid0700] Disclosures of Topic Map Applications (TMAs)

[parid0710] Each Disclosure of a Topic Map Application (TMA) must disclose the following things:

  1. [parid0720] Topic Map Application Name. The name of the TMA.

  2. [parid0730] Property Class Names. The names of all of the TMA's Subject Identity Property (SIP) and Other Property (OP) classes. Every TMA must define at least one SIP class, but OP classes are optional. Within a given TMA, the names of all SIP and OP classes must be unique; they all share the same namespace.

  3. [parid0731] Property Value Constraints. Definitions of the constraints on the values of instances of each SIP class and OP class. These constraints may include, for example, their value types.

  4. [parid0732] Property Instance Merging Rules. For each SIP class and OP class, a definition of the rule for combining the values of multiple instances of it into the value of a single instance.

    Note 5: 

    [parid0734] Whenever two subject proxies have been determined to have the same subject and each of them has an instance of a given property class, the values of the two property instances are combined into the value of a single instance of the same property class. (Each subject proxy has only one instance of each property class.) TMA-defined property instance merging rules disclose how that calculation should be made for all of the instances of each property class.


    Note 6: 

    [parid0735] When combining two or more subject proxies, if there is a single instance of a property, then that instance is preserved in the merged subject proxy. (See the definition of merging.)


    Note 7: 

    [parid0733] Designers of TMAs are free to define rules for combining the values of multiple instances of property classes in such a way that they must raise exceptions or have other special behaviors when combining the values is impossible, or when combining them would violate the logic of the TMA. Such exceptional behaviors are unconstrained by the TMRM; they are left to the discretion of TMA designers.


  5. [parid0740] Subject Sameness Detection Rules. For each SIP class, the rule for comparing any two instances of the class in order to determine whether they specify the same subject, i.e., whether the subject proxies in which two instances of the same SIP class appear can be viewed as a single subject proxy in which their respective property instances have been merged.

  6. [parid0780] Conferral Rules. If the TMA includes rules (called "conferral rules") that require values or value components to be conferred upon property instances, then, for each conferral rule, the property/value conditions that trigger the operation of the rule, and the effects of its operation on property values, must be disclosed. Conferral rules can require the conferral of values or value components on property instances that do not have any built-in value components, and therefore would not exist without their conferred values; such property instances are said to be "conferred into existence". Property instances that are conferred into existence may appear in subject proxies that would not otherwise have any property instances, and therefore would not otherwise exist; such subject proxies are also said to be "conferred into existence".

    Note 8: 

    [parid0781] Individual topic maps may or may not have any proxies whose property values are conferred (as opposed to built-in). For example, a topic map can be governed by a TMA that has no conferral rules, or that has conferral rules whose triggering conditions are unmet in the topic map. However, every a topic map must have at least one built-in property value; without it, there would be nothing to trigger the operation of any conferral rule.


[parid0790] The following grammatical productions are expressed in a notation similar to that of Clause 5 of ISO 8879. They summarize what the TMRM requires TMA Disclosures to disclose, and how the property instances that appear in the subject proxies of topic map views invoke the disclosures that apply to them by means of the names of TMAs, SIP classes, and OP classes.

[parid0800]

(1) topicMapApplicationDisclosure = (topicMapApplicationName,
                                     propertyClassDefinition+, 
                                     conferralRuleDefinition*)

[parid0810]

(2) propertyClassDefinition = ( SIPClassDefinition | OPClassDefinition)

[parid0820]

(3) SIPClassDefinition = (propertyClassName
                          propertyValueConstraints,
                          propertyInstanceMergingRule,
                          propertyInstanceSubjectSamenessDetectionRule)

[parid0830]

(3) OPClassDefinition = (propertyClassName
                         propertyValueConstraints,
                         propertyInstanceMergingRule)

[parid0840]

(4) subjectProxy = (SIPInstance+, 
                    OPInstance* )

   

[parid0850]

(5) (SIPInstance | OPInstance) = (topicMapApplicationName,
                                  propertyClassName,
                                  propertyValue)

   

[parid0860]

(6) (topicMapApplicationName | 
    propertyClassName | 
    propertyValueConstraints | 
    propertyInstanceMergingRule |
    propertyInstanceSubjectSamenessDetectionRule |
    propertyValue)          = [unconstrained by the TMRM]

   

[parid0870]

(7) conferralRule           = [unconstrained by the TMRM, except that
                              only instances of property classes
                              defined by the same TMA that defines the
                              conferral rule can have values conferred
                              upon them by that conferral rule.]

   

[parid0880]

(8) topicMapView = (topicMapApplicationName+, subjectProxy+)

   

Editor's Note 2: 

[parid0970] The items to be defined by a TMA deserve better names. The longer forms are used herein for clarity in discussion within the committee and should not necessarily taken as suggested final forms.

Editor's Note 3: 

[parid0980] The TMA concept answers the question raised by the often-heard requirement that merging be possible on any basis in addition to or instead of the rules articulated in the TMDM (ISO 13250-2): If a topic map view is to be constructed with merging based on such an additional or replacement set of merging rules, how is that to be disclosed for interchange purposes? It also answers the question of how to merge two or more topic map views that follow rules that depart from the TMDM.

Editor's Note 4: 

[parid0990] While not proposed herein, it is noted that some notation should be required of TMAs in order to facilitate interchange of topic map views based upon TMAs. At a minimum, it is suggested that names for the various components of a TMA be defined, even though the content to follow those names is of necessity unconstrained. (For example, the definition of an SIP for resources held in Cobol may need to use a different syntax and terminology than for resources that are held in XML.) For interchange purposes, however, it would be helpful for users to know that a SIP is being defined, even if the user must accept the burden of understanding whatever syntax is used to express the definition.


5   [parid1000] Formal Model (remarks on Barta's Tau)

[parid1010] While extremely interesting as an implementation strategy or model, the Tau model has no concept of identity (by design), preferring to leave that issue to other parts of the topic maps standard.

[parid1020] For the TMRM, on the other hand, the question of identity lies at the core of having a topic map view. That is, in order to have interchangeable, mergeable, or even useful topic map views, one must know what was authored as a subject proxy (in TMRM terms, what possesses an SIP) and on what basis that SIP was to be compared to others for purposes of detecting subject sameness, and what to do when subject sameness is detected. Knowledge of how such processing will be performed is of vital interest to promote use of topic maps, but is not the same as formulation of a the requirements for disclosing those choices.