ISO/IEC JTC 1/SC34 N (no official status)

ISO/IEC JTC 1/SC34

Information Technology --

Document Description and Processing Languages

TITLE: Reference Model for ISO 13250 Topic Maps (RM4TM)
SOURCE: Steven R. Newcomb, Sam Hunting, Jan Algermissen and Patrick Durusau
PROJECT: Topic Maps
PROJECT EDITORS: Michel Biezunski, Martin Bryan, Steven R. Newcomb
STATUS: Editor's Draft, Revision 1.15
ACTION: For review and comment
DATE: 3 February 2003
SUMMARY:
DISTRIBUTION: SC34 and Liaisons
REFER TO:
SUPERCEDES:
REPLY TO: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mail: mailto:mxm@y12.doe.gov
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Ms. Sara Hafele Desautels, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
25 West 43rd Street
New York, NY 10036
Tel: +1 212 642-4937
Fax: +1 212 840-2298
E-mail: sdesaute@ansi.org

3 February 2003

JTC 1/SC 34 N (no official status)

Reference Model for ISO 13250 Topic Maps (RM4TM)

Version 1.15, 2003/02/03 (The current editors' working revision is available at http://www.isotopicmaps.org/TMMM/TMMM-latest.html.
The last officially published version (version 1.0) is ISO/IEC JTC1/SC34/N0344.

Send comments to sc34wg3@isotopicmaps.org.
When commenting about any particular paragraph, please mention the paragraph's unique identifier (e.g. "parid2902"); these are not subject to change.
(Please don't bother to mention section numbers, note numbers, etc.; they change from revision to revision.)
(See how to make your comment easy for us to publish.)

Click on red underlined parids (e.g., [parid0000]) to see comments that have been submitted.

Table of Contents

0 Introduction
1  Scope
2  Glossary
3  Topic map graphs
3.1  The common structural abstraction for topic maps
3.2 Topic map graphs consist of nodes and arcs.
3.3 Arcs
3.4  Nodes and subjects
3.5 Well-formed nodes
3.6  Assertions
3.7  Well-formedness constraints on Assertions
3.8  Well-formedness constraints on topic map graphs
3.9  Well-formed and fully merged topic map graphs
4  Properties of nodes
4.1 Only a common framework for properties; no common properties
4.2  Every property is governed by a single TM Application
4.3  Subject identity discrimination properties ("SIDPs")
4.4  Other properties ("OPs")
4.5  Names of properties of nodes
4.6  Values of properties of nodes
4.7  Assignment of values of properties of nodes
4.8  Internal consistency of the values of a node's SIDPs
5  Definitions of TM Applications
5.1  Introduction
5.2  Constraints on definitions of aspects of TM Applications
6  Constructing fully-merged topic map graphs from well-formed topic map graphs
6.1  Construct the topic map graph
6.2  Validate assertion instances for conformance to definitions
6.3  Assign values to properties of nodes
6.4  Validate the values of the SIDPs of nodes
6.5 Merge nodes according to the defined merging rules
6.6  Conditionally stop or repeat
7  Conformance
7.1  Conforming TM Applications
7.2  Conforming TM Application definitions
7.3  Conforming implementations of TM Applications
7.4  Conforming interchangeable topic maps
Annex A Brief informal overview (informative)
A.1  The structure of topic spaces: topic map graphs
A.2  One subject per node; one node per subject
A.3  All subjects are represented by nodes
A.4  Nodes have properties
Annex B Assertion diagrams (informative)
Annex C Sample properties that reflect assertion structure (informative)

CHANGE HISTORY:

[parid3000] What happened between Revision 1.0 and Revision 1.15:

  • [parid3001] Minor editing of [parid0907] 3.3.

  • [parid3002] Corrected numerous errors in the "Six Cases of Well Formed Nodes" table ([parid0146] 3.5.1) (thanks to Marc de Graauw).

  • [parid3003] Standardized the form of the unique identifiers of paragraphs, so as to enable the new hyperlink-to-comments editorial support system.

  • [parid3004] In [parid0992] 4.7.1 and [parid0993] 6.3, clarified that during the merging process, conferred property values may need to be erased, and that built-in property values are never erased.

  • [parid3005] Clarified that some of the data in syntactic instances becomes the values of built-in SIDPs. This clarification involved editing [parid0922] 5.2.7, [parid0970] 5.2.10.2 and [parid0421] 6.1.1, and adding [parid0989] 5.2.10.2 and [parid0991] 5.2.10.2.


0    [parid2001] Introduction
Editor's Note 1: 

[parid0000] This is a test of the hyperlink-to-comments system.

[parid2902] This Reference Model for ISO 13250 Topic Maps (RM4TM) provides a framework for the definitions of Topic Map Applications (TM Applications). Diverse topic maps that conform to diverse TM Applications that are defined in keeping with this framework can be interpreted and amalgamated automatically by independently implemented systems, without losing information, and with predictable results.

[parid2901] Many of the key advantages of the Topic Maps paradigm derive from the achievement of its primary objective, the "Subject Location Uniqueness Objective", which is to make everything known about every subject in a topic space accessible from a single location within that space. The achievement of the Subject Location Uniqueness Objective means that the efficiency with which users can find information is maximized, not only because the subject's single location, once found, acts as a comprehensive catalog of the things that are known about it, but also because the subject's location can be found in terms of any of its relationships to other subjects.

[parid2004] This RM4TM facilitates the development of TM Applications and systems that can achieve the Subject Location Uniqueness Objective with respect to all subjects, including those that are only implicit in interchangeable topic map instances, as well as with respect to subjects that are relationships (and aspects of relationships) among other subjects.

[parid0941] Moreover, this RM4TM facilitates the development of TM Applications and implementations that can amalgamate the topic spaces represented by topic maps that conform to diverse Topic Maps Applications into a single resulting topic space in which each subject has a single location, there is no redundant information, and all of the information represented by the comprising topic maps is preserved.

[parid2005] This RM4TM provides definition requirements for user-defined Topic Map Applications that allow such definitions to serve as contracts between topic map creators, users, and system implementers, such that when the interchange or amalgamation of topic maps fails due to nonconformance to the definition of a Topic Maps Application, the nonconforming aspects of the topic maps or system implementations can be identified.


1   [parid0001] Scope

[parid0005] This RM4TM defines:

  1. [parid0481] an abstract graph structure for the representation of relationships between subjects;

  2. [parid0482] rules for defining Applications of the Topic Maps paradigm; and

  3. [parid0483] rules for processing the information contained in topic maps.

Note 1: 

[parid0952] See Annex [parid0951] A for a brief informal overview of this RM4TM.



2   [parid0441] Glossary
Editor's Note 2: 

[parid0936] (The glossary hasn't been drafted yet.)


3   [parid0022] Topic map graphs

3.1   [parid0023] The common structural abstraction for topic maps

[parid0024] This RM4TM defines an abstract structure, called a "topic map graph", in terms of which all kinds of topic maps can be uniformly interpreted, regardless of their governing TM Applications, and regardless of the TM Application-defined interchange syntaxes in which they may be representable.

[parid0900] The "topic map graph" form of any given topic map represents all of the subjects that participate in the topic map explicitly, even if they were only implicitly represented in the interchangeable form of the given topic map.

[parid0484] The following subclauses name and define the rules and cases to which topic map graph components and entire topic map graphs must conform in order to be considered "well formed", and the additional rules to which topic map graphs must conform in order to be considered "fully merged". Topic map graphs that are under construction may or may not be well-formed, but only well-formed topic map graphs are eligible to become fully merged, in addition to being well-formed.


3.2   [parid0025] Topic map graphs consist of nodes and arcs.

[parid0026] A topic map graph consists of nodes and arcs. In a well-formed topic map graph, every arc is a typed, oriented connectedness of two nodes, and every node is one of the two endpoints of zero or more arcs.

Note 2: 

[parid0932] This RM4TM uses the neologism "connectedness" in order to avoid implying that TM Applications must be implemented in such a way that arcs are represented as a data structure. For example, The arc abstraction can be fully honored by the property values of the nodes that serve as its endpoints.



3.3   [parid0042] Arcs
Note 3: 

[parid0938] The reader's understanding of the remainder of this clause [parid0022] 3 is likely to be aided by referring to the informative "Assertion Diagrams" Annex [parid0488] B.


[parid0506] An "arc" in a topic map graph is a two-ended connectedness between nodes that satisfies all of the following criteria:


3.3.1   [parid0045] Four arc types

[parid0046] There are four arc types, named "AT", "AC", "CR", and "Cx". The significance of each type of arc is different.


3.3.2   [parid0047] Names of arc types and arc endpoint types

[parid0048] The first letter of an arc type's name is the name of one of its endpoint types. The second letter of the arc type's name is the name of its other endpoint type. That is, an AT arc has two endpoints, one of endpoint type "A" and the other of endpoint type "T".

Note 4: 

[parid0927] In a well-formed topic map graph, only a-nodes serve as "A" endpoint types, only c-nodes serve as "C" endpoint types, only r-nodes serve as the "R" endpoint types, and only t-nodes serve as the "T" endpoint types. There is no such thing as an "x-node", because all kinds of nodes are eligible to serve as the x endpoints of Cx arcs. The exceptional character of the x endpoints of Cx arcs is the reason why "x" is the only endpoint type name that is always rendered in lower case.



3.3.3   [parid0058] Eight forms of connectedness are possible

[parid0056] In all instances of each type of arc, the significance of a node's service as one of the endpoints is different from the significance of a node's service as the other endpoint. Given two nodes, N1 and N2, there are eight possible forms of connectedness between them, since there are four types of arcs. They are enumerated in the following subclauses.


3.3.3.1   [parid0060] Form 1: "A" to "T"

[parid0061] The connectedness of N1 and N2 is an instance of an AT arc type in which N1 is the A endpoint, and N2 is the T endpoint.


3.3.3.2   [parid0062] Form 2: "T" to "A"

[parid0063] The connectedness of N1 and N2 is an instance of an AT arc type in which N1 is the T endpoint, and N2 is the A endpoint. (This is the reverse of Form 1.)


3.3.3.3   [parid0064] Form 3: "A" to "C"

[parid0065] The connectedness of N1 and N2 is an instance of an AC arc type in which N1 is the A endpoint, and N2 is the C endpoint.


3.3.3.4   [parid0066] Form 4: "C" to "A"

[parid0067] The connectedness of N1 and N2 is an instance of an AC arc type in which N1 is the C endpoint, and N2 is the A endpoint. (This is the reverse of Form 3.)


3.3.3.5   [parid0068] Form 5: "C" to "R"

[parid0069] The connectedness of N1 and N2 is an instance of a CR arc type in which N1 is the C endpoint, and N2 is the R endpoint.


3.3.3.6   [parid0070] Form 6: "R" to "C"

[parid0071] The connectedness of N1 and N2 is an instance of an CR arc type in which N1 is the R endpoint, and N2 is the C endpoint. (This is the reverse of Form 5.)


3.3.3.7   [parid0072] Form 7: "C" to "x"

[parid0073] The connectedness of N1 and N2 is an instance of a Cx arc type in which N1 is the C endpoint, and N2 is the x endpoint.


3.3.3.8   [parid0074] Form 8: "x" to "C"

[parid0075] The connectedness of N1 and N2 is an instance of a Cx arc type in which N1 is the x endpoint, and N2 is the C endpoint.

Note 5: 

[parid0076] The above list of Forms of Connectedness can be represented in tabular form as follows:

Table 1:  The Eight Forms of Connectedness
[parid0077]
N1 N2
1
A T
T A
A C
C A
C R
R C
C x
x C
2
3
4
5
6
7
8

Note 6: 

[parid0908] The above enumeration of the Forms of Connectedness serves two purposes in this RM4TM:

  1. [parid0910] It establishes a name ("Form n", where n is an integer in the sequence 1..8) for each of the Forms of Connectedness that an arc can represent, as a convenience for use elsewhere in this document, and possibly in the definitions of TM Applications.

  2. [parid0909] It establishes that the orientation of the connectedness represented by an arc is an essential aspect of the definition of "arc" in this RM4TM. For purposes of a TM Application's definition of a "situation feature" (see [parid0928] 3.4.2), for example, it is insufficient merely to say that two nodes are connected by a certain type of arc. The specification of the arc must also include information as to which node serves as which endpoint type. In order to represent connectedness equivalent to the connectedness represented by an RM4TM arc in some "directed graph" paradigms, at least two directed graph arcs must be used, plus whatever additional machinery may be required to associate the two directed graph arcs in order to represent that both represent different directional aspects of the same connectedness. By contrast, RM4TM arcs are nondirectional, but oriented.



3.4   [parid0032] Nodes and subjects

3.4.1   [parid0033] One subject for each node

[parid0034] In topic map graphs, only nodes can represent subjects, and every node represents a single subject.


3.4.2   [parid0036] Situations and subjects

[parid0037] A node serves as one endpoint of zero or more arcs.

Note 7: 

[parid0509] A node that serves as the endpoints of no arcs at all is not well-formed unless it has at least one built-in SIDP value. (See [parid0502] 3.4.2.)


[parid0502] A node that is the endpoint of zero arcs is said to be "isolated." In a well-formed topic map graph, only "built-in" nodes (see Clause [parid0220] 4) can be isolated.

[parid0503] A node that is the endpoint of one or more arcs is said to be "situated." A node's "situation" is its service as one of the endpoints of all of the "connected paths" through the graph to all other nodes accessible via such paths. (Given node n[0], a "connected path" is a finite alternating sequence n[0], arc[1], n[1], arc[2], n[3]... n[n] such that each arc[i] in the sequence connects node[i-1] and node[i].)

[parid0504] Except for the built-in values of the properties of built-in nodes, all of the values of the properties of nodes are determined by their situations. Thus, except for the built-in subjects of built-in nodes, the subjects of all nodes are entirely determined by their situations.

[parid0928] Except for the restrictions on the subjects of nodes that have special functions within assertion subgraphs (see [parid0185] 3.6.2.2), TM Applications are free to define "situation features" (features of the situations of nodes) and how those features, when they occur, affect the values of the properties of the nodes whose situations include those situation features. The values of all properties can be affected by such situation features, including both Subject Identity Discriminating Propertes (SIDPs) and Other Properties (OPs), in accordance with the specifications provided in the definition of the TM Application that defines the properties and the situation features (see [parid0253] 4.7.2.2).

Note 8: 

[parid0505] The situation of a node in a topic map graph is always and only as visible as the values of its properties make it. See Clause [parid0220] 4.


Note 9: 

[parid0904] The definition of a situation feature can include, but is not limited to, the situated node's status as a role player in one or more assertions. The definition of a situation feature can also include the situated node's status as another kind of assertion component node, such as an r-node component of one or more assertions (see [parid0185] 3.6.2.2).



3.5   [parid0080] Well-formed nodes

3.5.1   [parid0081] Six cases of well-formed nodes

[parid0082] A node that satisfies all the criteria in the subclauses of one of the six cases described in the following subclauses is well formed. A node that does not satisfy the criteria of one of the six cases is not well formed.


3.5.1.1   [parid0083] Well-formed node Case 1

3.5.1.1.1   [parid0084] Defining Characteristics of Case 1 nodes

3.5.1.1.1.1   [parid0085] The node serves as no endpoint of any arc.

3.5.1.1.1.2   [parid0086] The node has at least one built-in SIDP value (see Clause [parid0220] 4).

3.5.1.1.2   [parid0087] Node type name of Case 1 nodes

[parid0088] Case 1 nodes do not have a node type name.


3.5.1.1.3   [parid0089] Subjects of Case 1 nodes

[parid0090] The subjects of Case 1 nodes are not constrained by this RM4TM.


3.5.1.2   [parid0091] Well-formed node Case 2

3.5.1.2.1   [parid0092] Defining characteristics of Case 2 nodes

3.5.1.2.1.1   [parid0093] The node serves as one or more of the x endpoints of any number of well-formed Cx arcs.

3.5.1.2.1.2   [parid0094] The node does not serve as any other endpoint type of any instance of any arc type.

3.5.1.2.1.3   [parid0095] The node either has at least one built-in SIDP value, or its situation as a role player causes at least one SIDP value to be conferred upon it.

3.5.1.2.2   [parid0096] Node type name of Case 2 nodes

[parid0097] Case 2 nodes do not have a node type name.


3.5.1.2.3   [parid0098] Subjects of Case 2 nodes

[parid0099] The subjects of Case 2 nodes are not constrained by this RM4TM.


3.5.1.3   [parid0100] Well-formed node Case 3 ("a-node")

3.5.1.3.1   [parid0101] Defining characteristics of Case 3 nodes

3.5.1.3.1.1   [parid0102] The node serves as zero or more of the x endpoints of any number of Cx arcs.

3.5.1.3.1.2   [parid0103] The node serves as the A endpoint of two or more AC arcs.

3.5.1.3.1.3   [parid0104] The node may or may not serve as the A endpoint of one AT arc.

3.5.1.3.1.4   [parid0105] The node does not serve as any other endpoint of any instance of any arc type.

3.5.1.3.2   [parid0106] Node type name of Case 3 nodes

[parid0107] A Case 3 node is called an "a-node" (where "a" stands for "assertion").


3.5.1.3.3   [parid0108] Subjects of Case 3 nodes

[parid0109] The subject of an a-node is always the relationship that is specified via the assertion for which it serves as the unique nexus. The relationship is an instance of the type of relationship which is the subject of the node that serves as the T endpoint of the AT arc of which the a-node is the A endpoint, if any. If the a-node is not the A endpoint of an AT arc, the type of the relationship is unspecified.


3.5.1.4   [parid0113] Well-formed node Case 4 ("c-node")

3.5.1.4.1   [parid0114] Defining characteristics of Case 4 nodes

3.5.1.4.1.1   [parid0115] The node serves as zero or more of the x endpoints of any number of Cx arcs.

3.5.1.4.1.2   [parid0116] The node serves as the C endpoint of a single AC arc.

3.5.1.4.1.3   [parid0117] The node serves as the C endpoint of a single CR arc.

3.5.1.4.1.4   [parid0118] The node may or may not serve as the C endpoint of a single Cx arc.

3.5.1.4.1.5   [parid0119] The node does not serve as any other endpoint of any instance of any arc type.

3.5.1.4.2   [parid0120] Node type name of Case 4 nodes

[parid0121] A Case 4 node is called a "c-node" (where "c" stands for "casting").

Note 10: 

[parid0914] The term "casting" is consistent with the theatrical metaphor invoked by the term "role player". In an assertion, the role players are like the actors in a stage play. Each c-node represents the "casting" of an actor (a role player) in a specific role (a role type) in a specific stage production (a specific assertion), which may or may not be a production of a specific stage play (a specific assertion type). See [parid0454] 3.6.1.



3.5.1.4.3   [parid0122] Subjects of Case 4 nodes

3.5.1.4.3.1   [parid0123] Case 4 nodes with role players

[parid0124] If a c-node serves as the C endpoint of a Cx arc, then its subject is the playing of a specific role type by a specific subject in a specific relationship.


3.5.1.4.3.2   [parid0125] Case 4 nodes without role players

[parid0126] If a c-node does not serve as the C endpoint of a Cx arc, then its subject is the fact that a specific role type in a specific relationship is not played by any subject.


3.5.1.5   [parid0127] Well-formed node Case 5 ("r-node")

3.5.1.5.1   [parid0128] Defining characteristics of Case 5 nodes

3.5.1.5.1.1   [parid0129] The node serves as zero or more of the x endpoints of any number of Cx arcs.

3.5.1.5.1.2   [parid0130] The node serves as the R endpoint of one or more CR arcs.

3.5.1.5.1.3   [parid0131] The node does not serve as any other endpoint of any instance of any arc type.

3.5.1.5.2   [parid0132] Node type name of Case 5 nodes

[parid0133] A Case 5 node is called an "r-node" (where "r" stands for "role type").


3.5.1.5.3   [parid0134] Subjects of Case 5 nodes

[parid0135] The subject of an r-node is a role type that can be played by subjects in relationships. The subjects of the c-nodes that serve as the C endpoints of the CR arcs whose R endpoints are the r-node are the role-player castings of role players that play the role type.


3.5.1.6   [parid0136] Well-formed node Case 6

3.5.1.6.1   [parid0137] Defining characteristics of Case 6 nodes ("t-node")

3.5.1.6.1.1   [parid0138] The node serves as zero or more of the x endpoints of any number of Cx arcs.

3.5.1.6.1.2   [parid0139] The node serves as the T endpoint of one or more AT arcs.

3.5.1.6.1.3   [parid0140] The node does not serve as any other endpoint of any instance of any arc type.

3.5.1.6.2   [parid0141] Node type name of Case 6 nodes

[parid0142] A case 6 node is called a "t-node" (where "t" stands for assertion "type").


3.5.1.6.3   [parid0143] Subjects of Case 6 nodes

[parid0144] The subject of a t-node is a class of relationship, including the roles that can be played in instances of the class, and the values that are conferred on the properties of role players by virtue of their situations as players of specific roles in instances of the class. The subjects of all of the a-nodes that serve as the A endpoints of all of the AT arcs of which a t-node serves as the T endpoint are instances of the class of relationship that is the subject of the t-node.

Note 11: 

[parid0145] The above well-formedness requirements for nodes can be summarized in tabular form as follows:


Table 2:  The Six Cases of Well-formed Nodes
[parid0146]
Form of
Connectedness
(node N2)
node N1
N1 Case 1
N1 Case 2
N1 Case 3
N1 Case 4
N1 Case 5
N1 Case 6
8
.........
C
x
7
.........
x
C
6
.........
C
R
5
.........
R
C
4
.........
A
C
3
.........
C
A
2
.........
A
T
1
.........
T
A
node type
name
(if any).
Subject
constraint
(if any).
Subject is:
requires
built-n
SIDP
value(s)?
0 0 0 0 0 0 0 0 (none) (unconstrained) yes
1+ 0 0 0 0 0 0 0 (none) (unconstrained) no
0+ 0 0 0 0 2+ 0 1? "a-node" assertion no
0+ 1? 0 1 1 0 0 0 "c-node" casting no
0+ 0 1+ 0 0 0 0 0 "r-node" role type no
0+ 0 0 0 0 0 1+ 0 "t-node" assertion type no