Editor's working copy

ISO/IEC

ISO/IEC JTC 1/SC34

Information Technology —

Document Description and Processing Languages

Title: SAM issue 'topic-naming-constraint'
Source:Lars Marius Garshol, JTC1/SC34
Project:ISO 13250
Project editor:Steven R. Newcomb, Michel Biezunski, Martin Bryan
Status:Editor's draft
Action:For review and comment
Date:2002-07-14
Summary:
Distribution:SC34 and Liaisons
Refer to:
Supercedes:
Reply to:Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mailk: mailto:mxm@y12.doe.gov
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Ms. Sara Hafele, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
25 West 43rd Street
New York, NY 10036
Tel: +1 212 642-4937
Fax: +1 212 840-2298
E-mail: shafele@ansi.org

SAM issue 'topic-naming-constraint'

Editor's draft 14 07 2002

This version:
Editor's working copy
Editor:
Lars Marius Garshol , Ontopia <larsga@ontopia.net>

Abstract

This document is a position paper on SAM issue 'topic-naming-constraint' and presents one proposed resolution to the issue, together with the rationale for that resolution.

Basically, what this paper argues is that the topic naming constraint as known from ISO 13250:2000 and XTM 1.0 causes serious practical difficulties. It proposes a solution whereby topic map documents that use the topic naming contraint can continue to do so.

This is $Revision: 1.1 $.

Table of Contents

1 Introduction
2 The problems
    2.1 Non-human authoring
    2.2 Recursive identity
    2.3 Does it actually work?
    2.4 The performance cost
3 The proposed resolution

Appendix

A References


1 Introduction

In topic maps there are two mechanisms for establishing the identity of a subject: one can use a URI reference (either to the subject itself or to a resource with unambiguous identification) or one can give the topic a name. That name string will then, together with its associated scope, serve as a globally unique identifier for the topic.

The current standards require topic map implementations to perform merging of topics based on the combination of the name string and the name's scope. Any two topics that have the name string in the same scope will be merged, regardless of what other properties they may have.

The main problem is that there is no way to assign a name to a topic without invoking the topic naming constraint. This paper argues that this is sufficiently harmful that it must change, and proposes a way to allow those who wish to to continue to use the topic naming constraint, while allowing those who do not to avoid it.

2 The problems

This section presents the problems the current design of the topic naming constraint causes.

2.1 Non-human authoring

Not all topic maps are authored by humans. In many cases, topic maps are built automatically by software, for example from existing data in other representations or from events in their environment. In these cases the software will often encounter subjects whose names are not unique. These names can then only be used as the base names of the corresponding topics if an appropriate scope can be found for them. In some few cases such a scope may be found (for example the topic's type, or another of its associated topics), but in general this will not be possible.

At this point the software designer has two choices:

  • Make the base name a meaningless string that can be guaranteed to be unique.

  • Scope the base name by the topic itself. Since no other topics will have the same scope, no unwanted merging will occur.

Both of these approaches have the effect that they effectively fill the topic map map junk information in order to neutralize the topic naming constraint. Add to this the performance cost of the topic naming constraint, and it becomes clear that this is a poor solution for such applications.

2.2 Recursive identity

The topic naming constraint causes formal subject identity to become recursive, in the sense that the identity of a topic depends on the identity of the topics in its base name scopes. This causes a number of difficulties.

To explain we start with an example that is often used ([scope]) to explain why the topic naming constraint is part of the standard. We begin with the usual topic definitions in LTM syntax ([LTM]).

[paris1 = "Paris" / france]
[paris2 = "Paris" / texas]
[paris3 = "Paris" / mythology ancient-greece]
[paris4 = "Paris" / romeo-and-juliet]
[paris5 = "Paris" / botany]

Unfortunately, these definitions are not complete. For this fragment to be useful, we need to provide the scoping topics with names as well. However, to do so we need further scopes, in order to avoid unwanted merging. This gives the following additional topics.

[france = "France" / geography]
[texas = "Texas" / usa geography]
[mythology = "Mythology" / subject-area]
[romeo-and-juliet = "Romeo and Juliet" / shakespeare]
[botany = "Botany" / subject-area]

Now we have defined our contexts, but it turns out that none of these contexts are context-free, and so we are no further than we were. Another iteration leads to:

[geography = "Geography" / subject-area]
[usa = "United States of America" / geography]
[subject-area = "Subject area"] /* unscoped, since it is so high-level */
[shakespeare = "William Shakespeare" / literature]

Now, however, things are beginning to bottom out. Only one topic remains to be defined:

[literature = "Literature" / subject-area]

2.3 Does it actually work?

2.4 The performance cost

The topic naming constraint is complicated to implement and requires considerable resources to maintain. [...]

3 The proposed resolution

[stuff]

A References

LTM
...
scope
Towards a General Theory of Scope, Steve Pepper and Geir Ove Grønmo, Ontopia, 2001-06-25.