ISO/IEC JTC 1/SC 34 N0490

ISO/IEC logo

ISO/IEC JTC 1/SC 34

Information Technology --
Document Description and Processing Languages

TITLE: Topic Maps -- Reference Model Use Cases
SOURCE: Mr Patrick Durusau; Dr Steven R. Newcomb
PROJECT: WD 13250-5: Information Technology -- Topic Maps -- Reference Model
PROJECT EDITORS: Mr. Patrick Durusau; Dr. Steven R. Newcomb
STATUS: Informational (2.7)
ACTION: For review and comment
DATE: 2004-03-20
DISTRIBUTION: SC34 and Liaisons
REPLY TO:

Dr. James David Mason
(ISO/IEC JTC 1/SC 34 Chairman)
Y-12 National Security Complex
Bldg. 9113, M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
Network: masonjd@y12.doe.gov
http://www.y12.doe.gov/sgml/sc34/
ftp://ftp.y12.doe.gov/pub/sgml/sc34/

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: jtc1sc34@scc.ca
http://www.jtc1sc34.org



2004-03-20

JTC 1/SC 34 N0490

Topic Maps -- Reference Model Use Cases

Version 2.7, 2004-03-20

Table of Contents

0 Introduction
1  No Abstract Model of Topic Maps
2  Subject Identity Based on Connections
2.1 Overview
2.2 Preconditions
2.3 Scenario
2.4 Postconditions
2.5 Business Case
3  Specifying Properties of Topics
3.1 Overview
3.2 Preconditions
3.3 Scenario
3.4 Postcondition
3.5 Business Case
4  Topic Maps and Diverse Information Resources
4.1 Providing an Integrated View of European and UK Parliamentary Information
5  Subject Identity and Data
5.1 Overview
5.2 Preconditions
5.3 Scenario
5.4 Postconditions
5.5 Business Case
6  Subject Identity and Soundex Matching
6.1 Overview
6.2 Preconditions
6.3 Scenario
6.4 Postconditions
6.5 Business Case
7  Disclosure of Merging Rules
7.1 Overview
7.2 Preconditions
7.3 Scenario
7.4 Postcondition
7.5 Business Case
8  Access to Information from Multiple Sources, Preserving Context
8.1 Access to Information from Multiple Sources, Requiring Context
9  Conclusion: A Procrustean Bed of Subject Identity?


0    Introduction

The following use cases have been developed to guide the discussion of requirements for the Topic Maps Reference Model (TMRM). There has been extensive work on and discussion of the both topic maps and TMRM and this document is written against that background. The casual reader is therefore cautioned that terms of art and usage occur without warning or explanation.


1   No Abstract Model of Topic Maps

Currently ISO 13250 provides no abstract model of topic maps. The situation is analogous to the path not taken in the early development of airplanes. Without the underlying model that guided the design of the Wrights' airplane, others could copy their work, making airplanes that, like the Wright's flyer, would really fly -- but only for a few hundred meters. The development of airplanes for diverse practical purposes required a general model of the dynamics of powered flight -- one that could form a basis on which many problems could have many creative solutions. Similarly, the first interchange syntaxes and processing models for topic maps have guided the construction of topic maps that really work. However, by themselves, these syntaxes and processing models provide an inadequate basis for creating and using diverse solutions to the evolving problems confronted by those who create, manage and use human knowledge.

The existing interchange syntaxes and processing models for topic maps reflect particular approaches to the identification of subjects -- specific techniques for determining when two or more topics represent the same subject. Both ISO 13250 and the proposed revisions of it concede that their interchange syntaxes and processing models can be extended, but they provide no guidance for modeling or meaningfully disclosing those extensions, such that meaningful construction and interchange of such topic maps are possible. In the absence of an abstract model for topic maps, it is not possible for vendors and users to extend current syntaxes and processing models in a reliable and interchangeable way.

The TMRM exposes principles on the basis of which diverse designs for topic maps can be expressed, compared, evaluated, and made to work together. This document describes some use cases in which the TMRM is expected to enable solutions to problems that, in the absence of the TMRM, would be more difficult to solve.


2   Subject Identity Based on Connections

2.1   Overview

In topic maps, the connections between topics represent connections between subjects. George may be connected to Laura (his wife), to the US Government (his employer) and to Osama bin Laden (his nemesis). The question raised by this use case is: "How, in the absence of an abstract model for saying so, we can know whether to merge two topics on the basis of their connections to other topics?"

In this particular use case, the Social Security Administration, an agency of the US government responsible for distributing funds to elderly and disabled US citizens, is interested in investigating fraud in claims for payment. In some cases, fraud is committed by persons who claim multiple payments by pretending that they are multiple persons, each with a different "social security number" (a different presumably-unique identifier assigned by the US government). In order to detect this kind of fraud, the Social Security Administration wishes to enable merging of topics that represent individuals on the basis of their connections to other individuals, geographic locations, treating physicians and the types of claims being made.


2.2   Preconditions


2.3   Scenario

The investigator needs to enable merging of topics that represent individuals, despite the lack of equal locator items (as specified in the proposed Topic Maps - Data Model Section 5.4.6 "Properties") on the basis of connections to type of claim, treating physician and both the person's and physician's geographic location topics.


2.4   Postconditions

Investigator obtains merger of topics representing individuals who share a connection to geographic location as specified with a connection to a particular physician and connection to a type of claim. When such mergers occur, there is some possibility of fraud if the resulting merged topic has more than one social security number.


2.5   Business Case

The approach allows efficient detection of cases in which there is a possibility of collusion of patients and physicians in fraud. The approach depends on having flexibility in how subject identity is determined. (If the Social Security Administration needed to determine subject identity solely on the basis of social security numbers, the TMDM's approach to subject identification would be adequate.) This use case illustrates that there is utility in applying merging rules other than those provided by the TMDM.


3   Specifying Properties of Topics

3.1   Overview

The interchange syntax-based explication of topic maps in ISO 13250 enunciates certain properties for topics, including topic name, occurrence, and association. Some current proposals, such as the TMDM, recognize that 13250's interchange syntax is not intended to constrain the properties of the non-interchangeable topic objects found in implementations. These proposals provide additional properties, but they nevertheless would provide all topics with only a single, specific fixed set of properties, exclusively reserving unto themselves the privilege of defining the properties of topics.

There is nothing particularly sacred about the properties of topics reflected in any interchange syntax or proposed data model. As has been the case for many years, different industries and users employ different notions of subject identity, even when they are processing exactly the same information. Creators of topic maps should have the ability to declare the properties in terms of which they intend their topics to be understood, and their ability to declare such properties should not be constrained by the topic maps standard. Users of topic maps should be free to use the advice provided by their creators, or to ignore it; users should be able to decide for themselves the properties in terms of which they wish to understand topic maps.

In this particular use case, the US Geological Survey (USGS), another agency of the US government, wishes to construct a topic maps in which topics have, in addition to names, location properties whose values are expressed in terms of longitude and latitude. As it happens, the USGS does not wish to create topics to represent individual quanta of geographic space; instead, it prefers to understand latitude and longitude values as points in their respective continua. This attitude has implications for subject identity, and therefore for merging, and the USGS needs to understand and explain those implications to itself and to the users of its topic map products.


3.2   Preconditions

The USGS wishes to build a topic map that contains topics whose subjects are geographic locations. For each such topic, the following information will be conveyed:

The USGS intends the topic map to be understood in such a way that, when any two topics have the same longitude (within some tolerance), the same latitude (within some tolerance), and any name or variant name in common, the two topics will be regarded as having the same subject (i.e., they will be merged).


3.3   Scenario

The ability to integrate information about a given specific set of geographic coordinates is just one of the USGS requirements. Another requirement is to be able to respond to queries about identified locations with respect to any set of coordinates. In the data model that the USGS needs to use for its topic maps, longitude and latitude are as much characteristics of its topics as topic names or information locators may be in some other data model.


3.4   Postcondition

The TMRM shows how the USGS can enjoy the benefits of Topic Maps without having to surrender the freedom to construct subject identity properties that accurately reflect its own understandings and attitudes with respect to the subjects within its domain (in this case, geographic locations). Because the TMRM establishes the minimum requirements for usefully disclosing such understandings and attitudes, the USGS can provide the users of its topic maps with the option of understanding them exactly as USGS intends them to be understood.


3.5   Business Case

The TMRM allows the benefits of creating and using Topic Maps to be realized by diverse user communities, even when their notions about how subjects should be identified are highly specialized, or are themselves subject to change. It allows the topic maps paradigm to be adapted to the attitudes of its users with respect to their knowledge domains, rather than requiring the users to adapt their thinking about the representation of their knowledge domains to the constraints of topic maps. This can significantly reduce the learning curve burdens of new users who already have a data model with which they are already familiar. It also maximizes the freedom of topic map creators/maintainers to adapt to changes that occur within their knowledge domains.


4   Topic Maps and Diverse Information Resources

One of the listed purposes of ISO 13250 was to provide integration of diverse information resources (both structured and unstructured) through the use of a topic map. The practical requirements for integrating truly diverse resources may, but very likely will not always, be fully satisfied by the properties of topics (and the merging rules based on those properties) that have been proposed as revisions to ISO 13250.

This section, Providing an Integrated View of European and UK Parliamentary Information, written by Ann Wrightson, is one example of such a use case. This material, Copyright 2003 Ann Wrightson, appears here with her permission.


4.1   Providing an Integrated View of European and UK Parliamentary Information

4.1.1   Overview

It is a fact that at the time of writing, the European Parliament is evaluating Topic Maps as a medium for recording the existence and organization of a range of information assets, and the UK Parliament has decided to adopt RDF for indexing some of its information assets. This usecase takes this situation forward into a plausible future scenario where both these illustrious organizations have followed through on these early directions, and have furthermore made substantial collections of their respective information assets available by remote access. These access interfaces include the following capabilities:

This usecase is a high level description of a user interface that gives an integrated view across these two collections, including search and retrieval functions that do not require the user to interact separately with the two collections of information. The researcher is an independent third party.


4.1.2   Preconditions

  1. Researcher wishes to investigate a matter with relevant sources in both UK and European Parliament collections

  2. Researcher has a tool driven by the TMRM - called below the RM-Nav

  3. Each collection includes metadata, for example Dublin Core.

  4. Access to both collections is available, through an interface supporting querying of subject index & metadata, and retrieval of documents.

  5. A "researcher's friend" ontology is available -