|Title:||Topic Map Constraint Language (TMCL) Requirements and Use Cases|
|Source:||Mary Nishikawa, Graham Moore, JTC1/SC34|
|Project:||19756 Topic Map Constraint Language|
|Project editor:||Graham Moore, H. Holger Rath|
|Action:||For Review and Comment|
|Distribution:||National Bodies and Liaisons of SC34|
|Supersedes:||N0405 TMCL Requirements|
|Version:||$Id: requirements.html,v 1.17 2003/10/20 15:05:19 larsga Exp $|
The ISO JTC1 SC34 Project for a Topic Map Constraint Language (TMCL) [N0221] was voted and approved on 2001-10-5. The NP vote can be found in ISO/IEC JTC 1/SC34 N0259 [N0259], along with the National Body comments on N0226 Draft Requirements, examples and a "low bar" proposal for Topic Map Constraint Language [N0226]. At the time of the submission of this NP, a data model was proposed and it was decided that since this model would be needed to develop TMCL, work was suspended until development of the model has sufficiently progressed. It was decided in Baltimore 2002 to resume work on TMCL and begin collecting requirements. The changes in this document over N0226 were made based on the comments received from National Bodies on N0226, plus additional comments gathered from email@example.com and discussions on irc.//irc.freenode.net/#topicmaps.
TMCL will provide a means to express constraints on classes of topic maps conforming to the Data Model [DM] for ISO/IEC 13250:2002 Topic Maps . Its goal will be to provide a language such that a topic map can be said to be conforming to set of constraints. A topic map can be said to conform if it is successfully validated against the set of constraints. It may help optimization both of topic map storage and of TMQL [N0249] queries based on schema information. It may aid applications in providing more intuitive user interfaces for creating and maintaining topic maps.
This document, split into three core sections, outlines the general requirements that the language should meet, provides technical use cases (these are data structure level use cases), and provides use cases that cater to the wider deployment of TMCL (such as illustrating how TMCL could be used in a document management solution to provide metadata authoring rigor).
The keywords "MUST," "MUST NOT," "REQUIRED," "SHALL," "SHALL NOT," "SHOULD," "SHOULD NOT," "RECOMMEND," "MAY," and "OPTIONAL will be used in this document as defined in [RFC 2119].
"Topic Map Constructs" are the information items described in [DM].
"TMCL Schema" can be defined as a collection of constraints (defined using TMCL) used together for some purpose.
"Schema Introspection" is the ability to interrogate the data model of a set of constraints.
A "Selector" is an expression that locates or identifies some set of topic map constructs.
A "Restrictor" expresses conditions upon items identified by the selector.
"Typing Constraints" are constraints that only apply to classes of topics or classes of associations.
"Free Ranging Patterns" are constraints that apply to constructs that can be arbitrarily dispersed throughout a topic map.
"Constraint Reification" is ability to represent a constraint as topics in some topic map.
TMCL must be specified in terms of the Topic Map data model.
TMCL may define a set of validation exceptions. Where these exceptions are defined it must be unambiguous as to what constitutes a violation and what exception occurs. A violation is considered a topic map construct that breaks a TMCL constraint.
Given two schemas, it must be possible to compose them. In cases where there are schema contradictions, these must be communicated to the client by the merging application.
Schemas should be resources that have their own unique identifiers, such as a Published Subject Indicators (PSIs).
Schemas must be able to explicitly extend other schemas in order to reuse constraints while adding new constraints. Schema extension must be a transitive relation; if schema A extends schema B, and schema B extends schema C, then schema A implicitly extends schema C as well.
A Topic Map must be able to explicitly commit to a specific schema.
It must be possible to provide metadata for each schema. This meta data should take the form of a topic map.
The language must provide features for relating different versions of the same schema. This should include features for relating revisions to prior versions, explicit statements of backwards-compatibility, and the ability to deprecate identifiers (i.e., to state they are available for backwards-compatibility only, and should not be used in new applications/documents.)
The language must be able to constrain topic class hierarchies. This includes, but is not limited to, sub classing and transitivity.
The language must be able to express constraints on topic characteristics. These include, but are not limited to, names, occurrences, identifiers and roles in associations.
TMCL will not specify its own set of basic data types. However, it will be enabled to utilize data typing as defined by standards such as XML Schema data types, Relax-NG or DSDL.
The language must be able to express that a set topic map information items and/or literals constitute the allowed value of a property, be it in relation to topic types, scope or allowed role players.
The language must support the specification of cardinality restrictions on topic map structures.
The language should have one and only one standardized XML serialization syntax.
Note: Text will be provided from updated TMQL requirements (in progress).
The language must be able to constrain all topic map structures defined in the data model for Topic Maps.
Anything retrievable by TMQL can be constrained.
The language must allow for topic map constraints to apply to several disjoint topic map structures, i.e., TMCL will not just restrict the properties of a topic of a given class, or the role types of a given association type.
The language must be constructed in such a way that it is possible to introspect any given schema for the purpose of user interface creation or automated semantic processing.
TMCL will have a defined data model, perferably the data model for Topic Maps [DM].
TMCL should enable alternative serializations of a TMCL schema.
TMCL must define a) a syntax for expressing constraints b) a model for internal representation for these constraints, and c) behavior of TMCL validation.
TMCL should be defined in two parts. TMCL Big will allow for the constraint of any topic map construct, TMCL Lite will only enable the constraint of commonly used topic map constructs such as topic type and association type.
Some examples include constraints that
Some examples include
These may include but are not limited to valid combinations of locator (URI pattern), scope, and parent topic.
Internal occurrence data types and values should be constrained and used when constraining other aspects of the topic map.
These are selectors that
A selector may restrict a topic to be used for scoping topic characteristic(s): base names, occurrences and association roles.
A selector may restrict which topics can be used as scoping topics.
Cardinality will be defined, for example, for
TMCL should allow different combinations of logical connectives and quantifiers such as exists, forall, and, or, not, implies and equivalent.
TMCL should represent information about equality (= and /=).
TMCL should include basic constructs for making statements about
TMCL may include basic constructs for making statements about string-based data using regular expressions.
TMCL may allow the introduction of "virtual" (extended) constructs which can be reused in defining new validation rules (a kind of "virtual" relations/associations).
TMCL may create named complex conditions.
A prescriptive validating topic map application MUST ensure that only instances of schema constructs exist in the map, i.e., only topics of types defined in the schema can exist and that all constraints defined are conformed to, in the topic map.
With permissive validation, any constraints defined must be met but any structures not governed by the constraints are deemed to be in violation of the schema.
This section provides use cases that are based on topicmap constructs. These low level constraints help to define the scope and shape of the building blocks of TMCL. The examples in this section will provide use cases of how people would constrain constructs such as names and identities.
General rules for all constraints:
Elements of association definition in Topic Maps:
This particular constraint defines what topics can be used as association type topics. For example, the topic "student" is not the right topic to be used as an association type topic, while the topic "is-studying-at" is.
a. Topic T can be used as an association type topic.
Example: Association of type born-in defines a class of associations.
Restriction: Further restrictions may be enforced on this particular constraint such as limiting the use of a particular topic as an association type topic and nothing else.
This particular constraint defines the scope a particular association should have. This may also restrict which topics may be used to scope an association.
a. Association with association type A must be in scope S.
Example: Association of type born-in must be in scope biography.
b.Topic T can be used to scope the association (used as an association scope).
Example: Topic biography can be used to scope association.
Restriction: Further restriction may be enforced to this particular constraint. In regards to the second usage, limit the use of a particular topic to scope association and nothing else.
This particular constraint defines which association roles an association should have, and it also enforces cardinalities of the association roles. An association must have association roles, and this constraint defines which roles should be included with the association.
a. Association with association type A has only roles R1 and R2.
Example: Association of type born-in must have roles person and place.
b. Association with association type A has at least the roles R1 and R2.
Example: Association of type employed-by must have at least the roles employee and employer.
This particular constraint defines which topics can be used as association role topics. The topic born-in is not an appropropriate choice as an association role topic, but the topic employee is.
a. Topic T can be used as an association role topic.
Example: Topic container can be used as association role topic.
b. Topic T can be used as an association role topic in association with association type A.
Example: Topic killer can be used as an association role topic within association of type murder.
Restriction: Further restrictions may be enforced on this particular constraint such as limiting the use of a particular topic as association role topics only and nothing else, whether or not within a specific association only.
This particular constraint
defines which topics can be used as association player topics. The topic born-in
is not an apporpriate choice as an association player topic, but the topic Erica
Example: Topic of type student can participate in an association.
b. Topic of type T can be used
as association player topic or participate in an association with association
type A .
Example: Topic of type student can participate
in association with association type studying-at.
Restriction: Further restrictions may be enforced on this particular constraint. As in the first usage, limit the use of a particular topic as association player topics only and nothing else, whether or not within a specific association only.
constraint defines what an association should look like.
a. Association of type A must have (only/at least) two participating topics where one is of type T1 and the other is of type T2.
Example: Association of type employed-by must have at least two participating topics where one is of type employee and the other is of type employer.
b. Association of type A must
have the role R being played by a topic of type T .
Example: Aassociation of type studying-in
must have the role student being played by a topic of type student.
Association of type A has role R played by exactly two topics
of type T.
Example: Association of type project-partnership must have the role partner being played by two topics of type businessman.
of type A has role R1 played by topic of type T1 and role
R2 played by topic of type T1 or T2 .
Example: Association of type born-in must
have the role mortal being played by one topic of type person and role birthplace
played by one topic of type location.
Description: Constraint association of type A such that
Example: Association of type located-in must have the roles container and containee, such that the container may be
- a topic of type country, in which case the containee must be of type state, county, or city
- a topic of type state, in which case the containee must be of type county or city
- a topic of type county, in which case the containee must be of type city
- a topic of type city, in which case the containee must be of type theatre.
Elements of topic
Topics can have name(s)
whether it is an explicit or implicit name. This constraint is to limit the
existence and/or value of the explicit names of a topic.
of type T must have a specified number of explicit names (cardinality).
Example: Topic of type person must have two explicit names, i. e., one is a nickname and the other is the full name.
of type T must have at least one explicit name.
Example: Topic of type student must have at least one registered name.
c. Topic of type T must have a name where the value matches the particular pattern.
Example: Topic of type tutorial must have a name containing the word tutorial.
of type T must not have a name.
Example: Topic of type time must not have any explicit name.
e. Topic of type T must have a
name with scope S .
Example: Topic of
type English-man must have a name with scope en (meaning English) .
This constraint is
to ensure that a particular topic has the occurrences as required.
of type T must have a specified number of occurrences (cardinality) .
Example: Topic of type tutorial must have at least two occurrences, one is the main reference and the other is backup reference.
of type T must have at least one occurrence following a particular pattern .
Example: Topic of type tutorial must have at least one occurrence which has the pattern http:// (denotes it is a web page).
of type T must have an occurrence that is of type O .
Example: Topic of type tutorial must have an occurrence that is of type homepage.
of type T must have an occurrence in scope S .
Example: Topic of type country must have occurrence of type map in the scope geography.
of type T must not have any occurrence .
Example: Topic of type is-doing-project must not have any occurrences
This constraint is
to ensure that a particular topic has the subject inline as required.
a. Topic of type T
must have a specified number of subject indicators (cardinality).
Example: Topic of type tutorial must have at least one subject indicator.
b. Topic of type T must have at least one subject indicator.
Example: Topic of type project must have at least one subject indicator no matter what the value is.
c. Topic of type T must have a subject indicator in scope S.
Example: Topic of type person must have a subject indicator in scope social-security-number.
of type T must have a subject indicator of type O .
Example: Topic of type country must have a subject indicator of type map.
e. Topic of type T must have a
subject indicator with a particular value or match a pattern P.
Example: topic of type country must have have the subject indicator: http://www.oasis.org/psi/country/.
f. Topic of type T must have a subject indicator defined in a certain registry.
Example: Topic of type country must have a subject indicator defined in the registry at http://www.oasis-open.org/psi/country.
of type T must not have any subject
Example: Topic of type undefined must
not have any subject indicator .
defines which topics can be used for topic typing.
T can be used for typing other topics and nothing else.
Example: Topic person can be used for typing other topics, while the topic Erica is not appropriate to use for typing other topics.
b. Topic T can be used for typing occurrences.
Example: Topic map can be used for typing occurrence, and nothing else.
c. Topic T can be used for typing subject indicator.
Example: Topic locator
can only be used for typing subject indicator and must not occur anywhere else.
defines which topics can be used for scoping, and what it should scope.
a. Topic of type T can only be used for scoping basenames and nothing else.
Example: topic of type English can only be used for scoping names and nothing else.
b. Topic of type T can only be used for scoping occurrences and nothing else.
Example: Topic of
type map can only be used for scoping occurrences and nothing else.
of type T can only be used for scoping associations and nothing else.
Example: Topic of
type doing-something can only be used for scoping associations
and nothing else.
defines the allowed instances of a type and may include cardinality.
A List of topics
are instances of topic type T .
Example: Instances of topic type primary-color can only be red, blue, yellow.
b. There are at least
3 topics that are instances of topic type T.
Example: There have to be at least 5 topic instances of topic type student.
This constraint defines which association a particular topic should participate in.
a. Topic of type T must participate in association type A.
Example 1: A topic of type person must be the only member in association born-in who plays the role mortal.
Example 2: A topic of type student must participate in an association studying-at.
This constraint defines valid combination uses
of scopes, locators, and parent topics.
a. Valid combinations of the use of
scopes S1, S2, S3 is S1, S2 and S2, S3 only.
Example: Valid combination of topic scopes English,
French, and beginner are English - beginner and French
b. Scope T1-T2 defined
by the topics T1 and T2 to be used to qualify characteristic
assignments of type C .
Example: Scope English beginner and French beginner may only qualify occurrence of type description.
This constraint defines where
occurrences may appear.
a. Occurrence of type O can only be a characteristic of topics of type T.
Example: Occurrences of type map may only be characteristics of topics of type country.
This constraint defines the valid occurrence types and scopes.
a. Occurrence of type O can only be used within scope S.
Example: Occurrence of type download can only be used within the scope of internet.
This constraint defines the
valid occurrence value .
a. Occurrence of type O must have locators that match a URI pattern P.
Example: Occurrences of type map must have a URI that matches the pattern http://www.maps.com/.
By default, it is defined that if topic T is a subtype topic A, and topic A is a subtype of topic S, then topic T is a subtype of type S. This constraint allows other ways to define the supertype-subtype relationship.
a. Topic A is subtype of topic T, and topic T is subtype of topic B¸ but topic A is not a subtype of B.
This constraint defines the
transitivity rule for association roles within an association.
T1 and T2 plays role R1 and R2 in association A, topic T2 and T3 plays role R1 and R2 in association A, topic T1 and T3 are transitively related .
Example: Queensland plays
role containee and Australia plays role container in association
with association type contained-in; Brisbane plays
role containee and Q-ueensland
plays role container in association with association type contained-in.
Transitivity rule therefore defines Brisbane plays role containee
and Australia plays role container in association with association
This constraint defines
data type for particular topic characteristic, or even for any value contained
within a topic map.
a. Topic of type T must have a
characteristic and the value has to be a particular datatype .
Example1: Topic that is
of type product must have an inline characteristic which is price and the value should be of
data type currency.
of type T must have an occurrence, and the value has to be a
particular data type.
Example 2: Topic must have an occurrence that can be either isURL, isURN, isURI, isNumber.
Topic Maps allow merging between topic map documents, therefore TMCL should also allow merging between a constraint schema for one Topic Map with a constraint schema for another map, when the two Topic Maps are merged.
Example: Topic map A is created following the constraint OA, while topic map B is created following the constraint OB. When topic map A is merged with topic map B then the two constraints OA and OB need to be merged as well.
Some constraints may be enforced on the value of the particular topic characteristics, e.g., number calculation.
Example: A customer can only order a maximum value of $100. This contraint is explained further in section 4.1 of this document.
This section provides use cases that are more high level than the those in the previous sections. It provides use cases that describe the wider application of TMCL to business problems and specific deployment domains. These constraints are intended to keep the big picture in mind when developing the low level building blocks of TMCL in order to support the data structure use cases.
shall be able to aid applications in providing more intuitive user interfaces.
should support schema-driven presentation of information.
Comment: Without TMCL, software viewers only can show topic map constructs presented explicitly in a topic map. TMCL schema allows to present "what is expected," "what is entered already," and "what is still missing" in a topic map.
should support schema-driven information editing.
Comment: Schema-based software editors can automatically generate visual forms to support editing existing topic map constructs and entering new "missing" or "possible" topic map constructs.
should support context sensitive, schema-optimized entering of information by
Comment: When users try to enter new information, software editors can suggest a list of all entered-before topics as candidates for "new" items. In real life scenarios, it can include thousands of topics. TMCL Schema-based editors can optimize the list of "candidates" based on context. Example: Topic is instance of Type(s) and constraints.
TMCL may provide support for generating topic map user interfaces with such technologies as XSLT, XForms or other equivalent technologies.
TMCL should support introspection
The presence of a TMCL schema may also allow applications to improve the results
of merging topics/topic maps by providing enough information to allow implementations
to do additional transformations and redundancy removal. See SAM issue
merge-use-of-schemas and references to it.
TMCL will have at least one human-friendly syntax, preferably one based on XML. It may be based on XTM.
TMCL should have a standardized representation in topic map form using published subjects defined in the TMCL standard.
TMCL will make use of TMQL to define both selectors and restrictions. In other words, TMCL will use TMQL to locate the topic map constructs or properties to be constrained and TMQL will use TMCL to identity particular situations with the task of TMQL only being the generation of output.
A model may be created. It may be useful in helping us to understand how TMCL
makes use of the TMQL to define selectors and restrictions.
Note: TMCL should support the validation of topic map data authored by humans.
The results of the validation should either be confirmation on that the topic map satisfy the given set of constraints or if any constraint is not satisfied then error messages should be given. This error messages may also include warnings, depending on the validation rules (refer to R2 on section 3). Results of the validation may produce a virtual topic map.
The language should be based on the next version of ISO 13250 as defined by [N0323] and may also make use of ISO 18048 TMQL [N0249]. It may make use of [XML Schema] Part 2 and OWL [OWL].
E-Sell Corp. has many local offices, some of them large and others small, that all maintain their own customer and order databases. In order to provide unified access to all of these, E-Sell plans to use Topic Maps. The contents of the various databases, which all have slightly varying schemas, will be mapped into Topic Maps which will then be merged. Some information may also be authored in topic map form. In order to ensure consistency across all these topic maps, they must be constrained to follow certain structural rules.
A typical solution for knowledge management is a relational database. The following ER diagram shows how the information should be represented in relational database.
This particular ER diagram can be represented as a Topic Map. The mapping will be from a relational database to topic map document (note that this is only a suggested mapping).
The entities, as shown on the ER diagram, are customer, product, and order.
Each customer is represented by a topic that has the topic-type customer. Attributes of the customer can be represented as the topic characteristics. Customer name is represented as the topic's basename, address as topic's occurrence, and the email address as topic's occurrence as well. In addition to that, customer may also have a customer id that may be represented as both occurrence and subject identifier.
Each product is represented by a topic that has the topic-type product. Attributes of the product can be represented as the topic characteristics. The product name may be represented as basename, and the product price can be represented inline.The product id can be represented as both an occurrence and subject identifier.
Each order is also represented by a topic that has the topic-type order. Attributes of the order can be represented as the topic characteristics. Order ID is represented as both occurrence and subject identifier.
The relationship between the entities, as shown on the ER diagram, are orders and contains.
Each relationship may be represented by an association. The relationship orders may be represented as an association of type is-making-order with two association roles customer and order. The relationship contains can be represented as association with association-type contains with at two association roles order and product.
After designing the topic map structure for this particular application, constraints or rules should be defined to ensure the consistency of data contained within the topic map document. Several possible constraints for this particular topic map document can be defined.
Many more constraints will be required to limit and ensure the consistency of the data contains in the topic map document.
These constraints for a topic map document should be able to be represented in Topic Map Constraint Language. Therefore, TMCL in this particular application must be able to define the following constraints.
Business rules may also be represented in TMCL, for example, a customer may order up to maximum of $100. The is-making-order association which has all the orders that a customer makes, should help ensure this. Therefore for all the topic playing the role order, should calculate the total order and make sure it does not exceed the $100 limit.
E-Sell Corp.'s local offices may update their database from time to time, and these changes have to be incorporated to the topic map construct. These updates may include price list updates, new customers, new products, etc.
Incoming data need to be added to the current topic map construct. In order to keep the consistency of the current topic map, the new data needs to be validated to check its consistency. The new data can be added manually to the topic map construct via an authoring interface. Another option is to construct a topic map document for the new data and then merge it with the current topic map construct.
If the incoming data is in a topic map construct, then merging should take place. This incoming topic map TM2 may have constraints C2 define for itself, but it has to be validated against the constraints of E-Sell topic map constructs C1 before or during the merging process. The constraints C1 and C2 need to be merged along the merging of the topic maps documents. The merging of constraints is completely independent to the merging of topic maps, it is up to the author to decide whether the constraints merging is required.
Within E-Sell topic map constraints, the constraint language should be able to define the merging rules. During the merging, TMCL is able to filter "bad" data and only add the necessary data to E-Sell topic maps document.
If the incoming data is then added manually to the topic map document, then we need to edit the current document. TMCL should act as a template, and the editor is schema-based. Any changes that is done to the current document is done through a schema-based editor. Any changes that does not satisfy the template will be rejected therefore the topic map document is kept valid even after editing.
E-Sell Corp.'s product database is currently in relational database. These products are grouped in different product categories and currently E-Sell is facing a problem with adding a new product category, since different product categories requires different information to be kept in the database.
Products may have categories, and each category relates to a set of product characteristics. These categories are
Each E-Sell's product has different characteristics that are related only to a particular category. Modelling this in a relational database can be very ugly, therefore native topic map storage may provide a proper solution. Adding new category in topic map can be strictly easy and a different product category may have specific constraints imposed on it.
A new product category is just another topic with required characteristics. The constraint language can help us define which characteristics this particular product category must have and it can also define the merging rules.
However; these may be used for illustrative purposes in usage scenarios.
[Issue 2] If a vocabulary is not within the scope of TMCL then how is the vocabulary merged with the schema? An ontology is the vocabulary plus its structure.
The TMCL will not require the use of any any registry or repository.
It is recommended that a test suite be developed, but it will not be within the scope of this ISO project of work. However, we encourage other standards bodies such as OASIS to consider it.
If a suite were developed, it could possibly consist of a set of topic maps and schema and a superstructure that tells you
The progress of TMCL development rests on dependencies on the topic map family of standards [N0323] [N0388] and may be also be dependent on the Web Ontology Language [OWL] and XML Schema Part 2: Datatypes [W3C Schema]. As these standards progress so does TMCL development. With changes in dependent models such as the data model for Topic Maps, XTM Syntax specification,and TMQL, backwards compatibility must always be kept in mind to ensure integration with all standards TMCL is dependent on.
We are indebted to Lars Marius Garshol, Steve Pepper, Robert Barta, Erica Santosaputri, Dmitry Bogachev, Kal Ahmed, Bernard Vatant, Eric Freese, Anthony B. Coates, and members of firstname.lastname@example.org and irc.//irc.freenode.net/#topicmaps for the source of requirements and Use Cases and discussions of this draft. Special thanks goes to Erica Santosaputri for providing the use case examples and usage scenario.