[sc34wg3] New SAM draft

Martin Bryan sc34wg3@isotopicmaps.org
Thu, 5 Dec 2002 14:33:06 -0000


Lars Marius

As I shan't be able to get to the part of the WG3 meeting scheduled to
discuss the SAM I thought I submit comments by e-mail and hope that someone
could bring them up at the appropriate point in the discussions.

Abstract
Rather than saying "will supersede ISO 13250" I suggest we state that it
"will augment ISO 13250".

1 Introduction, para 3
"Clearly documented" is judgemental: "documented" is sufficient, unless you
want to make it clear that such documentation is publicly available, in
which case "publicly documented" would be better.

1 Issue (scope-extension)
The use of the word scope in this context is ambiguous. I take it that we
are not referring to topic map scopes but the scope of the standard itself.
ISO 13250 is certainly not "restricted to only defining the issues related
to the interchange of topic maps". It defines architectural forms to which
topic map architectures must conform, irrespective of whether or not the
architectures are interchangeable.

3.1 1. [notation]
The statement "If not, the two first characters of the string must be x-" is
invalid for the HyTime locator notation and for any other ISO defined
locator, which must conform to the ISO rules for naming notations. I suggest
you changes "If not" to "For user-defined notations".

3.2 amd 3.3
A better explanation is needed of why source locators can be sets. Can such
sets only be created using merging rules? (If so, say so specifically.) If
not, make it clear how two source locators can be assigned to a single
subject. (Remember that ISO 13250 does not formally define source locators.)

3.3 Para immediately preceding SAM Constraint heading
What happens if the only difference between two entries is the source
locator chosen to reify the topic map item? Surely this isn't a conformance
issue!

3.4 Issue (subject-vs-resource)
A subject and a RFC 2396 resource are definitely not the same thing. A
subject is a categorization of a resource. It serves as metadata with
respect to it. Acts Ch12 v5 is a perfectly good subject. It has millions of
potential resources (at least 10 in the house I am currently in!) For each
of these potential resources the subject can serve as metadata "for one
small verse from a large collection of clearly identified verses" (not that
all bibles do clearly identify all verses!)

3.4 4th para
"Every topic represents one, and only one, subject" is incorrect. Suppose I
create a supertype node that brings together the subjects of "astronomy" and
"quantum theory". There is no known name for this subject so I create a new
topic and name it Astronomy & Quantum Theory". Is this topic really about
only one subject? (I know what you are trying to get at, but the wording is
inaccurate!)

When you say "it is not clear that they represent different subjects" you
are introducing a double negative. It would be better to be positive and
state that "it is clear that they represent the same subject".

3.4.1 Subject address
The definition of the term subject address is inadequate to make it clear
where and when subject addresses can be used. To me "a locator that refers
to the information source that is the subject of a topic" is a statement
that could refer to any occurrence identified by the topic. I do not think
this is how the term is used within the SAM, but why it should not be used
in this way is unclear. A clearer distinction between subject indentifier
and subject address is required.

3.4.1 Issue (term-subject-indicator-def)
Acts Ch11 V5 is a subject indicator that does not refer to "a specific"
information resource, but could still exist as a locator of the form
urn:purl:bible:JamesI:Acts:Ch11:V5 or some equivalent identifier.

3.4.1 Issue (term-subject-address-def)
Re "Does it represent that storage location" the answer must be no. What if
the address is reassigned to different hardware containing a copy of the
referenced resource? What if the address is notational, as in the above
example? What if there are multiple copies of a particular resource (whether
notational or not, as per retrieval from caches rather than the original
resource)? This must be left an open issue for applications to determine.

3.4.1 Issue (topic-naming-constraint)
For backward compatibility with version 1 of the standard the constraint
should be retained, though a means of deliberately overriding it can be
provided as an extension

3.4.3 Scope
When rewriting be aware of the conflict between the word "all" in the firs
para and the word "each" in the Issue statement. Each is better as scope
means "in one or more of these contexts".

3.4.3 Issue (scope-unconstrained-rep)
Unconstrained scopes could be represented by an empty set

3.4.2 and 3.4.3
These are suddenly invisible :-)

3.4.4 2nd para
The phrase "or to associate a topic name with certain topics" suggests that
the same name can be assigned to different topics. I would not want to
suggest this as a good idea to anyone, or to suggest that there is a work
around to it. (This work around sound suspiciously like RDF reification,
which I suspect you will find adds too much of an overhead to everything to
be viable.)

3.4.4 3rd para
Do we really want "if an information item has a source locator item that is
equal to one of the items in the [subject identifiers] property of a topic,
that topic item reifies the information item" to apply if the source locator
is associated with an occurrence item? This is what the sentence states at
present. I suggest you may want to qualify this statement so that
"information item" becomes "topic item".

3.4.5
The names in the UML do not match thise in the list for [topic names] and
[subject identifiers].

3.4.5 Issue (prop-subj-address-values)
You need to give a clear reason as to why sets are not applicable in this
single instance of a locator property.

3.4.5 Issue (prop-subj-address-scope)
Anything defined as a topic is a valid theme for a scope. This is one of the
reasons why I oppose making built-in subjects into topics.

3.4.5 Issue (strings-as-subjects)
It should be possible to create topics that represent strings. I might want
to create a topic that refers to all occurrences of the string XML (I would
use an XML query to identify the occurrences of this topic!)

3.4.5 SAM Constraint: Source locator and subject identifier namespace
The reason for this constraint is not obvious. Must it be flagged as a fatal
error? The word namespace occurs in the title. These two names are in
different namespaces. Why cannot applications simply assign them to
different namespaces to resolve the possible conflict?

3.5
The UML model, and the overall model, are not conformant with the fact that
ISO 13250 allows multiple sets of base names to be associated with multiple
display names, and multiple sort names. Under this model data would need to
be replicated for each of the base names associated with a set of
display/sort variant names.

3.5 Issue (prop-value)
One vote for the use of [label] rather than [value] for the strings that
make up names

3.5 Issue (names-with-types) and (names-as-subject)
Base names should be allowed to have types (that's why multiple basenames
are allowed in 13250: you can assign both Acronym and FullName as the types
of name of a subject). But you should be wary of merging based only on
labels because the same label can have different roles. What you need to
base merging on is the combination of type, label and a match of one or more
scopes (i.e. do not require an exact match of scope sets, but require that
the shared label has at least one of the scopes assigned to the name it is
being merged with.)

3.7
ISO 13250 does not require that all occurrence types be topics: this is a
constraint only imposed (unreasonably in my view) by XTM.

3.9
ISO 13250 does not require all roles to be topics either. (One reason for
this is that it prevents situations where roles, and occurrence types, get
used to create scopes for topics.)

4.1
The third item in the list is misleading. The subject address locator should
not be OR'd with the other entries, as it is something that is an AND
constriant on the other 3 options.While item 6 in the list is misleading, in
that there is no constraint stated in 3.4.1 about the sharing of subject
addresses and the statement does not require them to have matching
addresses, the para after the list makes the third item impossible unless
one of the other three is in force.

4.2
Basenames should not be merged if they have different types (see comments
above re 3.5)

4.3
It is bad practice to overwrite any existing map with a merged maps. Merged
maps should always create new maps, just as new items do. You may then
remove the source maps, but you should never change the sources.

Don't you need to apply 4.1 and 4.2 after any such merging?

5.3 2nd/last para
Change "as a label" to "one of the possible labels" (switching between
variants is an application dependent decision, which is why multiple display
names, of different types, are permitted in 13250 so that you can choose the
type of display that suits your working environment).

Martin Bryan