[sc34wg3] Alignment of N0396 with N0393

Lars Marius Garshol
27 Apr 2003

Jan Algermissen
| The alignment of N0396 with N0393 is now online at
| http://www.isotopicmaps.org/tmmm/TMSM-1.3/TMSM-1.3.html

My first question about this document is how it is intended to be
used. Is it just a rough draft showing a first attempt at a SAM-RM
mapping that is intended to live together with SAM and RM, or is it
intended as a replacement for N0396? Clarification on this would be

On reviewing the document a number of shortcomings become apparent.
Some are simple bugs (like the definition of the 'text' data type),
others are omissions, and a last category is more subtle. I think it's
clear that the simple bugs can easily be fixed, and it would seem that
the omissions can also be handled (though I'm not too sure). The last
category I am not sure the RM can handle at all.

The omissions are obvious, and some are even stated in the document.
The TMA leaves out the PSIs for variant names, the
unique-characteristic PSI, as well as variant names. These should be
added once the author has time to do so.

The subtle problem is that the merging rules are wrong. Subject
identifiers and source locators share a namespace, but this is not
modelled. Further, base names are here merged if they have the same
string value, even if their scopes, types, and parent topics are
different. Finally, I can see no indication that duplicate topic
characteristics are removed as they ought to be.

What is much worse is that I am not convinced that the RM as it
currently stands can overcome these problems. So in order to
conclusively prove that the SAM can be modelled in the RM this would
have to be corrected.

SAM associations are also modelled as RM assertions, despite the fact
that these are structured differently, and there are some other warts.
How serious these are is difficult to judge.

Another subtle problem is that a number of things that are handled by
the infoset formalism in the SAM is done by prose in the RM, which in
my opinion is not good, and it also seems to indicate that the RM as a
formalism is less suited than the infoset is. 

This would also seem to be borne out by the fact that this document is
much less understandable than the SAM is. Back in 2001 when we were
discussing suitable formalisms for modelling topic maps I proposed
using EXPRESS, an existing and very suitable ISO standard, but was
told that however suitable it might be the problem was that most
people did not know it, so it would not communicate well. The RM
suffers even more from this, of course.

The RM only adds value over and above the infoset if it simplifies the
expression of the SAM. To me it seems obvious that rather than
simplifying the SAM it considerably complicates the modelling of it.

