[sc34wg3] Individual contribution on the U.S. N.B. position onthe progress ion of Topic Map standards

Patrick Durusau sc34wg3@isotopicmaps.org
Sat, 03 Apr 2004 07:52:45 -0500


Robert Barta wrote:
> On Thu, Apr 01, 2004 at 03:48:14PM +0200, Bernard Vatant wrote:
>>>I believe that the four parts of ISO 13250 in progress at the moment
>>>address all four of your points, but as you noted, there is currently no
>>>way for an application to specify merging rules declaratively.
>>I'm not sure to understand what you mean by "specify merging rules
>>declaratively", but it sounds to me a sort of paradox. From the
>>recent thread about merging rules, what I understood was that the
>>debate was about having or not merging rules *at all* in the core
>>standard, since they are procedural specifications.
> Bernard,
> I happily disagree. :-)
> I think Dmitry's assessment of the situation that you can capture all
> merging rules with 'additional statements' is quite correct.

Dmitry is only correct if you think subject identifier (like Humpty 
Dumpty) can mean whatever you wish for it to mean at the moment, 
undisclosed to any author or user of a topic map.

In the context of the TMDM, which is, after all, a data model for XTM 
syntax, subject identifier has a specific meaning, to-wit:

3.24 subject identifier
a locator that refers to a subject indicator

[What is a locator?]

3.11 locator
a string conforming to some locator notation that references one or more 
information resources

[What is a subject indicator?]

3.25 subject indicator
an information resource that is referred to from a topic map in an 
attempt to unambiguously identify the subject of a
topic to a human being. Any information resource can become a subject 
indicator by being referred to as such from
within some topic map, whether or not it was intended by its publisher 
to be a subject indicator.

Dmitry's case of baseName is a good one, but not for the reason posed.

How do I determine, based upon an examination of the syntax of a topic 
map in front of me, that Dmitry has used baseName as the basis for 
subject identity? Certainly not reflected in the syntax of the topic 
map. Not in the TMDM.

Can I apply some TMCL rule as you suggest to reach that result? Sure, 
but that is determining subject identity on an ad hoc basis and not in 
terms of specifying the rules for subject identity prior to processing 
the topic map.

What is the meaning of the data to which I am applying the TMCL rule? As 
far as I can tell, the syntax and TMDM don't say and there is no 
mechanism for that to be made known. I have seen examples that make 
resourceData determinative of subject identity. How am I going to 
distinguish those cases from cases where that is not happening?

This is really the struggle between developers who want to make software 
"do" something and the need to have documentation for why it works the 
way it does. Private knowledge of the meaning of various bits of syntax 
in a particular context is a real poor way to achieve interchange of 
information. And on which to base a standard.

> At least my view is that merging is _always_ application specific, it
> just depends how you identify two things. In this sense a 'merging
> rule' is nothing else than an additional constraint on a map: "It
> SHALL never be that two topics are in one map where .... <and here
> comes some condition involving the two topics>".
> If one accepts that a merging rule is nothing else than a constraint
> then one may also consequently think that this is something which
> should belong in a TMCL statement. This makes sense to me as a TMCL
> document is supposed to constrain the form of a topic map.

Correct on the question being: "how you identity two things." The TMDM 
does not say, nor provides a way to say it. In order to apply some other 
rule for that purpose, you have to know what you are applying the rule to.

For example, you limit the content of a database field to integers.

That rule does not confer a notion of integers on that field. You had to 
have a notion of integers before you could even state the rule, much 
less have it make any sense.

I would submit the same is true for TMCL/TMQL. If I don't know what 
meaning (in terms of subject identity) was attached to particular parts 
of topic map syntax/data model, how is that going to be supplied by a 
TMCL/TMQL statement?

Oh, that is not to say that I could not use TMCL/TMQL to impose 
arbitrary subject identities on a particular topic map, without regard 
to its original authoring, etc., but that is a different case from the 
one under discussion. Even in that case, I think we need to have more 
than ad hoc notions of subject identity to underlie disclosure.

> And that can and....
>>And seems to me that Jim's point is to ask for a RM which would
>>contain only declarative semantics, and not procedural
> ....should be declarative, yes.
> ==
> My impression - and here I speak with the hat of a computer scientist
> on - is that the TM community tries to burden the "data model" with
> all sorts of 'semantical' constraints. I do not think this is a clever
> move and it will bite us later when we have to integrate TM?L.

Note that the TMRM is not trying to burden the data model with semantic 
constraints. It is designed to enable disclosure of the basis for 
subject identity and nothing more.

Quite honestly, I don't why the TMRM is viewed as competing with the 
data model. It does something quite different and necessary in order to 
talk about subject identity.

> Please note, that this is NOT like building a house, starting from
> ground up and then making the roof.
> \rho
> PS: If someone wants to follow my thought experiments:
>     http://topicmaps.it.bond.edu.au/docs/23/toc

Interesting. I will read in detail over the weekend but one quick 
comment from "lite" scan.

Note that you presume that merging rules of the TMDM result in a new 
topic. Actually I have been told in person and I assume this exists in 
the email archives somewhere, that whether one follows the rules of 
merging found in the TMDM is an arbitrary thing.

Second, and more importantly, note that "actual" merging involves the 
loss of information, that is that two (or more) topics existed where now 
there is just one. It may in fact for auditing purposes for example, 
desirable to have the "appearance" of merging and not the "actual" 
merging that you posit in your thought experiment. In other words, your 
TMQL statements provide a view "as though" all the information was 
located at a single point and conducts futher operations as though that 
were the case.

Depends on the situation and demands of the project as to which one you 
would want to follow if not some combination of the two, dffering 
depending upon your information needs for particular parts of the topic map.

Hope you are having a great day!


> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3

Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!