[sc34wg3] Documenting merging rules in TMDM

Patrick Durusau sc34wg3@isotopicmaps.org
Sat, 13 Mar 2004 15:42:19 -0500


Steve Pepper wrote:
> * Patrick:
> | > Answers to my questions in your capacity as leader and
> | > spokesperson for the US National Body, I hope? :-)
> | > 
> | Too early in the weekend for that sort of responsiblity. :-)
> Fair enough, but I really hope you will answer them.
> Will you?
Back for a few moments, but yes, I will answer all of your questions to 
the best of my ability. That does not mean that I will be able to do so 
completely this weekend.

> | > But are you saying that you would accept Kal's approach, even
> | > though it delegates responsibility for defining how to document
> | > merging rules to TMCL?
> |
> | No.
> | 
> | Kal's approach is forced by the limitations of the TMDM that you note 
> | below.
> Why do you regard this as a limitation? Doesn't it make
> eminent sense to delegate expression of merging rules to
> the part of the Topic Maps standards family that is
> concerned with constraints?

I would separate your second question into two parts: ;-) (I really do 
miss Daniel.)

1. Does it make sense to delegate merging rules to a separate part of 
the Topic Maps standards family?

If I say yes, then why does the TMDM have any merging rules at all? If 
the goal is clean separation of merging rules from other parts of the 
standard, wouldn't that be better served by having all the merging rules 
in one place?

2. Does it make sense that merging rules (where ever they are) are based 
upon a particular model of subject identity?

An unqualified yes, as I can't think of a merging rule that would make 
sense without reference to the rules for subject identity. I could have 
a merging rule that say if topic A has subject identity property X and 
topic B has subject identity property X, then merge the two, but if 
there is not allowance from subject identity property X, the rule has 


I must defer, not to avoid but simply due to a lack of time your question:

> Why do you regard this as a limitation? 

I will return to it early in the coming week.

> | If one follows the TMDM, then Kal's suggestion may be a good fix.
> "If one follows the TMDM"...
> So, for people who want to follow the TMDM, it's OK to use TMCL to
> express merging rules.
> ...but you voted not to approve the TMDM on the grounds that it does
> not have a mechanism for specifying merging rules.
> Which is it? Do you want the TMDM to have that capability or don't
> you? If so, what *kind* of merging rules do you mean?

Note I said "may be a good fix."

The lack of a mechanism for specifying merging rules is a serious 
problem with the TMDM.

What *kind* of merging rules? Well, in part the same kind that the TMDM 
specifies in 5.4.1 Subjects and topics, where it says:

"Merging beyond the minimal merging required by the rules of Clause 6 is 
freely allowed. Most commonly this will be done by inferring the subject 
of the topics from their characteristics."

Kal's solution is optional, well I suppose all disclosure is optional in 
some sense. One could write up TMCL statements and bind them in a fancy 
binder that had nothing to do with the actual operation of an application.

More on disclosure next week.

> | (still have the problem of vendor lock by embedding merging rules
> | in software, even if Kal's suggestion is a good theoretical answer
> | to the documentation problem)
> What? Kal's suggestion means that the rules would be defined in the
> TMCL schema, in an open, standardized manner. What does that have
> to do with vendor lock? (Please try and answer this precisely
> because as someone who has devoted himself to promoting topic maps
> out of intellectual conviction - but is also a vendor - I am very
> sensitive to the use of such arguments. I have ceased to communicate
> with at least one person because of it :-)
Actually I was making that argument to appeal to all the current vendors 
of topic maps. When topic maps start to gain notice from, shall we say 
larger vendors, it is my assumption that the ability to promote a better 
product and assure customers that there is no risk in switching from 
vendors who have licensed the sunset, the English language and similar 
assets, will be important. If anything, I think a standard that avoids 
"embrace and extend" empowers vendors with a history of involvement in 
the development of a standard.

Would you prefer "undocumented feature" to "vendor lock?" What I am 
concerned about are custom and non-disclosed merging rules embedded in 
an application that a customer relies upon without realizing that the 
rules are in operation. If another vendor comes along, say with an 
implementation that follows only the express rules of the TMDM, the 
customer can't understand why the behavior of their topic map is now 
different. Well, you could certainly say: "It must have had different 
merging rules." but that is hardly of any comfort to the customer.

I thought I always answered with precision? ;-)

> | If one departs from the TMDM, then Kal's suggestion, to the extent that 
> | TMCL follows the TMDM, is less useful.
> If one departs from the standard data model there is little that
> a standard can do to help, in my opinion, but this is a separate
> issue. Let us try not to confuse them. The first thing to clear
> up is
>   What does the TMDM need to do (if anything) in order to satisfy
>   the requirement to be able to document merging rules?
> Kal and Dmitry have both answered "nothing", because the job is
> better done by TMCL. I tend to agree with them. The US National
> Body, based on its comments in the ballot, seems to think that the
> TMDM needs to have this capability itself. Is that the case or
> is it not?

In light of the precision remark above I am tempted to just say "yes." 
;-) (could not resist)

The TMDM specifies some merging rules and does not say that other 
merging rules have to be specified by TMCL but may are "freely allowed." 
If it is going to have merging rules, then it should have a mechanism 
for disclosing other merging rules that are "freely allowed."

If TMCL takes, I don't know, another year to complete (random guess), 
does that mean that applications may have non-disclosed merging rules 
embedded in them? Is there a way to avoid non-disclosed merging rules?

> | As you say, the TMDM limits the values of X that I can talk about for 
> | both A and B, and it does not cover "X is true of A and Y is true of B."
> Neither does it cover "X is true of both" for anything other than
>   X = (same name of same type) | (same occurrence of same type).
> However, this is simply a generalization (and "optionalization") of
> the old topic naming constraint, which the majority of the committee
> wanted to do away with. It was never intended to cover arbitary
> merging rules, only those that correspond to OWL's "unique-property".
> | Whether we will eventually agree on the details of those limits remains 
> | to be seen, but I see them as being imposed by a particular 
> | implementation strategy and not inherent in the nature of topic maps. Do 
> | you disagree?
> We ought to be able to agree on the details of the limits, because
> if we can't it means that the standard doesn't specify them clearly
> enough. (Either that or you and I are not smart enough.)
> I don't understand what you mean by "a particular implementation
> strategy". Can you be more explicit?
Take the topic item equality rules for example.

Other than the reified properties, topics are deemed equal on the basis 
of a variety of locator items.

Locator items (well locators anyway) are defined as:

3.11 locator
a string conforming to some locator notation that references one or more 
information resources

That certainly is one implementation strategy for determining the 
identity of a subject. Compare the locators to see if they point to the 
same place. If they do then the two topics are equal.

Another implementation strategy would be to resolve the locators and 
compare the content or results of operations on the content of what is 
found with the locators.

Another would be to have additional properties for topics and to base 
merging on those additional properties. (Properties here not being 
locators as defined above.)

This is not meant to be exhaustive.

> That applications *need* to apply additional rules in certain
> circumstances is beyond any doubt (we do it all the time in our
> applications). I've given some examples in my original posting.
> Are they sufficiently representative or do you have more?

Hate to defer again but I am in the final stages of editing the TMRM use 
case document that will appear on Monday.

I will answer your original questions, which seem to have gotten lost in 
the flurry of emails, as well as comment on your examples and offer 
others starting early next week. (I anticipate it will take more than 
one or two emails to cover them in any detail.)

Hope you are having a great day!


> Steve
> --
> Steve Pepper <pepper@ontopia.net>
> Chief Strategy Officer, Ontopia
> Convenor, ISO/IEC JTC 1/SC 34/WG 3
> Editor, XTM (XML Topic Maps 1.0)
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3

Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!