[sc34wg3] Individual contribution on the U.S. N.B. position o nthe progress ion of Topic Map standards

Robert Barta sc34wg3@isotopicmaps.org
Sat, 3 Apr 2004 21:06:29 +1000

On Thu, Apr 01, 2004 at 01:38:09PM -0500, Mason, James David (MXM) wrote:
>  Bernard, I think we are in agreement, and I thank you for helping me
> clarify my thinking.
> I'm not a great theoretician, but my understanding of merging is that it
> shoud be based somehow on subject identity. Now subject identity is a sort
> of fuzzy thing in the current 13250 (I believe Patrick will soon have more
> to say about this). We've had (controversial) approaches to merging based on
> names. We've talked a lot about PSIs as a basis for identity. But the
> closest I've seen in something intended to become part of a standard for
> defining how to establish identity is the discussion of  SIDPs in the RM
> (that's a pretty bad sentence!). We've all fallen down on the job here, from
> the original text of 13250 to the present because we've just assumed we knew
> what identity was. If I remember my past life in algebras, establishing (or
> postulating) a basis for identity is one of the first things that needs to
> be done. I think that's one of the more valuable things the RM can
> accomplish, so we're not all depending on implicit assumptions.
> What I want from the RM is not merging rules but a statement of the basis on
> which merging rules must work. And I think that's what you're saying in the
> statement that merging rules should not be in the RM. And if we agree on
> that, then the supposed Use Cases for the RM are really not something that
> the RM must solve completely: all that's necessary from the RM is to show
> that it provides an adequate basis for establishing identity to support
> those UCs. Then some QL can, outside the RM, take that identity basis and
> formulate procedures for executing the merges. But how that gets done is
> irrelevant to the RM.
> So the RM is no threat to TMQL, but, without the RM, all the other things
> we're working on, like TMQL and TMDM,  are all based on inadequately
> documented assumptions. In short, by jumping in on these other parts of the
> TM family, we've not just put the cart before the horse, we've shoved the
> cart out on the road before we've found the horse.


This could be, but it also could be the other way round.

I appreciate the approach many people have to first define a data
model to somehow capture the information they have in mind. This is
what RM/SAM/TMDM all try to do with varying degree of ... elegance.
But to define what a data structure really "is", it's meaning, the
whole gist of it, you can _only_ do with ..... operations.

To define what exactly a stack is you will not come very far with a
data structure. Whatever structure you come up with, I find a way
to abuse and pervert it.

But if you say, the stack S is something which has to follow the rule

    a = pop (push (S, a))

where 'pop' and 'push' are operations on a stack

    pop  : S   -> S    # a stack gives a stack
    push : SxA -> S    # a stack plus an element gives a new stack

then you have exactly characterized _what_ a stack is. Amazingly
enough an implementation has ALL freedom to invent different data
structures, as long as the above 'Axiom' is met. This is where
implementations differ in speed, memory consumption, concurrency, etc.

What I am trying to say is that I get the impression that some of you
try very hard to work on the data structure level. I am not sure to
which degree this can succeed. My understanding of all this is, that
only if I can capture rules how this data structure behaves when I
apply operations on it, then I have eventually defined what it is.

But applying operations on a map is exactly what a query and update
language does.