[sc34wg3] Towards TMDM 3.0

Lars Marius Garshol larsga at garshol.priv.no
Wed Feb 25 08:10:29 EST 2009


* Rani Pinchuk
>
> Throwing the item identifiers after a merge is exactly what I  
> suggest: I
> suggest to simplify things by having a mandatory one item identifier  
> per
> item. The item identifiers are not used for merging, only for
> identifying items. And no collection of item identifiers is done after
> merging.

Why throw them away only after merging?

> For (1) - with your suggestion, [...]

This is not my suggestion, Rani, but what's been the most common  
internal model for Topic Maps engines for the past decade. Actually,  
since the first Topic Maps engine was written. And it's been the  
standard for a number of years now. All parts of ISO 13250, except -1  
and -5, build on this model. As do TMCL and TMQL. And TMAPI 1.0 and  
2.0. Plus a number of non-ISO specifications (LTM, JTM, tolog, ...).

So changing this property is going involve a *lot* of updates. If  
we're going to change it, we need a very good reason. Of course, we  
could decide that changing it would be better, but not worth it. So  
far, though, I haven't seen anything to persuade me that any of your  
proposed changes would be for the better.

> indeed all topics have a "kind of identifier" but it is actually a  
> subject identifier, as it is not item identifier (although it is  
> called that way).

Actually, no. TMDM defines subject identifier as "locator that refers  
to a subject indicator" and subject indicator as "information resource  
that is referred to from a topic map in an attempt to unambiguously  
identify the subject represented by a topic to a human being".

Item identifiers don't refer to subject indicators, so they really are  
not subject identifiers.

> The reason it is not item identifier, is that it does not help you  
> to identify one item but a group of items (because we collect the  
> item identifiers, we do not have any more one item identifier - one  
> topic relationship).

It's true that it only identifies an item uniquely within a single  
topic map. I don't consider that a flaw. In fact, I consider that  
unavoidable. So far, the only person I'm aware of who disagrees is you.

I really don't see any problem here. And if I *did* think this was a  
problem your suggestion would not solve it.

Imagine this CTM topic map:

   topic.

Now load it into two different TopicMap objects in the same engine.  
That gives you two different topic items with the same item  
identifier, even if we make the change you suggest.

> For (2) - Let's examine a concrete example:
> Suppose we have a topic map with a topic with id "person" and item
> identifier http://one/person.
> We have a query to show all persons (pseudo code of course):
>    show all topics of type "person".
>
> I assume here that we do not use the full item identifier in the  
> queries
> we write.

No, you'd typically write something like

   person << types

> Now we merge with another topic map, that has other persons, and are
> typed with a topic with item identifier http://two/human
>
> If we still use topic map http://one we still can use our query. If we
> now use topic map http://two, we cannot, because we use an ID in our
> query that does not match to "human" and cannot be extracted from the
> other item identifier.

That's true. In fact, querying in the Omnigator with tolog and  
opera.xtm used to (and might still) show this problem.

> So I cannot see any gain here.

You mean, any gain relative to having only one item identifier? There  
would be a gain if the two topics that merged came from the same topic  
map, but I agree that's a marginal case.

> A much simpler way to achieve the same is to simply keep the local  
> item
> identifiers when merging with external topic maps. The collection of
> item identifiers does not help.

A collection versus a single-value property? No, it doesn't help much.

> For (3) - This is indeed a rare situation. If A and C are merged, it
> means that we had a reason to merge them. The same with B and C.  
> Only if
> those merges were done without PSIs, merging A and B using item
> identifiers will actually make any sense. Merging without PSIs seems  
> to
> me at least as difficult as assigning PSIs to the topics that should  
> be
> merged.

And yet merging without PSIs is the common case. It's what usually  
happens.


In any case, I think you should turn your argument around. If we agree  
that item identifiers are going to exist at all, I think you should  
consider carefully what the benefits of making it a single-value  
property are.

As a general rule, when two topics merge, no information is lost  
(except the distinction between the two topics). All properties are  
preserved. Why should item identifiers be an exception to this rule?

Remember also that XTM 1.0, XTM 2.0, and CTM all allow you to assign  
more than one item identifier to the same topic, so this is not only  
about merging.

Given all this, why should it be a single-value property? What are the  
benefits of that? Yes, a single-value property requires less overhead  
than one that's multi-value, but that's all. There are no other  
benefits that I know of, or that you have explained. So I really don't  
understand why you consider this change so important.

--Lars M.
http://www.garshol.priv.no/blog/
http://www.garshol.priv.no/tmphoto/





More information about the sc34wg3 mailing list