[sc34wg3] Question on TNC / Montreal minutes

Lars Marius Garshol sc34wg3@isotopicmaps.org
17 Sep 2002 22:51:23 +0200


* Steve Pepper
| Let's have Level Zero
|
| (0) At the level of the application.

* Marc de Graauw
| 
| This is "Let's ditch the TNC" phrased politely, not? :-)

It depends what you mean by that phrase. It certainly is "let's take
it out of the core standard". You should be aware, however, that many
of the applications I have written do merging based on names. I still
don't think having that behaviour in the standard is right.
 
| I think it would be a big mistake to leave the TNC up to
| applications. The TNC supports some very generic behaviours that
| merit a place in the standard.

If they do, why do not all the other forms of merging by topic
characteristic belong there?
 
| I noted there are two ways in this topic to merge with another
| topic: merging based on subjectIdentity and merging based on the
| TNC. As a former relational database administrator I am completely
| allergic to unnecessary redundancy. It will only lead to mistakes
| (one mechanism says merge, the other says don't) and multiplication
| of unwarranted merges due to human mistakes.

I agree with this, except that there is no way in topic maps to say
that two topics shouldn't be merged. (Well, short of using [subject
address], that is.)

| So one mechanism had to go.

Why? Surely there can be more than one way to establish identity
without this counting as redundancy? In fact, it is the other way
around. If you can only do merging based on names that will force
people who want to do merging on the basis of other characteristics to
turn those into names (even when they are not) or to duplicate them.

Another problem is that if the TNC is all you have to control merging
by you will very often run into the "I don't want topics with the same
name but different types to merge"-problem. The TNC solution to that
is to redundantly add the type to all TNC name scopes...

| The name is not a candidate: users will need the name to navigate
| the Topic Map. 

They need the name, but do they need name-with-merging-semantics?

| So the subject identity is redundant: the name plus the TNC already
| establishes subject identity. This applies not only to this example,
| but to every case of a controlled vocabulary with unique names. Note
| that in such vocabularies things as string matching are usually not
| going to cause much trouble, since those vocabularies are well
| defined. There is one big advantage names have over subject
| indicators: names are human readable, subject indicators are not (by
| most humans).

When I first read this I thought you were arguing that merging by URI
should be taken out of the standard entirely, and I have to say I have
rarely felt so discouraged in a topic map standard discussion.
 
| There are two other arguments against subject identity in the case
| of a controlled vocabulary:
|
| 1) The vocabulary is already there and can be used right away. PSI's
| have to be made first, which is more effort.

The URIs do not have to resolve to anything. It's better if they do,
but they don't have to.

| 2) Some realms do not lend themselves very well to defining PSI's
| because they are too volatile. Social security number (privacy
| issues aside for the sake of the argument) spring into existence
| every day. Do we really want to have someone republish all those
| numbers as PSI's again?

The first problem with that is that SSNs are not names. I suspect you
want to stick them in base names because the standard will not let you
merge by occurrence. In any case, the same argument as above applies.

| (We could of course use PSI's which are not published on the WWW,
| but to me that sounds suspiciously much like using names and
| disguising them as PSI's.)

It's precisely what it is (except that it is not limited to names).
It's just that URI have none of the problems that plague merging by
name.
 
| Since the behaviour described is pretty common, I believe we have a
| very strong case not to leave the TNC up to applications, just make
| it optional.

I'm sorry, but so far I've only seen an argument for why the TNC is
useful. I think merging by name is useful, but that the TNC is not.
What we really need is a more general mechanism for doing merging by
topic characteristics in a manner specified by an individual
application and independent of each topic map used by that
application.

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC        <URL: http://www.garshol.priv.no >