[sc34wg3] Re: Going beyond SIPs?

Jan Algermissen sc34wg3@isotopicmaps.org
Thu, 02 Sep 2004 10:40:48 +0200


Robert Barta wrote:

> This is where the TMRM concept 'TMA' enters the stage. Here "identity"
> is defined as "whatever topics have a circle, sufficiently congruent
> shall be regarded the same".
> 
> My view is that this is a "constraint": It says "if I had my way, then
> no two topics in a map exist which have a sufficiently congruent
> geographic range".
> 
> There are two questions:
> 
>   (a) what is the formal language (if any) to express this?

In TMTK (publication pending, I know) you would create a TMA (ultimately
an object that is loaded at run time and drives the whole engine) and in that
TMA a property that serves as the SIDP that 'controls' *your* merging
desire. The 'rule' is expressed as an equivalence test on values of that
property. TMTK's merging algorithm processes all SIDPs, applies  this
function to all its values and merges the topics that represent the
same subject (according to the equivalence test, that is).

For the SubjectIndicators property, for example, the equivalence function
is notDisjoint(value1,value2). If you wonder where notDisjoint comes into
the scene....it is defined by the value type (the class if you like OO speak)
of the SubjectIndicators property.

Theoretically, this is actually quite easy....it just took me about a year
to get the generic merging algorithm in place and provide some other
complexity than O(N^4) :-)

Jan

> 
>   (b) what must TM software do to make this constraint TRUE, i.e.
>       change the map in such a way that it does not violate the
>       constraint?
> 
> For (b), the answer seems easy: The software must remove all
> situations which would conflict with the constraint. It will detect
> the topics and will - without further control - merge them into
> one. (A query/transformation language could actually put more control
> on how the 'merged' map looks like)
> 
> For (a) the answer is not so simple. Every language has a certain
> degree of expressitivity. The higher that is, the more complex things
> one can express. But the price of this is that at some stage you loose
> the ability to "stay in control". For instance, it may be
> theoretically impossible to always decide whether two constraints are
> effectively saying the same (and are just using a different way to
> do so), or whether two constraints are just contradicting each other
> (making it impossible for maps to satisfy both constraints).
> 
> So this is a thin path to walk.
> 
> ---
> 
> What you guys suggest within TMRM is to consider only properties of
> the subjects in question to be part of such a constraint. This is all
> well and good for many cases like the one above or "all persons having
> the same email address should be regarded the same".
> 
> But it is rather arbitrary. What about the following "constraint":
> 
>   Two "cities" (i.e. this constraint only applies to instances of the
>   concept "city") are to be regarded the same if both are directly or
>   indirectly linked to a geographic item (river, mountain, bay, sea,
>   lake, ...).
> 
> In our case above, all three cities may be linked to the "Posperus",
> so they may be candidates for a merge. Note, that we have left
> completely unspecified the exact nature of linking. This could be
> "is-bordering-at", or "on-the-banks-of".
> 
> Now you could argue that this is still to be done with properties,
> maybe properties which have to be defined ad-hoc-ishly for this very
> purpose.
> 
> But I can top it. Now I would like to identify as equal all persons
> who are involved in more than 5 associations of type "knows". Slightly
> bizarre, but possible. Or what about "all persons with the same
> distance from G.W. Bush should be regarded equal" (not sure how
> bizarre that is).
> 
> In these cases there is no property involved. Still, a constraint
> language may be able to capture this (or at least parts of this).
> 
> --
> 
> I am not saying that a future TMCL should have these features, what I
> am saying is that we should orient ourselves towards the expressivity
> and not an artificial selection of what is going to be compared.
> 
> \rho
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3

-- 
Jan Algermissen                           http://www.topicmapping.com
Consultant & Programmer	                  http://www.gooseworks.org