[sc34wg3] Re: Going beyond SIPs?

Robert Barta sc34wg3@isotopicmaps.org
Thu, 2 Sep 2004 18:15:52 +1000


On Fri, Aug 27, 2004 at 02:17:56PM -0400, Patrick Durusau wrote:
> You have said more than once that the \tau can go beyond SIPs (hard to 
> think you can go beyond what is already unlimited but the \tau is your 
> model so your entitled to your opinion).

Patrick, et.al.

What some of us are arguing (I think to remember that Dmitry was with
me here), is that 'properties' alone are just one way to pinpoint the
identity of a subject.

Let us reconsider the example with the cities which are "identified"
by their geographical coordinates. What happens here:

  - First someone writes a topic (subject proxy, if you wish) about
    the city 'konstantinopel'. He creates associations around it, such that
    it borders the Posperus, but also adds X and Y coordinates (and
    maybe a radius R) to approximate the location of Konstantinopel.

  - then another topic about Instanbul is added, similar X/Y, similar R

  - and then another 'Byzantium' is added. Again, similar X/Y, similar R

Now, as already discussed, "it depends" whether the three topics
should be identified (= "regarded as equivalent") or not. A historian
would probably want to see all three of them, or maybe not.
Obviously, it is unwise to somehow code this equivalence into the map
itself. It is something which the "application", i.e. the context how
the map is used, is supposed to define.

This is where the TMRM concept 'TMA' enters the stage. Here "identity"
is defined as "whatever topics have a circle, sufficiently congruent
shall be regarded the same".

My view is that this is a "constraint": It says "if I had my way, then
no two topics in a map exist which have a sufficiently congruent
geographic range".

There are two questions:

  (a) what is the formal language (if any) to express this?

  (b) what must TM software do to make this constraint TRUE, i.e.
      change the map in such a way that it does not violate the
      constraint?

For (b), the answer seems easy: The software must remove all
situations which would conflict with the constraint. It will detect
the topics and will - without further control - merge them into
one. (A query/transformation language could actually put more control
on how the 'merged' map looks like)

For (a) the answer is not so simple. Every language has a certain
degree of expressitivity. The higher that is, the more complex things
one can express. But the price of this is that at some stage you loose
the ability to "stay in control". For instance, it may be
theoretically impossible to always decide whether two constraints are
effectively saying the same (and are just using a different way to
do so), or whether two constraints are just contradicting each other
(making it impossible for maps to satisfy both constraints).

So this is a thin path to walk.

---

What you guys suggest within TMRM is to consider only properties of
the subjects in question to be part of such a constraint. This is all
well and good for many cases like the one above or "all persons having
the same email address should be regarded the same".

But it is rather arbitrary. What about the following "constraint":

  Two "cities" (i.e. this constraint only applies to instances of the
  concept "city") are to be regarded the same if both are directly or
  indirectly linked to a geographic item (river, mountain, bay, sea,
  lake, ...).

In our case above, all three cities may be linked to the "Posperus",
so they may be candidates for a merge. Note, that we have left
completely unspecified the exact nature of linking. This could be
"is-bordering-at", or "on-the-banks-of".

Now you could argue that this is still to be done with properties,
maybe properties which have to be defined ad-hoc-ishly for this very
purpose.

But I can top it. Now I would like to identify as equal all persons
who are involved in more than 5 associations of type "knows". Slightly
bizarre, but possible. Or what about "all persons with the same
distance from G.W. Bush should be regarded equal" (not sure how
bizarre that is).

In these cases there is no property involved. Still, a constraint
language may be able to capture this (or at least parts of this).

--

I am not saying that a future TMCL should have these features, what I
am saying is that we should orient ourselves towards the expressivity
and not an artificial selection of what is going to be compared.

\rho