[sc34wg3] revised draft Reference Model document N0298

Tue, 9 Apr 2002 15:17:06 +0200

Hello there

Jim forwarded me lately a message where Holger asked if I was listening to the debate. OK,
I'm there, and subscribed to the sc34wg3 list, so now I think I should join the debate,
although jumping right in the middle of it is not easy.

I will stick for the moment to one point. I cc to our "graph brains" Pascal and Patrice so
that they be able to correct me if I am wrong. Following comments about the following
propositions:

P1: "In the draft Reference Model, a topic map is seen as a set of "assertions", no more
and no less."

P2: "Every topic map is a graph, and every assertion within a topic map is a subgraph
within that graph."

That looks great, but it figures anyway you have to stand on more accurate definitions.

-- First P1 and P2 are not consistent with each other. P1 implies that assertions are
*elements* in a topic map graph, whereas P2 claims that they are subsets. Clearly IMO, P2
is right, and P1 is wrong. And it figures also that in the graphical representation, the
use of "assertion" to name the A-nodes (those linked to and A-end of a "CA" arc) is
misleading. Is assertion the node or the subgraph "around that node"? Certainly the
latter. But it figures the elements of that subgraph are to be clearly explicited. Which
nodes and arcs belong to a given assertion?

-- I would suggest to replace "subgraph within that graph" in P2 by "subgraph *of* that
graph", which means precisely:

Given a (topic map) graph G, if A is an assertion of G, then:
    - A is a subset of G : nodes and arcs of A are nodes and arcs of G.
    - A has a graph structure, inherited from the graph structure of G.

Note that is *not* a definition of an assertion. You get just necessary conditions, not
sufficient conditions. Which means it gives you no way to look for an assertion. I'll be
back to that later on.

-- Assuming we have defined (correctly) an assertion as a (sub)graph (of G), if we want
now to replace P1 by something consistent, it looks like it should be something like:

P3: "A topic map G is a (finite) union of assertions".

BTW such a definition seems to imply that there is not such a thing in G like an "isolated
node" - e.g. a topic with no assertions about. Is this what the model really wants? It is
somehow restrictive. You can at some point get a topic with no characteristic (a node
belonging/linked to no assertion), because e.g. you have extracted the graph of assertions
valid in a certain scope, and no assertion concerning this node is valid in that scope.
But you want to keep that node in the topic map anyway (just in case). That kind of topic
map graph would not be validated against the model if you want a topic map graph to be an
union of assertions.

Unless ... I think we could stick to P3, because it is so simple. Since we have so far no
formal definition of an assertion (unless I missed it), we have to set that definition
sufficiently generic to have P3 making sense - even in the above case. Maybe we could
include in the definition something like "null-assertions" (sets of one or more isolated
nodes).

Which leads to the very fundamental question: How is an assertion defined? If I have a
graph processor, and I want to extract assertions subgraphs, how do I identify them? From
the document, I can just guess what is an assertion, and I would like precise answers to
some questions before trying to set a definition. Like:

Q1: How many A-nodes does an assertion contains? Exactly one? At most one? One or more?

Q2: Is the union (merging) of two assertions an assertion?

Is "yes" it implies the answer to Q1 is "one or more", and it implies obviously that any
finite union of assertions is an assertion. Along with P3, it implies that a topic map,
being a finite union of assertions, is itself an assertion. Which makes sense after all. A
topic map is a maybe very complex assertion, but it is an assertion - but that means the
concept of topic map is redundant in the model. Weird, but unescapable.

OTOH answering "never" does not seem sustainable. Think about adding a new role, merging,
inference, query ...

And answering "yes, sometimes" and "no, sometimes" kills the simplicity of the model, and
looks like making the devil enter through the back door.

Q3: What about constraints on cardinality? like:

How many CA arcs are allowed for a given C-node?
Is the CR arc mandatory for every C-node?

Well - enough for today. Just an answer to Holger recent post:

*Holger
> If "graph" is understood as a collection of nodes
> you can reach following the arcs starting at an arbitrary node ...

I think, Holger, you are wrong here. Your definition reduces the general notion of graph
to "connected graph" (in which there exists a finite walk from any node to any other
node). I don't see why a topic map should be constrained to be connected. If you merge two
topic maps with disjoint sets of subjects, role and association types, you will get a
merged graph with (at least) two disjoint connected components. So being connected is not
a requirement for G.

> ... and "subgraph" is understood as a subset of this
> collection, then I would say that "every assertion within
> a topic map" is not necessarily a subgraph.
> Why? Because assertions may define graphs which are not
> connected to other assertions/graphs.

I don't see your point. Two assertions A1 and A2 can be two disjoint subgraphs of G. This
is not a problem in graph theory. Or an assertion could be a non-connected subgraph, if it
is the union of A1 and A2, like "Bernard works for Mondeca and Holger likes Scottish
beer" - although it would be quite easy to add a third assertion making the connection
happen in that case :))

To be continued

Bernard

-------------------------------------------------
Bernard Vatant - Consultant
Mondeca - "Making Sense of Content"
www.mondeca.com
bernard.vatant@mondeca.com
-------------------------------------------------