[sc34wg3] revised draft Reference Model document N0298

Steven R. Newcomb sc34wg3@isotopicmaps.org
10 Apr 2002 06:07:01 -0500


"Bernard Vatant" <bernard.vatant@mondeca.com> writes:

> P1: "In the draft Reference Model, a topic map is
> seen as a set of "assertions", no more and no less."

> P2: "Every topic map is a graph, and every assertion
> within a topic map is a subgraph within that graph."

> That looks great, but it figures anyway you have to
> stand on more accurate definitions.

> -- First P1 and P2 are not consistent with each
> other. P1 implies that assertions are *elements* in a
> topic map graph, whereas P2 claims that they are
> subsets. Clearly IMO, P2 is right, and P1 is wrong.

I think you are right, Bernard.  For example, a topic
map may have nothing in it but topics whose subjects
are addressable.  

(Therefore, P1 is a case of rhetorical excess.  We had
reason to be rhetorically excessive in N0298, but we
can't afford that kind of thing in the standard.  I
think of N0298 as a kind of manifesto.  The standard
itself cannot be a manifesto; it must be rigorously and
comprehensively detailed.  With N0298, we were
attempting to make the most efficient possible use of
everyone's time, including ours, by omitting the usual
legalistic mumbo-jumbo that always seems to obscure the
intent of a standard, and by also omitting a great deal
of hair-splitting that will ultimately be necessary.
We felt that the right time for making these
significant time investments is only *after* we all
feel we're in agreement about the outline and intent of
the Reference Model.)

> And it figures also that in the graphical
> representation, the use of "assertion" to name the
> A-nodes (those linked to and A-end of a "CA" arc) is
> misleading. Is assertion the node or the subgraph
> "around that node"? Certainly the latter.

This is subtle.  The assertion itself is a subgraph,
but _the node whose subject is the assertion_ is the
A-node.  The A-node is not the assertion; it's just the
thing you identify as a role player when you want to
make an assertion about the assertion.  Sometimes I
describe the A-node as the "assertion nexus" or the
"assertion handle".

> But it figures the elements of that subgraph are to
> be clearly explicited. Which nodes and arcs belong to
> a given assertion?

My explanation of this can probably be much more
elegantly and compellingly expressed by someone who is
more skillful in mathematics and graph theory than I
am.  The following is my poor attempt to answer your
question:

By definition: 

* There are no arcs that don't have either a C end or
  an A end, or both.  C ends are always C-nodes; A ends
  are always A-nodes.

* Every C-node and every A-node is a component of
  exactly one unique assertion.  

* Every arc that's connected to a C-node and/or an
  A-node is a component of the same unique assertion of
  which the C-node or A-node is also a component.

* Every A-node is a node that uniquely corresponds to
  exactly one assertion, and serves as the only nexus
  of that same assertion.  Every assertion has exactly
  one A-node.

* For any C-node, there is only one A-node that is
  connected to it by an AC arc.  The A-node thus
  connected to the C-node is the A-node that uniquely
  corresponds to the assertion of which the C-node is
  also a member.

For any arc, you can always determine the one-and-only
assertion of which it is a component, because it is
connected to exactly one A-node (perhaps through a
C-node).  (If the arc is a Cx arc, you have to ignore
the node that serves as its x end.)

> -- I would suggest to replace "subgraph within that
> graph" in P2 by "subgraph *of* that graph", which
> means precisely:

> Given a (topic map) graph G, if A is an assertion of
> G, then:

>     - A is a subset of G : nodes and arcs of A are
>       nodes and arcs of G.

>     - A has a graph structure, inherited from the
>       graph structure of G.

I'm happy with everything you say, except I don't think
I fully understand the significance of "inherited
from".  What does this mean?

> Note that is *not* a definition of an assertion. You
> get just necessary conditions, not sufficient
> conditions. Which means it gives you no way to look
> for an assertion. I'll be back to that later on.

> -- Assuming we have defined (correctly) an assertion
> as a (sub)graph (of G), if we want now to replace P1
> by something consistent, it looks like it should be
> something like:

> P3: "A topic map G is a (finite) union of
> assertions".

> BTW such a definition seems to imply that there is
> not such a thing in G like an "isolated node" -
> e.g. a topic with no assertions about. Is this what
> the model really wants? 

No, N0298 is wrong to imply that isolated nodes are not
allowed.  Sorry for this inconsistency!  The revised
N0298, as now posted, does say "zero" in the following
statement,

  "Every node in a topic map graph serves as some
  combination of zero or more of the eight end points
  of the four types of arcs."

> It is somehow restrictive. You can at some point get
> a topic with no characteristic (a node
> belonging/linked to no assertion), because e.g. you
> have extracted the graph of assertions valid in a
> certain scope, and no assertion concerning this node
> is valid in that scope.  But you want to keep that
> node in the topic map anyway (just in case). That
> kind of topic map graph would not be validated
> against the model if you want a topic map graph to be
> an union of assertions.

Right.

> Unless ... I think we could stick to P3, because it
> is so simple. Since we have so far no formal
> definition of an assertion (unless I missed it), we
> have to set that definition sufficiently generic to
> have P3 making sense - even in the above case. Maybe
> we could include in the definition something like
> "null-assertions" (sets of one or more isolated
> nodes).

I think there's no need to go that far.  Let's let
nodes be nodes, and assertions be assertions, and try
very hard to say exactly what we mean.

> Which leads to the very fundamental question: How is
> an assertion defined? If I have a graph processor,
> and I want to extract assertions subgraphs, how do I
> identify them? From the document, I can just guess
> what is an assertion, and I would like precise
> answers to some questions before trying to set a
> definition. Like:

> Q1: How many A-nodes does an assertion contains?
> Exactly one? At most one? One or more?

Exactly one.  (Unless when you say "contains", you
include the role players -- the nodes that serve as the
x ends of Cx arcs.)

> Q2: Is the union (merging) of two assertions an assertion?

If you mean by "merging" what I mean by "merging", the
answer is "Yes".  I'm not comfortable with your
implication that the term "union" means the same thing
as "merge", but maybe that's because I don't understand
it.  I don't think of merging as an operation on sets.
(Although I suppose at some extremely Platonic level of
abstraction it would be OK to think of it that way.)

> Is "yes" it implies the answer to Q1 is "one or more",

No, it doesn't.  Merging always involves a reduction in
the number of nodes.  After merging, the answer to
Q1 is still, "Exactly one."

> and it implies obviously that any finite union of
> assertions is an assertion. Along with P3, it implies
> that a topic map, being a finite union of assertions,
> is itself an assertion.

No.  I think maybe we're being led astray here by
confusing merging with unioning.

> Which makes sense after all. A topic map is a maybe
> very complex assertion, but it is an assertion - but
> that means the concept of topic map is redundant in
> the model. Weird, but unescapable.

I don't follow you, here.

> Q3: What about constraints on cardinality? like:

> How many CA arcs are allowed for a given C-node?
> Is the CR arc mandatory for every C-node?

OK, here are our proposed rules for graph
well-formedness:

* All nodes can serve as the x ends of any number of Cx
  arcs.

  At the level of nodes and arcs, a node's service as
  the x end of a Cx arc (i.e., as a role player) has no
  implications for its subject.  

  (NOTE: However, the semantics of some assertion
         types, such as the Reference Model's
         "topic-subjectIndicator" assertion type, can
         impact the subjects of nodes.  Instances of
         this assertion type declare that the subject
         of the node that plays the "topic" role is the
         subject that's indicated by the addressable
         subject that plays the "subjectIndicator"
         role.  Applications, including but not limited
         to the Standard Application, may provide many
         assertion types that impact the subjects of
         players of certain of their roles.)

* Any node that serves as the C end of any AC, CR, or
  Cx arc:

   * MUST serve as the C end of exactly one AC arc, and
     as the C end of exactly one CR arc, and as the C
     end of exactly one Cx arc, and

   * CANNOT serve as the A end of any AC or AP arc, and

   * CANNOT serve as the P end of any AP arc, and

   * CANNOT serve as the R end of any CR arc.

   The subject of such a node is always the playing of
   a specific role in a specific assertion by a
   specific role player.

* Any node that serves as the R end of any CR arc:

   * MAY serve as the R end of any number of CR arcs, and

   * CANNOT serve as the A end of any AC or AP arc, and

   * CANNOT serve as the P end of any AP arc, and

   * CANNOT serve as the C end of any CR, Cx, or AC arc.

   The subject of such a node is always a role (or,
   using the jargon established in HyTM, a "role
   type").

* Any node that serves as the A end of any AP or AC arc:

   * MAY serve as the A end of any number of AC arcs,
     where that number is greater than 0, and

   * MAY serve as the A end of exactly one AP arc,

   * CANNOT serve as the P end of any AP arc, and

   * CANNOT serve as the R end of any CR arc, and

   * CANNOT serve as the C end of any CR, Cx, or AC arc.

   The subject of such a node is always an assertion.

* Any node that serves as the P end of any AP arc:

   * MAY serve as the P end of any number of AP arcs, and

   * CANNOT serve as the A end of any AC or AP arc, and

   * CANNOT serve as the R end of any CR arc, and

   * CANNOT serve as the C end of any CR, Cx, or AC arc.

   The subject of such a node is always an assertion
   type, with or without patterning information.

-- Steve

Steven R. Newcomb, Consultant
srn@coolheads.com

Coolheads Consulting
http://www.coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA