[sc34wg3] Mathematical Expression of Reference Model

Mon, 15 Apr 2002 23:45:51 +0200

Jan

Thanks for your feedback. Answers below.

> (P1) Every element of T is called a topic.
>
> (P2) C, A, R and P are disjoint subsets of T

>     <comment>
>     From a philosophical standpoint this makes sense
>     (e.g. no assertion can be a pattern)  but as far
>     as I know dRM makes no such restrictions (yet).

I think the dRM does imply those restrictions. At least it's what I inferred from the set
of constraints that Steve sent recently in answer to my questions. I would like Steve to
confirm. But in fact the definition of C, A, R and P is more functional that generic,
which means they are defined to be the respective range or domain of the functions
representing the four types of "arcs".
I could have defined those functions first, and define C as the domain of Ca (and Cx), A
as the range of Ca, etc ...

>     I think that P2 is a very simple formalism for the
>     'connection rules' mentioned in dRM: "There are
>     rules regarding the combinations of arc endpoints
>     as which a given node can serve,..."

>     [Is this assumption correct, Steve ?]

>     P1 and P2 are a very nice way of expressing the
>     fact that dRM does not use typed nodes, but that
>     nodes belong to certain 'categories', depending on
>     the types of arcs they are endpoints of.

Yes. It figures that no property of elements in set theory - including "type" - is
generic. Properties are inherited from the relations an element is involved in. That's why
it is the perfect expression of a topic. It is a "blank" element as long as there is no
assertion on it.

>     <comment>
>     I wonder what the 'meaning' of this set of topics
>     is that are neither elements of A,R,P or C.
>     </comment>

That "meaning" has been almost there ;-) I named that set on a first version, "ground" or
"level-0" topics, then changed my mind, because this set is not used in function
definitions. I prefer to consider they are "neither-neither". But they are the majority of
topics in most topic maps!

> (P3) Ca, Ct [you mean Cx, right ?], Cr and Ap are
>      functions, of which domain and range are subsets
>      of T.

I do mean Cx - thanks for correcting the typo

>     <comment>
>     This is very good! You provide a formalism for
>     traversals using set theory. The functions allow
>     one to say: "start at a set of seed nodes and
>     traverse a certain arc. The set of nodes that are
>     'reached' that way is the result set of the
>     traversal. 'traverse a certain arc' maps to your
>     functions, e.g. "apply Ca to every topic in the
>     seed set".
>
>     [BTW: The query language of the GooseWorks Toolkit
>     works exactly this way. ;-) ]

Good. I don't figure how it could work otherwise ;-)

>     I suggest that you add functions for the opposite
>     'direction' too:
>
>     AC, xC, RC and PA
>     </comment>

WRONG !!! You miss a very important point there. The inverse of a function is not a
function !!!
Ca is a function because every element in C is linked to exactly one element in A, but
*not the other way round*.
And it's the same with the three others. The function "arrows" have to be exactly in the
direction they have been defined. They do not make sense as functions in the opposite
direction. That is very important. A function is not an oriented arc!

If you want to define reverse functions, you will use functions from T to P(T) - set of T
subsets.

>     <comment>
>     set theory makes it easy to 'explain' dRM ;-)
>     </comment>

I would prefer "express" than "explain". Anyway all the point of mathematical language,
and set theory in particular, is to make things easy to express :))

> (P4) For every element a of A is defined a subset Ga of
>      T, called the assertion-graph of a.
>      Ga = { a , ci , xi } such as i belongs to a
>      finite set I, and for every i in I : a = Ca ( ci )
>      and xi = Cx ( ci ).  {xi} is the set of
>      role-players of the assertion graph.

>          <comment>
>          Is there I reason why you did not include the
>          Pattern and the roles ?

I was waiting for that one. It's not obvious. You could - I did include them to begin
with. The way I did it finally is to be closer to the notions of "connected components"
and "lift" that we have in the hypergraph model. Roles and Patterns belong to another
semantic layer.

>          Doing so would enable use to say that a topic map
>          is the union of all its assertion subgraphs (Ga)
>          </comment>

I don't see your point there. First the assertion-pattern-graph is also an assertion
subgraph, so the fact that p and r belong or not to Ga does not change the  validity of
that proposition. And second we have to consider isolated topics (those which are neither
in the range nor domain of any of the four functions). Anyway I don't figure that saying
that a topic map is the union of its assertion subgraphs has any practical use in the
model.

> (P5) An assertion-pattern is an element ap of A,
>      of which the assertion-graph contains exactly one
>      role-player which belongs to P, whereas the other
>      role-players belong to R.
>
>          <comment>
>          Hhmm, you seem to say here that the connection
>          between an assertion topic (A) and its pattern (P)
>          is 'labeled' by a role ???

Not at all. I don't see why you understand it that way.
I'm speaking there of the assertion-graph of (ap), representing the
"assertionPattern-role-rolePlayerConstraints"

>          It's not! The patterns of assertions are not
>          role-players (in the assertion they do pattern),
>          they are just connected to them via AP arcs.

Ageed. p = Ap(a) does not say anything else.

>          Also, I think that topics 'become' patterns
>          (elements of P ;-) by playing the dRM predefined
>          role 'role-pattern' in a
>          "assertionPattern-role-rolePlayerConstraints" assertion!
>          That means thet there CAN be patterns that are not connected
>          to an assertin via an AP arc!  Steve, am I correct here ?
>          </comment>

I think you are correct here. There might be patterns with no assertion instance, and
roles with no casting instance. In that case, it means the range of Ap is *a subset of* P,
and the range of Cr is *a subset of* R. I will make that modification.

> (P6) Assertion-pattern consistency rule:
>      If p = Ap ( a ) and a = Ca ( c ) and r = Cr ( c )
>      then p and r belong to the assertion-graph of the
>      same assertion-pattern
>
>        <comment>
>        I wonder how many other usefull statements there are!
>        </comment>

Good question. This is definitely not the end of it, even for patterns. As said
previously, patterns are the most difficult thing to handle, in any model. The cardinality
constraints are not yet expressed, for example.

And there is the controversial constraint on connected components we have in the
hypergraph model, that can also be expressed in this model (to be delivered ASAP)

All the question is to know which constraints are to be in the generic model. Afterwards,
it will be easy to add more constraints to define more specific models, for specific
classes of topic maps (layered, connected ...).

Keep feedback coming.

Bernard