[sc34wg3] And yet another...

Lars Marius Garshol sc34wg3@isotopicmaps.org
Mon, 26 Jul 2004 09:13:57 +0200


* Robert Barta
| 
| Part II
| 
| - What I think that the proposal shows successfully is that you can
|   translate any TMDM structure into RDF. Your quadruples are actually
|   RDF statements, or, more precisely, implementations thereof.

Nearly, but not quite. All my quads that have a statement ID as the
first item in the quad are very awkward to reproduce in real RDF. The
real merit here (if there is any) must lie in the ability of the model
to represent both TMDM and RDF more easily than TMDM and RDF can
represent this model.
 
|   All implementations of RDF I have seen use this approach, although
|   the notation is maybe different:
| 
|     my $subject   = ....
|     my $predicate = ....
|     my $object    = ....
|     my $statement = new RDF::Core::Statement($subject, $predicate, $object);
| 
|   $statement is nothing else than the identifier for the whole thing, like
|   what you suggest.

Sure, but you can't easily refer to $statement in a new statement in
RDF. That's why exporting TMs to RDF is so difficult, because where do
you put scope, variants, and reification?
 
|   In this context you write
| 
|      Finally, the only way to represent associations in this model is
|      to turn them into full nodes with each role player connected in
|      by a triple of its own.
| 
|   Well, this is RDF isn't it?

Not really. In RDF the employed-by relationship goes like this:

  (employee-topic, employed-by, employer-topic)

while in the FM it goes

  (assoc-id, employee, id1, employee-topic)
  (assoc-id, employer, id2, employer-topic)
  (assoc-id, TYPE,     id3, employed-by)

| - The question is now whether the foundation model (FM) is a model for
|   topic maps. For me this is the case when all topic maps, say
|   instances of TMDM, can be be mapped into the FM (completeness).
| 
|   You have not shown all details, but I cannot see a problem here. 

There is one problem, actually, which isn't described in the document.
If we use the occurrence type as the value of the second item of the
quad to represent occurrences there is nothing in there that says that
this actually *is* an occurrence and not, say, a typed name.

The solutions I've thought of for this problem are

  a) introduce a TMDM restriction that says a type can only be used as
     a type for a single kind of construct, or

  b) turn the quads into quints.

I haven't gone for either yet, because I think this problem is
intimately related to the RDF/TM interoperability problem, and that
remains unsolved here, even though it does appear to be within reach.

|   Actually, the RDF people have argued all along that this can be
|   done and that TMs can be disregarded because of this.

If you take the Omnigator, go into opera.xtm, click export, select
RDF/XML format, and study the result I think you'll have to admit that
it's harder than it seems. In fact, it's so hard that I think the
claim can be dismissed out of hand, unless I misunderstand something.

|   The other question is whether all sets of quadruples form valid topic
|   maps (soundness). This is less clear to me. Is, for instance, the set
| 
|     [ 1, 2, 3, 4 ]
|     [ 5, 6, 7, 8 ]
| 
|   a map? And if not, why not?

That's a good question, and I think this is what the first para of 5
is talking about. Frankly, I don't know if this is an interesting
question. 

Do we care about this model as a model, or do we only care about it as
a way of connecting XTM/CXTM with TMQL/TMCL? I don't know the answer
to this, but I think we should answer this question for ourselves
before we spend a lot of time creating a waterproof model for
something that may in the end be no more than an editorial device.
 
|   In this context you write that 
| 
|       Since the thinking behind this proposal is that TMDM remain as
|       it is and where it is, it is not necessarily a problem for the
|       foundational model to not have the constraints in it. The
|       constraints will be provided by TMDM, and the foundational
|       model will be specified as a transformation from TMDM to the
|       set of tuples.
| 
|   If the FM is simply a transformation of TMDM items into quaduple sets,
|   then what exactly is the additional benefit, except of showing one
|   alternative to implement TMDM instances?

It makes it easier to do TMQL. End of story, really. (Of course, there
is the RDF bit, but whether we can make any use of that in the
standards work I don't know.)
 
|   Should then not FM simply be an addendum to TMDM, or - even better
|   - should TMDM not be drastically simplified by using quadruples
|   only?

I'm not sure it *would* be simplified, since you'd have to import all
the constraints into the quad model. How that would look I haven't
considered.

The other thing is that TMDM has gone to CD after three years of
committee work. I'd hate to start the mill necessary to stop it and
replace it with something based on quads, especially when, as you say,
it's not clear that there's any benefit to it.

My conclusion is that I think this model, if we do further work on it
at all, should go into TMQL as the basis for TMQL and the means of
connecting TMQL with TMDM.
 
| - The FM would allow to map TMDM instances to quadruples. This is
|   making use of the 'vocabulary' we all (almost) love and use:
|   basename, occurrences, variants, ....
 
Yep.

|   What about using another vocabulary? birthdate, shoesize? Is this
|   then un-TM-ishly improper? 

Birthdate and shoesize are both occurrence types in TMDM. So

  <topic id="rho">
    <occurrence>
      <instanceOf>
	<topicRef xlink:href="#shoesize"/>
      </instanceOf>
      <resourceData>41</resourceData>
    </occurrence>
    <occurrence>
      <instanceOf>
	<topicRef xlink:href="#birthdate"/>
      </instanceOf>
      <resourceData>unknown</resourceData>
    </occurrence>
  </topic>

would become (using symbolic identifiers this time, for readability)

  (rho, SOURCE_LOCATOR, statement1, "file://...#rho")
  (rho, shoesize, statement2, "41")
  (rho, birthdate, statement3, "unknown")
  (shoesize, SOURCE_LOCATOR, statement4, "file://...#shoesize)
  (birthdate, SOURCE_LOCATOR, statement5, "file://...#birthdate)

I guess what I'm saying is that you don't *need* another vocabulary
to do this sort of thing; you just do it within TMDM.

|   How can I define which are there and in which constellation they
|   may appear? (Others would call it 'disclosure'.)

TMCL is still the answer to that question, as far as I'm concerned.
 
|   Would I have to write then a TMDM-rho which - in prose - would
|   define all this stuff? Isn't this a bit like going to field number
|   one, given all the TMRM discussions we had over the last year?

It would be, yes. So I'm not talking about that, but only about
connecting TMDM with TMCL/TMQL.

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >