[sc34wg3] Modularization

Fri, 7 Feb 2003 01:26:51 -0500

* Lars Marius Garshol:

> Michel, I've read this posting several times now, but I can't make out
> what you are saying. Could you start from N0323 and describe what it
> is you would like to see done differently?

I believe both the SAM and the RM need to be revisited
as well as their boundaries. This requires a
particularly big effort for those like you who are
directly involved in one of them and have a view that
privilege one of the aspects of the whole game. But
it's particularly important to take the perspective one
level up in order to make the various pieces work
*together* instead of considering that they work
*against* each other. This is why I am not too much
surprised by your reaction and am hopeful that after we
clarify the various points we'll arrive to a point
where all of the parties involved and having worked on
several aspects will feel comfortable that everything
can fit in the global picture.

In Topic Maps I see 4 main aspects which I will
describe as follows: 

1) the subject being represented as a topic.

2) the assertion structure, i.e. the internal
elementary constituents (in other words, the assembly
language level) of what it takes to utter an assertion.

3) the concepts of name, occurrence, scope, that makes
the SAM model specific and can be described as
predefined assertion types.

4) the element type definitions in the syntaxes which
embody the concepts and make their properties available
using some kind of XML-based processing

These aspects are present in the current topic maps
model as defined in current ISO 13250, in the SAM, in
the RM, in the so-called future guide, in the proposals for 
TMCL and TMQL. They should be defined in a modular way, so that
we know which are the ones which are being referred to
and implemented in certain application models and
certain applications. There is no necessity to have
them all implemented in all applications.  But at least
we need to know what is where. This will help topic map
users to choose the application they need depending on
their requirements.

The situation we have today shows contradictions that
we need to solve.  Here when I use "we", it means "we"
as the whole topic map group, regardless in which
subgroup we consider ourselves the most interested in.

- We can't claim at the same time that the RM is the
  foundation of Topic Maps and ask that it be removed
  from the Topic Maps standard.

- We can't claim that the SAM provides for everything
  everybody would ever want to do and at the same time
  limit it to a small number of built-in assertion
  types.

This situation is potentially risky because the more we
add layers (TMCL for example) the more sophisticated
and complex the applications will become, the more difficult
it's going to be to de-intricate what's fundamental
from what's specific, and we are going to create a
situation where we will only be able to process a
narrow universe because the model is too much defined
(remember ODA which tried to define all characteristics
of word processing applications for example, which
after a while looked quite pretentious and
ridiculous). I don't think any of us wants to put
ourselves in a similar situation.

A standard can not be built like a software
application. Yes, there is a need to support existing
software applications and I don't think anybody denies
that. We need to make sure that the existing software
applications that are able today to define themselves
as topic maps compliant will still be able to do so in
the future even if it requires perhaps some minor
adjustments. The difference is that a standard must be
enabling, i.e. it must accommodate some space for
applications which are not necessarily following the
models used by today's existing applications. This is I
think a necessary condition to ensure long-term
viability for the standard. The level of abstraction of
the standard must be higher than the level in which
applications are being designed. 

On the one hand, one of the problems I see with the SAM
as currently defined is that it's too precise in the
sense that it implies a given type of API. I do not
deny the interest of having had to do so in order to
understand what we are doing and we have to be very
grateful to the current implementers who are sharing
technical information about the way they have
implemented things. This is extremely useful but we
should use this to build the concepts in a way that
takes some distance from what has been made available.

On another hand, the approach taken by the RM has also
its drawbacks. The RM doesn't distinguish clearly
enough the "subject location uniqueness objective"
which is the basic principle that underlies the very
nature of what topic maps are from the assertion
structure mechanism (all the business about the types
of nodes and arcs). For that reason, the RM looks like
a big monolithic set of stuff that looks entangling and
extremely constraining (why the hell do we need to go
through all this ordeal?), and it has the effect that
some of the software implementers see it as something
astronomically costly in terms of guaranteeing
compliance. I believe that this problem could be
overcome by separating what the reference model brings
to topic maps in several parts that need to be
integrated at completely different levels in the
forthcoming new version of the standard.

What I propose therefore is to redefine the building
blocks that will facilitate building the kinds of
applications the market is looking for. So now to
answer your question, what I am proposing is not
different from anything that has been developed both as
part of the RM and the SAM, but I propose to organize
the building blocks in a clearer, more harmonious,
better integrated, way. I think this is feasible and I
think that once it will be done, it will realize what
the contributors of the various parts are looking to
achieve. It will have the effect to provide us with
something that actually works, and will be usable over
the long term. I believe it's worth the effort, even if
what I am saying might seem painful at first sight.

That's all! Let me now try to answer to your specific
questions or comments.

> I comment on specific parts below, but, really, it is not just the
> parts that are incomprehensible to me, it is first and foremost the
> whole. I suggest that you use the comments below as guidance to what I
> didn't understand, in order to better succeed in your next attempt.
> 
> (Just saying this because I think that if you start replying to my
> comments individually we'll just lose ourselves in the details without
> going anywhere.)
> 
> * Michel Biezunski
> |
> | Yes, the existing syntaxes should still be normative.

* Lars: 
> What does it mean for a syntax to be normative?

To be defined in a normative section of the standard.

* Michel:

> | I propose a modularization strategy where we clearly separate 1)
> | from 2).

* Lars:

> Isn't that what the current SAM/RM separation does? I can't work out
> how you want your 1) and 2) to be different from the RM and the SAM.

See above. I think I have answered to your question.

* Michel:
> | Any topic map candidate application should be able to be represented
> | as 1), but 2) could be replaced by any model that exists or will
> | exist and can be mapped to 1). In other words, 2) would be a plug-in
> | to 1) that could be unplugged and replaced by another plug-in.
* Lars:
> What is a "topic map candidate application", who would create one, and
> why? Is it the same as a "model"?

An RDF-based navigation system, a relational database,
a book indexed within a word processor, for
example. Lots of people are already using those. Why?
Ask them! Some of them have models (databases have
schemas) some have not necessarily an explicit model
(an index of a book for example).

* Michel:
> | Currently, the SAM as written inherits from the confusion resulting
> | in a fuzzy separation between these 2 levels.  Which translates
> | partially in the fact that concepts are represented directly by
> | syntactical constructs. 
* Lars:
> The SAM does not express things in terms of syntactic constructs.
> There is nothing inherently syntactical in the notion of base names,
> occurrences, or scope.

Except that there is a direct (one-to-one often)
relationship between such a concept and an element
type. 

Michel:
> | We thought when we did it that this was going to make it "easy" to
> | grasp, but now I see this as a severe limitation to expansibility.

Lars:
> What is extensibility, and what do we want it for?
It is the possibility to regard as a TM application
some application which is not in XTM (I am not speaking
of XML namespaces here). For example, it's possible to
regard HTML documents as topic maps if the meta tag
contains information that can be used to derive base
names. If we enable that, the Web becomes a gigantic,
already existing, topic map.

* Michel:
> | The part that corresponds to 2) should define conceptually,
> | abstractly, what we mean for example when we speak of an occurrence
> | or of a name in this representation.
* Lars:
> The SAM already does this.

Good. The current text of ISO 13250 as well.

[...]

* Michel:
> | Because it's possible that other applications will use the same
> | notion of names, and other element types. We need a description
> | precise enough to know if the model can be assimilated or not.
*Lars:
> Of course. That's why I am talking about mapping for example XFML
> directly to the SAM.

Yes, but what about a mapping between XFML and another
one (XIL?). There is no reason a priori why the SAM
should be at the center. Especially if the applications
invent concepts which have no equivalent in the
SAM. Then what would you do?

*Michel:
> | Last remark: it might be appropriate to change the name "Standard"
> | in the expression "Standard Application Model" for a name more
> | hospitable to coexistence with others. 
*Lars:
> This takes us back to the question of what other models we envisage
> and why we would create any.

Yes. But it's not necessarily "us". Some other people
might want to, and will be, and already are, creating
other models, whether we want it or not.

* Michel:
> | Because if another standard organization publishes a standard that
> | we can consider Topic Maps-compliant, then there will be another
> | standard application model. 
* Lars:
> Yeah, but it won't be topic maps. When you write software you'll have
> to either implement the SAM, or not implement it. If you implement RDF
> you've implemented RDF, not topic maps. If you implement the RM you've
> implemented the RM, and not the SAM.

This is where we differ fundamentally. For me, it can
still be topic maps, at least in certain
circumstances. For example, an RDF application as such
doesn't mean anything except for the fact that it is
represented with the RDF syntax. There is for example a calendar
application using RDF. This one has not a direct
relation with topic maps (not even sure but let's say
temporarily it doesn't). Other applications of RDF are
annotation managers, topic-based navigation systems,
graphs of associations between subjects, these are
genuine topic map applications. They just happen to be
expressed with a different syntax, but what they are
doing is exactly the same kind of thing we are
describing as topic map applications. Why should we
prevent ourselves to regard them as actual topic map
applications? 

To summarize my point, if our concepts are clearly
expressed and are properly modularized, we realize a
huge step forward in the span of applicability of the
topic maps standard. I don't see no reason why we
shouldn't at least try to make it work and push the
limit further away.

Michel

===================================
Michel Biezunski
Coolheads Consulting
402 85th Street #5C
Brooklyn, New York 11209
Email:mb@coolheads.com
Web  :http://www.coolheads.com
Voice: (718) 921-0901
==================================