[sc34wg3] Analysis of TMRM Use Cases

12 Apr 2004 12:39:06 -0400

As you already know, I unfortunately can't make it for the
next meeting. Here is my contribution to the current discussions:

Steve Pepper wrote:

> I certainly see the attraction of having a "foundational" model
> that has associations but not basenames or occurrences... At least
> I see how that could be intellectually satisfying.
> 
> But I'm not sure I see what use we would have for it. And without
> understanding what its purpose is, there is no way we can answer
> important questions like:
> 
> * What happens to identifiers in such a model?
> 
> * Will such a model require us to be able to show how the same 
>   information can be represented using associations instead of 
>   base names and occurrences? (If so, are we sure that we will
>   not open a can of worms in doing so?)
> 
> * To what extent are templating constructs required as part of
>   the model?
> 
> * How far do we need to go in terms of defining the fundamental
>   rules that apply to the model, such as (in simplified terms)
>   "can a topic be both a topic type and an association role
>   type"?
> 
> Without answers to questions like these, we don't know enough
> about the kind of model we need and so we can neither evaluate
> current proposals or create new ones.

I believe that the purpose of the Topic Maps standard is to
interchange information, not processing. And we are not interchanging
just pieces of information declared as element instances (XML is
sufficient) but we are describing the connections between information
pieces. And what is extremely important, and what makes Topic Maps
a standard, is that these connections are described independently
of 
- any processing that has been performed to create them in the
state they are,  
- any processing that can or will be performed on those.

Now in this perspective let me try to address the issues you
raise.

> * What happens to identifiers in such a model?

What matters here (at interchange stage) is the guarantee that
subjects have been identified. A subject is not equivalent
to an address (simply because the address itself maybe a subject).
The XML ID used can itself be a subject. So what is important
is the disclosure of what the subjects are, and how they are
exposed. This is really the fundamental issue, especially when
you interchange an existing topic map (i.e. you publish it for use
by others) you want to make sure that the essence of the topic
map is preserved. By "essence of the topic map" I mean the work
which has been put in order to differentiate any subject from
any other. This perspective has the consequence that merging
rules should be expressed externally to the interchangeable subject
map. Whether they are going to be applied or not by the recipient
is kind of out of the picture.

* Will such a model require us to be able to show how the same 
>   information can be represented using associations instead of 
>   base names and occurrences? (If so, are we sure that we will
>   not open a can of worms in doing so?)

The actual model created by the designers of a topic
map need not to be constrained. What matters
is that pieces of information get connected. The very nature
of the connections can be as diverse as we want. If some people
want to have a more precise level of interchange and want to
standardize their maps for example on an industry-wide level, fine.
But that should be left open and there certainly is no constraint
of how information should be represented. Your point shows that
even using strict XTM-based syntax, there is already an infinity
of possible solutions to represent the same things. So there is
no harm in opening it to other, new schemas for doing topic maps.
Since topic maps is a modeling language for knowledge (given that
knowledge is considered as connections between information items)
there is no way to constrain knowledge representations. Trying to
limit that would not lead to anything that could have any usability
value outside of a very limited circle of users who basically all
know each other. That's not what the topic maps standard is trying to
achieve. We need to position the standard as being able to
encompass any representation of connected knowledge items. Whether
we like or not what people have designed should not be our problem.

> * To what extent are templating constructs required as part of
>   the model?

I dont understand what you mean by "templating constructs".

Personally, I see XTM itself (or rather the TMDM model) as a
set of templating constructs, as valid as any other one, only
different because there has been more work on it up to now).
It's similar between custom-made and ready-made. It all
depends how sophisticated and specific the user requirements
are.

> * How far do we need to go in terms of defining the fundamental
>   rules that apply to the model, such as (in simplified terms)
>   "can a topic be both a topic type and an association role
>   type"?

We should define the rules in term of disclosing what subjects
are, i.e. what gets reified and what has to be taken for granted.
The rules used for merging topics can be disclosed not so much
because we let others do the merging instead of providing a
merged representation (I didn't have time to do it, so I let
you do this instead !), but because we want to give them the
ability to extend the topic map with their own information
using the same techniques we used. This is a matter of disclosure.

It is important that we are able to do so because we want topic
maps users to *trust* the standard in its ability to describe
what they are willing to accomplish with no hidden issue in
the background. We want to avoid having to tell them: by the
way, by accepting this model you have implicitely agreed that
the information you were representing was so and so. In other
words, we don't want fine print to be hidden in the contract
agreement that a user commits when he/she signs for a topic map
compliant application. My answer goes beyond your question.
To your question I would simply answer: it doesn't matter.

In one of the topic maps application I am dealing
with, a different topic type appears when there is a double
dash in one of the base names for a topic. And this new
topic type overrides the current topic type which has been
set by another process. I don't think there is any point
of having this rule to be standardized, simply because there
is an infinite number of specific cases, exceptions, etc., that
need to be addressed. But what's important is that the actual
topic type which is in use is disclosed.

> Again, there is a certain intellectual satisfaction in this,
> but it begs further important questions:

I disagree with your handling of the concept of "intellectual
satisfaction", which implies that this is kind of secundary
goal. The real issue here is how much users can trust the
standard to reflect their requirements.

There is a tension that I know quite well (I am there too)
between users and implementers. I consider the topic map standard
to be useful as purely declarative. All the processing that is
being performed can be standardized or not, but it's not
at the same level. To draw a parallel with the XML family
of standards, I see the standard as the equivalent of XML
(mostly declarative and wide open, with some validation/well-formedness
rules that need to be explicit),  and all kinds of satellite
applications (Styles, Queries, etc.) that can optionally be
applied without having any influence on the fact that the
application is XML or not. We don't want the Topic 
Maps standard to be going in a way where a single application
has plenty of features, some of them extremely constraining,
and others have nothing and are excluded from being considered
topic maps. That's my real concern, and this is why I believe
that the potential brought by what is now called TMRM is giving
us directions to avoid falling into this trap.

> * What other "instantiations" already exist or might exist? 

As Jim said, relational databases are an example of the
information sets that can -- and will -- be regarded as
topic maps. Any existing metadata schema (be it in XML or
not) can be regarded as a topic map. Some Semantic Web
applications are already topic maps. Google search results
are a topic map. Etc.

> * Do we want to call those "instantiations" Topic Maps?

Yes. Because the usability is in the information content
they provide, not in the internal encoding syntax they have been
designed in.

> * If so, to what extent does it serve or damage the interests
>   of Topic Maps users for there to be multiple models, all of
>   which can legitimately be called Topic Maps?

Topic Maps users want their information to be connectable
with others. Otherwise why would they be doing topic maps for?

> * To what extent should multiple "instantiations" of the
>   generic model be interoperable?

It depends what you call interoperable. I think we should be
realistic here and limit the interoperability to identify what
subjects are. This is where it's important to distinguish
the declarative perspective from the procedural. We can't be
doing both if we want to enable diversity. We need to focus
on the declarative parts: "Expose your knowledge" would be
the slogan, not "Here is what to do about it". 

> I have never said that the idea is not worth pursuing. I'm not
> sure anyone has (except possibly in a moment of extreme frustration).
> It's a question of *how* we pursue the RM, and whether we let that
> pursuit delay other parts of the standard.

A standard is the result of a consensus in a committee work.
Building the consensus is not a waste of time. We should be
avoiding building things that are not providing consensus.
That's where delay is coming from.

> We have to agree on "industry requirements" before we can proceed.
> The use cases were intended to help in that process. Lars Marius
> and I did our best to extract a real industry requirement from those
use cases and then demonstrated that that requirement could be
> satisfied using the TMDM and TMQL. SRN and Patrick have since
> claimed that we missed the point and have made another effort to
> get that point across. I am still trying to get my head around
> their latest contribution.

Good.

> So, yes, I'm all for an intellectually satisfying generic model,
> provided we first agree on its exact purpose in terms that make
> it possible to standardize something that people actually have a
> use for.

Agreed.

> But this goes way beyond a mere restatement of 13250, which is
> what we are supposed to be doing at the moment, 

Yes.

> and it is very
> definitely *not* as urgent as TMQL and TMCL.

Unless we decide TMQL and TMCL to be considered "satellite
standards" as X-Query and XSL etc. are for XML. If we
decide to go along that line, that the work on those would
not be delayed. It would be performed in parallel.

-- 
==================================
Michel Biezunski
Coolheads Consulting
402 85th Street #5C
Brooklyn NY 11209
Email: mb@coolheads.com
Web  : http://www.coolheads.com
Voice: (718) 921-0901
==================================