[sc34wg3] XTM 1.1 issues

Robert Barta sc34wg3@isotopicmaps.org
Wed, 14 Dec 2005 20:15:49 +1000


On Mon, Dec 05, 2005 at 09:14:37PM +0100, Lars Marius Garshol wrote:
> An additional rationale is that the editors consider that a
> standardized interchange syntax is not something that should be
> changed only half-way. Anything that is not corrected now will hang
> around as an irritant for years to come, and so we would much rather
> consider all the options than only half of them.

I support this view, namely that of XTM being an _interchange_ syntax,
not an authoring syntax. This is a shift of focus away from the
2000iesh understanding, though, which some members will be reluctant
to follow.

I only think that XTM will have a hard time being in a dual role, that
of an 'interchange' syntax and that of 'authoring'. [ see Goldoni:
Diener zweier Herren ].

> POSSIBLE FURTHER CHANGES
> 
> --- <baseName>
> 
> The <baseName> element is in contradiction with TMDM, where the same
> construct is named "topic name". In addition, <baseNameString> is not
> a good element type name.
> 
> Two proposals:
> 
>   1) <topicName>
>         <baseName>...</baseName>
> 
>   2) <topicName>
>        <instanceOf>... <scope>...
>        <value>...</value>
>        <variant>...
> 
> The editors prefer 2).

Me too, although the separate <value> element is not overly lean.

> --- Reification
> 
> The reification processing in the current XTM draft is very heavy,
> basically because reification is explicit in the model, but not in the
> syntax, so you have to wait until the entire document is read before
> joining up the reification connections.
> 
> The solution is to make it explicit in the syntax. We thought of four
> alternatives:
> 
>   1) -reifier- attribute on anything that can be reified
>      cut ID from everything, except <topic>
> 
>      bad: have to add everywhere, not consistent with general XTM 1.0
>           syntax design

This strikes me as the most elegant solution, although I am inclined
to keep the ID on topics.

"Everywhere" means only assocs, names, occurrences and wartisants.

>   4) <reifies> sub-element on <subjectIdentity>, with xlink:href
>      <reificationRef>? <reifiedRef>? <reifiesRef>? <subjectRef>?
> 
>      favourite: <reificationRef>
>      bad: can't think of a good name
>      bad: means when you process the topic you don't know what
>           it reifies, so you have to remember until you find the thing

Yes, this is very inconvenient.

> --- The -id- attributes
> 
> Every XTM 1.0 element has an -id- attribute, but only about half can
> be used within XTM 1.0 itself, and you have to read the syntax spec to
> find out which.
> 
> There are two alternatives:
> 
>   1) Lose from everything which cannot be reified
>   2) Lose from everything except topic
> 
> Which is chosen depends on how reification comes out; leaning towards
> 4) above the editors lean toward 1) here. If 1) is chosen above then
> the result would be 2) here.

I would go for 2), again under the assumption that this is INTERCHANGE
and NOT authoring.

> --- Topic references
> 
> In XTM 1.0 there are three elements which can be used to reference
> topics everywhere in the syntax. However, they are all just shorthands
> for <topicRef>s, and this is complicated to teach, and also makes
> implementation much harder.
> 
> We propose only allowing <topicRef> elements to refer to topics.

+1

> The consequence is that every topic in a topic map must have a <topic>
> element, which is not really a bad thing, although this is slightly
> awkward with fragments. 

Yes, but on the other hand, creating a topic stub for everything being
used in a fragment makes the fragment a NON-fragment, i.e. something
self-contained. And that means it can be processed without any further
context (except that of TMDM semantics, of course).

And this is A Good Thing (TM).

> This change, however, makes it difficult to write the XSLT translation
> stylesheet, though that's not really a valid argument against it. The
> translation is still very much possible, it just becomes hard to do in
> XSLT.

I thought it would get easier: fewer elements, one canonical place to
find things...

> Given this change, probably ID on topic should become required again,
> since without it it really will be impossible to refer to topics.

Yeah, I would probably keep that.

> --- Serialization of item identifiers
> 
> <topicRef> in subjectIdentity should become <itemIdentity>. The
> rationale is that the number of people on the planet who know how to
> interpret this is probably less than 10, and there seems no reason for
> the element to have this particular name here if what it is used for
> is to persist item identifiers.
> 
> The element type name should not be <itemIdentityRef>, since it is not
> a reference to anything, it *is* the identity.

Hmmm, should item identifiers really be serialized? Is this role not
completely covered with IDs on topics and the "reifier" attribute
above?

> --- Other sub-elements of <subjectIdentity>
> 
> The <resourceRef> and <subjectIndicatorRef> elements within
> <subjectIdentity> are really misnamed, as they really correspond to
> TMDM properties with completely different names.
> 
> We propose:
> 
>   resourceRef -> subjectLocator in subjectIdentity
>   subjectIndicatorRef -> subjectIdentifier in subjectIdentity

Consistent naming ++.

> --- External references
> 
> Currently, <topicRef> elements which point to external topics are
> specified to cause the external resource to be loaded. This is quite
> simply a bad idea, and causes things like:
> 
>   <topicRef xlink:href="composer"/> --> crash because no such file
> 
>   causes lots of loading of files from topicmaps.org because of
>   xml:base (and similar issues)
> 
> This was really a mistake in XTM 1.0, but we can fix it now.

Yes, it was. I have even disabled this in my software to discourage
the use. Being one who plans to run a substantial open-access TM
server I also have nightmares of DoS attacks if it is easy to pull in
SUMO at every map upload.

Again, I think that if XTM instances can be made completely
selfcontained, this is in concordance with _interchange_.

> --- Why allow external <topicRef>s?
> 
> A further simplification would be to make <topicRef> use IDREFs
> instead of xlink:href, as this would enormously simplify processing of
> XTM files. However, this would mean you can't reference topics in
> other files that are <mergeMap>-ed in, and instead, you'll have to
> create duplicate topics with the same identity.

You mean that an XTM instance will have to have stub topics? Yet
again, perfectly ok for an _interchange_ syntax. For authoring, it
s..ks.

> --- <mergeMap>
> 
> Another question is whether the <mergeMap> element really belongs in
> an interchange syntax. The capability for merging topic maps is
> useful, but the act of doing so is really an act of authoring, and
> putting <mergeMap> in the syntax is really putting authoring features
> in the syntax.
> 
> The alternative is to say: which files get loaded depends what the
> user told the processor to load. The user can choose to load 1, 5, or
> 17 files into a single topic map.

I have critized <mergeMap> for many years exactly for the reasons you
write, especially for interchange. Even for authoring, I am EXTREMELY
reluctant to have such a "include" feature. It is NOT the right horse
for the right course.

> --- Added themes on <mergeMap>
> 
> This is perhaps the most complicated part of XTM to implement, and
> none of the editors can recall seeing any topic map which uses it
> (except for test cases constructed to test the processing of it).  
> 
> The arguments against <mergeMap> apply with even more force against
> this feature, which is also not sufficiently powerful to really be
> useful, since the original intention was that added themes would allow
> users to track the source of statements. However, to be useful such a
> facility has to be able to handle updates to both the remote and local
> topic maps, and XTM just doesn't do this, simply because it's not
> something an interchange syntax should get involved with.
> 
> The editors propose that whether <mergeMap> stays or goes, the added
> themes should definitely go.

+1

It is very expensive to process, especially in virtualized
environments.

> --- XLink
...
> Given that XLink support adds no value, but considerable cost, we
> think this is an easy decision to make.

No opinion on that.

> --- xml:base
> 
> This is, quite simply, a problem, and not the solution to anything.
> The presence of this attribute causes us to get into considerable
> difficulties over how to interpret -id- attributes, the processing of
> "#foo" URIs, and it often causes problems with processing files whose
> references appear to be local, but actually do point out to the
> network.

RDF/XML has the same problem abusing a 'per-document-feature' (as
xml:base is) as something to specify an address space. I think the
semantics of it is so far ok.

> The editors can think of no useful use case supported by the presence
> of this attribute, and so want to remove it.

If

  - topicRefs only point to topics and are IDREF
  - and consequently EVERYTHING can be interpreted 'relative' (inside
    the document),

then you might be able to remove it. But this has to be thought
through.

> --- <topicMap> content model
> 
> Should we require <association>s to follow after the <topic>s? It
> seems tidier, however, it is more restrictive on software generating
> XTM (have to do all topics before you can do your first association).
> On consideration we reject this proposal.

No, no ordering please.

> A related issue is whether all <mergeMap> elements (if they stay)
> should be required to come before any <topic> and <association>
> elements. When added themes were present this would have simplified
> processing quite considerably. Now that they are gone it is probably
> less of an issue. (This occurred to me (LMG) after the meeting, so the
> editors have not discussed this.)

No, no ordering please.

> --- Naming convention
> 
> The editors discussed leaving badCamelCase, of which neither is fond,
> but decided against it.

There are no badCamels and no GoodCamels (only the camel on the Perl
book is a good camel) :-)

> --- <subjectIdentity>
> 
> Why have this element? It's just a wrapper, and is not needed for
> anything. XTM 1.0 would look the same if it was removed. The elements
> currently inside it (after the changes above) look much better outside
> than inside.
> 
> Overall, we favour this.

Yes, it is unnecessary.

> --- General item identity
> 
> Should the <itemIdentity> element be allowed for all topic map
> constructs? 
> 
>   this would allow the item identities on all constructs to be
>     preserved
>   however, not sure we want to do that
>   if not, we can just lose [item identifiers] from TMDM (except on
>     topic)

Are item identifiers in an XTM instance necessary to reconstruct the
TMDM instance on the recipient side without violating TMDM semantics?
I do not think so, so why have it travelling in XTM?

\rho

PS: Your cleanup efforts are MUCH appreciated.