[sc34wg3] XTM 1.1 issues

Lars Marius Garshol sc34wg3@isotopicmaps.org
Mon, 05 Dec 2005 21:14:37 +0100


The editors have just had a meeting to decide what changes to make to
XTM 1.1, and what not to do. The meeting was quite effective and
successful, but a large number of issues came up, and the editors did
not really feel comfortable pushing it all into a draft without
conferring with the committee first. The problem, basically, is that
the combined weight of the changes is quite heavy, and some of them
start touching the model.

The underlying rationale for all the changes is the same: to make XTM
1.1 easier to process than XTM 1.0 is, and to bring it closer to TMDM
terminology and structure.

An additional rationale is that the editors consider that a
standardized interchange syntax is not something that should be
changed only half-way. Anything that is not corrected now will hang
around as an irritant for years to come, and so we would much rather
consider all the options than only half of them.

However, the full weight of all of these options is quite substantial,
and some of the alternatives are difficult to fully reason through.
What the editors are looking for is not really detail discussion of
the various issues (which could go on for years), but more general
reactions to the overall direction they are taking.

The editors still think they can have a draft ready on December 20,
although right now the task does look a bit bigger than anticipated.


ATLANTA DECISIONS

These are the decisions that were made in Atlanta:

  parameters -> scope
  member -> role
  roleSpec -> instanceOf
  reverse the order of subjectIdentity and instanceOf in the content
    model for topic
  remove the variantName wrapper element
  require instanceOf on association, occurrence, association role, and
    topic name elements
  restrict role elements to containing a single (required) role player
  remove nesting of variants

(See <URL: http://www.jtc1sc34.org/repository/0676.htm >.)

POSSIBLE FURTHER CHANGES

--- <baseName>

The <baseName> element is in contradiction with TMDM, where the same
construct is named "topic name". In addition, <baseNameString> is not
a good element type name.

Two proposals:

  1) <topicName>
        <baseName>...</baseName>

  2) <topicName>
       <instanceOf>... <scope>...
       <value>...</value>
       <variant>...

The editors prefer 2).

--- Reification

The reification processing in the current XTM draft is very heavy,
basically because reification is explicit in the model, but not in the
syntax, so you have to wait until the entire document is read before
joining up the reification connections.

The solution is to make it explicit in the syntax. We thought of four
alternatives:

  1) -reifier- attribute on anything that can be reified
     cut ID from everything, except <topic>

     bad: have to add everywhere, not consistent with general XTM 1.0
          syntax design

  2) <reifier> sub-element on anything that can be reified with
     <topicRef> inside

      bad: have to add everywhere
           very intrusive
           becomes very ugly indeed inside <topicMap>

  3) -reifies- attribute on <topic>

     bad: not consistent with general XTM 1.0 syntax design

  4) <reifies> sub-element on <subjectIdentity>, with xlink:href
     <reificationRef>? <reifiedRef>? <reifiesRef>? <subjectRef>?

     favourite: <reificationRef>
     bad: can't think of a good name
     bad: means when you process the topic you don't know what
          it reifies, so you have to remember until you find the thing

The editors haven't really landed here, but are proceeding with 4) as
a working hypothesis for the moment. However, suggestions further down
imply that 1) might be made to work better.

--- The -id- attributes

Every XTM 1.0 element has an -id- attribute, but only about half can
be used within XTM 1.0 itself, and you have to read the syntax spec to
find out which.

There are two alternatives:

  1) Lose from everything which cannot be reified
  2) Lose from everything except topic

Which is chosen depends on how reification comes out; leaning towards
4) above the editors lean toward 1) here. If 1) is chosen above then
the result would be 2) here.

--- Topic references

In XTM 1.0 there are three elements which can be used to reference
topics everywhere in the syntax. However, they are all just shorthands
for <topicRef>s, and this is complicated to teach, and also makes
implementation much harder.

We propose only allowing <topicRef> elements to refer to topics.

The consequence is that every topic in a topic map must have a <topic>
element, which is not really a bad thing, although this is slightly
awkward with fragments. 

A related problem is that XTM 1.0 has the three reference element
types reappearing inside <subjectIdentity> with different semantics,
which is not very clean. This change avoids that problem. (The
<resourceRef> element could be said to be used in a third sense inside
<occurrence> and <variantName>.)

This change, however, makes it difficult to write the XSLT translation
stylesheet, though that's not really a valid argument against it. The
translation is still very much possible, it just becomes hard to do in
XSLT.

Given this change, probably ID on topic should become required again,
since without it it really will be impossible to refer to topics.

--- Serialization of item identifiers

<topicRef> in subjectIdentity should become <itemIdentity>. The
rationale is that the number of people on the planet who know how to
interpret this is probably less than 10, and there seems no reason for
the element to have this particular name here if what it is used for
is to persist item identifiers.

The element type name should not be <itemIdentityRef>, since it is not
a reference to anything, it *is* the identity.

--- Other sub-elements of <subjectIdentity>

The <resourceRef> and <subjectIndicatorRef> elements within
<subjectIdentity> are really misnamed, as they really correspond to
TMDM properties with completely different names.

We propose:

  resourceRef -> subjectLocator in subjectIdentity
  subjectIndicatorRef -> subjectIdentifier in subjectIdentity

--- External references

Currently, <topicRef> elements which point to external topics are
specified to cause the external resource to be loaded. This is quite
simply a bad idea, and causes things like:

  <topicRef xlink:href="composer"/> --> crash because no such file

  causes lots of loading of files from topicmaps.org because of
  xml:base (and similar issues)

This was really a mistake in XTM 1.0, but we can fix it now. There is
no loss of expressive power, since if you want to load in the external
resource you can just use <mergeMap>. (In fact, when external
references cause loading there's little or no point in <mergeMap>.)

--- Why allow external <topicRef>s?

A further simplification would be to make <topicRef> use IDREFs
instead of xlink:href, as this would enormously simplify processing of
XTM files. However, this would mean you can't reference topics in
other files that are <mergeMap>-ed in, and instead, you'll have to
create duplicate topics with the same identity.

We didn't conclude on this.

--- <mergeMap>

Another question is whether the <mergeMap> element really belongs in
an interchange syntax. The capability for merging topic maps is
useful, but the act of doing so is really an act of authoring, and
putting <mergeMap> in the syntax is really putting authoring features
in the syntax.

The alternative is to say: which files get loaded depends what the
user told the processor to load. The user can choose to load 1, 5, or
17 files into a single topic map.

--- Added themes on <mergeMap>

This is perhaps the most complicated part of XTM to implement, and
none of the editors can recall seeing any topic map which uses it
(except for test cases constructed to test the processing of it).  

The arguments against <mergeMap> apply with even more force against
this feature, which is also not sufficiently powerful to really be
useful, since the original intention was that added themes would allow
users to track the source of statements. However, to be useful such a
facility has to be able to handle updates to both the remote and local
topic maps, and XTM just doesn't do this, simply because it's not
something an interchange syntax should get involved with.

The editors propose that whether <mergeMap> stays or goes, the added
themes should definitely go.

--- XLink

The editors propose that XTM 1.1 not use XLink. The original arguments
in favour of using XLink were:

  a) XLink processors can at least make some sense of XTM files
  b) Using XLink is politically shrewd

It is the opinion of the editors (now that they are older and wiser)
that a) never was true, and that b) probably isn't any more, either.

The consequences of using XLink is that we have

  every element which has xlink:href must also have xlink:type
    if xlink:type is omitted the XLink is no longer valid
    every topic map omits the xlink:type
    ie: no actual topic map gets the benefits of xlink:type unless it
      is processed with a DTD

  quite a bit of extra text and references in the standards document itselfq

  dependency issues like whether XLink supports IRIs or not (it does,
    but this is a general problem)

  interpretation of references in XTM depends on XLink
    these references are quite complex, being relative URIs
    to find out how to process them you need to read XLink

Given that XLink support adds no value, but considerable cost, we
think this is an easy decision to make.

--- xml:base

This is, quite simply, a problem, and not the solution to anything.
The presence of this attribute causes us to get into considerable
difficulties over how to interpret -id- attributes, the processing of
"#foo" URIs, and it often causes problems with processing files whose
references appear to be local, but actually do point out to the
network.

The editors can think of no useful use case supported by the presence
of this attribute, and so want to remove it.

--- <topicMap> content model

Should we require <association>s to follow after the <topic>s? It
seems tidier, however, it is more restrictive on software generating
XTM (have to do all topics before you can do your first association).
On consideration we reject this proposal.

A related issue is whether all <mergeMap> elements (if they stay)
should be required to come before any <topic> and <association>
elements. When added themes were present this would have simplified
processing quite considerably. Now that they are gone it is probably
less of an issue. (This occurred to me (LMG) after the meeting, so the
editors have not discussed this.)

--- Naming convention

The editors discussed leaving badCamelCase, of which neither is fond,
but decided against it.

--- <subjectIdentity>

Why have this element? It's just a wrapper, and is not needed for
anything. XTM 1.0 would look the same if it was removed. The elements
currently inside it (after the changes above) look much better outside
than inside.

Overall, we favour this.

--- General item identity

Should the <itemIdentity> element be allowed for all topic map
constructs? 

  this would allow the item identities on all constructs to be
    preserved
  however, not sure we want to do that
  if not, we can just lose [item identifiers] from TMDM (except on
    topic)

The editors discussed this at length without really reaching a
conclusion, and decided to continue thinking about it.


OTHER THOUGHTS

We probably have to have an iso:topic-name PSI to serve as the default
name type. This is needed for the XTM 1.0 -> 1.1 stylesheet, but also
seems like a general requirement. XTM 1.0, unfortunately, does not
provide a PSI for this in core.xtm.

The same applies to association-role.

Both should be defined in TMDM, rather than in XTM 1.1.

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >