[sc34wg3] CXTM Issue: Values

Lars Marius Garshol larsga at garshol.priv.no
Wed Apr 23 07:14:50 EDT 2008

* Lars Heuer
> The current CXTM draft <http://www.isotopicmaps.org/cxtm/2008-04-14/>
> is silent about the canonical representation of values.
> From my understanding of the current draft; this may lead into
> different canonical representations of the same topic map.

Not quite. It means the canonical representation of the value is the  
original string value that the engine was given, whether that was in  
CTM or XTM or ...

> Example: xsd:boolean has four valid lexical representations: "true",
> "false", "0", "1"; CXTM does not enforce that the canonical
> representation ("true" / "false") of xsd:boolean is used; it would be
> legal to use "0" or "1" in the CXTM output.
> Shouldn't it made explicit, that the canonical form of the value
> according to the datatype should be used in the CXTM output?

This is a quite deep issue, actually. TMDM just kept quiet about this,  
without getting too deeply into datatypes, but now with TMQL and CXTM  
that will have to change.

The stack of specs as it stands says you must keep the string  
representation of the value, simply because nothing else said  
anywhere. The places we could change this are the TMDM->TMRM mapping  
(which actually would say something else, strictly speaking) and CXTM.  
These two documents have to be in sync on this point.

The trouble with the current state of affairs is that it prevents  
engines from throwing away the inefficient string representation of  
the literal and representing the value directly. Ideally engines  
should be free to be efficient here.

Some things we could do:

   (1) Leave the current specs as they are, and just make sure to not
       construct tests which "invite" trouble by using non-canonical
       input values.

   (2) Leave the current specs as they are, and make the test system
       smarter, so that we can specify alternative canonicalizations
       of the same input.

   (3) Put canonicalization of values into CXTM. If we do this, it has
       to be consistent with TMQL.

This leaves open the question of how to handle this in TMQL, which has  
to be consistent, and where I'm not yet 100% certain of how it will  
pan out. Going into that is a bit much for me right now.

--Lars M.

More information about the sc34wg3 mailing list