[sc34wg3] Feedback on the CTM draft

Steve Pepper pepper at ontopia.net
Mon Aug 7 13:51:40 EDT 2006

Thank you for your useful feedback on CTM. Our responses to your comments
are below.

We would very much appreciate more feedback as soon as possible, especially
from WG3 members who will not be attending the Montreal meeting.

Gabriel Hopmans
Lars Heuer
Sam Oh
Steve Pepper
* Lars Marius Garshol
| The draft is somewhat tricky to evaluate, since it's not really a 
| specification, but more like a tutorial. It's also a rather 
| superficial tutorial, unfortunately.

This is exactly true. But for a first draft we felt most people would need
something more like a tutorial, and that it would be a waste of time to be
too much like a specification until the general design has stabilized. It's
the latter we really need feedback on at this stage.

| I would particularly have liked
| to see the BNF included together with the examples, and the text 
| centered around the BNF, with examples as support. (The XML 
| Recommendation sets a good example in this regard, BTW.)

Good suggestion. We will follow it in the next draft.

| --- 1
| It's not true that CTM is defined through a mapping to TMDM. This is 
| what the spec should do when we have one, but this isn't true yet.

This sentence is borrowed from the XTM spec and signifies our intent: it
will be true once Annex B has been written. That, in turn, has to wait for
the grammar to stabilize a little more.

| --- 3
| The EBNF in annex A is not the ISO EBNF, which is very different. The 
| EBNF used in annex is that from the XML Recommendation. I would keep 
| the EBNF as it is and just reference the XML Rec instead. This is what 
| TMQL does.

We would like more input on this. In general, ISO standards are obliged to
reuse other ISO standards when appropriate ones exist. The question is: are
there acceptable reasons for NOT using ISO 14977?

| Why is there a normative reference to XTM? I didn't find any such 
| references in the text.

There aren't, but there might be in the future.

| --- 4
| I would cut this section. It serves no purpose.

You are probably right. At the moment it doesn't look as if we need to
define any new terms.

| --- 5.1.1
| TMQL uses only the # comment syntax. Do we really need the /* ... */ 
| comments? Personally, I'd prefer to see only #.

We are interested to hear what others think on this. The editors are
divided: some of them have a real use for multi-line comments (for
commenting out large chunks of a topic map, including non-CTM data
temporarily during the editing process, etc.).

| --- 5.1.2
| Why do HTTP URIs get special treatment? And why use '<' and '>' as 
| delimiters? IMHO those should be avoided, for obvious reasons.

Ideally the editors would have liked to avoid having to delimit URIs at all,
but this is not possible without causing ambiguity with respect to QNames.
As a compromise it was decided to special case the URI scheme that is likely
to be most used (HTTP); at the same time, this allowed us to promote what we
regard as best practice (i.e., the use of HTTP URIs rather than other kinds
of URIs as subject identifiers).

| --- 5.1.3
| Why use both " and """ for strings? Isn't it enough to just allow line 
| breaks inside single-quote strings?

The argument was that triple-quoted strings permit strings that include
unescaped quote marks and that this is familiar to many users through
Python. The question is whether this advantage is big enough to warrant the
additional syntax. What do others think?

| --- 5.2.5 [apologies for the error in the section numbering]
| If the encoding declaration is to have any value it must be required 
| to appear first in the document, before any whitespace or other 
| characters. This is so that it is possible to detect the basic *type* 
| of encoding in cases where it is not ASCII-compatible (UTF-16, UTF-32, 
| EBCDIC, ...).

Ack. Thanks for pointing this out.

| --- 5.2.6
| Are the <>s really needed here?

Strictly speaking, no -- as long as the namespace is defined using the HTTP
scheme -- but it is not wrong to use them.

| ---
| Including scope in the name and occurrence type declaration is not a 
| good idea. This means including the scope for every instance of those 
| types, which strongly implies that the scope is in fact a statement 
| about the name/occurrence type, and not about the individual 
| instances. This seems like poor modelling.

That is the concern raised by the Note. Unless someone can come up with a
convincing argument for allowing scope to be specified in a template, this
will probably go away in the next iteration (unless the template mechanism
is defined in such a way that would make this non-intuitive).

| ---
| Why allow the datatype here? Is it for when the datatype is not one of 
| the predefined ones, to avoid repetition?

Yes. The main reason is to enable greater compactness, since the datatype
will not have to be specified on every individual instance of an occurrence
type whose values always have a datatype that is not "autodetected".

An additional advantage of allowing datatypes in a template MIGHT be to
enable more datatypes to be autodetected (e.g. "2006" could be recognized as
an xsd:gYear rather than an xsd:Integer).

| --- 5.3.8
| NOTE 5 should follow from the EBNF.

| Why can assertion blocks be terminated either with a blank line *or* a 
| period? It should be one or the other. Given that line breaks have no 
| significance anywhere else in CTM it seems strange for them to assume 
| significance here. (Of course, one could get rid of some of the 
| delimiters inside assertion blocks using line breaks instead, which 
| might actually be a better option.)

We have gone back and forth on both of these options. It seems not to be
possible to get rid of delimiters inside assertion blocks without either
reducing expressiveness (in the case of comma), or requiring some other
additional syntax (in the case of semicolon).

Regarding the termination of an assertion block, there seemed to be strong
arguments in favour of both the period (consistency with comma and semicolon
syntax; conservation of vertical space), and the empty line (likely to be
used for readability anyway when editing lengthy topic maps). So we ended up
giving the user the choice.

| ---
| Please find some other solution than the %ROLE syntax. Another keyword 
| (NULL) might work better, but %ROLE is really ugly.

This is just a placeholder.

| ISA is dubious, but used in TMQL. It's true that it is ambiguous 
| (there is no reason why ISA couldn't be used for subclassing as well). 
| I don't think a special predefined syntax for superclass/ subclass is 
| necessary.

The editors are divided on the need for special predefined syntax for
supertype/subtype. More opinions are solicited.

Some of the editors also regard ISA as dubious but have yet to come up with
a better alternative. Suggestions welcome.

One point regarding TMQL (and TMCL): CTM obviously has to be aligned with
these standards, but the fact that one or the other has made a particular
design choice in its current draft is not necessarily an argument to do
things that way. We need to find solutions that fit the requirements of all
three standards, and that may involve some modifications to the current
drafts of TMQL and TMCL.

| --- 5.3.11
| The syntax for reifying the TM is unattractive. The ctm:self option is 
| not too pretty, either. Personally, I think I prefer a directive for 
| this.

This seems to be very much a matter of personal taste. Other opinions?

| --- 6
| This section is OK for now, but does not belong in the final standard.

Why? Because you think CTM should support all of the TMDM, or for some other

| --- Annex A
| I spent some time with this, but it's hard to read without any text, 
| and it doesn't really seem 100% coherent. I've skipped diving deep 
| before we have a more proper specification.

Fair enough.

| There should be a general statement about the handling of whitespace 
| here, since whitespace is generally omitted in the grammar.


| Comments should not be included in the grammar, since they are removed 
| in the lexing stage.

And yet the XML spec, which you suggest using as a model in other respects,
*does* include comments.

| Why allow prefix-directives and templates to mixed in with assertion 
| blocks?

Some of the editors felt it would be wrong to prevent this; others felt we
should encourage the best practice of keeping all directives and templates
in the header. More opinions on this are solicited.

| The definition of version-directive looks wrong. It says that
|    %version  ctm 1.0
| is syntactically invalid, but that seems very strange.

Yes, you're right.

| The EOL{2} syntax is not used in the XML Rec. There's no need, anyway, 
| as EOL EOL would do just as well.


| The constraint on name-type does not belong in the BNF. Not sure the 
| constraint makes any sense, anyway.


| The NOTE about IRIs is not actually a NOTE (it's normative), and it 
| belongs in the spec proper, anyway, together with the BNF.


| Why does WS not include \r?

Good point.

More information about the sc34wg3 mailing list