[sc34wg3] Comments on CTM

Patrick Durusau patrick at durusau.net
Tue Oct 7 11:04:04 EDT 2008


Lars,

Thanks!

It would be real helpful if the editors could be prepare a summary 
response (no particular form) to these and other comments so we could 
run through the uncontested ones rather quickly (or even have a summary 
page that simply lists them) so that we can concentrate on issues that 
merit discussion.

Hope you are having a great day!

Patrick

PS: I am assuming that we should start with CTM, followed by TMCL and 
TMQL. But that is just my assumption. Is there any feeling among the 
editors that another order would be more productive?

Lars Marius Garshol wrote:
>
> Here are my comments on the latest CTM draft, based on a close reading 
> of the spec, an (unfinished) attempt to implement it, and some early 
> fragmentary use. The comments are the same as those in the CTM ballot 
> result, with two additions.
>
> I'm hoping we can discuss this in Leipzig.
>
> --Lars M.
> http://www.garshol.priv.no/blog/
> http://www.garshol.priv.no/tmphoto/
>
>
> ------------------------------------------------------------------------
>
> MB 	Clause no. 	Paragraph 	Type of comment 	Comment 	Proposed change 
> Sec obs
> NO 	  	  	ge 	Throughout the document some clauses, like for example 
> 3.5, have text before the first numbered subclause. This is not 
> allowed under the ISO style, and in fact also disallowed in is.rnc. 
> Reorganize the document to conform to ISO style. 	 
> NO 	  	  	ge 	Throughout the document the text "an error is flagged" 
> recurs. This is awkward, because it raises the question of what it 
> means to flag an error, and this is nowhere answered. It would be 
> better to use the more common phrasing of saying something "is an 
> error", and then to have a separate clause stating what to do in the 
> case of errors. 	Rephrase throughout, and add an extra clause. 	 
> NO 	Introduction 	2 	ed 	Text refers to "the TMDM". Normal usage is 
> "TMDM". 	Remove "the". 	 
> NO 	2 	  	ed 	References to ISO 13250-2 and -3 are not in the 
> ISO-mandated style. 	Correct the references. 	 
> NO 	3.1 	2 	ed 	The term "CTM documents" is unnecessary, and 
> potentially misleading, since CTM does not need to appear in 
> "documents". 	Rephrase to avoid the term "CTM documents". 	 
> NO 	3.2 	1 	ed 	This paragraph is a verbatim copy of a similar 
> paragraph in the XTM 2.0 specification. It does not really add 
> anything to the specification. 	Remove paragraph. 	 
> NO 	3.2 	3 	te 	Text says that the byte stream is converted to a 
> character stream, but does not say how. 	Define character conversion 
> process. 	 
> NO 	3.2 	6 	ed 	The paragraph "Each CTM processor ..." is awkwardly 
> formulated. 	Replace with "The following QName prefix is predefined, 
> and shall be recognized by all CTM processors:" 	 
> NO 	3.2 	6 	te 	Is it necessary to have a predefined prefix? Users can 
> define this prefix if they need it. 	Remove paragraph. 	 
> NO 	3.3.1 	2 	ed 	Text says "Space (#x20) characters are allowed 
> everywhere". This seems to imply that only ASCII character 32, space, 
> is allowed, and that the other three whitespace characters are not 
> allowed. 	Alter text to say "Whitespace character are allowed 
> everywhere". 	 
> NO 	3.3.2 	2 	te 	Production [2] uses "#xA" as line separator, but 
> different line separators are used on different platforms. This needs 
> to be specified more carefully. 	Alter text to account for different 
> line separators. 	 
> NO 	3.3.2 	3 	ed 	Text says "This part of ... allows nesting". This is 
> clumsily phrased. 	Alter text to say ", and may be nested." 	 
> NO 	3.3.3 	[5] 	te 	Production [5] refers to RFC 3987 for part of the 
> production. This makes it harder for implementors to know exactly what 
> is allowed and what is not. There is every chance that implementations 
> will differ in what they accept here, harming interoperability. 
> Define production explicitly. 	 
> NO 	3.3.3 	  	ed 	Second para of first note says "An IRI which is not 
> ...". This is normative, and so does not belong in a note. Further, it 
> should really be specified in the grammar. 	Remove para and 
> incorporate this in the grammar. 	 
> NO 	3.3.4 	1 	ed 	Text says "They [QNames] are declared as follows:", 
> which is confusing, as the reader is likely to think this means that 
> the following will show how QNames are declared in CTM. In fact, the 
> productions show the syntax for QNames. This awkward phrasing recurs 
> for many, many grammar fragments throughout the document. 	Rephrase 
> paragraph to "QNames are used to abbreviate IRIs. The syntax of QNames 
> is as follows:" Also rephrase other corresponding sentences in the 
> spec. 	 
> NO 	3.3.6 	  	ed 	The note is clearly meant to be normative, and so 
> should not be a note. 	Make the note a normal paragraph. 	 
> NO 	3.3.7 	  	ed 	The note is clearly meant to be normative, and so 
> should not be a note. The semicolon should be made optional in the 
> EBNF instead. 	Remove the note and adapt the EBNF. 	 
> NO 	3.3.7 	[18] 	te 	One of the arguments for adding the embedded 
> topic syntax was that this would be very useful for TMCL, and might 
> even remove the need for templates. The new TMCL draft does not use 
> embedded topics at all, and it is not clear that they would be very 
> useful. At the same time, they substantially complicate the 
> implementation of CTM. There seems to be no compelling reason to keep 
> embedded topics. 	Remove embedded topics. 	 
> NO 	3.3.8 	  	ed 	The clause describes the creation of locators from 
> the wildcards, but does not say where these locators. Nor is there any 
> mention of any topic being created. 	Make it clear that a topic is 
> created, and where the locators go. 	 
> NO 	3.3.9 	[22] 	te 	The production explicitly states that there may 
> be whitespace (WS) after the '@' character, but does not specify that 
> whitespace is allowed around the comma. This is inconsistent, and 
> seems unnecessary when 3.3.1 states that whitespace is allowed 
> everywhere. 	Remove the WS reference from the production. 	 
> NO 	3.3.9 	3 	ed 	The text refers to "the [scope] property of the 
> Topic Maps construct", which is rather awkward. 	Rephrase to "the 
> [scope] property of the statement". 	 
> NO 	3.3.10 	1 	ed 	The text refers to "Topic Maps construct", which is 
> a rather awkward term. 	Rephrase to "statement". 	 
> NO 	3.3.10 	[23] 	te 	Same as above. 	Remove the WS reference from the 
> production. 	 
> NO 	3.3.11 	1 	ed 	Text says "...used to assign a type to /a/ Topic 
> Maps item in which it occurs...", which is grammatically incorrect. 
> Replace "a" by "the". 	 
> NO 	3.3.11 	2 	ed 	Text says "...Topic Maps item...", which is 
> inconsistent with the terminology used in TMDM. 	Rephrase to 
> "information item". 	 
> NO 	3.4 	[31] 	ed 	The production just references the XML Schema spec. 
> It would be better to incorporate the full production in the grammar. 
> Similarly for [32]. 	Specify the grammar explicitly. 	 
> NO 	3.4 	NOTE 	ed 	The note is clearly meant to be normative, and so 
> should not be a note. 	Make the note a normal paragraph. 	 
> NO 	3.4.1 	  	ed 	The interpretation of the various escape sequences 
> is not defined in the text, leaving implementors to guess what is 
> meant. 	Define the interpretation. 	 
> NO 	3.5 	[40] 	te 	The grammar says "(directive* reifier)?" which 
> means that it is impossible to have a directive without a reifier. 
> This is unlikely to be what is actually meant. 	Redefine to 
> "directive* reifier?". 	 
> NO 	3.5.1 	  	ed 	Production [41] is explained in 3.5.1, but defined 
> in 3.5. For context it might be better to define it in 3.5.1. 	Move 
> production to 3.5.1. Alternatively, remove 3.5.1 entirely, or merge it 
> into the previous clause. 	 
> NO 	3.5 	  	te 	Directives, as defined here, must be terminated by a 
> newline, but this seems unnecessary. Whitespace is not significant 
> elsewhere in the grammar, so why make an exception here? 	Remove the 
> restriction. 	 
> NO 	3.5 	  	te 	Directives use "#xA" as a line separator, just like 
> production [2], which has the same problems. 	Remove the restriction. 
> Alternatively, fix as with [2]. 	 
> NO 	3.5.2 	  	ed 	The text says encoding names are to be given in the 
> form recommended by XML 1.0. XML 1.0 references an IANA registry. We 
> should do the same. 	Reference IANA directly. 	 
> NO 	3.5.2 	  	te 	The text says the encoding directive must be on the 
> first line, if given. This is insufficient. There must be no 
> whitespace before the directive, either. The purpose of this is to 
> allow implementors to auto-detect the character encoding, as in XML 
> <http://www.w3.org/TR/REC-xml/#sec-guessing>. The encoding 
> detection/directive must also be connected with the byte-to-character 
> stream conversion in 3.2. 	Change text accordingly. 	 
> NO 	3.5.3 	4 	ed 	The text says "this part of ... recommends", which 
> is awkward. 	Change to "Its usage is recommended for ...". 	 
> NO 	3.5.3 	5 	te 	The text sets out strict requirements for where the 
> version directive can appear in terms of line breaks using prose. Does 
> this need to be specified in terms of prose rather than EBNF, and does 
> it matter how many line breaks there are before the version directive? 
> The EBNF already requires this directive to be in the prolog, which 
> should be sufficient. 	Remove entire paragraph. 	 
> NO 	3.6 	NOTE 	ed 	The note is normative, and so should not be a note. 
> However, this requirement should be worked into the EBNF grammar 
> instead. 	Remove the note. 	 
> NO 	3.7 	[45] 	ed 	In this production the parentheses around the OR 
> group appears to be missing. 	Insert parentheses. 	 
> NO 	3.7 	[45] 	te 	Strictly speaking, the semicolons (";") after each 
> statement are not required, and in actual CTM files their appearance 
> is aesthetically not very pleasing. On the other hand, their presence 
> may make finding syntax errors easier for novices. 	Discuss whether to 
> remove semicolons. 	 
> NO 	3.8 	  	ed 	Clause 3.7 covers the "topic-tail", which primarily 
> consists of names and occurrences, which are covered in 3.9, 3.10, and 
> 3.11. In between comes associations in 3.8, which cannot be part of 
> the topic-tail. This order is distracting for readers. 	Move 3.8 after 
> 3.11. 	 
> NO 	3.8 	[51] 	ed 	A separate type production with processing rules 
> has already been defined in 3.3.11, and is used in 3.9 and 3.10. Why 
> not here also? 	Remove production and use type instead. 	 
> NO 	3.8 	NOTE 	ed 	The note uses "shall" indicating that it is 
> normative, but in reality it is just advising users not to put a "(" 
> next to an unbracketed IRI. This could be solved by disallowing "(" 
> characters in IRIs if the EBNF for IRIs is given explicitly. 	Disallow 
> "(" in IRIs and remove the note. Alternatively, reword the note so it 
> does not seem to be normative. 	 
> NO 	3.12 	1 	ed 	The definition of what a template is is quite poor, 
> and does not at all refer to the uses of templates. 	Rewrite. 	 
> NO 	3.12 	  	ed 	The scope of template declarations is not explicitly 
> defined, although it is alluded to many places in the spec. 	Define 
> the scope explicitly. 	 
> NO 	3.12 	[61] 	te 	Production template-body allows the prefix 
> directive to occur within a template body. This raises the issue of 
> the scope of the prefix declaration (both for the specification and in 
> the minds of users), and also complicates the grammar slightly. It 
> also makes implemenations more complex, and the value of this is 
> disputable at best. 	Change the grammar to disallow prefix directives 
> in template bodies. 	 
> NO 	3.13 	[64] 	te 	The need for a separate topic-template-invocation 
> is not clear. Why must templates invoked in a topic block be invoked 
> with a parameter when templates outside a template block do not need 
> this? 	Remove production. 	 
> NO 	3.13 	NOTE 	ed 	The note is normative, and so should not be a 
> note. Secondly, instead of directly defining the scope rules for 
> template definitions, this note indirectly describes the effects of 
> these rules, which is awkward and difficult to read. 	Remove note. 	 
> NO 	3.13 	  	ed 	This clause does not actually define the effect of 
> invoking a template, nor how template variables are bound. One assumes 
> there is a rule for including the topic in a topic block as the first 
> parameter, but this is not defined anywhere. 	Define how template 
> invocation works. 	 
> NO 	3.14.1 	  	ed 	The scope rules for prefix bindings is not 
> defined. 	Define the scope rules. 	 
> NO 	3.14.1 	[69] 	ed 	This production defines a reference as either an 
> IRI or a random collection of non-WS characters. The text does not 
> define the interpretation of the second alternative. 	Explain or 
> remove second alternative. 	 
> NO 	3.14.2 	  	ed 	The text defining the interpretation of the include 
> directive is much too vague and awkwardly written. There is no 
> reference to the processing of the iri-ref, nor to how deserialization 
> is to be performed, and so on. It is possible to guess the intent 
> behind the text, but it could be stated much more directly. 	Clean up 
> and fill out the text. 	 
> NO 	3.14.3 	  	ed 	The text in this clause suffers from the same 
> problems of vagueness as that of 3.14.2. 	Clean up and fill out the 
> text. 	 
> NO 	3.14.3 	  	te 	Are syntax identifiers case-sensitive or not? And 
> is "CTM" or "ctm" the syntax identifier? 	Tighten up the text. 	 
> NO 	3.14.3 	  	te 	The "xtm" syntax identifier is not clearly defined. 
> Does this mean XTM 1.0, XTM 2.0, or both? There are no references to 
> other specifications here, making it even more difficult to know what 
> is meant. 	Define separate identifiers for XTM 1.0 and 2.0, and 
> reference the specifications. 	 
> NO 	3.14.3 	  	te 	What happens if some implementors start using, say, 
> "LTM" to mean Ontopia's syntax, and then later ISO updates CTM, 
> defining "LTM" to mean something else? Should ISO reserve parts of the 
> namespace for itself? Or should syntax identifiers be PSIs? 	Discuss. 	 
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3 at isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>   

-- 
Patrick Durusau
patrick at durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)



More information about the sc34wg3 mailing list