xml:id RE: [sc34wg3] Compact syntax requirement question

Lars Marius Garshol sc34wg3@isotopicmaps.org
Tue, 19 Jul 2005 17:44:40 +0200

* Bernard Vatant
| Apologies to have seemed so "negative" and "aggressive".
Apology accepted. I guess I may have been a bit more touchy than
necessary, too.

* Lars Marius Garshol
| I think it's quite clear that XML is going to have to struggle
| quite hard to compete with
| [bernard : person = "Bernard Vatant"; "vatant, bernard"]
* Bernard Vatant
| This is where I disagree, and where I think Patrick got my
| point. The above has never seemed *simple* nor intuitive to me;
| neither to read, nor to write (too many [ : ; [ % @ make me uneasy).

That's OK, but we already have XTM and topic map editors to satisfy
people who feel that way. The problem is that there are a lot of
people who are not put off by this at all, and who have a real need
for this kind of syntax.

| To tell the truth I never even tried to learn to read LTM or AsTMa= so
| actually I only guess the semantics of the above. But maybe it's
| only because I'm lazy :))

I don't think you're lazy; I just think you're not a programmer, and
so you don't need LTM/AsTMa=. 

| Anyway simplicity is what is simple for the user, and is not
| necessarily in linear relation with string length ...

Of course not.

| I've edited a lot of XTM, and never found it "painful". Verbose and
| heavy, yes, but not "painful".

I don't edit XTM except when forced to when fixing something that went
wrong in a demo or suchlike. It really is just too painful to even
think about editing. Note that I don't expect you to feel the same
way; just to accept that there is a large and important group of users
who feel this way.
| Now the argument that it is needed for TMQL is another story. I
| don't understand it completely because I'm not sure what INSERT in
| TMQL is, but it's certainly relevant.

Oh, OK. I guess this should have been explained.

One of the things that the committee in general picked up in the
review of TMQL proposals was that Robert Barta had designed a language
family with consistent syntax, instead of creating three separate
languages (compact topic map syntax, schema syntax, query syntax).
It's kind of obvious once you realize it, of course, but it really is
a major benefit if, say, scope is indicated the same way in all three
languages. And so on.

At the moment, of course, ISO is only doing two of these languages: a
schema language and a query language. However, the query language is
going to have an update part, once we've done the query-only part.

Typically, update languages have three operations:

  INSERT: add new data
  DELETE: remove data
  UDPATE: change data

To support the INSERT operation, we will be forced to to provide some
way to express the topic map information to be added. If you want to
add the "bernard" topic from my previous example you have to be able
to express the characteristics etc of that topic, so that you can

  INSERT <topic-map-fragment-goes-here>;

or something like it.

This means we have to either create a textual syntax for topic maps or
use XTM. Having only XTM isn't an option, and so we will have to
define a textual syntax.

However, it doesn't really seem like a good idea to first design TMQL
for querying, then TMCL for schemas, and only afterwards come up with
a textual syntax for topic maps. It seems better to design the three
in parallell, so that we can ensure that they are consistent.

I hope this makes things clearer.
* Lars Marius Garshol
| That's an interesting requirement, but I'm not sure exactly what you
| mean by it. Validate on what level? Syntactically? Or against a schema?
* Bernard Vatant
| I'm amazed by those questions. 

Well, in this thread the amazement is mutual. :-)

| If you specify a language, I guess you provide ways to check if the
| files you produce are conformant to the specification. 

Yes, but you wrote as though "parse" and "validate" were different,
while "validate" is usually part of "parsing", and so I wasn't sure I
understood what you meant.

| Call it well-formed, valid, whatever, in any case : when I get a
| file "foo.ctm", how do I make sure it's conformant to the CTM
| specification? What kind of tool do I use?

Same as with XML: a parser.

| When I have an XML file, I know that I have two possible levels of
| validation (at least), and the ways to check it in my XML
| editor. What should be the levels of validation for CTM, I don't
| know.

The same: syntactic and schema. However, a good implementation will
have syntactic validation for CTM and XTM separately, but will share
*all* of the code for the schema validation. (The OKS does this, so
you can import RDF and validate it with OSL if you feel like it.)

| I just wonder how they will be specified, and which tools i will
| use. Certainly a simple text editor will not do it, right?

No, an editor won't do it, but your parser will.
* Lars Marius Garshol
| And easy for whom? The implementor or the user?
* Bernard Vatant
| Anyone who is bound to ask the question : is that file valid CTM?
| For example ...
| [bernard : person = "Bernard Vatant"; vatant, bernard"]
| ... is not well-formed LTM, I guess. You need a specific parser to
| find out why, right?

Not really. You (Bernard) know that what's wrong with it, without even
having learned LTM. But really this is the same as with all other
syntaxes: people learn it, and then they can use it. Their parser
tells them if they screw up, but most of their mistakes they spot
themselves long before that.

| Are such parsers available?

Yes. There is one in the OKS, one in TM4J, one (I think) in
tmapi-utils, one in the Perl::XTM package, one in the Python engine
being built, etc

A major reason why there are so many such parsers is that once you
have a topic map implementation creating an LTM parser is not a big

| You say a lot of people use LTM. How do they cope with that? Are
| there LTM editors/parsers/validators around?

There are no specific LTM editors that I know of, but there are
editing modes for some editors. Parsers there are lots of, and
validators are the same thing. The Omnigator lets you load, convert,
validate, browse, etc etc your LTM as much as you like, and really,
the only difference from XTM is that it's not XML. 

To be frank I find this discussion utterly baffling. New syntaxes are
created all the time, and nobody seems to be bothered in the slightest
by this. So what's wrong with a standard compact syntax for topic
maps? There is a real user community there that wants one; we have to
create one anyway for another standard, etc. 

In fact, even if you think compact syntaxes for topic maps are a bad
thing you should appload CTM, because it means we will go from two
compact syntaxes for topic maps (LTM and AsTMa=) to just one. 

Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >