[sc34wg3] Re: Public Interest and ISO WAS: [topicmapmail] <mergeMap> questions

Lars Marius Garshol sc34wg3@isotopicmaps.org
24 Oct 2001 23:34:22 +0200

* James David Mason
| The Topic Map community may not be so old or well established as the
| SGML/XML community as a whole, but it's part of that latter
| community and should accept the openness to change of the larger
| community. That we have proposals for models and auxiliary languages
| moving through the ISO process is a sign that the TM community is
| alive and capable of change.

I don't think anyone would ever want to outlaw the possiblity of
changes, but on the other hand a certain measure of stability is
needed. If we change the standard slightly and carefully every three
years, say, I don't think anyone will be very much upset. Certainly
I won't be.

To make changes continually and fundamentally, on the other hand, is
something I just don't think we can do. Patrick Durusau likes the
example of Unicode. Well, one thing in Unicode that will never EVER
change, even though it is certainly broken from some points of view,
is the assignments of characters to code points. New ones can be
added, but the existing ones will never go away or change.

This is my personal opinion, that a standard of this kind that changes
too often just isn't going to be very usable, or even very much used.
This applies both to the model, to the interpretation of serialized
topic maps, and to the interchange syntaxes. Some change is
inevitable, but frequent change is something I fear will kill this
whole thing.

And, like Kal, I would much prefer to see changes based on user
experience to changes based on speculation. (Yes, bugs are bugs, and
need to be fixed; I'm talking about changes in general.)

| So if we know there's something than needs fixing in ISO 13250, we
| need to fix it. Users and vendors alike need to work on the fix. And
| we ought to move on the fix as soon as we can figure out what the
| best solution is.

I guess in many cases the bone of contention will be whether it really
is broken or not, like in this case, where many people disagree about
whether anything is broken at all, and some (like me) don't know yet.
| ISO 13250 is not specified in a rigorous mathematical language; it's
| not susceptible to proof. The only thing formulaic we have is two
| DTDs, and we know they're not consistent. So we need to fix
| something, and that will probably mean changing both words and
| DTDs. If that means changing some vendors' code, so be it.

I'm aware of that, and I'm resigned to it. Given the lack of rigor in
those two documents there is no way we can avoid this from happening
in quite a few cases. Mostly edge cases, but probably many.

What this discussion has been about, however, is completely different.
It is a change to both ISO 13250 and XTM 1.0 that has nothing to do
with harmonizing them.

| I can't speak for how much work is involved in fixing code (except
| to guess that while nontrivial it's considerably less than the job
| of creating the overall code).

Well, frankly, the effort of, say, changing all topic characteristics
to be able to have sets of scopes internally in Ontopia would probably
be somewhere between two man-weeks and a man-month. For our customers
it would probably mostly be less effort, perhaps nearly as much in one
or two cases, probably close to zero in some cases.

Later, of course, when we have more software built on top of the
engine, it would be more. When we have a full query engine, for
example, the amount of work would increase quite drastically, among
other things because a lot of work that had previously gone into
optimization and performance tuning would suddenly be worthless.

For empolis I think the effort would probably be greater, given that
they already have a query engine. For Mondeca I have not the faintest
idea. As for their customers I know nothing whatever of them.

So the problem is not that this can't be done, or that it would kill
us, or anyone else, but that it's a bad signal to send, and that it's
a bad practice that we can avoid if we run a proper process. If we
find that something really is broken then probably it's better to bite
that bullet than to live with true brokenness (which wouldn't be a
good signal, either, in any case), but even so fixes will have to come
in batches, I think.

SGML and XML have set a good precedent here, I think. In the case of
XML they waited two and a half years before even tightening up the
prose. This despite the fact that the XML syntax has bugs in it[1].
The first version of XML that will actually change XML in any way, and
those changes will be very very minor, hasn't even passed beyond the
requirements stage yet, one year later. (This is XML Blueberry[2].)

So what we need to do is not to make no changes ever, but to be very
disciplined about the changes we do make.

--Lars M.

[1] One of them is that PIs and comments are allowed after the
    document element, which means that you can't have multiple
    documents in an input stream and know where to split the stream
    without some kind of partioning syntax (like MIME multipart).

    This has been known since some time in 1998 and is an acknowledged
    bug. Another is the infamous newline problem[3].

[2] <URL: http://www.w3.org/TR/xml-blueberry-req >

[3] <URL: http://www.w3.org/TR/newline >