[sc34wg3] Editorial structure of N0396

Patrick Durusau sc34wg3@isotopicmaps.org
Tue, 22 Apr 2003 10:01:33 -0400


Martin Bryan wrote:

>Patrick wrote
>>On a more technical issue, you might want to note that definition of
>>String in the SAM:
>>>    Strings are sequences of abstract Unicode characters conforming to
>>>    Unicode Normalization Form C [unicode]
>>>    <http://www.isotopicmaps.org/sam/sam-model/#unicode>
>>While following the W3C for XML 1.1 (see details at:
>>http://www.w3.org/TR/charmod/) does exclude (unless this is one of those
>>optional things) other normalization forms that may be required in
>>non-Web based topic map contexts. This may be of particular significance
>>for systems using Chinese/Japanese texts in non-web based topic maps.
>Coming from someone who I seem to remember criticised me for suggesting that
>something other than the concrete abstract syntax should be applicable in
>SGML I find this somewhat rich ;-)
Glad to have made your day! (I recall my point being somewhat different 
but do appreciate the irony.)

>W3C have, after much arguing and many years of wrangling, finally got around
>to agreeing a single prefered normalization form for Unicode within XML
>documents and Patrick wants us to allow topic map users to be able to adopt
>an alternative normalization scheme!!! This is supposed to make integration
>of topic maps easier in some way. Two topic maps, using different encodings,
>both in XTM cannot be merged safely if they adopt different normalization
The question is does one normalization scheme, particularly one for the 
web fit all topic maps? I really don't think that limiting topic maps to 
the WWW is a good idea, have not in the past, do not now, unlikely in 
the future. Note that I saw a note yesterday that 75% of all business 
data in held in COBOL. Glad that the SAM allows any notation to be used 
for locators.

I am not saying that two topic maps based on XTM should use different 
normalizations nor that such would make "integration" easier. That is 
both unfair and inaccurate. The SAM, as I understand it, is supposed to 
be "the" data mdel for all topic maps. Are you suggesting that it is 
only the data model for XTM based topic maps?

>Having said that, I do believe that this statement should not be part of the
>SAM model, but should be part of the XTM serialization of the model. As HyTM
>is based on SGML rather than XML we can expect user-defined character sets
>to be defined as part of HyTM. We can, of course, agree to differ as to
>whether or not topic maps based on different character sets need to be
>normalized to conform to Form C before being interchanged/merged. This
>subject should be added to one of the discussion lists for London, but I'd
>hate to suggest which one.
All jousting aside, I think this comes close to my original point.


Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Co-Editor, ISO Reference Model for Topic Maps