[sc34wg3] Almost arbitrary markup in resourceData

Michel Biezunski sc34wg3@isotopicmaps.org
19 Nov 2003 16:14:20 -0500

On Wed, 2003-11-19 at 11:41, Mason, James David (MXM) wrote:

Oops. I wanted to answer Jim's mail and I just sent it as it was.
Sorry about that.

I wanted to what Jim and Eric said another requirement I have for
names. In the topic map application I am currently working on,
topics have several names, and they don't play the same role. One
is a "privileged name" and the other ones are treated as synonyms.

>From the pure topic map point of view, nothing distinguishes them,
and I can't use scope because of the name-based merging rule (because
in my case I want them to merge). I can't use topic types because
it has nothing to do with names. I don't want to use variants because
it doesn't correspond to what I need.

> It was initially baseNameString that I was most interested in
> corrupting. resourceData is of less importance to me. 
> It may be laziness/ignorance on my part, but baseNameString is what I
> choose to display to my users. I see that name (and indeed all names)
> primarily for human consumption. Variants, resourceData, are of less
> interest to me precisely because baseNameString is what drives my UI.
> It's true that I'm working in an environment where I have a lot of
> control, that I never expect to receive or transmit an arbitrary TM,
> so I don't need the fallback of somewhere having a string that's
> guaranteed to be raw text.

Same for me basically except maybe in the future for the level of
> As I've commented elsewhere in this thread, I don't believe in
> arbitrary interchange. I expect there to be an at least implicit DTD
> for all my data. So there's never really "almost arbitrary markup" for
> me, though the markup may come as a surprise to the topic map engine.
> I never believed in name-based merging because, as a linguist, I'm all
> too aware of the variability and fragility of names. 

I am using name-based merging, but not conforming to what the topic
map standard says, because I merge names even when they're not in
the same scope. So it's the opposite to your case, but the observation
is similar: As designed, the name-based merging rule doesn't fit my

> Yes, I need to render XML. For me, the primary problem is that there
> are things I need to display that I can't display without additional
> markup. I sometimes need to display more than one paragraph. I need
> subscripts and superscripts. I need (Oh Horror!) the dreaded
> <emphasis> tag. I need things that will require XSLT processing, such
> as generating labels like "Note:" I don't want the topic map engine to
> mess with that stuff, just pass it through to where the user agent can
> do whatever it takes to render the stuff.

Do you need this markup to go inside the names?

> Maybe I'm pushing topic maps too hard, but the projects I have in my
> shop now generally involve creating portals to collections of
> information, and the users want the information displayed in the
> portal to look like the information it's the gateway to. My impression
> is that I'm not alone in this, that Eric, for one, has similar
> requirements.
>         Lars Marius (and Steve N.):
>         | So, if there's markup in a <baseNameString>, and name-based
>         merging is
>         | switched on, on what basis will name matching be done?
>         The equivalence rule for topic name items. We haven't defined
>         it yet in the presence of markup (will be part of the XML
>         representation proposal), but I think we'll have to base it on
>         Canonical XML. (From what I gathered from Dan Connolly, that
>         seems to be what the RDF folks will do, and for the same
>         reason I propose it: lack of alternatives.)
> As I said, I never liked name-based merging. I'd much rather have
> merging based on some formalized subject indicator. In one of the maps
> I'm currently working on, name-based merging would be absolutely
> disasterous. I'm mapping our products and their parts. Several of our
> products have parts called "apple", but those parts, though named
> identically, are wildly different things. (Yes, I know, I could
> qualify the apples, and indeed I do scope the names according to the
> parent product. But my TMs are generated by scripts from data that I
> don't control, and I've had to go back and generate scopes for names,
> scopes that aren't in the source data, just to protect myself.)

I see this as 2 different things:
- you can already merge by producing adequate subject indicators.
- you can't express the rule by which your subject indicators are
  being built. 
The question is: do we need a standard representation to do this?

I have a similar problem. When I say I merge by name, I first use
several transformations from the actual names to get to some normalized
forms that are used to merge. And this mechanism is currently not
interchangeable. It doesn't seem to be a problem in the short term,
but it might become desirable to be able to do this in the future.

Michel Biezunski
Coolheads Consulting
402 85th Street #5C
Brooklyn NY 11209
Email: mb@coolheads.com
Web  : http://www.coolheads.com
Voice: (718) 921-0901