[sc34wg3] Almost arbitrary markup in resourceData

Mon, 17 Nov 2003 16:19:15 -0500

I grant that name-based merging has its utility: my problem, however, is
that I can't enter some names without embedded markup (e.g, chemical
formulae). I don't want to use variant names because I need the primary
display name to be the one that shows the results of the embedded markup.
For me, the value of a name as a human interface trumps its utility in
merging.

Jim

-----Original Message-----
From: Steven R. Newcomb [mailto:srn@coolheads.com] 
Sent: Monday, November 17, 2003 2:10 PM
To: sc34wg3@isotopicmaps.org
Subject: Re: [sc34wg3] Almost arbitrary markup in resourceData

"Mason, James David (MXM)" <masonjd@y12.doe.gov> writes:

> I'd like to hear more from Martin and the other two original editors 
> on what they thought their goals were for TM interchange.

I can only speak for myself.  My goal was to interchange only the
information that we thought was Topic Map information.  For me, the decision
to exclude non-TM markup was a matter of controlling the scope of the
project, as well as the scope of any project involving the implementation of
a Topic Map processor.  As everybody knows, rigorous scope control is at the
heart of creating a successful standard.

I will not oppose the idea of foreign markup in <resourceData>, but,
personally, I'll never willingly include foreign markup in a topic map.  The
interpretive context of such markup will probably become ambiguous, once it
has been stored inside what can only be a Topic Map engine, at least if the
name of our standard has any meaning, and if Topic Map engines aren't going
to be required to duplicate and special-case at least some of what really
and exclusively belongs in an XML engine.  In other words, this whole idea
violates modularity for me.  (Of course, many XML "helper" standards violate
modularity at least as egregiously; those horses escaped the barn long ago,
which makes me wonder why I still give a hoot, actually.)

I find the inclusion of markup in <baseNameString> hard to swallow. Maybe
that's because I'm an unreconstructed believer in the name-based merging
rule.  I'm not comfortable with the idea that name-based merging should
always be relegated to the realm of URIs -- which is now the only place left
where the name-based merging rule is still globally in effect.

  (When we wrote the XTM DTD, I hadn't yet understood REST, or even
  the fact that a W3C "resource" is in fact exactly the same thing we
  mean when we say "subject".  For me, the reason we used a URI for
  subject identity was to point at a piece of information in some
  particular storage context (an "anchor", in HyTime parlance), and it
  was that piece of information, and not its name/address (its URI)
  that would serve as the binding point for the subject.  

  Well, that is the HyTime/GROVE way, but it's not the Web way.  And
  XTM is supposed to be Webby, so anchor-based merging is gone, and
  all that's left is name-based merging.  That history is the reason
  why name-based merging now occurs in two wildly different places in
  the syntax.  And it's why people sometimes excitedly "discover" that
  if they turn their topic names into URIs, they can get the
  name-based merging they want.  

  But I still don't think it's a good thing to confuse knowledge
  management power with Webbiness.  They're not the same thing.  Not
  at all.  I see no advantage to the public in making XTM into a tool
  for implicitly convincing people that URIs should always be used as
  the names of subjects that are used for name-based merging purposes.
  It does nothing but add overhead to applications, and, of course, it
  hypes the Web.  Nobody gains anything from the overhead, and the Web
  doesn't need hyping.)

So, if there's markup in a <baseNameString>, and name-based merging is
switched on, on what basis will name matching be done?  What about the
nonsignificant whitespace in such markup, and what about the order of the
attribute value specifications in the start tags?  Suddenly we have to have
a whole bunch of complicated rules, or to invoke a parser-output standard
like RAST, where a simple, application-neutral string match used to be
sufficient.  I don't like it when things get more complex.  There's gotta be
a damn good reason.  Jim says he has one, and I take him at his word, but
I'd be happier if he would explain why <variantName> won't meet his needs,
and/or what he's doing about name-based merging of names that contain
markup.

I take Eric's point, which is why I can reluctantly accept foreign (but not
XTM) markup in <resourceData>.  But I think it's a serious mistake to allow
markup in <baseNameString>, at least if non-URI-based name-based merging is
switched on.

Steven R. Newcomb, Consultant
srn@coolheads.com

Coolheads Consulting
http://www.coolheads.com

direct: +1 540 951 9773
main:   +1 540 951 9774
fax:    +1 540 951 9775

208 Highview Drive
Blacksburg, Virginia 24060 USA