[sc34wg3] Almost arbitrary markup in resourceData
19 Nov 2003 16:02:08 -0500
On Wed, 2003-11-19 at 11:41, Mason, James David (MXM) wrote:
> See Below.
> Jim Mason
> Lars Marius:
> What does make me worried about <baseNameString> is two
> 1) our rationale for allowing XML in <resourceData> is that
> equivalent to <resourceRef>, but <baseNameString> really
> and topic names have no [locator] property,
> 2) base names are crucial to all kinds of user interfaces,
> they provide labels for the topics, and without those you
> really have much of a UI. We can have resources as names
> topics (through variants), but having base names as
> ensures that there's always *something* that can be
> displayed as
> a mere string.
> If we allow markup in here that goes out the door. You
> may have
> to strip (or, even worse, render) XML markup to be able
> to label
> your topics.
> I'd be interested to hear what people think of this. Should we
> change our minds and only do this for <resourceData>?
> It was initially baseNameString that I was most interested in
> corrupting. resourceData is of less importance to me.
> It may be laziness/ignorance on my part, but baseNameString is what I
> choose to display to my users. I see that name (and indeed all names)
> primarily for human consumption. Variants, resourceData, are of less
> interest to me precisely because baseNameString is what drives my UI.
> It's true that I'm working in an environment where I have a lot of
> control, that I never expect to receive or transmit an arbitrary TM,
> so I don't need the fallback of somewhere having a string that's
> guaranteed to be raw text.
> As I've commented elsewhere in this thread, I don't believe in
> arbitrary interchange. I expect there to be an at least implicit DTD
> for all my data. So there's never really "almost arbitrary markup" for
> me, though the markup may come as a surprise to the topic map engine.
> I never believed in name-based merging because, as a linguist, I'm all
> too aware of the variability and fragility of names.
> Yes, I need to render XML. For me, the primary problem is that there
> are things I need to display that I can't display without additional
> markup. I sometimes need to display more than one paragraph. I need
> subscripts and superscripts. I need (Oh Horror!) the dreaded
> <emphasis> tag. I need things that will require XSLT processing, such
> as generating labels like "Note:" I don't want the topic map engine to
> mess with that stuff, just pass it through to where the user agent can
> do whatever it takes to render the stuff.
> Maybe I'm pushing topic maps too hard, but the projects I have in my
> shop now generally involve creating portals to collections of
> information, and the users want the information displayed in the
> portal to look like the information it's the gateway to. My impression
> is that I'm not alone in this, that Eric, for one, has similar
> Lars Marius (and Steve N.):
> | So, if there's markup in a <baseNameString>, and name-based
> merging is
> | switched on, on what basis will name matching be done?
> The equivalence rule for topic name items. We haven't defined
> it yet in the presence of markup (will be part of the XML
> representation proposal), but I think we'll have to base it on
> Canonical XML. (From what I gathered from Dan Connolly, that
> seems to be what the RDF folks will do, and for the same
> reason I propose it: lack of alternatives.)
> As I said, I never liked name-based merging. I'd much rather have
> merging based on some formalized subject indicator. In one of the maps
> I'm currently working on, name-based merging would be absolutely
> disasterous. I'm mapping our products and their parts. Several of our
> products have parts called "apple", but those parts, though named
> identically, are wildly different things. (Yes, I know, I could
> qualify the apples, and indeed I do scope the names according to the
> parent product. But my TMs are generated by scripts from data that I
> don't control, and I've had to go back and generate scopes for names,
> scopes that aren't in the source data, just to protect myself.)
> LMG and SRN:
> | I don't like it when things get more complex. There's gotta
> be a damn
> | good reason. Jim says he has one, and I take him at his
> word, but I'd
> | be happier if he would explain why <variantName> won't meet
> his needs,
> | [...]
> I'd very much like to hear this too. Jim?
> This is all dreadfully complex anyway. I hate making the topic map
> engine have to do any more work than is necessary, but we can't assume
> that topic maps live in spendid isolation from the data they're
> mapping. Real data is messy. I'm spending most of my time now trying
> to unscramble other folks' data to the point where I can reliably run
> XSLT scripts on it to generate TMs that work. I'm about to increase
> the number of system parts in the TM I mentioned above by about an
> order of magnitude. I'm getting this data from multiple sources, some
> of them older than a number of members of SC34. It's really messy. My
> other project, the one I've talked about at Extreme, is an interface
> to a document-management system. When you start talking about
> documents, things get really messy (after all, that's why most of us
> work in SGML/XML and not in HTML). What more can I say? I can't talk
> about TMs just out in TM land. The map is not the territory, but it
> can't be separated from the territory, either. I'm a publisher, not an
> abstract topologist.