[sc34wg3] Almost arbitrary markup in resourceData

Tue, 18 Nov 2003 15:03:11 +0100

Well, since Lars Marius seems to agree with me at about 90%, I think my
duty is to argue a bit on the remaining 10%

> * Bernard Vatant
> |
> | The latter is the fundamental question. My first-cut answer would
> | be, along the lines of Graham's one ("no big deal"), that it just
> | *can't mean anything* from the viewpoint of a TM application.

*Lars Marius Garshol
> This is my view as well, and this is the reason why I feel we should
> exclude markup in the XTM namespace from appearing within
> <resourceData>. It's not clear from what you write whether you agree
> with that or not.

Oh, yes, of course, I do agree. I was talking implicitly about exotic
markup. Using XTM namespace markup inside <resourceData> is something so
weird I'd never imagined one could do that - although I generally tend to
imagine too much, as you know :)

> | Allowing embedded markup could open the door to lazy
> | modeling, meaning by that it might often be the case (and well, if
> | you adopt the Reference Model philosophy, it certainly *is* always
> | the case) that semantics captured in the embedded markup could have
> | been expressed as proper TM information at a finest level of
> | granularity. And the specification prose should recommend to do so
> | whenever possible.
>
> This I agree with, and this is something that's bothered me a bit
> about allowing embedded XML. We may well see people representing stuff
> with elements and attributes that they really should be representing
> with topics and associations. Or, even worse, we may see them doing
> both, so that they have (horrors!) redundant data.
>
> Whether we can, and whether we should, do anything about that I am not
> sure. So far I've leaned towards just leaving it to "user education",
> but I could be convinced that that's wrong.

I think that should not belong to the specification, but be included in
some kind of best practices tutorial - whoever is in charge of that. Might
be included in version 1.1 of the "TAO of Topic Maps", by You-Know-Whom :))

> | Example of a "lazy occurrence" of type "PostalAddress" for topic
> | "John Smith".
> |
> | <resourceData>
> | 	<street>Main Street</street>
> | 	<number>23</number>
> | 	<city>Nothing Gulch</city>
> | </resourceData>
> |
> | It's clear that the lazy TM author could (should?) have defined
> | "PostalAddress" as a topic class, then "street", "number" and "city"
> | as occurrence types, and linked "John Smith" to "John Smith's
> | address" using a "PersonalAddress" association.
>
> Wow. I would have argued the opposite, actually, that this is just a
> piece of data, and that it shouldn't be topic mapped, because you'd be
> unlikely to want to ever say anything more about the address than what
> you do in the example above ... <snip/>

Well, depends. I currently work on an use case involving linguistic tools,
where management of assertions at such fine-grained level is needed. But I
don't want to quibble on the example details. Sure it can be modelized
different ways, but all using proper TM representation, that's what I was
about and we agree on that.

> ... it seems that while embedded XML does
> provide one more way of blowing your foot off, there are several
> almost identical ones there already.

Sure. Let's not make our life more complicated than it is already.

> | I'm not pretending that any embedded markup cases can practically
> | and easily boil down to that kind of reduction, but my ground
> | experience, in Mondeca real world implementations, so far, is that
> | even in cases where representation of fine-grained information
> | embedded in existing resources has been needed, a workaround to
> | embedded markup has been found.
>
> I think that is true. The question is how bad we consider the
> workarounds to be, and so far the general opinion seems to be that we
> consider them bad enough that we want to get rid of them.

I'm not 100% convinced yet. I would need to know more about the use cases
rationale from Eric, Jim, Martin ... to make sure their trade-off was worth
it.

Bernard