[sc34wg3] Identifying and comparing subjects (and a possible
extension to \tau )
Mon, 02 Aug 2004 12:54:12 -0400
Apologies for the delayed response!
I am not sure that we need an additional model for merging.
From my reading of the Tau model, which is probably imperfect and/or
incomplete, I don't think it compells the same merging model that is in
use in the TMDM/TAU model.
My impression is that the Tau model allows a definition of the identity
of subject proxies that supports the SIP model used by the TMRM.
Recall that the SIP of a subject proxy in the TMRM is a complex of
components that identify the subject. That is to say that one can
achieve whatever level of granularity of subject identity that is
required for a particular environment.
If one has two separate topic map instances that follow that model,
merging can be based on any combination of the components in the
Hope you are having a great day!
Ann Wrightson wrote:
> Small rant and suggested extension to \tau model
> Identifying and comparing subjects via proxies is not in general
> tractable/computable. So, what we are looking for is a reasonable practical
> strategy. That the TMDM/TAO way works in practice for a growing number of
> real situations is important (and is BTW a v. legitimate basis for LMG's
> Subjects are neither atomic nor stable. A classic example is colours:
> although a dictionary would equate "grey" with "llwyd", and "red" with
> "coch" (Welsh), there are things I would say were "llwyd" or "coch" in Welsh
> that I would say were brown in English. Moreover, a younger Welsh-speaker
> would be likely to make more use of "brown" as a loan-word from English than
> I do, though I would still expect their boundary "brown/coch" to be
> different from the red/brown boundary of an English speaker. This example is
> particularly clear and simple, however this kind of thing happens all over
> (witness our own group discussions!).
> To combat this, you might put in place a more effective/stable way of
> identifying/comparing subjects, by using proxies designed for the job, such
> as a term set derived from a controlled vocabulary. This works v. well for
> everyone using that term set. However, the picture changes once you have
> more than one term set (eg medical and social-work). In this situation, the
> two goals of mapping between the term sets and preserving the information of
> each source actually conflict. That is, if you set up a correspondence
> between two different term sets, then use that correspondence to merge two
> topic maps based on the respective term sets, then you generally lose
> information in real terms even though there is an argument that the result
> topic map "contains" everything the source topic maps contained. The
> severity of this problem increases in proportion to the precision of the
> term sets used, because that very precision is lost, and is valued by the
> information users.
> This has been (since 2000 ;-) my main argument against merging always being
> an a+b=c kind of operation that loses the fine structure of a & b. I believe
> that there should also be a different operation that both preserves the
> original maps and defines a view-as-if-merged-according-to-this-mapping.
> I'm realizing as I write that this suggests a different formal account of
> merging - call it merge2 - eg (in \tau) where the two maps being merged have
> different sets of names (curly N1 and N2) in the universe (curly I) and
> instead of merge2 being a set union, it is a function where the desired
> property of "preserving information" from the source maps to the target maps
> becomes a morphism property of the function.
> Ann W.
> sc34wg3 mailing list
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Topic Maps: Human, not artificial, intelligence at work!