[sc34wg3] Identifying and comparing subjects (and a possible extension to \tau )

Patrick Durusau sc34wg3@isotopicmaps.org
Mon, 02 Aug 2004 12:54:12 -0400


Apologies for the delayed response!

I am not sure that we need an additional model for merging.

 From my reading of the Tau model, which is probably imperfect and/or 
incomplete, I don't think it compells the same merging model that is in 
use in the TMDM/TAU model.

My impression is that the Tau model allows a definition of the identity 
of subject proxies that supports the SIP model used by the TMRM.

Recall that the SIP of a subject proxy in the TMRM is a complex of 
components that identify the subject. That is to say that one can 
achieve whatever level of granularity of subject identity that is 
required for a particular environment.

If one has two separate topic map instances that follow that model, 
merging can be based on any combination of the components in the 
respective SIPs.

Hope you are having a great day!


Ann Wrightson wrote:
> Small rant and suggested extension to \tau model
> ================================================
> Identifying and comparing subjects via proxies is not in general
> tractable/computable. So, what we are looking for is a reasonable practical
> strategy. That the TMDM/TAO way works in practice for a growing number of
> real situations is important (and is BTW a v. legitimate basis for LMG's
> position). 
> Subjects are neither atomic nor stable. A classic example is colours:
> although a dictionary would equate "grey" with "llwyd", and "red" with
> "coch" (Welsh), there are things I would say were "llwyd" or "coch" in Welsh
> that I would say were brown in English. Moreover, a younger Welsh-speaker
> would be likely to make more use of "brown" as a loan-word from English than
> I do, though I would still expect their boundary "brown/coch" to be
> different from the red/brown boundary of an English speaker. This example is
> particularly clear and simple, however this kind of thing happens all over
> (witness our own group discussions!).
> To combat this, you might put in place a more effective/stable way of
> identifying/comparing subjects, by using proxies designed for the job, such
> as a term set derived from a controlled vocabulary. This works v. well for
> everyone using that term set. However, the picture changes once you have
> more than one term set (eg medical and social-work). In this situation, the
> two goals of mapping between the term sets and preserving the information of
> each source actually conflict. That is, if you set up a correspondence
> between two different term sets, then use that correspondence to merge two
> topic maps based on the respective term sets, then you generally lose
> information in real terms even though there is an argument that the result
> topic map "contains" everything the source topic maps contained. The
> severity of this problem increases in proportion to the precision of the
> term sets used, because that very precision is lost, and is valued by the
> information users.
> This has been (since 2000 ;-) my main argument against merging always being
> an a+b=c kind of operation that loses the fine structure of a & b. I believe
> that there should also be a different operation that both preserves the
> original maps and defines a view-as-if-merged-according-to-this-mapping. 
> I'm realizing as I write that this suggests a different formal account of
> merging - call it merge2 - eg (in \tau) where the two maps being merged have
> different sets of names (curly N1 and N2) in the universe (curly I) and
> instead of merge2 being a set union, it is a function where the desired
> property of "preserving information" from the source maps to the target maps
> becomes a morphism property of the function.
> Ann W.
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3

Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!