[sc34wg3] Occurrences in the data model

Patrick Durusau sc34wg3@isotopicmaps.org
Wed, 07 Jan 2004 05:50:59 -0500


Peter Brown wrote:
> ----- Original Message -----
> From: Patrick Durusau <mailto:Patrick.Durusau@sbl-site.org>
> To: sc34wg3 <mailto:sc34wg3@isotopicmaps.org>
> Sent: Monday, December 22, 2003 1:20 PM
> Subject: [sc34wg3] Occurrences in the data model
> <snip/>
> The more interesting case arises with location information,
> such as Right Ascension/Declination in astronomy, longitude and
> latitude in GIS systems (and targeting systems), where finding all the
> occurrences that share a point on a particular axis could well be important.
> Note that I don't think making coordinates topics would solve the
> problem as given the fine grained nature of coordinate systems there
> would be a proliferation of topics for any relatively sophisticated
> system of coordinates. Not to mention that coordinates are commonly
> thought to be characteristics of objects/locations and not subjects in
> their own right.
> Is there some conceptual reason for this treatment of occurrences in
> the data model?
> <PB>
> On the contrary, I think there is a compelling argument that such 
> treatment should be explicitly excluded: they are indiscrete (or 
> analogue) variables and can never be defined with a discrete value: in 
> taxonomy work, it the phenomenon of spectrum values: as you say, it 
> depends on the granularity to which you are prepared to take a 
> particular classification.

You raise an interesting point (which I address below) but I am not 
certain that it is relevant to the question I was originally posing.

My original question concerned the failure of the data model to merge 
two occurrences that have different parents but the same value. That 
does not pose the problem of a spectrum of possible values.

> There is also a principle of economy to be considered: in a vary random 
> or unevenly distributed set of values, the "discrimination" offered may 
> vary wildly: whereas for one part of the spectrum, values of 21,22, 23 
> might be enough to discriminate between different occurrences; at 
> another part you might need values as fine as 1.113, 1.114, 1.115, etc.
> You can never know in advance how to model indiscrete values in a 
> discrete manner...and it indeed makes assertions about equivalence well 
> nigh impossible; do two people with ages of "23", "23 years and 1 day", 
> and "22 years 11months and 27 days" all have the same age?

Hmmm, don't think I understand your point:

"know in advance how to model indiscrete values in a discrete manner"

Can you say a bit more about that? What I think I am missing is what you 
mean by "model."

To illustrate what I may be missing, say I am the running the voter 
registration office. In order to register to vote, voters must be of a 
certain age, say 23. For my purposes, although we know "in fact" that it 
will happen that people will come in on their 23rd birthday to register, 
that it is also the case that people whose 23rd birthday has passed will 
also come in to register. Despite the differences in ages, I simply 
record 23 (or whatever the voters age) in my records. Close enough for 
government work as they say in the US. ;-)

Doesn't that mean I have chosen the granularity at which I will be 
modeling age, despite common knowledge that it is fairly imprecise?

Or did you mean something else?

(BTW, for voting purposes in the US, "23", "23 years and 1 day", are the 
same age, but "22 years 11 months and 27 days" is not.)

> Would/Could it be useful to know whether the concept - however 
> formulated - of scope helps here: can we state that we are interested in 
> "documents with version numbers between 1 and 3"; or "items in the night 
> sky between RA/DEC coordinates xy and x'y' " ? It seems that all the 
> debate about facets/scope has looked (correctly) at the issue of an 
> "axis" of interest, but not this problem of discrete and indiscrete 
> values/ranges.

Actually I think it should be possible to specify merging rules that 
deal with information along an axis of measurement. If a the subject of 
a topic is anything I want to speak about, why can't I have a topic that 
is all the parts (as represented by topics) that weigh less than 10 
kilos? And expect to find all the information related to that subject at 
that one topic?

Of course, this presumes that one does not have fixed, one size fits all 
merging rules that operate in all cases, forcing authors to work around 
whatever rules have been chosen as standard.

Not to mention that the merging rules I need may change depending upon 
the nature of the information I need.

Hope you are having a great day!


> All the best...
> Peter
> </PB>

Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!