[sc34wg3] Topic Equality test

Wed, 22 Jun 2005 06:03:57 -0400

Lars,

I think we have a numbering inconsistency so I have inserted #,s on my 
original post:

Lars Marius Garshol wrote:

>* Patrick Durusau
>| 
>| Equality rule: Two topic items are equal if they have:
>| 
>  
>
1.

>| - at least one equal string in their [subject identifiers] properties,
>|
>
2.

> 
>| - at least one equal string in their [source locators] properties,
>  
>
3.

>| 
>| - at lease one equal string in their [subject locators] properties,
>  
>
4.

>| 
>| - an equal string in the [subject identifiers] property of the one
>|   topic item and the [source locators] property of the other, or
>  
>
The test here being string from #1 in #2.

>| 
>| - the same information item in their [reified] property.
>| 
>| (Note, although not explicity stated, I am assuming "or" applies to
>| all these equality conditions.)
>
>It does. The "or" appears between the last and the second-to-last
>condition. Isn't that sufficient?
>
>  
>
I think what made me pause when reading it closely was the "at least" 
language. When included on each condition, it seems to imply that 
condition is required to be meet.

Contrast:

Equality rule: Two topic items are equal if they have:

- at least one equal string in their [subject identifiers] properties,

- at least one equal string in their [source locators] properties,
...

(current draft)

With:

Equality rule: Two topic items are equal if they have at least:

- one equal string in their [subject identifiers] properties,

- one equal string in their [source locators] properties,
...
..., or, 

Which seems to make it clearer that "at least" one of the following conditions has to be meet. 

Actually to make it completely explicit, think about the following:

Equality rule: Two topic items are equal if one or more of the following conditions are meet:

(then delete the "at least" language from the following clauses)

I think that captures the case where more than there is more than one equal string as well. To make that completely explicit, think about "one or more equal strings" where I have suggested "one equal string."

>| Shouldn't the forth item, "an equal string in the [subject
>| identifiers] property..." read:
>| 
>| "at least one equal string in the [subject identifiers] property..."
>
>That certainly is the intent, and I guess it is also what the text
>should say.
> 
>| Reasoning that the test is:
>| 
>| topic one {{subject identifiers}{source locators}} intersection topic
>| two {{subject identifiers}{source locators}} does not equal the empty
>| set?
>
>This is somewhat wider than test 3, which is
>
>  topic1 {subject identifiers} intersect topic2 {source locators} != Ų or
>  topic2 {subject identifiers} intersect topic1 {source locators} != Ų
>
>  
>
Don't you mean test 4?

Let's see:

Test 1:

topic1 {subject identifiers} intersect topic2 {subject identifiers} != 0

Test 2:

topic1 {source locators} intersect topic2 {source locators} != 0

Test 4 (as you report):

topic1 {subject identifiers} intersect topic2 {source locators} != Ų or

topic2 {subject identifiers} intersect topic1 {source locators} != Ų

The test I was suggesting is broader than #4, but that is because it 
also captures #1 and #2 in a single operation.

That is if you have:

topic1 {{subject identifiers}{source locators}} intersection topic2 {{subject identifiers}{source locators}} != 0

What are the possible intersections?

a. subject identifiers (test 1)

b. source locators (test 2)

c. subject identifier (of t1 or t2) equal to source locator (of t1 or t2) (test 4)

Since the test I was suggesting does a + b (which are not covered by 4 in the draft), yes it is broader.

>| That is to say for merging purposes it really doesn't matter which
>| test, 1, 2, or 4 triggered the merge, it is enough that the merging
>| was triggered?
>
>Not really. Test 3 is a separate test, but I guess what you are saying
>is that it can be generalized to also cover 1 and 2. Test 4 is
>entirely separate, however.
>
>  
>
Sorry, I think we are off on the numbering. Test 3, as I read it is:

"at lease one equal string in their [subject locators] properties,"

I did not intend to suggest a test broad enough to encompass that one.

>Tests 1-3 are written separately simply because that seems more
>logical, given that each test corresponds to a certain "merge test".
>
>  
>
Sure. Depending on how the "at least" language is resolved, making the 
4th condition the same as the others would be helpful.

I was just thinking that from an implementation standpoint, I would 
rather do one test from merging (conditions 1, 2, 4, still have to test 
the others separately), assuming it is logically equivalent to the 
separate tests, rather than doing the tests separately.

On that point, I have no objection to the tests, 1, 2, and 4 being 
written separately, but I would like for it to be clear that it is an 
"or" test. The combination of the tests point was more a question of 
understanding/implementation than the standard per se.

Hope you are having a great day!

Patrick

-- 
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work!