[sc34wg3] TM Data Model issue: prop-subj-address-values

Graham Moore sc34wg3@isotopicmaps.org
Mon, 3 Nov 2003 11:39:50 +0100


Ok, heres my take:

FACT: 2 different URIs can reference the same resource.

When merging topics based on, for example subject identifiers, the =
current
draft states that it is an exception if the topics being merged have
different values in the subject address property. From above it follows =
that
this is an unsafe assumption. Thus to support this fact, this
non-determinism, subject address should be a collection.=20

Let me also position things in terms of a more general principle taken =
from
the use and expectations of subject identifiers. There is no reason why =
two
subject identifiers cannot indicate the same subject via different
indicators. It is desirable to have one and only one, but that is =
unlikely
no matter how hard we try. But we don=92t say in the standard that to =
have two
subject indicators is an error. i.e. we err on the side of flexibility
because we can't be sure. If two topics are merged because a name match =
has
triggered it or some application code has triggered it then we don=92t =
throw
an exception if the two topics have different subject identifiers =
because
that=92s the way it works. Thus, I don=92t see a big distinction between
supporting a scheme we have defined (Subj Inds) and a scheme that exists
(URI) as being different when in both cases there is a degree of =
ambiguity/
flexibility.

I'm sure I've changed my mind on this about five times now but I think =
that
having subject addresses is the way to go - for now anyway :)

gra =20


----------------------------------------------------------------
Graham Moore, Ontopian            moore@ontopia.net
GSM: +47 926 82 437           http://www.ontopia.net


-----Original Message-----
From: sc34wg3-admin@isotopicmaps.org =
[mailto:sc34wg3-admin@isotopicmaps.org]
On Behalf Of Kal Ahmed
Sent: 31 October 2003 15:43
To: sc34wg3@isotopicmaps.org

On Fri, 2003-10-31 at 08:40, Geir Ove Gr=F8nmo wrote:
> * Kal Ahmed
> | On Thu, 2003-10-30 at 21:54, Geir Ove Gr=C3=B8nmo wrote:
> | > I _might_ consider http://example.org/, http://example.org/.,=20
> | > http://example.org/foobar/../,=20
> | > http://example.org/nodes.py?id=3Droot,
> | > http://example.org/index.htm, http://example.org/index.jsp and=20
> | > http://example.org/index.html all to reference the same resource=20
> | > even though they are different locators. Not sure, but this issue=20
> | > may boil down to whether or not the same _resource_ can be=20
> | > referenced by more than _one_ URI. This depends on our definition=20
> | > of what a _resource_ is.
> |=20
> | I think that is the key point. The definition of resource is not a=20
> | clear-cut thing (not even the relevant RFCs, standards, and TAG=20
> | pronouncements seem to match on this). Also there is the issue of=20
> | what is a URI - are two URIs equivalent if the resolve to the same
resource?
> | And then you get into a circular trap...
>=20
> I found the following message, which should be of interest in this
> discussion:
>=20
> http://rdfweb.org/pipermail/rdfweb-dev/2003-August/011820.html
>=20
> RFC 2396 (URIs) does seem to indicate that different URIs can=20
> reference the same resource.
>=20
> | > | If you were to allow multiple subject locators, you would not=20
> | > | only allow the arguably correct case of two locators which=20
> | > | return the same resource, but also a whole raft of incorrect=20
> | > | cases where the two locators return different resources.=20
> | > | [subject locator] is the lesser of these two evils.
> | >=20
> | > If it is incorrect -- then it is the _authors_ fault. No more no=20
> | > less. If it is incorrect - that's a human being's fault. Shit in - =

> | > shit out etc. That's life.
> |=20
> | No, consider mirrored sites - site A is mirrored by site B, such=20
> | that each URL x on A is mirrored to URL x' on B. Most of the time a =
URL
x'
> | resolves to exactly the same sequence of content bytes as URL x. But =

> | when the resource at x is updated, there is a lag before x' is=20
> | synchronised again. So a topic with x and x' as subject indicators=20
> | sometimes represents one resource and sometimes represents two=20
> | resources.
>=20
> Is inter-site mirroring any different from intra-site mirroring? ;)
>=20
> | > Why would we not want to trust topic maps authors?
> |=20
> | Its not a question of trust, its a question of the model that we=20
> | have for URIs, resources and subjects and the particular set of=20
> | heuristics that we choose. I feel that the original heuristics=20
> | encoded in XTM 1.0 are right - one locator, one resource, one =
subject.
>=20
> I see this as a matter of whether we should let authors express the=20
> fact that a subject, that is an information resource, can have more=20
> than a single locator (e.g. URI) referencing it or if we should force=20
> the topic map processor to choke if it so happened that two different=20
> authors were using synonymous locators. I have difficulties seeing the =

> usefulness of that.
>=20

But there is nothing to stop the author creating a topic that represents =
the
resource and adding occurrences indicating where it is mirrored or =
alternate
access points to it. Equally, there is nothing to stop a topic map
application that want to, from processing occurrences of a particular =
type
as providing subject identity and doing merging on that basis.

Having a topic map model that says one locator, one resource, one =
subject is
useful as it provides a concrete model that people can work with rather =
than
"Oh you have a topic with multiple subject addresses, well they could =
all be
the same resource or it could be a mistake, I don't know, you had better =
get
the resources and check for yourself".

Cheers,

Kal


_______________________________________________
sc34wg3 mailing list
sc34wg3@isotopicmaps.org
http://www.isotopicmaps.org/mailman/listinfo/sc34wg3