[sc34wg3] Unique names in Draft Reference Model

Steven R. Newcomb sc34wg3@isotopicmaps.org
01 Jan 2003 16:55:09 -0600


"Anthony B. Coates" <abcoates@TheOffice.net> writes:

> ** Reply to message from "Steven R. Newcomb"
> <srn@coolheads.com> on 31 Dec 2002 09:15:27 -0600
> 
> Dear Steve,
> 
> > * For me, the primary reason for having a single
> >   namespace is not technical; it's human.  Having a
> >   single namespace is a significant advantage for human
> >   beings who are trying to talk with each other about
> >   TM Models.  With a single namespace within which
> >   every aspect of a given TM Model has a unique name,
> >   we can speak to each other unambiguously, without
> >   having to be painfully precise every time we open our
> >   mouths.  We can say only "zorp", instead of having to
> >   say "the zorp assertion type", or "the zorp role
> >   type", or "the zorp SIDP", since "zorp" can mean only
> >   one thing (whichever it is).  (As a standards
> >   developer yourself, you know how hard it is to
> >   establish an unambiguous universe of discourse.
> >   Considering the vanishingly small price of this
> >   "single namespace" idea, and the potentially enormous
> >   cost of misunderstandings regarding TMs and what they
> >   mean, doesn't the single namespace idea make sense to
> >   you, too?)

> The single namespace is an understandable aspiration,
> but if you thought it was achievable in practice, you
> wouldn't already have suggested creating ad-hoc
> namespaces like "A::zorp" and "B::zorp".  Topic map
> systems will potentially contain myriads of names,
> some controlled internally to an organisation, some
> imposed from external bodies.  It will never be
> practical to find unique names for everything within
> a single namespace.  You only have to look at human
> language to see how hard it is.

Tony, I think you're arguing against something I'm not
saying.  I agree with you about what you're saying
above.

I wasn't talking about names and namespaces in general.
Facilities for handling names and namespaces in general
can be declared by specific TM Models.  For example, in
the straw-man expression of the Standard Model, I
expect to capture the whole ontology of naming with
only three assertion types.  With those three (only
three) assertion types (and six role types), topic maps
that conform to the Standard Model can support any
number of namespaces, each of which contains any number
of names.

I was only talking about the names of assertion types,
the role types, and an approximately equally small
number of properties, *as those names are declared in
the definition of the TM Model*.  In order for topic
map information to be interchangeable in accordance
with some TM Model, everybody needs to agree about the
names of those very few things.  Multiple namespaces
are downright undesirable in this case.

Similarly, the XTM DTD (or any other DTD) provides a
very few element type names, and everybody has to use
those names in order to conform to it.  But, even with
only those very few element types, XTM syntax allows
topic maps to be interchanged that specify subjects to
have any number of names in any number of namespaces.

In a topic map, element types can be subjects (because
anything can be a subject), so it's still possible for
a topic map to specify that an XTM element type (like
<resourceRef>, for example) has many other names, in
addition to "resourceRef", in many other namespaces.
Such a topic map might be useful to establish the
mappings between the corresponding element type
definitions in different localized versions of the XTM
DTD.  Each localized DTD is a different namespace of
element type names.  But in the context of any single
DTD, there's only one namespace of element type names:
that's all that's needed, and, in general, that's all
that anybody wants.

> Now, if you were to propose a single but explicitly
> hierarchically namespaced system, that I could see as
> a saleable item.  Having a well understood way to
> construct names like "A::B::C::D::zorp" would mean
> that people constructing enterprise-scale systems
> could have some confidence that they wouldn't spend
> all of their time in thesauri trying to find unused
> synonyms for the names that have already been used.
> That's my view, anyway.

Above, you're articulating a perfectly good requirement
for a topic map, and, as noted above, the Topic Maps
paradigm (via its Standard TM Model, or any other
appropriate TM Model) allows you to meet such
requirements using topic maps.  The Standard Model will
allow you to do what you want, *in a topic map*.  But
the Standard Model itself doesn't need or want multiple
names for each of the very few assertion types that
collectively support the requirement you mention.

Let me use an analogy between XML and Topic Maps:

  DTDs are to XML instances
            as
  TM Models are to topic map instances;

and

  element types are to XML DTDs
            as
  assertion types are to TM Models.

When we declare an XML DTD, we normally give each
element type a single name.  Nevertheless, *instances*
of an XML DTD, such as instances of the XTM DTD, can be
used to interchange information about subjects that
includes many names, in many namespaces, for each
subject.

Now that I've said that, you could argue: "Yes, Steve,
but, in XML, for example, there's a different
sub-namespace of attribute names for every element type
name."  True.  That makes sense in XML, because XML
information is essentially hierarchical.  Topic maps,
however, can represent hierarchies, but they are not
essentially hierarchical.  In a topic map, everything,
without exception, is a subject, and every subject is
equally privileged to have any number of relationships,
of any kinds, with any other subjects.

In a TM Model, for example, with only two relationship
types (assertion types), any number of hierarchies of
topic types can be supported, in a single topic map.
We could name these two assertion types
"superclass-subclass" and "class-instance".  Or, we
could name them "type-subtype" and "type-instance".  It
doesn't matter what we name them, but, whatever we name
them, it's vital that everybody knows what their names
are.  Otherwise, XML-based information interchange is
not possible.

Because, in a topic map instance, each assertion type
is a subject like any other subject, it can have any
number of names, in any number of namespaces.  But when
we declare an assertion type in the definition of a TM
Model, it's best for it to have exactly one name, so
that human beings can communicate about it, and about
the TM Model in which it appears, conveniently and
reliably.  It's the same situation in XML-land: every
element type in a DTD has exactly one name, at least
with respect to that specific DTD.  If XML didn't
impose this restriction, it would be inconvenient for
people to discuss a DTD in terms of its element types;
whenever they uttered the name of the element type,
they would have to say the name of the namespace in
which the element had that particular name.  And what
if it were also necessary to utter the name of the
namespace in which the namespace has that name?
Ultimately, it's better to establish a convention that
each TM Model definition establishes exactly one
namespace, just as it's a convention in XML that each
DTD establishes exactly one element type namespace.

The number of names in any given TM Model should be
quite small.  It doesn't make sense to make a TM Model
any larger than is necessary to support whatever TM
Model-specific benefits can be derived from the
smallest possible set of relationship types that
supports those benefits.  True, some TM Models will
encompass many assertion types, but they will "borrow"
(inherit) the bulk of them from much smaller modules,
each of which is itself a full-fledged TM Model, and
each of which will establish its own namespace.

Modularity is good, both in TM Models, and in the
software that implements them.  I would expect, for
example, that the two assertion types I mentioned above
(type-subtype and type-instance), together with their
four role types and four subject identity
discrimination properties (SIDPs), would comprise an
entire inheritable TM Model.  This TM Model, like all
TM Models, would establish its own namespace, which
we'll call "NS" for purposes of the following example.
In this particular TM Model, there are only ten named
things:

(1) NS::type-subtype                 (assertion type #1)
(2) NS::type-instance                (assertion type #2)
(3) NS::typeRoleOfType-subtype       (role type #1)
(4) NS::subtypeRoleOfType-subtype    (role type #2)
(5) NS::typeRoleOfType-instance      (role type #3)
(6) NS::instanceRoleOfType-instance  (role type #4)
(7) NS::supertypes                   (SIDP #1)
(8) NS::subtypes                     (SIDP #2)
(9) NS::types                        (SIDP #3)
(10) NS::instances                   (SIDP #4)

I don't see how it can be regarded as burdensome that
all of these ten names must be unique.  On the
contrary, it's very helpful, because if anybody, in
conversation or elsewhere, says "NS::types", it's
automatically known that what's being referenced is a
property, because only the "types" property has the
name "types".  (There is no assertion type or role type
whose name is "types".)

-- Steve

Steven R. Newcomb, Consultant
srn@coolheads.com

Coolheads Consulting
http://www.coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA