[sc34wg3] Re: FYI: Yet another TMRM Formalization (well, not really)

Robert Barta sc34wg3@isotopicmaps.org
Fri, 16 Jul 2004 07:29:10 +1000


On Thu, Jul 15, 2004 at 05:12:13PM +0100, Martin Bryan wrote:
> > Counterquestion: How can the natural numbers be implemented such that
> > a computer can understand?
>
> By humans adopting conventions that machines can interpret to
> determine the relevant values that can be used in binary arithematic
> to represent the number.

Martin,

But there are MANY relevant values in the set of natural numbers.
A formal model ...

> What I want are a set of machine processable identifiers for TM concepts
> that humans can enter safely by hand.

...by itself may not be necessarily transferable into a computer,
unless you do it strictly symbolically. And it still has a value, in
that you can analyse the structure of your things.

This huge system 'number theory' talks about monstrous numbers and
their properties. This is all symbolic math in the brains, almost no
machinery in place.

But all of it together is so strong that crypto technologies (RSA, EC,
DSA) are built on in and used in wars. The NSA has alledgedley 1000
mathematicians employed. They are not doing data entry there....

> > The \tau model is a formalism, nothing else. It is not meant to
> > implemented as is, in the same way as relational databases do not
> > implement tables with the relational algebra.
> 
> But they do have a computable form.

Saying the above, the \tau model also happens to be 'computable' in
the sense that it talks about finite sets of terms. An assertion is a
finite set and a map is a finite set of assertions.

If Jan would ask me again whether this is an 'implementation', I would
answer, yes, but a slooooooooooow one.

> > What RDBMSes do is to incorporate 'knowledge about the relational
> > algebra, at least to a certain extend. This is also the intention
> > behind the path language, namely that engines can use it to perform
> > optimization steps during processing. The model is nothing else as a
> > basis to formulate the rules.
> 
> But until you take the next step the model will remain just that = a
> "theoretical" model which explains all and does f.all.

'Just' is not a just characterisation. But this is more a question
about the usefulness of theoretical models in general. Your mobile in
your pocket, your combustion engine in your car, the plan you took
last month. Part of it is experiment, but lots only emerged on paper
first.

> >    ($m / is-author-of / author/bn @uc)
> 
> What if I, in my vast ignorance, misenter this as
> ($x/author/is-author-of/basename ^@unconstrained)

Then your query processor will give you a more or less friendly
feedback, that

   (a) the query is not consistent with the map structure
       if that is known to the processor, or

   (b) sorry, nothing of this kind can be found

In this sense the query language should behave like those for

   relational DBs
   OO DBs
   hierarchical DBs
   XML DBs
   you-name-it DBs

All I know necessitate the 'programmer' to know something about the
structure (some call it schema) of the database. Either the human
knows this - and in projects where you NEED the control to guarantee a
degree of quality this will be the case - or the development
infrastructure will help you.

In UNIX, for instance, there are many tools where you can use the
'TAB' key to get suggestions how to complete a command / xml-tag /
sentence /...

> My point is that information in natural languages can be entered in
> different orders to get the same result.

True, but TMs is NOT about natural language processing. It is - as I
understand it - about 'shallow knowledge'. And that HAS a structure,
albeit a weak one.

> SQL only offers a limited functionality for doing this. By naming
> each field XML allows you to identify when the wrong information has
> been entered in the wrong order.

XML does not help at all. Look at the snippet

<something>
   <Shoesize>123</Shoesize>
   <StudentNr>234</StudentNr>
</something>

That entity which puts in the information has to know what these
things 'mean'. Same for relational DBs, same for TM DBs.

> At present your model seems to be suggesting I need to be sure that
> type a is used in map x, that the third part of the path is always
> an association, the fourth is a TM construct and the fifth is a
> qualification :-( But is this correct, fixed, inviolate,
> adequate....?

The *formal* model suggests that there are ALWAYS only assertions
(which are equivalent to associations, excluding scoping). So what the
'formal programmer' would have to know what roles and what kind of
assertions are involved to reach a particular information.

In the 'practical query language' we will use topics and associations
because that is that what the people are familiar with. The *formal*
model can be used to make it clear (more than prose can do) what a query
actually should accomplish for a given map.

But also on the practical level the 'programmer' HAS to know what
types, roles, .... are relevant to reach particular information.

If he/she does not, then tools can easily extract this information
out of a map without knowing anyhing about it:

   - what topics are used only as type?
   - what are only roles?
   - what are instances
   - is there a type hierarchy
   - ....

Basic ontology reengineering. Equipped with that knowlege and the
proper labeling the programmer can get as much information as he needs
to formulate the query.

> Doubting TM (Thomas/Martin)

Well, if you associate TMs with natural language processing, then I
would also doubt them.

\rho