[tmql-wg] Result set requirements

Tue, 16 Mar 2004 10:54:34 +0100

* Lars Marius Garshol
|
| I've seen the XQuery folks do this, though I have to say I am not
| very keen to follow in their tracks. This quickly gets very
| complicated, and to make use of it requires an unusually advanced
| implementation, plus close integration between QL and CL. There's a
| lot of pain down this route; I'm not convinced there's a lot of
| gain.

* Robert Barta
| 
| I think there is. At least those small parts which I understand from
| functional programming (Hendler et.al.) points strongly in this
| direction.
| 
| My gutt feeling says 'typing' is even more important for TMQL in
| terms of potential speed benefits than for XQuery. My conjecture is
| 'the more irregular a data structure is, the more a query language
| has to rely on input from the application developer'. And TM allows
| more flexibility than XML, I suppose.

It does, but I think it also becomes harder to make efficient use of
what type information you do get. Note that I think we are not really
disagreeing on anything. It's clear that type information in general
does help with optimization. There are many ways to provide type
information, however, and in general there is a tension between strict
typing and ease of use.

| The trick they have accomplished is that XQuery works with NO TYPE
| INFORMATION as well. But then, of course, fewer optimizations can be
| done.

I think in general a compromise like that is a good thing, though I
think in many cases it will be possible to do the optimizations
anyway, by making use of information from that sources are available.

Say there is an inference rule like this

  my-inference-rule($A, $B) :-
    occurrence-type-x($A, $C), $C =< $B.

If we knew that $A was a topic of type 'person' and that $B and $C had
to be dates we could do things more efficiently, but there are some
other considerations:

 - if we require type declarations we make the programmer's job much
   harder,

 - declaring the types will in general restrict the use of a
   function/inference rule to situations where the types fit, (an
   excellent example is Java, where the same methods are often defined
   X times for X different types), and

 - if you can establish the types used for a particular invocation you
   can internally create a new instantiation of the function/rule
   where the types have been upgraded and if necessary do this for all
   the different type combinations that turn up.

I do think that for this to work properly the language does need to be
carefully designed, though I don't see any reason why we couldn't do
this. We will do all kinds of evaluation of whatever proposal we come
up with to make sure it's good enough, and this is just another kind
of evaluation. Subtle, to be sure, but definitely doable.

The main downside to having an implicit typing approach like this, I
think, is that the performance model of the language tends to become
very complex. What this means is that users may find that minor tweaks
to schemas or queries cause huge performance differences in practice
in ways that are very difficult for them to predict, and similarly
that which queries run fast on which implementations may also be very
difficult to predict.

Actually, I think a lot of previous experience with programming
languages is very relevant to this. There's a huge amount of
experience out there to learn from.

| The research papers (Chamberlin, Fankhauser) are quite interesting.
| 
|    http://topicmaps.it.bond.edu.au/mda/markup/xml/xquery/xquery

I'll have a look, but this route is not free of dangers, as I think
Jeni Tennison documented admirably:

<URL: http://www.idealliance.org/papers/extreme03/html/2003/Tennison01/EML2003Tennison01-toc.html >

(Well, I haven't read the paper; I just saw the presentation, which
was excellent.) 

| Not sure. I would love to see something like this, but this would
| need much much much more consideration and is beyond my
| capabilities.

I'm not sure it really is beyond your capabilities. It's not simple,
but I don't think it's impossible, either.

| I am not sure about the future of TMCL. At the moment it looks like
| RDFS light.

It does, but we don't want it to wind up that way, nor do I think
Graham wants that. So I wouldn't get worried just yet. I think the
final result is much more likely to look like AsTMa!/OSL represented
in topic maps, with some special syntax. The stuff Dmitry has been
playing with lately looks promising to me.

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >