[tmql-wg] Result set requirements

Mon, 1 Mar 2004 21:19:03 +1000

On Sat, Feb 28, 2004 at 07:36:40PM +0100, Rani Pinchuk wrote:
> > And my claim is that you cannot do that if you want TMQL have other
> > results than lists of topic map items. And even then you have
> > specified implicitely a 'list of topic map items':
> > 
> >    select $basename, $occurrence, ......
> > 
> > in contrast to, say,
> > 
> >    select $occurrence, $basename

> It is true that any query language should be able to deliver 
> somehow the results. The question is how complex you make this delivery. 
> 
> I see here kind of scale. You can start with something extremely simple 
> like single string. You can continue with something like what SQL gives.
> You can still continue with let's say adding the ability to deliver
> whatever XML. Next step might be the ability to deliver any format or
> structure. Actually you can still continue letting the user write
> algorithms within the query language.
> Something like (obviously, very approximate):
> 
> 
>  ---+--------+---------+----------+-----------+---------
>    single   SQL       XML        Any       algorithms
>    string   like                Format

Hmmm, I cannot see different complexities here.

Whether I say

   return
         "Rumsti"   # a string

or

   return
         ("Rumsti", "Ramsti", "Romsti")

or

   return
        <Rumsti>
            <Ramsti/>
            <Romsti/>
        </Rumsti>

does not make too much difference to me. I am also not quite sure
what you mean with 'algorithms' above.

In any case you might be right that inside the content constructor
we will need loops and branching. But that is true for all the above.

> I am not totally sure where we should end up in that scale. I guess I 
> would prefer somewhere between SQL-like and XML.

I would hope that we could leave SQL far behind us. It was invented in
the 70'ies (60'ies). :-)

> You gave good reasons to include (any) XML. However, I think the query
> language could do without it. I have two reasons for that:
> 1. It seems that in AsTMa (as well as in TMTL) the creation of XML
>    (or other formats) is based on processing strings.

Negative, captain. (I'm a Stark Trek fan :-)

In case of AsTMa we DO NOT expand strings. It is true that we write
the query as a string (well, of course), but the XML (as well as the
list and Topic Map constructor) are no text templates, but
internalized.

>    So actually the 
>    separation is easy to achieve:
> 
>    <albums>{
>         forall $t [ $a (album)
>                     bn: $bn ] in $m
>         return
>        <album id="{$a}">{$bn}</album>
>    }
>    </albums>
> 
>    Could also be written like:
>   
>    template: 
>     <while condition="loop_over_query_results">   
>       <albums>
>         <album id="$a">$b</album>
>       </album>
>     </while>
> 

I guess you mean <albums> to be at the outest level...

>    where "loop_over_query_results" is a callback function that gets the 
>    query from a phrasebook of queries, runs it and place in $a and $b 
>    the results (possibly with some extra processing of the results
>    before sending them to the template). 

Sure, sure. But how much have we actually gained and how much lost?

I would understand to do something TMTLish, this time using a taglib
approach:

<albums>
  <astma:loop query="get them from somewhere"
              bind="somehow bind parts of one result to $a and $b here?">
    <album id="{$a}">{$b}</album>
  </astma:loop>
</album>

But isn't that just an XMLish notation of AsTMa? itself? And what happens
now in case of nested queries?

<albums>{
    forall $t [ $a (album)
                bn: $bn ] in $m
    return
    <album id="{$a}">{$bn}
    {
     forall [ (is-producer-of)
              album: $a
              producer: $p ] in $m
        return
           <producer>{$p/bn}</producer>
     }
    </album>
}
</albums>

If you translate this into your XML notation how good is the separation
between finding the matches and generating the results?

> 2. I am not sure what is the correct way to generate XML like the 
>    above. Maybe if we decide in the end that TMQL can return topic
>    objects and association objects as excerpts of XTM (so XML elements),
>    we should use XSLT to generate other XMLs.

This is a no-go, I think. Even if we can assume that the XTM code is
fully merged and canonical (how to explain this to an engineer!), I
hardly can imagine that a TMQL processor will (a) serialize results
into XTM only to (b) XSL-transform it to something else.

What is worse is that XTM is not really orthogonal, lots of other
problems as well...

>    But maybe this is too
>    complex, and we should do something like you do it in AsTMa, or
>    maybe we should do it using DTD and Xpath (actually your ideas 
>    as well) described in
>    http://www.spaceapplications.com/toma/Toma.html#xml
>    So unless someone really know what is the correct way to do it,
>    I think it is wrong to include it into TMQL.

Well, I can claim that it works, because it does here. And as someone
(Dmitry?) already mentioned, once you have it in XML, integrating it
content-wise is easy.

> > If think I understand your concern, namely, if someone has a particular
> > query like
> > 
> >    forall [ some pattern P ]
> >    return
> >        some constructor C
> > 
> > that you want to factor out the way the data is used in the constructur
> > and avoid that the repetition of the pattern P
> > 
> >    forall [ some pattern P ]
> >    return
> >        another constructor D
> > 
> > That would only be possible if you store the captured data which was
> > detected in the pattern in an intermediate store which you can then
> > reuse with different constructors C and D.
> > 
> > In any serious language that intermediate store is SO COMPLEX that
> > reusing it becomes definitely more expensive than simply repeating
> > the pattern.

> I am not sure that we understood each other. My only gain in the 
> separation is that the final application is more readable and
> maintainable. The price I pay is performance and complexity.
....
> So I don't try to avoid running the same query. I try to avoid
> hard-coding the same query.

Isn't that exactly what I said above? That 'some pattern P' should
be in one place and is reused with different constructors?

\rho