[DAS2] query language description
chris mungall
cjm at fruitfly.org
Sat Mar 18 00:20:14 UTC 2006
On Mar 16, 2006, at 6:05 PM, Andrew Dalke wrote:
>> right now they are forced bypass the constraint language and go direct
>> to SQL.
>
> In addition, we provide defined ways for a server to indicate
> that there are additional ways to query the server.
I was positing this as a bad feature, not a good one. or at least a
symptom of an incorrectly designed system (at least in the case of the
GO DB API - it may not carry forward to DAS - though if you're going to
allow querying by terms...)
>
>> None of these really lit into the DAS paradigm. I'm guessing you want
>> something simple that can be used as easily as an API with get-by-X
>> methods but will seamlessly blend into something more powerful. I
>> think what you have is on the right lines. I'm just arguing to make
>> this language composable from the outset, so that it can be extended
>> to whatever expressivity is required in the future, without bolting on
>> a new query system that's incompatible with the existing one.
>
> We have two ways to compose the system. If the simple query language
> is extended, for example, to support word searches of the text field
> instead of substring searches, then a server can say
>
> <CAPABILITY type="features"
> query_uri="http://somewhere.over.rainbow/server.cgi">
> <SUPPORTS name="word-search"/>
> </CAPABILITY>
>
> This is backwards compatible, so the normal DAS queries work. But
> a client can recognize the new feature and support whatever new filters
> that 'word-search' indicates, eg
>
> http://somewhere.over.rainbox/server.cgi?note-wordsearch=Andre*
>
> (finds features with notes containing words starting with 'Andre' )
>
> These are composable. For example, suppose Sanger allows modification
> date searches of curation events. Then it might say
>
> <CAPABILITY type="features"
> query_uri="http://somewhere.over.rainbow/server.cgi">
> <SUPPORTS name="word-search"/>
> <SUPPORTS name="sanger-curation"/>
> </CAPABILITY>
so this is limited to single-argument search functions?
>
> and I can search for notes containing words starting with "Andre"
> which were modified by "dalke" between 2002 and 2005 by doing
>
> http://somewhere.over.rainbox/server.cgi?note-wordsearch=Andre*&
> modified-by=dalke&modified-before=2005&modified-after=2002
but the compositionality is always associative since the CGI parameter
constraint forbids nesting
> An advantage to the simple boolean logic of the current system
> is that the GUI interface is easy, and in line with existing
> simple search systems.
there's nothing preventing you from implementing a simple GUI on top of
an expressive system - there is nothing forcing you to use the
expressivity
> If someone wants to implement a new search system which is
> not backwards compatible then the server can indicate that
> alternative with a new CAPABILITY. Suppose Thomas at Sanger
> comes up with a new search mechanism based on an object query
> language he invented,
>
> <CAPABILITY type="down-oql"
> query_uri="http://sanger.ac.uk/oql-search" />
>
> The Sanger and EBI clients might understand that and support
> a more complex GUI, eg, with a text box interface. Everyone
> else must ignore unknown capability types.
but this doesn't integrate with the existing query system
>
> Then that would be POSTED (or whatever the protocol defines)
> to the given URL, which returns back whatever results are
> desired.
>
> Or the server can point to a public MySQL port, like
>
> <CAPABILITY type="mysql-connection"
> query_uri="mysql://username:password@hostname:port/databasename"
> />
>
> That's what you are doing to bypass the syntax, except that
> here it isn't a bypass; you can define the new interface in
> the DAS sources document.
>
>> The generic language could just be some kind of simple
>> extensible function syntax for search terms, boolean operators,
>> and some kind of (optional) nesting syntax.
>
> Which syntax? Is it supposed to be easy for people to write?
> Text oriented? Or tree structured, like XML, or SQL-like?
I'd favour some concrete asbtract syntax that looks much like the
existing DAS QL
> And which clients and servers will implement that search
> language?
all servers. clients optional
>
> If there was a generic language it would allow
> OR("on segment Chr1 between 1000 and 2000",
> "on segment ChrX between 99 and 777")
> which is something we are expressly not allowing in DAS2
> queries. It doesn't make sense for the target applications
> and by excluding it it simplifies the server development,
> which means less chance for bugs.
this example is pointless but it's easy to imagine plenty of ontology
term queries or other queries in which this would be useful
I guess I depart from the normal DAS philosophy - I don't see this
being a high barrier for entry for servers, if they're not up to this
it'll probably be a buggy hacky server anyway
> Also, I personally haven't figured out a decent way to
> do a GUI composition of a complex boolean query which is
> as easy as learning the query language in the first place.
doesn't mean it doesn't exist.
i'm not sure what's hard about having say, a clipboard of favourite
queries, then allowing some kind of drag-and-drop composition
> A more generic language implementation is a lot of overhead
> if most (80%? 90%) need basic searches, and many of the
> rest can fake it by breaking a request into parts and
> doing the boolean logic on the client side.
this is always an option - if the user doesn't mind the additional
possibly very high overhead. it's just a little bit of a depressing
approach, as if Codd's seminal paper from 1970 or whenever it was never
happened.
> Feedback I've heard so far is that DAS1 queries were
> acceptable, with only a few new search fields needed.
>
>> hmm, not sure how useful this would be - surely you'd want something
>> more dasmodel-aware?
>
> The example I gave was a bad one. What I meant was to show
> how there's an extension point so someone can develop a new
> search interface and clients can know that the new functionality
> exists, without having to change the DAS spec.
ok
that's probably all I've got to say on the matter, sorry for being
irksome. I guess I'm fundamentally missing something, that is, why wrap
simple and expressive declarative query languages with limited ad-hoc
constraint systems with consciously limited expressivity and limited
means of extensibility..
cheers
chris
>
> Andrew
> dalke at dalkescientific.com
>
> _______________________________________________
> DAS2 mailing list
> DAS2 at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das2
More information about the DAS2
mailing list