[DAS2] query language description
Gregg_Helt at affymetrix.com
Fri Mar 17 00:22:54 UTC 2006
For the type query filter, I'd suggest keeping the exacttype semantics
you discuss below, but using "type" for the field name rather than
"exacttype". If we're getting rid of one of them, and a non-exact type
is a meaningless concept, it seems like keeping that "exact" part is
unnecessary and potentially confusing.
> I think we decided that there is no type inferencing done in
> the server; it's a client side thing. In that case the 'type'
> field goes away. We can still keep 'exacttype'. The URI
> used for the matching is the type uri, and NOT the ontology URI.
> (We don't have an ontology URI yet, and when we do we can add
> an 'ontology' query.)
> The segment URI must accept the local identifier. For
> interoperability with other servers they must also accept the
> equivalent global identifier, if there is one.
> If range searches are given then one and only one segment is
> allowed. Multiple segments may be given, but then ranges are not
> The string searches support a simple search language.
> ABC -- contains a word which exactly matches "ABC" (identity,
> *ABC -- words ending in "ABC"
> ABC* -- words starting with "ABC"
> *ABC* -- words containing the substring "ABC"
> If you want a field which exactly contains a '*' you're kinda
> out of luck. The interpretation of whitespace in the query or
> in the search string is implementation dependent. For that
> matter, the meaning of "word" is implementation dependent. (Is
> *O'Malley* one word? *Lethbridge-Stewart*?)
> When we looked into this last month at Sanger we verified that
> all the databases could handle %substring% searches, which was
> all that people there wanted. The Affy people want searches for
> exact word, prefix and suffix matches, as supported by the the
> back-end databases.
> XXX CORRECT ME XXX
> The 'name' search searches.... It used to search the 'name'
> attribute and the 'alias' fields. There is no 'name' now. I
> moved it to 'title'. I think I did the wrong thing; it should
> be 'name', but it's a name meant for people, not computers.
> Some features (sub-parts) don't have human-readable names so
> this field must be optional.
> The "prop-*" is a search of the <PROP> elements. Features may
> have properties, like
> <PROP key="cellular_component" value="membrane" />
> To do a string search for all 'membrane' cellular components,
> construct the query key by taking the string "prop-" and
> appending the property key text ("cellular_component"). The
> query value is the text to search for.
> To search for any cellular_component containing the substring "mem"
> The rules for multiple searches with the same key also apply to the
> prop-* searches. To search for all 'membrane' or 'nuclear'
> cellular components, use two 'prop-cellular_component' terms, as
> The range searches are defined with explicit start and end
> coordinates. The range syntax is in the form "start:end", for
> example, "1:9".
> Let 'min' be the smallest coordinate for a feature on a given
> segment and 'max' be one larger than the largest coordinate.
> These are the lower and upper founds for the feature.
> An 'overlaps' search matches if and only if
> min < end AND max > start
> XXX For GREG XXX
> What do 'inside' and 'contains' do? Can't we just get
> away with 'excludes', which has complement of 'overlaps'?
> Searches are done as:
> Step 0) specify the segment
> Step 1) do all the includes (if none, match all features on
> Step 2) do all the excludes, inverted (like an includes search)
> Step 3) only return features which are in Step 1 but not
> in Step 2)
> Step 4) ...
> Step 5) Profit!
> I think this will support your smart code, and it's easy
> enough to implement.
> Every one but you was planning to use 'overlaps'. Only you
> wanted to use 'inside'. Anyone want to use 'contains'?
> dalke at dalkescientific.com
> DAS2 mailing list
> DAS2 at lists.open-bio.org
More information about the DAS2