[DAS2] feature search algorithm (was Re: feature locations)

Andrew Dalke dalke at dalkescientific.com
Sat Aug 19 14:18:36 UTC 2006


> Given a database:
>
> Foo: (10, 60)
>   Bar: (10, 20)
>   Baz: (50, 60)

I'll modify it a bit more.  Make it be

Foo: type=transcript, location=(10, 60)
   Bar: type=exon, location=(10, 20)
   Baz: type=exon, location=(50, 60)

What does the search for

   overlaps(30,40), type==exon, title==Foo

return and why?  I can think of three answers:

  1) return everything because in the feature group

    there is a feature which overlaps(30,40)
    there is a feature which is of type exon
    there is a feature with title "Foo"

(call this the "each query term must match at least
one feature in a feature group" algorithm)

  2) return nothing because there is no feature
    which overlaps(30,40) and has type exon and
    has title "Foo"

(call this the "at least one feature must be
matched by all query terms" algorithm.  This is
the current algorithm)

  3) return nothing because while the root feature
     overlaps(30,40) there is no feature which is both
     of type exon and with title "Foo".

(call this the "range searches are special" algorithm.)


Now what does the search for

   overlaps(30,40), type==exon, title==Bar

return and why?  Using the same three algorithms:

   1) return everything because each of the three
    criteria are matched by at least one feature in
    the feature group

   2) return nothing because no feature matches all
    three criteria.

   3) return everything because the root feature
    overlaps(30,40) and the Bar feature meets the
    other two criteria.


					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list