[Biojava-l] Finding a feature by its ID

Thomas Down td2@sanger.ac.uk
Wed, 16 Oct 2002 14:36:18 +0100


On Wed, Oct 16, 2002 at 02:20:52PM +0200, Jansen wrote:
> 
> FeatureFilter.ByAnnotation ff1 = new
> FeatureFilter.ByAnnotation("org.biojava.bio.program.xff.id",identifier);
> FeatureHolder holder = dasSeq.filter(ff1,true);
> 
> But unfortunatly this does not find all the features. For example it finds
> "components/NT_021937" if the DASSequence corresponds to chromosome 1 on the
> human genome 8.30. If I am searching for "ENSE00000897633" (which is a
> direct child of NT_021937) this works too. But if I am looking for
> "components/AL139424.21.1.86720" which is also a child of NT_021937, the
> feature is not found. The same for "AK074279@10439-10678" which is a child
> of "AL139424.21.1.86720".
> 
> Any idea about this strange behaviour?

This is rather strange behaviour...  I've taken a quick look through
but can't see any obvious explanation.  Could you tell me:

   - Which version of BioJava are you using?  Some significant
     changes to how FeatureFiltering worked were introduced into
     the development tree a couple of days ago, but won't appear
     in release versions until 1.3

   - Does the filter operation simple return an empty FeatureHolder,
     or do you get an error message?

   - (Roughly) how long does the program take to run.  Does it
     return an empty set in a couple of seconds, or does it
     take 30+ seconds.

Note that, given the currently implementation of the client,
finding features far down the hierarchy is potentially a very
slow operation since the client has to fetch /all/ the annotation
from the server and scan through this exhaustively.  The latest
version of the DAS protocol has a specific mechanism for retrieving
features by ID, and future versions of the BioJava client code will
take advantage of this (indeed, one of the reasons for the API
changes which have just gone in was to support the implementation
of this feature).

I'll try to test this this evening,

     Thomas.