[Biojava-l] Mass Search Results

Keith James kdj@sanger.ac.uk
09 Jan 2002 11:01:45 +0000


>>>>> "Will" == William Old <William.Old@UCHSC.edu> writes:

[...]

    >> Yes sounds good. There are some aspects of
    >> SeqSimilaritySearchHit like score typing that seem a little two
    >> specific since different algorithms may use different scoring
    >> strategies. So I think this is a good strategy in general as
    >> well as for including MS searches. Do you think it would be a
    >> good idea to add a getAnnotations() method to the base
    >> interface that returns a map of values for a result (Both in
    >> the SearchResult interface and the Hit interface). I am
    >> thinking this would be nice since, for example, there are lots
    >> of MS search algorithms out there and they all produce similar
    >> results with slight differences. So a getAnnotations() method
    >> would make it easier to create these different implementations
    >> without having to have a separate interface for each.

    Will> I agree with creating a new base interface to capture the
    Will> common elements of searches and extending it for the
    Will> specific requirements of the searches.  Using a generalized
    Will> method to return annotations for searches, results, and hits
    Will> is also a good idea. Even though each search algorithm
    Will> returns similar types of results, they can be very
    Will> different, and over time the data types returned will
    Will> change.

I replied to Michael off-list about this. In summary, the way Biojava
has been coping with highly variable report formats is to use the
org.biojava.bio.Annotatable interface (which has one method;
org.biojava.bio.Annotation getAnnotation()). That way concrete classes
don't have to implement getAnnotation if they don't need (or want) to,
but the option is always there.

    Will> What about an interface for the storage/retrieval of search
    Will> results/hits in a database schema? An example of using a
    Will> database to store mass spec search results was recently
    Will> published in Proteomics: (Proteomics 2001, 1 , 1489-1494).

If we are to get into object-relational mapping I think that it would
be good to come up with a solution which could be applied to any
Biojava object. This makes it quite a thorny problem, but one which
needs to be tackled at some point. In any event, I think that a
persistence mechanism is needed (for search result and other data),
but as a separate package.

I don't know if anyone has been considering O/R mappings for Biojava?
I've been dabbling a bit recently (with Castor, ObjectBridge etc) and
it's all fairly brain-wrenching.

Keith

-- 

-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Cambridge, UK