[Biojava-dev] blast parsing continued

Keith James kdj@sanger.ac.uk
13 Nov 2002 21:42:27 +0000


>>>>> "Doug" == Doug Rusch <drusch@tcag.org> writes:

[...]

    Doug> I think the use of sequenceDBs was a better approach than
    Doug> using just queryID, databaseID, and subjectID. Minimally, if
    Doug> you look at blast output, there are 3 valuable attributes of
    Doug> a sequence. The id, the definition line, and the length of
    Doug> the sequence. The problem comes that there is no such thing
    Doug> as a sequence-less Sequence object. I tested and implemented
    Doug> an approach that makes a VirtualSequence object that is
    Doug> built with an SymbolList.EMPTY and has an overridden
    Doug> getLength method that. This allows the parser to keep all
    Doug> the valuable information you might have about a sequence you
    Doug> see in a blast output while allowing you to use all the
    Doug> functionality of the sequenceDB classes.

    Doug> What are everyone elses opinions on this?

This sounds like a fine idea. It's certainly better than the dummy
objects which I was using.

Almost all the feedback I've had previously was from people finding
that the original API was hard to use. If we can roll your solution
into the dist then we can have the best of both worlds (by the magic
of CVS).

Keith

-- 

- Keith James <kdj@sanger.ac.uk> bioinformatics programming support -
- Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, UK -