[Biojava-dev] blast parsing continued

Simon Brocklehurst simon.brocklehurst@CambridgeAntibody.com
Tue, 19 Nov 2002 09:35:02 +0000


Doug Rusch wrote:
> 
> I agree that checking in the code I have now is a problem. Breaking
>the HMMer, FASTA, and possibly wu-blast parser would be very bad, not to
>mention that it requires java 1.4. Short of overhauling all the existing
>tool parsers, there are only a few options that I can see
> 
> 1) branching
> 2) creating new packages parallel to the existing parsing code
>    (search/ssbind/sax)
> 3) starting a code base for BioJava 2
> 
> I would like to make my code available for the community to look at,
> test, and comment on but not at the inconvience of a large number of
> biojava's users. Is there a prefered solution?
> 

Doug,

I haven't had a chance to look at your DTD in detail, so I don't know
how much similarity there is to the original.  But, if it were possible
to make the changes you need *optional* additions in the DTD, then this
would allow people to slot your new parser right in to their existing
ContentHandlers.

To me, that seems the ideal solution.  I'd be surprised if your parser
didn't work better for NCBI Blast - using the new 1.4 regular
expressions is a neat way to go. It would be nice for people to be able
just to plug it in.

As I say, I haven't had a chance to look at the DTD - so this may not be
possible.

Simon
--
Dr Simon M. Brocklehurst, Ph.D.
Director of Informatics & Robotics

Cambridge Antibody Technology
The Science Park
Melbourn
Cambridgeshire
SG8 6JJ, UK

Telephone: + 44 (0) 1763 263233
Facsimile + 44 (0) 1763 263413
Email: mailto:simon.brocklehurst@cambridgeantibody.com
http://www.cambridgeantibody.com

Cambridge Antibody Technology Limited *
Registered Office: The Science Park, Melbourn, Cambridgeshire,
SG8 6JJ, UK. Registered in England and Wales number 2451177
(* Cambridge Antibody Technology Limited is a member of the
Cambridge Antibody Technology Group of Companies)

Confidentiality Note: This information and any attachments is
confidential and only for use by the individual or entity to
whom it has been sent. Any unauthorised dissemination,
distribution or copying of this message is strictly prohibited.
If you are not the intended recipient please inform the sender
immediately by reply e-mail and delete this message from your system.
Thank you for your co-operation.