[Bioperl-l] RFC: Bio::SearchIO and Bio::Search::* objects

Ewan Birney birney@ebi.ac.uk
Sun, 21 Oct 2001 13:18:37 +0100 (BST)

On Sun, 21 Oct 2001, Thomas Down wrote:

> I'd say go for it.  Event-driven schemes are nice, not only
> in that they allow power users to extract subsets and summaries
> of the information quickly and efficiently, they also open the
> possibility of inserting `transducers' in the pipeline to rewrite
> the information in some way (BioJava uses transducers to help
> parse some file formats).
> We've been using two different event-driven parsing systems in
> BioJava for quite some time:
>   - Sequence IO (a.k.a. newio in BioJava 1.1):
>       This uses a SeqIOListener interface which was in some ways
>       inspired by SAX, but dedicated to sequence handling.
>   - Blast (and other apps) parsing, designed by the CAT people:
>       This actually uses SAX interfaces.  The parsers implicity
>       transform blast output into XML.  With a simple SAX -> XML
>       dumper, you can actually see this XML if you want.


Can you give us a pointer to a clean description of a SAX style interface
- or a quick four-line method summary? 

Jason - how do you feel about all of this - that you have asked for
opinions and just got people making your life harder? (remember, whoever
codes it does win the argument ;)).

> [In the last week, there has been a new suggestion in BioJava-land.
> Namely, using a lex-like parser generator -- jflex in this case --
> to generate a lot of the code of a flatfile -> SAX parser.  There
> was a quite impressive example of a blast parser implemented this
> way posted a few days back.  Search the BioJava mailing list archives
> for LSAX if you're interested].

I believe this is close to the biopython martel system. We are at least
getting close to ideas convergence between the bio* projects, though not
code sharing ;)

> Hope this helps,
>     Thomas.