[Bioperl-l] RFC: Bio::SearchIO and Bio::Search::* objects
Ewan Birney
birney@ebi.ac.uk
Sun, 21 Oct 2001 13:18:37 +0100 (BST)
On Sun, 21 Oct 2001, Thomas Down wrote:
>
> I'd say go for it. Event-driven schemes are nice, not only
> in that they allow power users to extract subsets and summaries
> of the information quickly and efficiently, they also open the
> possibility of inserting `transducers' in the pipeline to rewrite
> the information in some way (BioJava uses transducers to help
> parse some file formats).
>
> We've been using two different event-driven parsing systems in
> BioJava for quite some time:
>
> - Sequence IO (a.k.a. newio in BioJava 1.1):
>
> This uses a SeqIOListener interface which was in some ways
> inspired by SAX, but dedicated to sequence handling.
>
> - Blast (and other apps) parsing, designed by the CAT people:
>
> This actually uses SAX interfaces. The parsers implicity
> transform blast output into XML. With a simple SAX -> XML
> dumper, you can actually see this XML if you want.
[snip]
Can you give us a pointer to a clean description of a SAX style interface
- or a quick four-line method summary?
Jason - how do you feel about all of this - that you have asked for
opinions and just got people making your life harder? (remember, whoever
codes it does win the argument ;)).
>
> [In the last week, there has been a new suggestion in BioJava-land.
> Namely, using a lex-like parser generator -- jflex in this case --
> to generate a lot of the code of a flatfile -> SAX parser. There
> was a quite impressive example of a blast parser implemented this
> way posted a few days back. Search the BioJava mailing list archives
> for LSAX if you're interested].
>
I believe this is close to the biopython martel system. We are at least
getting close to ideas convergence between the bio* projects, though not
code sharing ;)
>
> Hope this helps,
>
> Thomas.
>