[Bioperl-l] new directions

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Wed, 7 Mar 2001 16:22:22 -0800 (PST)


On Wed, 7 Mar 2001, Jason Stajich wrote:

> [snip]
>  o The Blast issues.  I think the pluggable features to BPlite would be
>    ideal, I don't know how well it will work ( wanting to parse more or
>    less of the report -- runtime plugging of 'adaptors'?) . I like the
>    html features of Bio::Tools::Blast.  What about parsing NCBI Blast XML?

this isn't a new idea so apologies if i'm retreading old ground - i like
the idea of having a 'layered' parser for blast. a lightweight
object-model neutral parser would turn the raw output into sax-like
events, probably isomorphic to ncbi-xml. you could layer on top of this
something that catches the events and turns them into bioperl objects; or
if you wanted html-ised blast reports you would build this directly on the
event layer.
  
>  o Bio::Index::Blast which can read fetch ( and store?) seqs from a blast
>    index.

you mean using the .nsq files or whatever they are to do random-access of
a fastafile to pluck out your seq of interest?

I already have a perl module for random access of fasta files (we
occasionally use fasta files with ~20mb entries and it's useful to be able
to snip out a couple of kb from right in the middle of the entry), I'll
happily donate this to bioperl. 

it can be slowish for randomly accessing EST seqs out of huge EST fasta
files so I was going to write the underlying implementation in C, and to
use the blast index (it currently creates it's own custom index format).
If no one has written something like this already (surely someone must
have?) and folks would find this useful I'll do it as a bioperl project.
  
> [snip]
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center 
> http://www.chg.duke.edu/ 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>