[Bioperl-l] QUestions about Bio::Index and Bio::DB
morgarws@mh.us.sbphrd.com
morgarws@mh.us.sbphrd.com
Tue, 02 Apr 2002 16:54:38 -0500
I have a couple of questions about these two methods for random access into
FASTA files:
1. Since a GB FASTA file normally has gi and gb identifiers concatenated
together it appears the Bio::Index then can only access a sequence by the
concatenated ID. Is this the expected correct behavior?
2. Bio::DB::Fasta seems to be able to generate an index (via the -makeid
option) for one of the IDs but not for all that appear, ie the routine
specified with the makeid option is supposed to return a scalar. Is it planned
or in the works to allow the makeid routine to return a list of IDs that the
sequence is indexed by?
3. Both the latest version of WashU BLAST and (I believe though haven't
checked for myself) NCBI BLAST have the ability to generate an index file of
the FASTA file which can be used to randomly access the sequences in that file
(WASHU provides this with its xdformat and xdget commands). Is anybody working
on adding support to either Bio::Index or Bio::DB for these indexes? And if
someone wanted to do it for which module would it make the most sense to add
it to?
Thanks in Advance,
Bill Morgart
morgarws@molbio.sbphrd.com