[Bioperl-l] Bio::DB::Fasta

Fri Jun 27 21:49:21 EDT 2003

I have 2 followup questions, the first of which is small:
You sent me three links that had documentation on them before.
Was this on any of them or a different one, because I couldn't find it.

Second:
I have one database that just consists of one Fasta ID line followed by
the entire genome.  When I say

$db = new Bio::DB::Fasta($filename);
$bigseq = $db->get_Seq_by_id($id);

it crashes with this error:

Odd number of elements in anonymous hash at
/usr/local/bio/www/cgi-bin/BPPNew/Bio/DB/Fasta.pm line 969.

It does THAT because of the call in Fasta.pm that goes:

  return bless { db    => $db,
		 id    => $id,
		 start => $start || 1,
		 stop  => $stop  || $db->length($id)
	       },$class;

because $stop is undefined and $db->length($id) returns undefined. 
Since $db comes from the Bio::DB::Fasta constructor, I tend to assume
that that's where the problem is, but that's where the code gets hard to
understand and my investigation ended, so now I'm asking you folks.

It works when I use a database that's broken up into chunks.

The only other (obvious) difference between the working version and the
non-working version is that the working version is uppercase and the
non-working version is lowercase, and I don't expect this should be an
issue.

Is the large size of the database a problem for returning seq objects?
Do I need to go back to LargeSeq?

Thanks,
Mike

On Fri, 2003-06-27 at 07:38, Brian Osborne wrote:
> Michael,
> 
> This comes directly from the module's documentation:
> 
> use Bio::DB::Fasta;
> 
> # Bio::SeqIO-style access
> my $stream  = Bio::DB::Fasta->new('test.fa')->get_PrimarySeq_stream;
> 
> while ( my $seq = $stream->next_seq ) {
>    print $seq->seq;
> }
> 
> You also asked about "customizing" the indexing so you can use ids you have
> in hand, which partially match the ids in the file. See the bptutorial
> section on "Indexing or accessing..." or FAQ question 2.5.
> 
> 
> Brian O.