[Bioperl-l] need help with large genbank file

Dinakar Desai Desai.Dinakar@mayo.edu
Tue, 23 Jul 2002 17:30:52 -0500


Hello:

I am new to perl and bioperl. I have downloaded file from ncbi 
(ftp://ftp.ncbi.nih.gov/blast/db/nt) and this file is quite large. I am 
trying to parse this file for certain pattern with Bioperl. I get 
error.I have looked into largefasta.pm and they suggest not to use it.
I would appreciate, if you could help me with this problem.

My code to test only 5 records out of this big file is as follows:
<code>
#!/usr/bin/env perl

use lib '/home/desas2/perl_mod/lib/site_perl/5.6.0/';

use Bio::SeqIO;

$seqio = Bio::SeqIO->new( -file =>"/home/desas2/data/nt", '-format' => 
'Fasta');

$seqobj = $seqio->next_seq();
$count = 5;
while ($count > 0){
         print $seqobj->seq();
         $seqobj = $seqio->next_seq();

}
</code>
and the error message is:
<error>
------------ EXCEPTION  -------------
MSG: Could not open /home/desas2/data/nt for reading: File too large
STACK Bio::Root::IO::_initialize_io 
/home/desas2/perl_mod/lib/site_perl/5.6.0//B
io/Root/IO.pm:244
STACK Bio::SeqIO::_initialize 
/home/desas2/perl_mod/lib/site_perl/5.6.0//Bio/Seq
IO.pm:381
STACK Bio::SeqIO::new 
/home/desas2/perl_mod/lib/site_perl/5.6.0//Bio/SeqIO.pm:31
4
STACK Bio::SeqIO::new 
/home/desas2/perl_mod/lib/site_perl/5.6.0//Bio/SeqIO.pm:32
7
STACK toplevel ./test_fasta.pl:8

--------------------------------------
</error>

Do you have any suggestion, how I could get to read this big file and 
get sequence object. I know how to manipulate sequence object.

Thank you.

Dinakar