[Bioperl-l] Bio::DB::GenPept fetch pooping out every ~200 queries?

Dave Messina David.Messina at sbc.su.se
Mon Oct 27 00:27:57 UTC 2008


Hey everyone,

We're seeing a weird behavior where when fetching GenPept protein  
records by accession number over the net. We get a bogus "acc <foo>  
does not exist" error after 200-250 queries, and that triggers a  
"resource not available". Repeat the query with the same accession,  
though, and it's retrieved just fine.

Anyone else experience this, understand why, or have a solution?

On a related note, this is a fatal error. Seems to me it should be  
just a warn so additional queries can continue.
What's the rationale for fatal?

Errors and code below.

Thanks!
Dave



> ------------- EXCEPTION -------------
> MSG: Couldn't fork: Resource temporarily unavailable
> STACK Bio::DB::WebDBSeqI::_open_pipe
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/WebDBSeqI.pm:709
> STACK Bio::DB::WebDBSeqI::_stream_request
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/WebDBSeqI.pm:738
> STACK Bio::DB::WebDBSeqI::get_seq_stream
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/WebDBSeqI.pm:455
> STACK Bio::DB::NCBIHelper::get_Stream_by_acc
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/NCBIHelper.pm:466
> STACK Bio::DB::WebDBSeqI::get_Seq_by_acc
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/WebDBSeqI.pm:173
> STACK toplevel ./genbank:26
> -------------------------------------
>
>
> ------------- EXCEPTION -------------
> MSG: acc AAF52404 does not exist
> STACK Bio::DB::WebDBSeqI::get_Seq_by_acc
> /Network/Servers/courses2.cshl.edu/Users/jramsey/Library/Perl/5.8.8/ 
> Bio/DB/WebDBSeqI.pm:182
> STACK toplevel ./genbank:26


>
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::SeqIO;
> use Getopt::Long;
>
> my $idfile;
> my $format   = 'genbank';
> GetOptions(
>            'i|input:s' => \$idfile,
>           'f|format:s' => \$format,
>         );
> $idfile = shift @ARGV if ! defined $idfile;
> my $db = Bio::DB::GenPept->new;
>
> my $out = Bio::SeqIO->new(-format => $format,
> 							-file => ">$idfile.out",
> 						);
>
> my $fh;
> open($fh, $idfile) || die "cannot open '$idfile': $!";
>
> while (<$fh>) {
>   my $id = $_;
>   chomp($id); # refseq id from the file
>   my $seq = $db->get_Seq_by_acc($id);
>   if ( $seq ) {
>     $out->write_seq($seq);
>   } else {
>     warn("Didn't find seq $id\n");
>   }
> }
>




More information about the Bioperl-l mailing list