[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/996

Jason Stajich jason@chg.mc.duke.edu
Mon, 30 Jul 2001 13:36:38 -0400 (EDT)


Neilay -

I'm not sure I agree - you should be testing whether next_seq returns a
valid Seq or PrimarySeq object before you try and call length.  The
defined behavior of next_seq method is if there is no valid sequence to
read in, then it returns a null.  I not sure I understand your statement
here.

> If the fasta file contains no sequence,
> the following call which should ideally return
> 0 throws an exception "Cannot parse entry".

Are you asking for it to throw an exception OR return 0 here?  

You should try upgrading to bioperl 0.7 as we are currently not going back
and fixing any bugs on the 0.6 branch.  Perhaps I am not seeing the same
behavior as you since I am running off the the main-trunk.

SeqIO is an iterator which means you cannot expect next_seq to always
return a valid object - if it returns null then it was at the end of the
data stream.

my $seqio = new Bio::SeqIO(-file => $file);
my $saw_any = 0;

while( my $seq = $seqio->next_seq ) {
 $saw_any = 1;
}

if( ! $saw_any ) { # file was empty }

-jason
On Mon, 30 Jul 2001 bioperl-bugs@bioperl.org wrote:

> JitterBug notification
> 
> new message incoming/996
> 
> Message summary for PR#996
> 	From: dedhia@cshl.org
> 	Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero.
> 	Date: Mon, 30 Jul 2001 11:17:39 -0400
> 	0 replies 	0 followups
> 
> ====> ORIGINAL MESSAGE FOLLOWS <====
> 
> >From dedhia@cshl.org Mon Jul 30 11:17:39 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f6UFHdw07710
> 	for <bioperl-bugs@pw600a.bioperl.org>; Mon, 30 Jul 2001 11:17:39 -0400
> Date: Mon, 30 Jul 2001 11:17:39 -0400
> Message-Id: <200107301517.f6UFHdw07710@pw600a.bioperl.org>
> From: dedhia@cshl.org
> To: bioperl-bugs@bioperl.org
> Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero. 
> 
> Full_Name: Neilay Dedhia
> Module: Bio::SeqIO::fasta
> Version: 0.6.1
> PerlVer: 5.00502
> OS: Solaris
> Submission from: (NULL) (143.48.7.14)
> 
> 
> If the fasta file contains no sequence,
> the following call which should ideally return
> 0 throws an exception "Cannot parse entry". 
> 
> $length = Bio::SeqIO->new(-file => $file)
>                      ->next_seq()
>                      ->length(); 
> 
> Here is a patch:
> 
> *** /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta_old.pm Mon Jul 30
> 10:35:47 2001
> --- /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta.pm     Mon Jul 30
> 10:36:20 2001
> ***************
> *** 111,117 ****
>         return unless $entry = $self->_readline;
>     }
>   
> !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]+)/s
>       or $self->throw("Can't parse entry");
>     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
>       or $self->throw("Can't parse fasta header");
> --- 111,117 ----
>         return unless $entry = $self->_readline;
>     }
>   
> !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]*)/s
>       or $self->throw("Can't parse entry");
>     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
>       or $self->throw("Can't parse fasta header");
> 
> 
> _______________________________________________
> Bioperl-guts-l mailing list
> Bioperl-guts-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-guts-l
> 

Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center 
http://www.chg.duke.edu/