[Bioperl-l] Empty FASTA files with Bio::SeqIO
Hilmar Lapp
lapp@gnf.org
Wed, 20 Dec 2000 16:03:10 -0800
"J.C. Diggans" wrote:
>
> I went ahead and patched my local version from 0.6.2 (patch below). It
> was a quick fix, can anyone think of a case in which this would fail?
>
> - jc
>
> 122,123c122,134
> < my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]+)/s
> < or $self->throw("Can't parse entry");
> ---
> > # Check for empty sequences and handle gracefully
> > my ($top,$sequence);
> > if( $entry =~ /^(.+?)\n([^>]+)/s ) {
> > # There is valid sequence present
> > ($top,$sequence) = $entry =~ /^(.+?)\n([^>]+)/s
> > or $self->throw("Can't parse entry");
> > } else {
> > # There is no sequence present,
> > $top = $entry =~ /^(.+?)\n/
> > or $self->throw("Can't parse entry"); # save top
> > $sequence = ""; # set sequence to empty string
> > }
> >
>
The correctly FASTA-formatted empty seq ought to have an empty line after
the '>'-line. I think we should check for that, just to be sure we're not
misinterpreting something.
Second, Bio::Seq currently won't let you define an empty seq. This needs to
be fixed, too.
If your fix works for you, that's fine. 0.7 will still take a while anyway,
unless someone donates a fuzzy-location full coverage package for
christmas.
Hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp@gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------