[Bioperl-l] Re: fasta.pm and '>>' in description

Mon, 31 Jul 2000 16:10:41 +0100

Yes, it was me who fixed this. BTW the comment is partially incorrect as I
just see, it should actually say

      # a greater sign not preceded by a newline indicates that there is
      # a greater sign within the description, so we need more to complete
the
      # record

Now, the regexp detecting this should be okay in your case, too. However,
the statement

   return unless $next_rec = $self->_readline;

within the loop is probably the offending line, because it treats two
things in the same way, namely an undefined value and an empty string,
which both evaluate to FALSE. The duplicated '>' will make the call to
_readline() return an empty string. To fix this replace the statement by

     return unless defined($next_rec = $self->_readline());

and replace the regexp in the while condition with

     /(^|.)>$/

It works for me, and I'll fix it in the main trunk tonight (I do not have
CVS access at work).

     Hilmar

BTW if possible you probably shouldn't deviate from the $/ mechanism,
because this way the actual record reading is abstracted from the
underlying physical operations, and fasta.pm would then be the only module
bypassing this.

Kris Boulez <krbou@pgsgent.be> on 31.07.2000 14:26:10

To:   HILMAR LAPP/PH/Novartis@PH
cc:
Subject:  fasta.pm and '>>' in description

Dear,

Looking at your initials it was you who corrected Bio::SeqIO::fasta.pm
for descriptions containing a '>' in bioperl-live.

      # HL 05/25/2000
      # a greater sign not preceded by a newline indicates that there is
      # a newline within the description, so we need more to complete the
      # record

Your fix unfortunately doesn't work for description lines containing
'>>'. I could go and fix this one, but might it not be a good idea to
not do
       local $/ = '>';
but indeed, read line by line.

Kris,
--
Kris Boulez    Aventis CropScience N.V.