[Bioperl-l] Found bug in fasta.pm

Aaron J Mackey ajm6q@virginia.edu
Thu, 9 Nov 2000 23:54:44 -0500 (EST)


On Thu, 9 Nov 2000, Hilmar Lapp wrote:

> > > {
> > >   local $/ = "\n>";
> > >   while(<>) {
> > >     chomp;                 # remove trailing "\n>"
> > >     my ($id, $desc, $seq) =
> > >       $_ =~ m/^>?        # beginning >, only 1st seq.
> > >                 (\S+)\s+   # identifier
> > >                 ([^\n]+)\n # description line
> > >                 (.*)$      # sequence
> > >                /sox;       # multiline, compile-once, ignore-whitespace
> 
> To be honest, I'm not so happy with having the above expression literally
> in the code because IMHO it appears to be too strict: a description must be
> present, as must be the id.
> 
> Fasta seqs frequently come without a description, and through
> web-interfaces often even without an id. 

I agree with the first point entirely.

I dont understand the second point, you mean a fasta entry that looks
like:

>
MAGHRETRH...


?? That's very odd.  But I guess acceptable.  So go with the "\n>" input
record separator, and everyone's happy.

-Aaron