[Bioperl-l] Found bug in fasta.pm
Aaron J Mackey
ajm6q@virginia.edu
Thu, 9 Nov 2000 23:54:44 -0500 (EST)
On Thu, 9 Nov 2000, Hilmar Lapp wrote:
> > > {
> > > local $/ = "\n>";
> > > while(<>) {
> > > chomp; # remove trailing "\n>"
> > > my ($id, $desc, $seq) =
> > > $_ =~ m/^>? # beginning >, only 1st seq.
> > > (\S+)\s+ # identifier
> > > ([^\n]+)\n # description line
> > > (.*)$ # sequence
> > > /sox; # multiline, compile-once, ignore-whitespace
>
> To be honest, I'm not so happy with having the above expression literally
> in the code because IMHO it appears to be too strict: a description must be
> present, as must be the id.
>
> Fasta seqs frequently come without a description, and through
> web-interfaces often even without an id.
I agree with the first point entirely.
I dont understand the second point, you mean a fasta entry that looks
like:
>
MAGHRETRH...
?? That's very odd. But I guess acceptable. So go with the "\n>" input
record separator, and everyone's happy.
-Aaron