[Bioperl-l] fasta format
Wiepert, Mathieu
Wiepert.Mathieu@mayo.edu
Mon, 26 Aug 2002 08:46:57 -0500
All good points. Patterns below are not exact, only meant to illustrate (I hope) the use cases. My take is
>\n(or whatever OS says a new line is)
is valid
>\s+\n
is valid
>\s+(.*)
is valid, as described by Bill Pearson. Should have null ID, then description.
>^\S+\s+(.*)
is valid as already works
Is that all agreeable?
-----Original Message-----
From: Matthew Pocock [mailto:matthew_pocock@yahoo.co.uk]
Sent: Saturday, August 24, 2002 11:26 AM
To: Paul Gordon
Cc: bioperl-l@bioperl.org
Subject: Re: [Bioperl-l] fasta format
Hi. There is no one-size-fits-all solution for fasta description lines.
Perhaps an optional callback on the fasta parser object that takes all
text following ">" including all whitespace, and returns an array -
(id,description)? You could write a handfull of default callbacks with
obvious names and in realy mad situations (SCOP fasta may be a candidate
for this), the user can provide their own. Apologies if the bioperl
fasta already has this functionality.
Matthew
Paul Gordon wrote:
>>my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
>
>
> I guess the tradeoffs are between:
>
> 1. people who put a description, but no identifier at all, for whom the
> current code does not work nicely
>
> 2. people who have a space between the > and the identifier.
>
> So, which is more likely to occur? If you wanted to get really fancy, you
> might check, if there is a leading space, if the next word looks like an
> identifier (e.g. /^[^A-Z\-]$/i). Even swissprot ids usually have
> numbers or underscores. It may not work all the time (e.g. 16S kind of
> descriptors), but perhaps it's better than assuming the user isn't
> providing an identifier at all? And it would be mostly backward
> compatible?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
BioJava Consulting LTD - Support and training for BioJava
http://www.biojava.co.uk
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l