[Bioperl-l] Phylip format error

Peter Cock p.j.a.cock at googlemail.com
Thu May 23 08:30:21 UTC 2013


On Thu, May 23, 2013 at 8:22 AM, Alexey Morozov
<alexeymorozov1991 at gmail.com> wrote:
> Which is also worsened by the fact that there is relaxed phylip format,
> which allows up to 250 chars for taxon name. They are separated from a
> sequence by single space, which creates problems if names were extended to
> 10 chars in strict Felsenstein's format by whitespaces. On the whole,
> phylip is as messily defined format as one can make from a plain textfile
> with information content of fasta.
> Bioperl documentation says nothing about whether Bio::SeqIO accepts relaxed
> phylip and how does it tell dialects from one another. Even if code support
> is OK, it may be worthwile to explain it somewhere at bioperl.org

Biopython's AlignIO defines both a (strict) "phylip" and "relaxed-phylip"
as two separate formats (or variants, like the "fastq" variants). Doing
the same in BioPerl would seem sensible since auto-detection is not
easy.

http://biopython.org/wiki/AlignIO#File_Formats

Peter

P.S. Where does that 250 characters for the taxon name limit come from?
The trouble with relaxed phylip is that some tools are more relaxed than
others ;)



More information about the Bioperl-l mailing list