[Bioperl-l] Phylip format error

Adam Witney awitney at sgul.ac.uk
Thu May 23 08:43:15 UTC 2013


Not sure if there is an actual question in these messages, but BioPerl
can be used to generate valid Phylip format and run, like this:

## Build Align object
my $aln = Bio::SimpleAlign->new(-seqs=>$seqs);

## swap the taxa names with 8 characters long unique IDs
my ($aln_safe, $ref_name) = $aln->set_displayname_safe(8);

## Write out phylip format infile
Bio::AlignIO->new(-file=>'>infile.out', -format=>'phylip', -interleaved
=> 0)->write_aln($aln);

## run PHYLIP's pars program
my @params = (idlength=>10);   #, jumble=>"17,10");
my $tree_factory = Bio::Tools::Run::Phylo::Phylip::Pars->new(@params);
$tree_factory->quiet(1);  # Suppress pars messages to terminal
my $tree = $tree_factory->create_tree($aln_safe);

## fix the node labels back
my @nodes = sort { defined $a->id && defined $b->id && $a->id cmp $b->id
} $tree->get_nodes();
foreach my $nd (@nodes) {
	if ( $nd->is_Leaf ) {
		$nd->id($ref_name->{$nd->id_output})
	}
}

HTH

Adam

On 23/05/2013 08:22, Alexey Morozov wrote:
> Which is also worsened by the fact that there is relaxed phylip format,
> which allows up to 250 chars for taxon name. They are separated from a
> sequence by single space, which creates problems if names were extended to
> 10 chars in strict Felsenstein's format by whitespaces. On the whole,
> phylip is as messily defined format as one can make from a plain textfile
> with information content of fasta.
> Bioperl documentation says nothing about whether Bio::SeqIO accepts relaxed
> phylip and how does it tell dialects from one another. Even if code support
> is OK, it may be worthwile to explain it somewhere at bioperl.org
> 
> 
> 2013/5/21 Bernard Cohen <b.l.cohen_home at btinternet.com>
> 
>> Hello!
>>
>> I happen to have checked to see what the PERL webpage says about Phylip
>> format for DNA alignment files and see that it is erroneous.
>>
>> I am not a PERL user and do not want to be bothered to register or
>> otherwise learn how to make an official comment, so forward this for
>> someone to pick up.
>>
>> Phylip format allows up to 10 spaces for taxon names; the data must start
>> in the 11th space. This can be checked on Jo Felsenstein's site.
>>
>> The PERL page accessed by searching for "Phylip format PERL" allows only 8
>> spaces for the name.
>>
>> B. L. Cohen
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> 



More information about the Bioperl-l mailing list