[Bioperl-l] extending the PHYLIP format

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Sat May 31 11:10:53 UTC 2008


BTW, fixing the truncated IDs could be done also using t-coffee, at least
it is described in the docs:
http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htm#_Toc148261714

Albert Vilella wrote:
> Hi Heikki,
> 
> About a year ago, some code was added to deal with "the more than 10 chars"
> ids
> problem. (
> https://www.nescent.org/wg_phyloinformatics/Phylohackathon_1/BioPerl_Targets)
> 
> 
> Basically: (1) mapping the long ids to 10-char numeric ids, (2) running the
> program
> with the id limitation, (3) reverting the ids back to the originals in the
> output. The pods explain how to do it.
> 
> So I would say that the solution is at least "partially" there :-)
> 
>     Albert.
> 
> On Wed, May 28, 2008 at 9:23 AM, Heikki Lehvaslaiho <heikki at sanbi.ac.za>
> wrote:
> 
>> I just learned that a number of phylogenetics packages (PAUP, PHYML, Mr
>> Bayes
>> at least ) now allow longer than 10 character IDs in PHYLIP format. The
>> documentation is scarce but the rules seem to be:
>>
>> 1. There can be spaces before the ID.
>> 2. The ID can be up to 50 characters long.
>> 3. ID can contain any characters. If you are using spaces within the ID,
>> you
>> have to put the whole ID in single quotes ('). Single quotes can be used
>> for
>> all IDs and are removed when parsing in.
>> 4. It is customary to have two spaces between the ID and the sequence.
>>
>> This custom seems to have come into PHYLIP format from Nexus.
>> Note that this allows sequences in a file to start at different columns.
>>
>> Can anyone shed more light into matter?
>>
>>
>> I need to get this into bioperl as the names in HIV sequences that I work
>> with
>> are very long and can not be sensibly truncated.
>>
>> What would be the best way to do this?
>> 1. Add more options to the already heavily
>>   hacked Bio::AlignIO::phylip.pm
>> 2. Create a Bio::AlignIO::phyliplong.pm
>>
>> Do those ugly hacks for supporting fixed length long IDs really really
>> belong
>> in the vanilla phylip.pm file?
>>
>> Opinions?
>>
>>        -Heikki




More information about the Bioperl-l mailing list