[Bioperl-l] BioPerl and NHX tree

Laurence Amilhat Laurence.Amilhat at toulouse.inra.fr
Thu Jan 3 14:29:09 UTC 2008


Dear all,

I am trying to convert a newick tree into an NHX tree, so I can add the 
taxid tag for each leaf.

I am using the modules: Bio::TreeIO  & Bio::Tree::NodeNHX
The idea is
1) to read the newick tree
2) get the leaf, and get the corresponding taxid for it
3) add the nhx species tag
4) write the nhx tree

I was able to do the first 2 steps, and I could create an object 
node_nhx and add the tag T,
but I don't know how to write an nhx Tree with the node_nhx previously 
created...

Does anyone have an idea? any help are welcome.

Thanks,

laurence.


Here are my code and the samples files for better understanding:
newick2nhx.pl -f test_tree.nwk -o tata -c seq_taxid.txt

_newick2nhx.pl:_
use strict;
use Bio::TreeIO;
use Bio::Tree::NodeNHX;
use Getopt::Long;


my $tree_file;
my $outfile;
my $codefile;
my %corresp;

GetOptions('f|file:s' =>\$tree_file, 'o|out:s' =>\$outfile, 'c|code:s' 
=>\$codefile);

open (CODE, "< $codefile");
while (<CODE>)
{
    chomp;
    my($a, $b)=split (/\t/);
    $corresp{$a}=$b;
}


my $treeio = new Bio::TreeIO (-format => 'newick', -file => "$tree_file");
my $treeout= new Bio::TreeIO (-format => 'nhx', -file =>">$outfile");

while (my $tree= $treeio->next_tree)
{
    my @nodes=$tree->get_nodes();
    foreach my $nd(@nodes)
    {
        if ($nd->is_Leaf())
        {
            my $id=$nd->id();
            print "$id TAXID ",$corresp{$id},"\n";
           
            my $nodenhx=new Bio::Tree::NodeNHX();
            $nodenhx->nhx_tag({T=>$corresp{$id}});
        }
    }
    $treeout->write_tree($tree);
}


_test_tree.nwk_:
(((((42558930:100.0,42558943:100.0):100.0,(42558969:100.0,(42558981:100.0,
42558942:100.0):100.0):72.0):81.0,(((((90185247:100.0,56405380:100.0):100.0,
(42558987:100.0,148887393:100.0):100.0):90.0,66774197:100.0):100.0,AAEL015662:100.0):100.0,
42558970:100.0):82.0):100.0,(42558929:100.0,42558958:100.0):79.0):100.0,
42558941:100.0);

_seq_taxid.txt:_
AAEL015662      7159
42558969        9606
42558981        10090
42558942        9606
42558970        6239
42558929        10116
42558987        9606
42558930        10116
42558943        9606
148887393       10090
42558958        10090
42558941        9606
56405380        10090
90185247        9606
66774197        6239


_And the tata resulting file:_
(((((42558930:100.0,42558943:100.0):100.0[&&NHX],(42558969:100.0,(42558981:100.0,42558942:100.0):100.0[&&NHX]):72.0[&&NHX]):81.0[&&NHX],(((((
90185247:100.0,56405380:100.0):100.0[&&NHX],(42558987:100.0,148887393:100.0):100.0[&&NHX]):90.0[&&NHX],66774197:100.0):100.0[&&NHX],AAEL01566
2:100.0):100.0[&&NHX],42558970:100.0):82.0[&&NHX]):100.0[&&NHX],(42558929:100.0,42558958:100.0):79.0[&&NHX]):100.0[&&NHX],42558941:100.0);




-- 
====================================================================
= Laurence Amilhat    INRA Toulouse 31326 Castanet-Tolosan     	   = 
= Tel: 33 5 61 28 53 34   Email: laurence.amilhat at toulouse.inra.fr =
====================================================================






More information about the Bioperl-l mailing list