[Bioperl-l] BioPerl and NHX tree
Laurence Amilhat
Laurence.Amilhat at toulouse.inra.fr
Thu Jan 3 14:29:09 UTC 2008
Dear all,
I am trying to convert a newick tree into an NHX tree, so I can add the
taxid tag for each leaf.
I am using the modules: Bio::TreeIO & Bio::Tree::NodeNHX
The idea is
1) to read the newick tree
2) get the leaf, and get the corresponding taxid for it
3) add the nhx species tag
4) write the nhx tree
I was able to do the first 2 steps, and I could create an object
node_nhx and add the tag T,
but I don't know how to write an nhx Tree with the node_nhx previously
created...
Does anyone have an idea? any help are welcome.
Thanks,
laurence.
Here are my code and the samples files for better understanding:
newick2nhx.pl -f test_tree.nwk -o tata -c seq_taxid.txt
_newick2nhx.pl:_
use strict;
use Bio::TreeIO;
use Bio::Tree::NodeNHX;
use Getopt::Long;
my $tree_file;
my $outfile;
my $codefile;
my %corresp;
GetOptions('f|file:s' =>\$tree_file, 'o|out:s' =>\$outfile, 'c|code:s'
=>\$codefile);
open (CODE, "< $codefile");
while (<CODE>)
{
chomp;
my($a, $b)=split (/\t/);
$corresp{$a}=$b;
}
my $treeio = new Bio::TreeIO (-format => 'newick', -file => "$tree_file");
my $treeout= new Bio::TreeIO (-format => 'nhx', -file =>">$outfile");
while (my $tree= $treeio->next_tree)
{
my @nodes=$tree->get_nodes();
foreach my $nd(@nodes)
{
if ($nd->is_Leaf())
{
my $id=$nd->id();
print "$id TAXID ",$corresp{$id},"\n";
my $nodenhx=new Bio::Tree::NodeNHX();
$nodenhx->nhx_tag({T=>$corresp{$id}});
}
}
$treeout->write_tree($tree);
}
_test_tree.nwk_:
(((((42558930:100.0,42558943:100.0):100.0,(42558969:100.0,(42558981:100.0,
42558942:100.0):100.0):72.0):81.0,(((((90185247:100.0,56405380:100.0):100.0,
(42558987:100.0,148887393:100.0):100.0):90.0,66774197:100.0):100.0,AAEL015662:100.0):100.0,
42558970:100.0):82.0):100.0,(42558929:100.0,42558958:100.0):79.0):100.0,
42558941:100.0);
_seq_taxid.txt:_
AAEL015662 7159
42558969 9606
42558981 10090
42558942 9606
42558970 6239
42558929 10116
42558987 9606
42558930 10116
42558943 9606
148887393 10090
42558958 10090
42558941 9606
56405380 10090
90185247 9606
66774197 6239
_And the tata resulting file:_
(((((42558930:100.0,42558943:100.0):100.0[&&NHX],(42558969:100.0,(42558981:100.0,42558942:100.0):100.0[&&NHX]):72.0[&&NHX]):81.0[&&NHX],(((((
90185247:100.0,56405380:100.0):100.0[&&NHX],(42558987:100.0,148887393:100.0):100.0[&&NHX]):90.0[&&NHX],66774197:100.0):100.0[&&NHX],AAEL01566
2:100.0):100.0[&&NHX],42558970:100.0):82.0[&&NHX]):100.0[&&NHX],(42558929:100.0,42558958:100.0):79.0[&&NHX]):100.0[&&NHX],42558941:100.0);
--
====================================================================
= Laurence Amilhat INRA Toulouse 31326 Castanet-Tolosan =
= Tel: 33 5 61 28 53 34 Email: laurence.amilhat at toulouse.inra.fr =
====================================================================
More information about the Bioperl-l
mailing list