[Bioperl-l] PAML nssites model result object

Edward Chuong echuong at gmail.com
Thu Mar 10 18:23:53 EST 2005


Hey,

Some progress when I use your file: it now does return a PAML::Result
object, but there's an error
Can't use an undefined value as an ARRAY reference at
/Library/Perl/5.8.1/Bio/Tools/Phylo/PAML/ModelResult.pm line 308,
<GEN0> line 329.
which appears because it's trying to access positive selection array,
which doesn't exist for the nssites = 0 object-- there is no
"dnds_site_classes" which the other nssites models ( = 1 or =2 etc)
have, even though I think there should be? I can't find where these
values are stored--they don't appear to be stored on the individual
nodes as the documentation would suggest.

I've found that when I specify only one model for nssites, (nssites =
0 or 1 or 2), the get_NSSites code doesn't work, but if I specify more
than one for the PAML run it does.

I've sent the mlc files I have in a private e-mail, if you have time to check

Thanks so much!
-Ed

On Thu, 10 Mar 2005 16:38:59 -0500, Jason Stajich
<jason.stajich at duke.edu> wrote:
> I'm using t/data/codeml_nssites.mlc
> 
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
> 
> On Mar 10, 2005, at 3:47 PM, Edward Chuong wrote:
> 
> > Hi,
> >
> > I think the problem is that the $result object, according to Dumper,
> > doesn't store any ModelResult (NSSite_results), so the for loop
> > condition in this code ($result->get_NSSite_results) is never true. Is
> > this working on a mlc file that you have, and if so, can you send it
> > so I can see if it's a problem on my side?
> >
> > Thanks
> > -Ed
> >
> >
> > On Thu, 10 Mar 2005 09:48:02 -0500, Jason Stajich
> > <jason.stajich at duke.edu> wrote:
> >> The script needs to be adjusted for NSsites because their are trees
> >> are
> >> associated with each model result so you need one more loop on the
> >> get_NSSite_results. I added some code to the script to print out the
> >> positively selected sites as well.
> >>
> >> #!/usr/bin/perl -w
> >> use strict;
> >> use Bio::Tools::Phylo::PAML;
> >>
> >> my $outcodeml = shift(@ARGV);
> >>
> >> my $paml_parser = new Bio::Tools::Phylo::PAML(-file => "./$outcodeml",
> >>                                               -dir => "./");
> >> my $result = $paml_parser->next_result();
> >> my $MLmatrix = $result->get_MLmatrix(); # get MaxLikelihood Matrix
> >> my @otus = $result->get_seqs;
> >> # process the NSsites results
> >> for my $ns_result ( $result->get_NSSite_results ) {
> >>      print "model ", $ns_result->model_num, " ",
> >> $ns_result->model_description, "\n";
> >>      while ( my $tree = $ns_result->next_tree ) {
> >>         for my $node ( $tree->get_nodes ) {
> >>             my $id;
> >>             if( $node->is_Leaf() ) {
> >>                 $id = $node->id;
> >>             } else {
> >>                 $id = "(".join(",", map { $_->id } grep { $_->is_Leaf
> >> }
> >>                                $node->get_all_Descendents) .")";
> >>             }
> >>             if( ! $node->ancestor || ! $node->has_tag('t') ) {
> >>                 # skip when no values have been associated with this
> >> node
> >>                 # (like the root node)
> >>                 next;
> >>             }
> >>                 printf
> >> "%s\tt=%.3f\tS=%.1f\tN=%.1f\tdN/
> >> dS=%.4f\tdN=%.4f\tdS=%.4f\tS*dS=%.1f\tN*dN=%.1f\n",
> >>             $id,
> >>             map { ($node->get_tag_values($_))[0] }
> >>             qw(t S N dN/dS dN dS), 'S*dS', 'N*dN';
> >>         }
> >>      }
> >>      print "positively selected sites:\n";
> >>      #  get the positively select sites
> >>      for my $site ( $ns_result->get_pos_selected_sites ) {
> >>         print join(" ", @$site, "\n");
> >>      }
> >>      print "\n";
> >> }
> >>
> >> --
> >> Jason Stajich
> >> jason.stajich at duke.edu
> >> http://www.duke.edu/~jes12/
> >>
> >> On Mar 9, 2005, at 6:37 PM, Edward Chuong wrote:
> >>
> >>> Hi Jason,
> >>>
> >>> Thanks for the help.
> >>>
> >>> The code seems to get stuck at
> >>>
> >>>  if( ! $node->ancestor || ! $node->has_tag('t') ) {
> >>> (this condition turns out true for every node, not just root, so it
> >>> always hits "next")
> >>>
> >>> I used Data::Dumper to check on the node and I've pasted the
> >>> results--it seems like those tags aren't being sent in?
> >>>
> >>>
> >>> Thanks!
> >>> -Ed
> >>>
> >>> '_root_cleanup_methods' => [
> >>>                 sub { "DUMMY" }
> >>>               ],
> >>> '_creation_id' => 0,
> >>> '_branch_length' => '0.613722',
> >>> '_desc' => {},
> >>> '_id' => 'NP_033437.2_mus',
> >>> '_ancestor' => bless( {
> >>>        '_root_cleanup_methods' => [
> >>>
> >>> $VAR1->{'_root_cleanup_methods'}[0]
> >>>                                       ],
> >>>        '_creation_id' => 3,
> >>>        '_desc' => {
> >>>                                 '2' => bless( {
> >>>                                         '_root_cleanup_methods' => [
> >>>
> >>> $VAR1->{'_root_cleanup_methods'}[0]
> >>>                                                ],
> >>>                                         '_creation_id' => 2,
> >>>                                         '_branch_length' =>
> >>> '0.768322',
> >>>                                         '_desc' => {},
> >>>                                         '_id' => 'PM_BWp0001H02f',
> >>>                                         '_ancestor' =>
> >>> $VAR1->{'_ancestor'},
> >>>                                         '_root_verbose' => 0
> >>>                                       }, 'Bio::Tree::Node' ),
> >>>                                 '0' => $VAR1,
> >>>                                 '1' => bless( {
> >>>                                         '_root_cleanup_methods' => [
> >>>
> >>>                             $VAR1->{'_root_cleanup_methods'}[0]
> >>>
> >>>                           ],
> >>>                                         '_creation_id' => 1,
> >>>                                         '_branch_length' =>
> >>> '0.366319',
> >>>                                         '_desc' => {},
> >>>                                         '_id' => 'NP_742070.1_rat',
> >>>                                         '_ancestor' =>
> >>> $VAR1->{'_ancestor'},
> >>>                                         '_root_verbose' => 0
> >>>                                       }, 'Bio::Tree::Node' )
> >>>                               },
> >>>        '_id' => '',
> >>>        '_height' => undef,
> >>>        '_root_verbose' => 0
> >>>    }, 'Bio::Tree::Node' ),
> >>> '_root_verbose' => 0
> >>> }, 'Bio::Tree::Node' );
> >>>
> >>>
> >>> On Wed, 9 Mar 2005 18:01:34 -0500, Jason Stajich
> >>> <jason.stajich at duke.edu> wrote:
> >>>> Resend with code pasted....
> >>>>
> >>>> #!/usr/bin/perl -w
> >>>> use strict;
> >>>> use Bio::Tools::Phylo::PAML;
> >>>>
> >>>> my $outcodeml = shift(@ARGV);
> >>>>
> >>>> my $paml_parser = new Bio::Tools::Phylo::PAML(-file =>
> >>>> "./$outcodeml",
> >>>>                                               -dir => "./");
> >>>> my $result = $paml_parser->next_result();
> >>>> my $MLmatrix = $result->get_MLmatrix(); # get MaxLikelihood Matrix
> >>>> my @otus = $result->get_seqs;
> >>>> if( $#{$MLmatrix} < 0 ) {
> >>>>      for my $tree ($result->next_tree ) {
> >>>>         for my $node ( $tree->get_nodes ) {
> >>>>             my $id;
> >>>>             if( $node->is_Leaf() ) {
> >>>>                 $id = $node->id;
> >>>>             } else {
> >>>>                 $id = "(".join(",", map { $_->id } grep {
> >>>> $_->is_Leaf
> >>>> }
> >>>>                            $node->get_all_Descendents) .")";
> >>>>             }
> >>>>             if( ! $node->ancestor || ! $node->has_tag('t') ) {
> >>>>                 # skip when no values have been associated with this
> >>>> node
> >>>>                 # (like the root node)
> >>>>                 next;
> >>>>             }
> >>>>              # I know this looks complicated
> >>>>             # but we use the get_tag_values method to pull out the
> >>>> annotations
> >>>>             # for each branch
> >>>>             # The ()[0] around the call is because get_tag_values
> >>>> returns a
> >>>> list
> >>>>             # if we want to just get the 1st item in the list we
> >>>> have
> >>>>             # to tell Perl we are treating it like an array.
> >>>>             # in the future get_tag_values needs to be smart and
> >>>> just
> >>>>             # return the 1st item in the array if called in scalar
> >>>>             # context
> >>>>
> >>>>             printf
> >>>> "%s\tt=%.3f\tS=%.1f\tN=%.1f\tdN/
> >>>> dS=%.4f\tdN=%.4f\tdS=%.4f\tS*dS=%.1f\tN*dN=%.1f\n",
> >>>>             $id,
> >>>>             map { ($node->get_tag_values($_))[0] }
> >>>>             qw(t S N dN/dS dN dS), 'S*dS', 'N*dN';
> >>>>         }
> >>>>      }
> >>>> } else {
> >>>>      my $i =0;
> >>>>      my @seqs = $result->get_seqs;
> >>>>      for my $row ( @$MLmatrix ) {
> >>>>         print $seqs[$i++]->display_id, join("\t",@$row), "\n";
> >>>>      }
> >>>> }
> >>>>
> >>>> On Mar 9, 2005, at 5:41 PM, Jason Stajich wrote:
> >>>>
> >>>>> I just updated things last week so this is brand-spanking-new.  I
> >>>>> don't know if I connected everything up for NSsites stuff quite yet
> >>>>> as that is handled in - the branch-specific parsing should work
> >>>>> now.
> >>>>> I don't know if the synopsis code is really up to snuff either.
> >>>>> When
> >>>>> I get around to it I will try and see what still needs to be
> >>>>> connected
> >>>>> in NSsites parsing.
> >>>>>
> >>>>> I don't think $node->param() is going to work -
> >>>>> $node->get_tag_values() is the way I've implemented it.
> >>>>>
> >>>>> <00parse_codeml.pl>
> >>>>>
> >>>>> -jason
> >>>>> --
> >>>>> Jason Stajich
> >>>>> jason.stajich at duke.edu
> >>>>> http://www.duke.edu/~jes12/
> >>>>>
> >>>>> On Mar 9, 2005, at 5:23 PM, Edward Chuong wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> I'm trying to parse PAML results, and running into some trouble.
> >>>>>> I'm
> >>>>>> using branch specific omega model, and I want to get the branch
> >>>>>> specific ka/ks values out.
> >>>>>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/
> >>>>>> Bio/
> >>>>>> Tools/Phylo/PAML.pm?rev=HEAD&cvsroot=bioperl&content-type=text/
> >>>>>> vnd.viewcvs-markup
> >>>>>> says that $node->param('omega') should work, but Data::Dumper
> >>>>>> shows
> >>>>>> that this value isn't stored in the node (only branch lengths and
> >>>>>> seq
> >>>>>> IDs appear to be stored).
> >>>>>>
> >>>>>> I'm assuming that I can get these values out of the
> >>>>>> get_NSSite_result() Bio::Tools::Phylo::PAML::ModelResult object,
> >>>>>> but
> >>>>>> I'm not sure how to call it. The current synopsis uses
> >>>>>> "get_model_params" but it seems to be out of date because it's not
> >>>>>> in
> >>>>>> the current souce. The docs at
> >>>>>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/
> >>>>>> Bio/
> >>>>>> Tools/Phylo/PAML/Result.pm?rev=HEAD&cvsroot=bioperl&content-
> >>>>>> type=text/vnd.viewcvs-markup
> >>>>>> say to use my
> >>>>>> @results = @{$self->get_NSSite_results};
> >>>>>> --that looks like a mistake, and I've tried
> >>>>>> @result = $result->get_NSSite_results but that doesn't work either
> >>>>>> (just get undefined objs).
> >>>>>>
> >>>>>> Am I doing something wrong, or is this functionality still being
> >>>>>> worked on? I've tried using both 1.4 and the LIVE versions. Any
> >>>>>> help
> >>>>>> is appreciated, thanks!
> >>>>>>
> >>>>>> -Ed
> >>>>>> _______________________________________________
> >>>>>> Bioperl-l mailing list
> >>>>>> Bioperl-l at portal.open-bio.org
> >>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Edward Chuong
> >>> (949) 939-2732
> >>> AIM: edawad85
> >>>
> >>
> >>
> >
> >
> > --
> > Edward Chuong
> > (949) 939-2732
> > AIM: edawad85
> >
> 
> 
> 


-- 
Edward Chuong
(949) 939-2732
AIM: edawad85


More information about the Bioperl-l mailing list