[Bioperl-l] Re: [Bioperl-guts-l] bioperl commit

Aaron J. Mackey amackey at pcbi.upenn.edu
Thu May 27 16:12:35 EDT 2004


Hi Stefan,

I have a few questions about this latest commit; I'm sure it does what  
you need it to do, but it's a little "crufty".

What does this mean, why would you provide a probability threshold in  
whole integers, and why are values outside of 3 and 7 illegal?  Is  
Bio::Matrix::PSM nucleotide specific?  Why wouldn't this  
"get_all_vectors" method be useful for any PSM?  Why not use  
Bio::Tools::IUPAC to generate a sequence stream from a calculated  
consensus sequence?

-Aaron

On May 27, 2004, at 2:37 PM, Stefan Kirov wrote:

>
> skirov
> Thu May 27 14:37:54 EDT 2004
> Update of /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM
> In directory pub.open-bio.org:/tmp/cvs-serv8640
>
> Modified Files:
> 	SiteMatrix.pm SiteMatrixI.pm
> Log Message:
> method added: get_all_vectors, all possible seq to satisfy the PFM  
> under a give threshold
>
> bioperl-live/Bio/Matrix/PSM SiteMatrix.pm,1.15,1.16  
> SiteMatrixI.pm,1.7,1.8
> ===================================================================
> RCS file:  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrix.pm,v
> retrieving revision 1.15
> retrieving revision 1.16
> diff -u -r1.15 -r1.16
> ---  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrix.pm	 
> 2004/05/12 18:27:30	1.15
> +++  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrix.pm	 
> 2004/05/27 18:37:54	1.16
> @@ -883,4 +883,48 @@
>  return $score;
>  }
>
> +
> +=head2 get_all_vectors
> +
> + Title   : get_all_vectors
> + Usage   :
> + Function:  returns all possible sequence vectors to satisfy the PFM  
> under
> +            a given threshold
> + Throws  :  If threshold outside of 3..7 (no sense to do that)
> + Example :  my @vectors=$self->get_all_vectors(4);
> + Returns :  Array of strings
> + Args    :  (optional) floating
> +
> +=cut
> +
> +sub get_all_vectors {
> +	my $self=shift;
> +	my $thresh=shift;
> +  $self->throw("Out of range. Threshold should be >3 and 7<.\n") if  
> (($thresh<3) || ($thresh>7));
> +  my @seq=split(//,$self->consensus($thresh));
> +  my @perm;
> +  $thresh=$thresh/10;
> +  for my $i (0..@{$self->{probA}}) {
> +    push @{$perm[$i]},'A' if ($self->{probA}->[$i]>$thresh);
> +    push @{$perm[$i]},'C' if ($self->{probC}->[$i]>$thresh);
> +    push @{$perm[$i]},'G' if ($self->{probG}->[$i]>$thresh);
> +    push @{$perm[$i]},'T' if ($self->{probT}->[$i]>$thresh);
> +    push @{$perm[$i]},'N' if  ($seq[$i] eq 'N');
> +  }
> +  my $fpos=shift @perm;
> +  my @strings=@$fpos;
> +  foreach my $pos (@perm) {
> +    my @newstr;
> +    foreach my $let (@$pos) {
> +      foreach my $string (@strings) {
> +        my $newstring = $string . $let;
> +        push @newstr,$newstring;
> +      }
> +    }
> +    @strings=@newstr;
> +  }
> +	return @strings;
> +}
> +
> +
>  1;
>
> ===================================================================
> RCS file:  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrixI.pm,v
> retrieving revision 1.7
> retrieving revision 1.8
> diff -u -r1.7 -r1.8
> ---  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrixI.pm	 
> 2004/05/12 18:27:30	1.7
> +++  
> /home/repository/bioperl/bioperl-live/Bio/Matrix/PSM/SiteMatrixI.pm	 
> 2004/05/27 18:37:54	1.8
> @@ -572,5 +572,21 @@
>      $self->throw_not_implemented();
>  }
>
> +=head2 get_all_vectors
>
> + Title   : get_all_vectors
> + Usage   :
> + Function:  returns all possible sequence vectors to satisfy the PFM  
> under
> +            a given threshold
> + Throws  :  If threshold outside of 3..7 (no sense to do that)
> + Example :  my @vectors=$self->get_all_vectors(4);
> + Returns :  Array of strings
> + Args    :  (optional) floating
> +
> +=cut
> +
> +sub get_all_vectors {
> + my $self = shift;
> +    $self->throw_not_implemented();
> +}
>  1;
>
> _______________________________________________
> Bioperl-guts-l mailing list
> Bioperl-guts-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-guts-l
>
>
--
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania       email:  amackey at pcbi.upenn.edu
415 S. University Avenue         office: 215-898-1205
Philadelphia, PA  19104-6017     fax:    215-746-6697



More information about the Bioperl-l mailing list