[Bioperl-l] statistics of sequences

S.Paul s.paul at surrey.ac.uk
Thu Apr 22 17:06:45 EDT 2004


Thanks Heikki.  It worked

Sujoy
Sujoy Paul, PRISE Centre, UniS, s.paul at surrey.ac.uk
----- Original Message -----
From: "Heikki Lehvaslaiho" <heikki at ebi.ac.uk>
To: <bioperl-l at portal.open-bio.org>
Cc: "S.Paul" <s.paul at surrey.ac.uk>
Sent: Wednesday, April 21, 2004 7:53 AM
Subject: Re: [Bioperl-l] statistics of sequences


> Sujou,
>
> You are on the right track to use OddCodes. The OddCode methods give you
back
> a refrence to a plain string. (which, BTW, you can store into a
specialized
> sequence object of type Bio::Seq::Meta) that can be manipulated using
> standard perl functions.
>
> Here are two possibilities:
>
> # 1. works without knowing the characters
>
> my %hash;
> for (split / */, $new_coding5) {
>     $hash{$_}++;
> }
> for (keys %hash) {
>     print $_, ": ", $hash{$_}, "\n";
> }
>
> #2. you have to know what you are looking for
>
> my ($O) = $new_coding5 =~ tr/O//;
> print "O: $O\n";
> my ($I) = $new_coding5 =~ tr/I//;
> print "I: $I\n";
>
> ###
>
> There are more ways of doing the same thing, all depends on what you want
to
> do with the data.
>
> Yours,
> -Heikki
>
> On Tuesday 20 Apr 2004 23:36, S.Paul wrote:
> > Hi Everybody:
> >
> > I am pretty new to bioperl and am trying to find the statistics of the
> > polarity of amino acids in the protein sequence eg. how many are polar,
> > hydrophobic etc.  I tried using the SeqStats to calculate the mol wt
and
> > the number of A and C but cannot calculate the number of hydrophobic
acids
> > present.  I am enclosing the portion of the code.  I would appreciate if
> > anybody can offer any suggestions in this regard.
> >
> >
***************************************************************************
> >************************************************** my $seq_stats =
> > Bio::Tools::SeqStats->new($seq);
> > my $weight = $seq_stats->get_mol_wt();
> > #note $weight is an array
> > print " the weight is ", $$weight[0], "\n";
> > my $monomer_ref = $seq_stats->count_monomers();
> > print "Number of A\'s in sequence is $$monomer_ref{'A'} \n";
> > print "Number of C\'s in sequence is $$monomer_ref{'C'} \n";
> > print "Number of T\'s in sequence is $$monomer_ref{'T'} \n";
> > print "Number of G\'s in sequence is $$monomer_ref{'G'} \n";
> >
> >
> > print "\-----------------------------------------------\n";
> > my $oddcode_obj = Bio::Tools::OddCodes->new(-seq =>$seq);
> > #returns the reference
> >
> > my $output1 = $oddcode_obj->charge();
> > my $output2 = $oddcode_obj->structural();
> > my $output3 = $oddcode_obj->chemical();
> > my $output4 = $oddcode_obj->functional();
> >
> > my $output5= $oddcode_obj->hydrophobic();
> >
> > #displays
> > my $new_coding1 =$$output1;
> > print "\nthe charge of the sequence is $new_coding1";
> >
> > print "\-----------------------------------------------\n";
> >
> > my $new_coding2 =$$output2;
> > print "\nthe structural sequence $new_coding2";
> > print "\-----------------------------------------------\n";
> > my $new_coding3 =$$output3;
> > print "\n the chemical structure is : $new_coding3";
> > print "\-----------------------------------------------\n";
> > my $new_coding4 =$$output4;
> > print "\n the functional nature of the protein: $new_coding4";
> > print "\-----------------------------------------------\n";
> >
> >   my $new_coding5 =$$output5;
> > print "\n the hydrophobic nature of the protein: $new_coding5";
> >
> >
***************************************************************************
> >*******************************************
> >
> > Thanks
> >
> > Sujoy Paul
> > Sujoy Paul, PRISE Centre, UniS, s.paul at surrey.ac.uk
>
> --
> ______ _/      _/_____________________________________________________
>       _/      _/                      http://www.ebi.ac.uk/mutations/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
>     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
>    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
>   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
>      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
>



More information about the Bioperl-l mailing list