[Biopython-dev] Bio.SubstMat (was: Re: Calculating motif scores)

Michiel de Hoon mjldehoon at yahoo.com
Tue Aug 25 10:41:20 UTC 2009


I did (3) and (4) below, and I added a __str__ method but I didn't touch the other print functions (2).

For (1), maybe a better way is to subclass the SeqMat class for each of the matrix types instead of storing the matrix type in self.mat_type. Any comments or objections (especially Iddo)?

--Michiel.

--- On Sat, 7/25/09, Iddo Friedberg <idoerg at gmail.com> wrote:
> I'm the author of subsmat IIRC.
> Everything sounds good, but I would not make 2.6 changes
> that will break on 2.5. Ubuntu still uses 2.5 and I imagine
> other linux distros do too.

> 1) The matrix types (NOTYPE = 0, ACCREP = 1, OBSFREQ = 2,
> SUBS = 3, EXPFREQ = 4, LO = 5) are now global variables (at
> the level of Bio.SubsMat). I think that these should be
> class variables of the Bio.SubsMat.SeqMat class.
> 
> 
> 
> 
> 2) The print_mat method. It would be more Pythonic to use
> __str__, __format__ for this, though the latter is only
> available for Python versions >= 2.6.
> 
> 
> 
> 3) The __sum__ method. I guess that this was intended to be
> __add__?
> 
> 
> 
> 4) The sum_letters attribute. To calculate the sum of all
> values for a given letter, currently the following two
> functions are involved:
> 
> 
> 
>    def all_letters_sum(self):
> 
>       for letter in self.alphabet.letters:
> 
>          self.sum_letters[letter] =
> self.letter_sum(letter)
> 
> 
> 
>    def letter_sum(self,letter):
> 
>       assert letter in self.alphabet.letters
> 
>       sum = 0.
> 
>       for i in self.keys():
> 
>          if letter in i:
> 
>             if i[0] == i[1]:
> 
>                sum += self[i]
> 
>             else:
> 
>                sum += (self[i] / 2.)
> 
>       return sum
> 
> 
> 
> As you can see, the result is not returned, but stored in
> an attribute called sum_letters. I suggest to replace this
> with the following:
> 
> 
> 
>     def sum(self):
> 
>         result = {}
> 
>         for letter in self.alphabet.letters:
> 
>             result[letter] = 0.0
> 
>         for pair, value in self:
> 
>             i1, i2 = pair
> 
>             if i1==i2:
> 
>                 result[i1] += value
> 
>             else:
> 
>                 result[i1] += value / 2
> 
>                 result[i2] += value / 2
> 
>         return result
> 
> 
> 
> so without storing the result in an attribute.
> 
> 
> 
> 
> 
> Any comments, objections?
> 
> 
> 
> --Michiel
> 
> 
> 
> --- On Fri, 7/24/09, Michiel de Hoon <mjldehoon at yahoo.com>
> wrote:
> 
> 
> 
> > From: Michiel de Hoon <mjldehoon at yahoo.com>
> 
> > Subject: Re: [Biopython-dev] Calculating motif scores
> 
> > To: "Bartek Wilczynski" <bartek at rezolwenta.eu.org>
> 
> > Cc: biopython-dev at biopython.org
> 
> > Date: Friday, July 24, 2009, 5:34 AM
> 
> >
> 
> > > As for the PWM being a separate class and used by
> the
> 
> > motif:
> 
> > > I don't know. I'm using
> Bio.SubsMat.FreqTable for
> 
> > implementing
> 
> > > frequency table, so I understand that the new
> PWM
> 
> > class would
> 
> > > be basically a "smarter" FreqTable.
> I'm not sure
> 
> > whether it
> 
> > > solves any problems...
> 
> >
> 
> > Wow, I didn't even know the Bio.SubsMat module
> existed.
> 
> > As we have several different but related modules
> 
> > (Bio.Motif, Bio.SubstMat, Bio.Align), I think we
> should
> 
> > define the purpose and scope of each of these
> modules.
> 
> > Maybe a good way to start is the documentation.
> Bio.SubsMat
> 
> > is currently divided into two chapters (14.4 and
> 16.2). I'll
> 
> > have a look at this over the weekend to see if this
> can be
> 
> > cleaned up a bit.
> 
> >
> 
> > --Michiel.
> 
> >
> 
> >
> 
> >      
> 
> > _______________________________________________
> 
> > Biopython-dev mailing list
> 
> > Biopython-dev at lists.open-bio.org
> 
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 
> >
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> 
> Biopython-dev mailing list
> 
> Biopython-dev at lists.open-bio.org
> 
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 
> 
> 


      




More information about the Biopython-dev mailing list