[Biopython-dev] [Bug 2731] Adding .upper() and .lower() methods to the Seq object

Mon Jan 12 23:30:49 UTC 2009

http://bugzilla.open-bio.org/show_bug.cgi?id=2731

------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk  2009-01-12 18:30 EST -------
Created an attachment (id=1191)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1191&action=view)
Patch to Bio/Seq.py ONLY adding upper and lower methods

This patch is a proof of principle of how we could add upper and lower methods
while following the strict alphabet checking proposed on Bug 2597.  The code is
a little complicated/nasty in order to localise the change to Bio/Seq.py only.

Here is a usage example with the patch applied,

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import IUPAC
>>> my_dna = Seq("AGGGTGTTGA",IUPAC.IUPACUnambiguousDNA())
>>> my_dna
Seq('AGGGTGTTGA', IUPACUnambiguousDNA())
>>> my_dna.lower()
Seq('agggtgttga', NucleotideAlphabet())
>>> my_dna.lower().upper()
Seq('AGGGTGTTGA', NucleotideAlphabet())

Note that If we implemented (private) upper and lower methods in the Alphabet
objects as I suggested on Bug 2532, the code in the Seq class would be much
simpler, e.g.

def upper(self) :
    return Seq(str(self).upper(), self.alphabet._upper())
def lower(self) :
    return Seq(str(self).lower(), self.alphabet._upper())

The generic alphabets (where the list of letters is undefined) would just
return self, while the AlphabetEncoders could also implement these methods
simply.  Individual explicit alphabets (i.e. the IUPAC ones) would have to
define sensible upper/lower mappings - perhaps by defining lower case variants
(see Bug 2532).

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.