[Biopython-dev] [Bug 2532] New: Using IUPAC alphabets in mixed case Seq objects

Mon Jun 30 22:50:01 UTC 2008

http://bugzilla.open-bio.org/show_bug.cgi?id=2532

           Summary: Using IUPAC alphabets in mixed case Seq objects
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk

Bio.Alphabets.IUPAC defines a number of alphabets with defined lists of valid
letters which are in upper case ONLY.

Bio.Nexus and Bio.Sequencing.Phd create Seq objects which use these alphabets
even with mixed case sequences.

This contradicts how I think the alphabet's .letters property is intended to be
used (although currently this is not enforced by the Seq object).

I suggest either:

(a) Bio.Nexus etc switch to using generic DNA/RNA alphabets for any Seq objects
including lower case letters (or more simply, all Seq objects).

(b) We add lower case and mixed case variants of the alphabet objects, and use
the mixed case IUPAC alphabets in Bio.Nexus etc for the Seq objects.

There is also the option of (c) Extend the existing upper case only IUPAC
alphabets to include lower case too, but I fear this could have unexpected side
effects (e.g. where people looping over the expected set of letters).

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.