[Biopython-dev] [Biopython (old issues only) - Bug #2597] Enforce alphabet letters in Seq objects

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Thu Jan 19 16:59:13 UTC 2017


Issue #2597 has been updated by Peter Cock.

Assignee changed from Biopython Dev Mailing List to Peter Cock

Moved to GitHub as https://github.com/biopython/biopython/issues/1040

----------------------------------------
Bug #2597: Enforce alphabet letters in Seq objects
https://redmine.open-bio.org/issues/2597#change-15389

* Author: Peter Cock
* Status: In Progress
* Priority: Normal
* Assignee: Peter Cock
* Category: Main Distribution
* Target version: Not Applicable
* URL: 
----------------------------------------
If a Seq object is created with an alphabet with a pre-defined set of letters (e.g. the IUPAC alphabets) then I think Biopython should validate that the sequence does indeed only use those letters.

This will catch mis-use of ambiguous sequences with non-ambiguous alphabets, letters in an unexpected case, and most importantly any unexpected symbols (e.g. from a parsing problem).

This will impose a performance overhead - which can be avoided if the user instead chooses to use a generic dna/rna/protein alphabet which does not list the letters expected.

Note that we will have to resolve Bug 2532 before doing this, as currently some parts of Biopython are mis-using the upper case only IUPAC alphabet objects with mixed case sequences.

---Files--------------------------------
bug2597.patch (459 Bytes)


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20170119/d12cb618/attachment.html>


More information about the Biopython-dev mailing list