[Biopython-dev] Sequence object allows non-alphabet characters

Markus Piotrowski Markus.Piotrowski at ruhr-uni-bochum.de
Tue Dec 20 14:48:25 UTC 2011


Eric Talevich <eric.talevich <at> gmail.com> writes:

> As another alternative, you could add a method Seq.validate() which must be
> called separately. Then you'd have a way to trigger validation even after
> directly setting seq.data or .alphabet.
> 
> -E
> 

There is a function _verify_alphabet(sequence) in the package Alphabet, which
does exactly this. However, the example given in the API documentation doesn't
work for me:

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import IUPAC
>>> my_seq = Seq ("MKQHK", IUPAC.protein)
>>> _verify_alphabet(my_seq)

Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    _verify_alphabet(my_seq)
NameError: name '_verify_alphabet' is not defined

>>> from Bio import Alphabet
>>> Alphabet._verify_alphabet(my_seq)
True

Still, I would prefer to have checked the sequence against the choosen alphabet
during initialization, maybe as option: Seq(sequence[, alphabet, verify])

Markus  




More information about the Biopython-dev mailing list