[Biopython-dev] RNA Alphabet: request for comments

Wed Jun 16 09:41:35 UTC 2010

On Wed, Jun 16, 2010 at 10:03 AM, Kristian Rother <krother at rubor.de> wrote:
>
> Hi Peter,
>
>> Why do you need the  _set_sequence method? Why not just put that
>> small piece of code inside the __init__ method?
>
> In _set_sequence there'll be a small parser taking care of modifications
> where the one-letter abbreviations do not suffice. E.g. a sequence could
> be
>
> "CCC022UCCC"
>
> (22U is a 5-hydroxyuridine).
>
> --> being parsed into a list of RNAAlphabetEntries
> ['C','C','C','22U','C','C','C']
>
> So the code will grow a little, but the basic idea stays the same.
>
> If someone wants a one-letter representation, it could be "CCCxCCC", but
> this is degenerate because 'x' is used for several modifications.
>
> Best Regards,
>   Kristian

Thinking ahead, we are planning to make the Seq objects use string
comparison instead of object identity. When that happens, I would
suggest in your subclass you implement the the equality method so
that if you are comparing against another instance of the modified RNA
Seq compare at the more detailed "22U" level, and if not then for
compatibility compare at the single letter level ("x" even though degenerate).

Peter