[Biopython-dev] Fwd: Re: sequence class proposal

Blanca Postigo Jose Miguel jblanca at btc.upv.es
Mon Jun 2 19:11:19 UTC 2008

----- Mensaje reenviado de Blanca Postigo Jose Miguel <jblanca at btc.upv.es> -----
   Fecha: Mon,  2 Jun 2008 21:08:59 +0200
      De: Blanca Postigo Jose Miguel <jblanca at btc.upv.es>
Responder-A: Blanca Postigo Jose Miguel <jblanca at btc.upv.es>
 Asunto: Re: [Biopython-dev] sequence class proposal
    Para: Peter <biopython at maubp.freeserve.co.uk>

Mensaje citado por Peter <biopython at maubp.freeserve.co.uk>:

> In reply to Jose, I (Peter) wrote:
> >> One of your points seemed to be that the SeqRecord couldn't have a
> >> __getitem__ and methods like reverse, complement, etc.  I don't see
> >> why it couldn't have these.  Perhaps rather than introducing a whole
> >> new class, enhancing the SeqRecord would be a better avenue.
> I've filed Bug 2507 to try and show what I had in mind for the
> __getitem__ method.
> http://bugzilla.open-bio.org/show_bug.cgi?id=2507
I think that would be great. I've just added to the bug a question about the
.seq property of SeqRecord.

> Adding further methods for (reverse) complement etc could be done in
> much the same way.
> Returning to extending Biopython to support per-letter-annotation, I
> can see two options:
> Right now, the SeqRecord object HAS a Seq object.  If we create a new
> RichSeq which subclasses the Seq object to provide
> per-letter-annotation, then you could use a SeqRecord where the .seq
> property is in fact a RichSeq object.  The SeqRecord class doesn't
> need to have any changes made for this to work (assuming the RichSeq
> provides the same API as the Seq object).
Here I had a slighty different idea, but maybe yours is better. Basically my
RichSeq proposal is just a RichSeq with slicing and without the seq property.
The problem with the approach that you describe is that the RichSeq should have
the per-letter-annotation, so SeqRecord would have a general annotation and
RichSeq (in the .seq) would have other features. I would find that confusing.

> If we make the SeqRecord a subclass of the Seq object, then I would
> suggest either RichSeq subclassing SeqRecord subclassing Seq, or
> perhaps SeqRecord subclassing RichSeq subclassing Seq.  It depends on
> if you think the id/name/description/dbxrefs/etc properties would be
> useful in common use cases of the RichSeq object.
If SeqRecord is a subclass of Seq RichSeq is not necessary anymore. That's what
I was proposing. The problem is that the current users of SeqRecord would had a
hard time with the new behaviour, because in that case supporting the seq
property would be hard. To avoid that breakage I was proposing to create
RichSeq. RichSeq would be just the SeqRecord that you propose but would allow
the users to migrate to RichSeq without forcing them to change to a new
SeqRecord behaviour.

> Its not going to be possible for all three classes to have the same
> __init__ parameters without breaking existing scripts (and only
> supporting the lowest common denominator).
That's another reason to rename your new proposed SeqRecord to RichSeq.

> Peter

Jose Blanca

----- Fin del mensaje reenviado -----


More information about the Biopython-dev mailing list