[Biopython] Additions to the SeqRecord

Peter biopython at maubp.freeserve.co.uk
Wed Nov 18 13:30:35 UTC 2009


Peter wrote:
> In thinking about this, perhaps there is another less invasive change,
> which I'm going to call Plan(C):
>
> We expect (and could even enforce this assumption) there to be at
> most one "source" feature in a GenBank/EMBL file, and that it should
> span the full length of the sequence. Taking this a special case, when
> slicing a SeqRecord, we could also slice the "source" SeqFeature to
> match the new reduced sequence. Furthermore, when adding two
> SeqRecord objects, we would try to combine the two "source"
> SeqFeatures - taking only common annotation information.

Here is an outline of what I have in mind here (incomplete, but does
the basics). If we want to talk about the implementation, perhaps we
should move this to the dev list...

http://github.com/peterjc/biopython/commit/a074919b9925cb908935abf3161a50758f21f607

However, the point is that "Plan C" looks possible, and seems to have
potential for dealing with SeqRecord slicing and addition where there
is a "source" SeqFeature fairly nicely (i.e. preserving it for things like
removing part of a sequence, or doing an origin shift).

Peter



More information about the Biopython mailing list