[Biopython-dev] Merging the GFF3 and VCF branches

Peter Cock p.j.a.cock at googlemail.com
Wed Jun 10 09:18:03 UTC 2015


On Tue, Jun 9, 2015 at 8:17 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
> On Thu, Jun 4, 2015 at 3:44 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>>
>> This would be great to have merged - pathological test cases
>> and interconversion too :)
>>
>> Did we settle on a plan for parent/child relationships in
>> SeqFeature objects (beyond deprecating sub_features
>> which has been replaced with CompoundLocations)?
>>
>> Peter
>
>
> The last thread I see on this topic is from the end of summer 2012:
> http://mailman.open-bio.org/pipermail/biopython-dev/2012-July/018979.html
> (thread)
> http://mailman.open-bio.org/pipermail/biopython-dev/2012-September/019101.html
> (terminal)
>
> I'm a bit confused because the CompoundLocation class exists in
> Bio/SeqFeature.py, and git blame says it was written in late 2011 -- Peter's
> Time Machine in action? Does the f_loc5 branch modify the existing
> CompoundLocation class, then?

Old commits rebased to master; perhaps a merge would have
been clearer? As far as I recall, f_loc5 or whatever the final
iteration of this was, is all in the master now.

> The threads above also mention a deprecation process. I suppose in order to
> begin that process we need to determine what we're deprecating in favor of,
> then apply the new functionality and trigger a DeprecationWarning from the
> old-and-tired sub_features attribute along with some shim to keep things
> working approximately the way they used to?

Using the SeqFeature(..., sub_features=some_list) will trigger a deprecation
warning.

Accessing a (non empty) sub_features list will also trigger a deprecation,
e.g.

$ python
>>> from Bio import SeqIO
>>> r = SeqIO.read("NC_000932.gb", "gb")
>>> r.features[1].sub_features
... BiopythonDeprecationWarning: Rather using f.sub_features,
f.location should be a CompoundFeatureLocation ...
[SeqFeature(FeatureLocation(ExactPosition(97998),
ExactPosition(98793), strand=-1), type='gene'),
SeqFeature(FeatureLocation(ExactPosition(69610), ExactPosition(69724),
strand=-1), type='gene')]

(Looks like a grammatical typo in that message, whoops)

So I think we're ready to remove the sub_features attribute (and
the  associated code in the GenBank parser etc which populates it).

What to add for parent/child relationships between features is
yet to be decided.

> Even if a perfectly smooth transition isn't possible, I think it's
> worthwhile to make a gentle break to allow Biopython to correctly
> handle modern file formats for genomic features/annotations.

This should be pretty smooth.

Peter


More information about the Biopython-dev mailing list