[Biopython] iterating over FeatureLocation
Peter Cock
p.j.a.cock at googlemail.com
Mon Jan 13 16:18:01 UTC 2014
On Mon, Jan 13, 2014 at 4:07 PM, Michael Thon <mike.thon at gmail.com> wrote:
> Here are two examples from the GenBank format file (not from GenBank though)
>
>
> CDS order(6621..6658,6739..6985)
> /Source="maker"
> /codon_start=1
> /ID="CFIO01_14847-RA:cds"
> /label=“CDS"
>
> CDS 419..2374
> /Source="maker"
> /codon_start=1
> /ID="CFIO01_05899-RA:cds"
> /label=“CDS"
>
> if the feature is a simple feature, then I just need to access its start and end.
> If its a compound feature then I need to iterate over each segment, accessing the start and end.
>
> What I am doing at the moment is this:
>
> if feat._sub_features:
> for sf in feat.sub_features:
> start = sf.location.start
> …
> else:
> start = feat.location.start
> …
>
> it works, I think. Is there a better way?
Don't do that :) Python variables/methods/etc starting with a single
underscore are by convention private and should not generally be
used. In this case, ._sub_features is an internal detail for the behind
the scenes backwards compatibility for the now deprecated property
.sub_features (don't use that either).
Instead use the location object itself directly, it now holds any
sub-location information using a CompoundLocation object.
See the .parts attribute, which gives a list of simple locations.
e.g.
for part in feat.location.parts:
start = part.start
...
>
> Also, is there an easy way to get the sequence represented by the seqfeature,
> if it is made up of CompoundLocations? These features are CDSs where each
> sub-feature is an exon. I need to splice them all together and get the translation.
>
Yes, where `feat` is a SubFeature object use `feat.extract(the_parent_sequence)`
to get the spliced sequence, which you can then translate. See the section
"Sequence described by a feature or location" in the Tutorial,
http://biopython.org/DIST/docs/tutorial/Tutorial.html
http://biopython.org/DIST/docs/tutorial/Tutorial.pdf
On reflection, the Tutorial could do with a bit more detail on how to use
a CompoundLocation, but I did try to cover this in the docstrings.
Regards,
Peter
More information about the Biopython
mailing list