[Biopython] iterating over FeatureLocation
Peter Cock
p.j.a.cock at googlemail.com
Tue Jan 21 16:52:35 UTC 2014
On Tue, Jan 21, 2014 at 4:39 PM, Michael Thon <mike.thon at gmail.com> wrote:
> Here’s another question. I have this GenBank formatted feature:
>
> CDS order(complement(3448..3635),complement(2617..3256))
> /Source="maker"
> /codon_start=1
> /ID="CFIO01_05457-RA:cds"
> /label=“CDS"
>
> When I extract the sequence I get this:
>
> (Pdb) str(feat.extract(seq).seq)
> ...
>
> This is supposed to be a CDS which can be translated to a protein coding
> sequence starting with M and ending with a stop codon. the above sequence
> isn’t correct - the exons are in the wrong order. When I reverse the order
> of the exons I get the correct order and get a CDS sequence that can be
> translated:
>
> (Pdb) feat.location.parts.reverse()
> (Pdb) str(feat.extract(seq).seq)
> ...
> (Pdb) str(feat.extract(seq).seq.translate())
>
> 'MSHEHSHDGPHGHAHSHEGGFNAQEHGHSHEILDGPGSYLGREMPIVEGRNWSDRAFTIGIGGPVGSGKTALMLALCLALREKYSIAAVTNDIFTREDAEFLTRHKALPAPRIRAIETGGCPHAAVREDISANLAALEDLHREFDADLLLIESGGDNLAANYSRELADYIIYVIDVSGGDKIPRKGGPGITQSDLLVVNKTDLAEIVGADLGVMERDARKMREGGPTVFAQVKKNVAVDHIVNLMLSAWKASGAEENRRAAGGPRPTEGLDSLKA*'
>
> So my question is, is there something wrong with the file I’m parsing?
>
Possibly - the 'order' tag actually means the order of the parts is unknown.
If the order is known, it should be 'join' instead:
join(complement(3448..3635),complement(2617..3256))
What's the accession/URL for the full file this example came from?
Peter
More information about the Biopython
mailing list