[Biopython-dev] Error in SeqFeature.CompoundLocation parsing NCBI efetch format

Brynjar Smári Bjarnason binni at binnisb.com
Thu Dec 5 18:06:45 UTC 2013


I'll ask one who knows but I think I could skip using the bonds. Can you
suggest how I can ignore the bonds in efetch response, or the parser?

Thanks a lot for looking at this!
On 5 Dec 2013 18:12, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> On Thu, Dec 5, 2013 at 4:46 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >
> > Not to worry - the site did respond when I retried a bit later, and
> > I can reproduce the parser error:
> >
> >>>> from Bio import SeqIO
> >>>> r = SeqIO.read("1MRR_A.gp", "genbank")
> > BiopythonParserWarning: Couldn't parse feature location:
> > 'join(bond(84),bond(115),bond(118),bond(238))'
> > BiopythonParserWarning: Couldn't parse feature location:
> > 'join(bond(115),bond(204),bond(238),bond(241))'
> > BiopythonParserWarning: Couldn't parse feature location:
> > 'join(bond(194),bond(272))'
> > ...
> > ValueError: CompoundLocation should have at least 2 parts
>
> The problem is the bond locations, and in particular while the
> parser gave up on the ones with a warning, it fell over the
> single bond entry, bond(196).
>
> This is partly due to a change in the use of the bond term,
> which used to be a compound entry like bond(194,272).
> Also the GenBank parser was and is primarily used on
> nucleotide sequences rather than GenPept files which are
> occasionally more weird (like here!).
>
> A short term hack would be to strip out the bond term
> (with a warning) and parse the remainder as a simple
> join or single residue accordingly.
>
> Would that work for you - do you need the bond bit?
>
> Peter
>



More information about the Biopython-dev mailing list