[Biopython-dev] Error in SeqFeature.CompoundLocation parsing NCBI efetch format

Brynjar Smári Bjarnason binni at binnisb.com
Fri Dec 6 08:03:45 UTC 2013


On 5 December 2013 19:08, Brynjar Smári Bjarnason <binni at binnisb.com> wrote:

> Thanks, will look at this when I'm at the computer :-)
> On 5 Dec 2013 19:06, "Brynjar Smári Bjarnason" <binni at binnisb.com> wrote:
>
>> I'll ask one who knows but I think I could skip using the bonds. Can you
>> suggest how I can ignore the bonds in efetch response, or the parser?
>>
>> Thanks a lot for looking at this!
>> On 5 Dec 2013 18:12, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:
>>
>>> On Thu, Dec 5, 2013 at 4:46 PM, Peter Cock <p.j.a.cock at googlemail.com>
>>> wrote:
>>> >
>>> > Not to worry - the site did respond when I retried a bit later, and
>>> > I can reproduce the parser error:
>>> >
>>> >>>> from Bio import SeqIO
>>> >>>> r = SeqIO.read("1MRR_A.gp", "genbank")
>>> > BiopythonParserWarning: Couldn't parse feature location:
>>> > 'join(bond(84),bond(115),bond(118),bond(238))'
>>> > BiopythonParserWarning: Couldn't parse feature location:
>>> > 'join(bond(115),bond(204),bond(238),bond(241))'
>>> > BiopythonParserWarning: Couldn't parse feature location:
>>> > 'join(bond(194),bond(272))'
>>> > ...
>>> > ValueError: CompoundLocation should have at least 2 parts
>>>
>>> The problem is the bond locations, and in particular while the
>>> parser gave up on the ones with a warning, it fell over the
>>> single bond entry, bond(196).
>>>
>>> This is partly due to a change in the use of the bond term,
>>> which used to be a compound entry like bond(194,272).
>>> Also the GenBank parser was and is primarily used on
>>> nucleotide sequences rather than GenPept files which are
>>> occasionally more weird (like here!).
>>>
>>> A short term hack would be to strip out the bond term
>>> (with a warning) and parse the remainder as a simple
>>> join or single residue accordingly.
>>>
>>> Would that work for you - do you need the bond bit?
>>>
>>> Peter
>>>
>>
I believe for our part that leaving the bond bit out is fine so your patch
should work well.

Any suggestions on a good way to apply this patch? Should I build Biopython
from that branch or clone latest stable and apply the patch before building?

Thank you

Brynjar




More information about the Biopython-dev mailing list