[Biopython-dev] Problems importing GenBank Files with complex LOCATION tags

Bruce Southey bsouthey at gmail.com
Mon Feb 2 14:39:03 UTC 2009


Hi,
I guess this pertains to Bugs 2681 and  2745. Please see Peter's 
comments and suggested patch to Bug 2745.

http://bugzilla.open-bio.org/show_bug.cgi?id=2681
http://bugzilla.open-bio.org/show_bug.cgi?id=2745
 
Any comments or thoughts on these would be appreciated!

Thanks
Bruce

Nick Loman wrote:
> Hi there,
>
> I'm attempting to import the whole of RefSeq into a BioSQL schema 
> using the BioPython loader. However, I am encountering problems with 
> items in the CON division, such as NW_002063152. I am using stock 
> Biopython 1.49 install.
>
> The problem occurs when parsing complex CONTIG location tags, such as 
> the following (spacing adjusted for readability):
>
> CONTIG
>    join(NZ_ABJI01000250.1:1..6235,gap(unk100),
>    NZ_ABJI01000251.1:1..2827,gap(1420),NZ_ABJI01000252.1:1..1802,
>    gap(unk100),NZ_ABJI01000253.1:1..2460,gap(unk100),
>    NZ_ABJI01000254.1:1..12092,gap(639),NZ_ABJI01000255.1:1..1192,
>    gap(unk100),NZ_ABJI01000256.1:1..5498,gap(unk100),
>    NZ_ABJI01000257.1:1..20442,gap(unk100),NZ_ABJI01000258.1:1..2364,
>    gap(511),NZ_ABJI01000259.1:1..17405,gap(unk100),
>    NZ_ABJI01000260.1:1..2462,gap(570),NZ_ABJI01000261.1:1..3348,
>    gap(410),NZ_ABJI01000262.1:1..815,gap(196),
>    NZ_ABJI01000263.1:1..589)
>
> I have worked around the problem by rewriting during my import to 
> produce a blank ORIGIN definition, which at least gets the sequence 
> features imported.
>
> I realise complex location parsing has been discussed before on this 
> list - would the authors expect this to parse correctly, or is it out 
> of the scope of the current code?
>
> Best regards,
>
> Nick.
>
>
>
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev




More information about the Biopython-dev mailing list