[Biopython-dev] [Bug 1758] genbank parser chokes on /transl_except

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Mar 8 17:19:21 EST 2005


http://bugzilla.open-bio.org/show_bug.cgi?id=1758





------- Additional Comments From biopython-bugzilla at maubp.freeserve.co.uk  2005-03-08 17:19 -------
You can test this with accession NT_033779 available here:

ftp://ftp.ncbi.nih.gov/genomes/Drosophila_melanogaster/CHR_2/NT_033779.gbk

I think the problem is that the /transl_except=... entry spans multiple lines,
but is not wrapped in quotes (as done normally for multi-line entries).

Due to bug 1747 I haven't tried loading this with the current GenBank parser as
I don't have enough RAM.  However, for the record, even my personal GenBank
parser (patch on bug 1747) doesn't yet cope with the /transl_except=... entry.

Is this file and others like it are "wrong", and should it quote the entry?

Or we need to cope with this as well (not sure how painful that would be!)?

Snippet from the file:

     CDS             complement(join(18108857..18109603,18109665..18110692,
                     18111046..18111608,18111671..18112909,18113657..18114058,
                     18115560..18116014))
                     /gene="kel"
                     /locus_tag="CG7210"
                     /codon_start=1
                     /transl_except=(pos:complement(18111697..18111699),
                     aa:OTHER)
                     /protein_id="NP_476589.4"
                     /db_xref="GI:45549017"
                     /db_xref="FLYBASE:FBgn0001301"
                     /db_xref="GeneID:35084"
                     /translation="MIALSALLTKYTIGIMSNLSNGNSNNNNQQQQQQQQGQNPQQPA
                     QNEGGAGAEFVAPPPGLGAAVGVAAMQQRNRLLQQQQQQHHHHQNPAAEGSGLERGSC
                     LLRYASQNSLDESSQKHVQRPNGKERGTVGQYSNEQHTARSFDAMNEMRKQKQLCDVI
                     LVADDVEIHAHRMVLASCSPYFYAMFTSFEESRQARITLQSVDARALELLIDYVYTAT
                     VEVNEDNVQVLLTAANLLQLTDVRDACCDFLQTQLDASNCLGIREFADIHACVELLNY
                     AETYIEQHFNEVIQFDEFLNLSHEQVISLIGNDRISVPNEERVYECVIAWLRYDVPMR
                     EQFTSLLMEHVRLPFLSKEYITQRVDKEILLEGNIVCKNLIIEALTYHLLPTETKSAR
                     TVPRKPVGMPKILLVIGGQAPKAIRSVEWYDLREEKWYQAAEMPNRRCRSGLSVLGDK
                     VYAVGGFNGSLRVRTVDVYDPATDQWANCSNMEARRSTLGVAVLNGCIYAVGGFDGTT
                     GLSSAEMYDPKTDIWRFIASMSTRRSSVGVGVVHGLLYAVGGYDGFTRQCLSSVERYN
                     PDTDTWVNVAEMSSRRSGAGVGVLNNILYAVGGHDGPMVRRSVEAYDCETNSWRSVAD
                     MSYCRRNAGVVAHDGLLYVVGGDDGTSNLASVEVYCPDSDSWRILPALMTIGRSYAGV
                     CMIDKPMXMEEQGALARQAASLAIALLDDENSQAEGTMEGAIGGAIYGNLAPAGGAAA
                     AAAPAAPAQAPQPNHPHYENIYAPIGQPSNNNNNSGSNSNQAAAIANANAPANAEEIQ
                     QQQQPAPTEPNANNNPQPPTAAAPAPSQQQQQQQAQPQQPQRILPMNNYRNDLYDRSA
                     AGGVCSAYDVPRAVRSGLGYRRNFRIDMQNGNRCGSGLRCTPLYTNSRSNCQRQRSFD
                     DTESTDGYNLPYAGAGTMRYENIYEQIRDEPLYRTSAANRVPLYTRLDVLGHGIGRIE
                     RHLSSSCGNIDHYNLGGHYAVLGHSHFGTVGHIRLNANGSGVAAPGVAGTGTCNVPNC
                     QGYMTAAGSTVPVEYANVKVPVKNSASSFFSCLHGENSQSMTNIYKTSGTAAAMAAHN
                     SPLTPNVSMERASRSASAGAAGSAAAAVEEHSAADSIPSSSNINANRTTGAIPKVKTA
                     NKPAKESGGSSTAASPILDKTTSTGSGKSVTLAKKTSTAAARSSSSGDTNGNGTLNRI
                     SKSSLQWLLVNKWLPLWIGQGPDCKVIDFNFMFSRDCVSCDTASVASQMSNPYGTPRL
                     SGLPQDMVRFQSSCAGACAAAGAASTIRRDANASARPLHSTLSRLRNGEKRNPNRVAG
                     NYQYEDPSYENVHVQWQNGFEFGRSRDYDPNSTYHQQRPLLQRARSESPTFSNQQRRL
                     QRQGAQAQQQSQQPKPPGSPDPYKNYKLNADNNTFKPKPIAADELEGAVGGAVAEIAL
                     PEVDIEVVDPVSLSDNETETTSSQNNLPSTTNSNNLNEHND"




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list