[Bioperl-l] Problems downloading and parsing GenBank records

Fields, Christopher J cjfields at illinois.edu
Wed Jun 21 02:53:59 UTC 2017


Hi Jon,

It looks like the script is attempting to parse a bad Genbank record, one that was truncated by an external error from NCBI, and failing (which is probably a good thing if the record is faulty).

I noticed the record for that protein no longer is valid (it’s discontinued); the genome was replaced with this one:

https://www.ncbi.nlm.nih.gov/genome/?term=txid1343740[Organism:noexp]

Was this an older cached record?

chris

From: Bioperl-l <bioperl-l-bounces+cjfields=illinois.edu at mailman.open-bio.org> on behalf of "Moller, Abraham" <mollera2 at miamioh.edu>
Date: Tuesday, June 20, 2017 at 7:24 PM
To: "bioperl-l at mailman.open-bio.org" <bioperl-l at mailman.open-bio.org>
Subject: [Bioperl-l] Problems downloading and parsing GenBank records

Hi all,

I have been using a script to parse GenBank files to find taxonomic information corresponding to bacterial genomes. After several tries, my script has failed with the following error:

...
Bacteria_Actinobacteria_Streptomycetales_Streptomycetaceae_Streptomyces_Streptomyces_sp._4F
Bacteria_Actinobacteria_Streptomycetales_Streptomycetaceae_Streptomyces_Streptomyces_glaucescens
--------------------- WARNING ---------------------
MSG: Unbalanced quote in:
/locus_tag="M271_25565"
/inference="COORDINATES: ab initio prediction:GeneMarkS+"
/note="Derived by automated computational analysis using
gene prediction method: GeneMarkS+."
/codon_start=1
/transl_table=11
/product="membrane protein"
/protein_id="YP_008791527.1"
/db_xref="GeneID:17596261"
/translation="MPSPTSLAPAGPTATPTRTTATARRLMAICGTLLAALLCALSVG
ANSASAHAALTSTDPADGSVVKTAPREVTLNFSEGVLLSGDSVRVLDPKGKRVDTGKT
AHVDGKSSTAAAGLHSGLPDG Error: External viewer error: Empty Response. Bytes read: 0 Status: TimeoutNo further qualifiers will be added for this feature
---------------------------------------------------`

After this, the script seems to halt for hours at least, if not indefinitely...
Is this a BioPerl or GenBank issue? Any help would be appreciated.
Thanks,
Jon Moller

--
Abraham (Jon) Moller
Microbiology and Chemistry | 2016
Cell, Molecular, and Structural Biology (CMSB) BS/MS | Liang Bioinfo Lab
Microbiology Club President


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/bioperl-l/attachments/20170621/95f4eca8/attachment.html>


More information about the Bioperl-l mailing list