[Biopython] Entrez.efetch
Michiel de Hoon
mjldehoon at yahoo.com
Fri Feb 26 03:47:16 UTC 2010
> ##Try on E. coli
> genome:
> parseGenome("CP000819.1")
> ##Try on Drosophila chromosome 4
> parseGenome("NC_004353.3")
> ##Try on Drosophila X chromosome
> parseGenome("NC_004354")
Have you tried "NC_004354.3" instead of "NC_004354"?
--Michiel.
--- On Thu, 2/25/10, Rohan Maddamsetti <rohan.maddamsetti at gmail.com> wrote:
> From: Rohan Maddamsetti <rohan.maddamsetti at gmail.com>
> Subject: [Biopython] Entrez.efetch
> To: biopython at lists.open-bio.org
> Date: Thursday, February 25, 2010, 9:33 PM
> Hello,
>
> I'm new to biopython (installed yesterday), so please bear
> with me. This
> problem is similar to one sent to list on Wed, Oct 8, 2008
> with the same
> subject line as this email, by a Stephan. Interestingly,
> though, my code
> works in a couple cases (including the chromosome input
> used by Stephan),
> but not in a third. I wrote the following simple function.
>
> def parseGenome(genbank_id):
> handle =
> Entrez.efetch(db="genome",rettype="gb",id=genbank_id)
> for seq_record in SeqIO.parse(handle,"gb"):
> print "%s with %i features" %
> (seq_record.id,
> len(seq_record.features))
> handle.close()
>
> ##Try on E. coli
> genome:
> parseGenome("CP000819.1")
> ##Try on Drosophila chromosome 4
> parseGenome("NC_004353.3")
> ##Try on Drosophila X chromosome
> parseGenome("NC_004354")
>
> And this is the output I get:
>
> CP000819.1 with 8759 features
> NC_004353.3 with 1191 features
> Traceback (most recent call last):
> File "BiasCalc.py", line 48, in <module>
> parseGenome("NC_004354")
> File "BiasCalc.py", line 38, in parseGenome
> for seq_record in SeqIO.parse(handle,"gb"):
> File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 420, in parse_records
> record = self.parse(handle, do_features)
> File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 403, in parse
> if self.feed(handle, consumer, do_features):
> File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 380, in feed
> misc_lines, sequence_string =
> self.parse_footer()
> File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 762, in parse_footer
> raise ValueError("Premature end of file in
> sequence data")
> ValueError: Premature end of file in sequence data
>
> Is this a bug, or am I doing something wrong? My eventual
> goal is to iterate
> through the features in the seq_record, and collect GC
> content statistics
> for the coding regions and introns.
>
> Thanks,
> Rohan
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
More information about the Biopython
mailing list