[Biopython] Entrez EFetch Options
Joshua Klein
mobiusklein at gmail.com
Thu Jul 9 03:29:01 UTC 2015
If you store your list of identifiers in a file given by the variable
'file_path', with one identifier per line, you can use:
ids_to_fetch = ",".join(open(file_path))
This code will open the file, and use the default iteration behavior for
file objects to yield a line at a time to the join method of the string
",". This will create a long string of comma-separated identifiers to use
in your efetch call.
On Wed, Jul 8, 2015 at 9:12 PM, Zach Gayk <zgayk at nmu.edu> wrote:
> Hello,
>
> I would like to use the following code from the biopython tutorial to
> retrieve gi numbers for a number of sequences that matched to scaffolds on
> a genome assembly:
>
> import os
> os.chdir('/Users/zachgayk/Desktop/GAVIABioinformatics/')
> from Bio import Entrez # this is the most likely script modified
> from Bio import SeqIO
> Entrez.email = "zgayk at nmu.edu"
> handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", \
> id="gi|50254217|gb|`, gi|50254217|gb|AY567890.1|,
> gi|559028|gb|L33375.1|GVSMTDGI,
> gi|559028|gb|L33375.1|GVSMTDGI")
> for seq_record in SeqIO.parse(handle, "gb"):
> print seq_record.description[:100] + "..." # the :100 specifies no.
> characters and "..." says this comes after specified character limit
> handle.close()
>
> The problem, however, is that there are a large number of gi numbers I
> wish to retrieve, and so there are simply too many to manually enter into
> the id ="" field. What I would like to do is specify a file containing all
> of the needed gi numbers in a list and then have the code parse all of
> them. I haven't been able to figure out how to do this yet, and if anyone
> has any ideas they would be very much appreciated.
>
> Thank you,
> Zach Gayk
>
>
>
>
> _______________________________________________
> Biopython mailing list - Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/f83d063f/attachment-0001.html>
More information about the Biopython
mailing list