[Biopython-dev] GI number for NcbiblastpCommandline

Peter Cock p.j.a.cock at googlemail.com
Thu Feb 20 11:22:26 UTC 2014


On Wed, Feb 19, 2014 at 8:42 PM, Brown, Tom <Tom.Brown at enmu.edu> wrote:
> Currently using result_handle = NCBIWWW.qblast("blastp", "nr", blastGI)
> where blastGI = 113000 in the Biopython program and would like to convert
> it to a local blastp. Is there a way to specify the blastGi within
> NcbiblastpCommandline instead of having to provide a fasta file for blast?
>  What are my options.
>
> from Bio.Blast.Applications import NcbiblastpCommandline
> blastp_cline = NcbiblastpCommandline(query="sh3.fasta", db="nr", evalue=0.001, outfmt=5, out="sh3.xml")
> stdout, stderr = blastp_cline()
>
> Thanks
>
> Tom

Hi Tom,

Sorry but no: somehow you will need to download/fetch the
actual protein sequence for GI:113000 in order to give it to
the standalone blastp tool.

e.g. http://www.ncbi.nlm.nih.gov/protein/113000

One way would be using Entrez Fetch (efetch) via Bio.Entrez.
Depending on your protein set, it might be simpler to download
via FTP - your example is a yeast protein also in UniProtKB,
it is also possible to fetch sequences via their UniProt ID.

Peter



More information about the Biopython-dev mailing list