[BioPython] Can't download a FASTA file from NCBI to BLAST
Peter
biopython at maubp.freeserve.co.uk
Tue Jun 19 16:46:20 UTC 2007
Roger Barrette wrote:
> Hi again Peter,
>
> You are correct in your assumptions as to what I'm trying to accomplish. I
> have a habit of pulling random code from different places when I'm at a loss
> for how to do something, when I can't find documentation or examples.
If we start with some of you last attempt, you can see that this NCBI
dictionary just returns raw fasta records as strings:
>>> from Bio import GenBank
>>> ncbi_dict = GenBank.NCBIDictionary("nucleotide","fasta")
>>> ncbi_dict["A0B5H8]
'>gi|121693723|sp|A0B5H8|A0B5H8_9EURY TATA-box binding\nMESTINI...'
>>> print ncbi_dict["A0B5H8]
>gi|121693723|sp|A0B5H8|A0B5H8_9EURY TATA-box binding
MESTINIENVVASTKLADEFDLVKIESELEGAEYNKEKFPGLVYRVKSPKAAFLIFTSGKVVCTGAKNVE
DVRTVITNMARTLKSIGFDNINLEPEIHVQNIVASADLKTDLNLNAIALGLGLENIEYEPEQFPGLVYRI
KQPKVVVLIFSSGKLVVTGGKSPEECEEGVRIVRQQLENLGLL
You can just write these directly to your file:
from Bio import GenBank
from Bio import SeqIO
acc_list = ["A0B5H8", "A0C5G2", "A0CM02", "A0CRU8"]
#Don't use any record parser, we just want the raw text
ncbi_dict = GenBank.NCBIDictionary("nucleotide","fasta")
fasta_file = open("c:\\Current_Query.fasta","w")
for acc in acc_list :
fasta_file.write(ncbi_dict[acc])
fasta_file.close()
This is very simple as there is no conversion between file formats - you
are asking the NCBI for fasta format records, and you save them to a
file as is.
Another option (which I was suggesting in the previous email) is to have
the NCBIDictionary parse the data into SeqRecord objects (rather than
raw text) and then write those to your file, possibly using Bio.SeqIO
Peter
More information about the Biopython
mailing list