[BioPython] qblast
Peter
biopython at maubp.freeserve.co.uk
Sat Aug 18 21:38:52 UTC 2007
Hi Michael,
Here is a short script based on your example, which I have tested for
calling qblast:
from Bio.Blast.NCBIWWW import qblast
seq_string = "TGTGATGGATATCTGCAGAATTCGCCCTTTAAACTTCAGGGTGACCAAAA" \
+ "AATCAAAATAAATGTTGAAATAATACTGGATCTCCACCACCACTAACTTC" \
+ "AAAAAATGTTGTATTAAAATTTCTATCAGTTAATAACATTGTTATAGCAC" \
+ "CCCCTAATACTGGTAATGATAATAATAATAATCATGCTGTTATAAATACA" \
+ "GCTCAAACAAATAAAGGTAACTTAAACATACTCATACCAGGTGTTCGCAT" \
+ "ATTAATAACAGTAACAATAAAATTTATTGAACCTAATATTGATGATATAC" \
+ "CAGCTAAATGTAAACTAAATATTGCACATTCTATTGAACCTCCTGAATGT" \
+ "GAAAATATACCAGATAATGGTGGATAAACAGTTCAACCTGTACCTGCCCC" \
+ "CATCTCGACTACAGATGATCAAATTAATAAAAAAAATGATGGTACTAATA" \
+ "ATCAAAAACTTATATTATTTAATCTTGGGAATGCCATATCAGGAGCTCCT" \
+ "ATCATTAAAGGTAAAAATCAATTACCAAAACCACCCATTAATGCAGGCAT" \
+ "AACCATAAAAAATATCATTATTAAAGCATGTGCTGTTATTAACACATTAT" \
+ "ATGCTTGATGATTGTAATTTAATATTACTGCACCAGCATCTGATAATTCT" \
+ "ATACGTATTAATATAGATCAAAATGTTCCTATTAAACCTGCTAAAAATGC" \
+ "AAATATTAAATATAATGTTCCAATATCTTTATGATTTGTTGACCAAGGGC" \
+ "GAATTCCAGCACACTGGCGGCCGTTACTAG"
#result_handle = qblast('blastn', 'nr', seq_string, format_type='HTML')
#output_handle = open("test.html", "w")
#output_handle.write(result_handle.read())
#output_handle.close()
result_handle = qblast('blastn', 'nr', seq_string, format_type='Text')
output_handle = open("test.txt", "w")
output_handle.write(result_handle.read())
output_handle.close()
#result_handle = qblast('blastn', 'nr', seq_string, format_type='XML')
#output_handle = open("test.xml", "w")
#output_handle.write(result_handle.read())
#output_handle.close()
print "Done"
The top hits from the script were:
gb|AY916130.1| Epidermophyton floccosum mitochondrion, complete
gb|EF180206.1| Penicillium confertum voucher 171.87 cytochrom...
gb|EF180399.1| Penicillium soppii voucher IBT 14908 cytochrom...
gb|EF180398.1| Penicillium soppii voucher IBT 3331 cytochrome...
gb|EF180397.1| Penicillium soppii voucher IBT 18220 cytochrom...
gb|EF180396.1| Penicillium soppii voucher 226.28 cytochrome o...
gb|EF180395.1| Penicillium soppii voucher 144.83 cytochrome o...
The top hits for me using online nblast for the same sequence also on
the nr database:
gb|AY129164.1| Pythium aphanidermatum cytochrome oxidase I ge...
gb|AY561976.1| Scopalina ruetzleri cytochrome oxidase subunit...
gb|EF468468.1| Phytophthora sp. H-6/02 cytochrome oxidase sub...
gb|DQ832717.1| Phytophthora sojae mitochondrion, complete genome
gb|EF468470.1| Phytophthora sp. H-8/02 cytochrome oxidase sub...
gb|EF468469.1| Phytophthora sp. H-7/02 cytochrome oxidase sub...
gb|AY129166.1| Phytophthora capsici cytochrome oxidase I gene...
i.e. Very different!
I switched to using plain text output as its easier to read by hand.
Both correctly understood the input query was 780 letters long.
Both claimed to be output from BLASTN 2.2.17
Both claimed to be output from the same database
There where some differences in the parameters footer - but I'm not sure
why. Using the script:
Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,
GSS,environmental
samples or phase 0, 1 or 2 HTGS sequences)
Posted date: Aug 16, 2007 6:06 PM
Number of letters in database: -51,729,944
Number of sequences in database: 5,751,035
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
...
While using the web browser:
Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,
GSS,environmental
samples or phase 0, 1 or 2 HTGS sequences)
Posted date: Aug 16, 2007 6:06 PM
Number of letters in database: -51,729,944
Number of sequences in database: 5,751,035
Lambda K H
0.634 0.408 0.912
Gapped
Lambda K H
0.634 0.408 0.912
Matrix: blastn matrix:2 -3
Gap Penalties: Existence: 5, Extension: 2
...
There is something funny here... does this throw any light on things?
Peter
More information about the Biopython
mailing list