[BioPython] plain txt blast output - xml instead
Peter
biopython at maubp.freeserve.co.uk
Thu Jun 15 17:30:18 UTC 2006
Rohini Damle wrote:
> Hi,
> I am using BioPython 1.41 on windows I have also updated
> NcbIstandalone.pyfor the link u gave. here is my code.
>
> from Bio.Blast import NCBIStandalone
> from Bio.Blast import NCBIXML
> blast_out = open("4proteinblast.xml","r")
> b_iterator = NCBIStandalone.Iterator(blast_out, NCBIXML.BlastParser())
>
> for b_record in b_iterator :
> query_name = b_record.query
> print query_name
> for alignment in b_record.alignments:
> print '****Alignment****'
> print 'sequence:', alignment.title
>
> This code gives "sequences producing significant alignments for all the 4
> proteins but printing querry name as P1
This code does the same thing, but prints less on screen so its easier
to read:
from Bio.Blast import NCBIStandalone
from Bio.Blast import NCBIXML
blast_out = open("4proteinblast.xml","r")
b_iterator = NCBIStandalone.Iterator(blast_out, NCBIXML.BlastParser())
for b_record in b_iterator :
query_name = b_record.query
print query_name
for alignment in b_record.alignments:
print query_name, alignment.title.split()[0]
> I mean I am getting all the information I want but I have 4 protein
> querries and this code is giving only P1 as a query (not P2, P3, P4
> but giving information about them) I ma attachin the xml file of
> 4 protein blast results. thank you for your help.
Looking at the raw XML file by hand, I could only see references to P1,
the first protein.
If the file had results for all four proteins I would expect to see:
<?xml version="1.0"?>
... results for P1 ...
<?xml version="1.0"?>
... results for P2 ...
<?xml version="1.0"?>
... results for P3 ...
<?xml version="1.0"?>
... results for P4 ...
Are you sure you gave Blast all four input sequences - and not just the
first sequence?
Peter
More information about the Biopython
mailing list