[BioPython] Parsing blast.out

Ravinder Singh Ravinder.Singh@colorado.edu
Tue, 14 May 2002 13:05:21 -0600


Hi,
I'm trying to parse a blast output file and have tried both ways - i.e
saving to a file then making a file handle or doing  the
cStringIO.StringIO.
I get the following error. Any help. Many thanks
Ravinder
*******
------------------------------------------------------------
SyntaxError: Expected blank line, but got:
           1,221,820 sequences; 5,507,506,871 total letters

--------------------------------------------------------------
I know that the blast works as it writes the blast output to a file. It
gets stuck at the parsing . The problem occurs when I generate the
b_record, using either handle. If I comment b_record1 line it prints
neither C not D, however, if I comment b_record2 it printc C not D,

b_record1 = blast_parser.parse(b_results)
print 'C'
b_record2 = blast_parser.parse(string_result_handle)
print 'D'

****************
If needed, my code is,
----------------------------------------------------------------
#! /usr/local/bin/python

from Bio import Fasta

file_for_blast = open('m_cold.fasta', 'r')
f_iterator = Fasta.Iterator(file_for_blast)

f_record = f_iterator.next()

from Bio.Blast import NCBIWWW
b_results = NCBIWWW.blast('blastn', 'nr', f_record)


save_file = open('my_blast.out', 'w')
blast_results = b_results.read()
save_file.write(blast_results)
save_file.close()

import cStringIO
string_result_handle = cStringIO.StringIO(blast_results)


b_results = open('my_blast.out', 'r')


print 'A'
from Bio.Blast import NCBIWWW

blast_parser = NCBIWWW.BlastParser()
print 'B'

b_record = blast_parser.parse(b_results)
print 'C'

b_record = blast_parser.parse(string_result_handle)

print 'D'
*******************
I'd like to do all of the following if and when the above code works.
E_VALUE_THRESH = 0.04

for alignment in b_record.alignments:
 for hsp in alignment.hsps:
  if hsp.expect < E_VALUE_THRESH:
   print '****Alignment****'
   print 'sequence:', alignment.title
   print 'length:', alignment.length
   print 'e value:', hsp.expect
   print hsp.query[0:75] + '...'
   print hsp.match[0:75] + '...'
   print hsp.sbjct[0:75] + '...'
--
********************************************************************************

Dr. Ravinder Singh
Assistant Professor
MCD Biology
347 UCB
University of Colorado
Boulder, CO 80309-0347

(303)492-8886 (voice)
(303)492-7744 (fax)