[Bioperl-l] Re: bl2seq reader chokes when no results (PR#927)

Brad Chapman chapmanb@arches.uga.edu
Tue, 24 Apr 2001 21:05:37 -0400


Hilmar:
> Hmm. Very true. I didn't expect them to be so crappy. 

:-). Yeah, when you put junk into BLAST it likes to give you junk back.

> So, this appears
> to boil down to 1) recognize by a parsing error that the program
> producing the file failed (crashed), and 2) start to interpret that
> failure of the program. Sounds like fun to add this doesn't it.

Lots of fun...
 
> Does this mean that your ErrorParser in fact does such an
> interpretation?

Yes, you are exactly on target with how it was implemented. BTW, the
class is actually called BlastErrorParser and is found in
Bio/Blast/NCBIStandalone.py, if you want to look at the python
code. 

It was implemented in python using exceptions -- the parser will 
raise a SyntaxError whenever it hits something it can't parse. So, we
save the BLAST report before parsing, parse it, and then catch any
SyntaxErrors that occur. If we get a SyntaxError, then we try to
analyze the bad report to see if we know what kind of error it
is. Eventually we either end up raising a specific error if we
recognize the problem, or raise the SyntaxError again and let the code 
calling the parser deal with it (they are free to catch the errors,
and just go on to parse the next record if they want).

Anyways, it is kind of ugly, since it is not guaranteed that Blast
will continue to produce similar junk in different revisions, but it's 
better than nothing. It also introduces a little extra overhead since we
save each BLAST report before parsing it, but I couldn't see a way
around that without messing with Jeff's pretty parsing code (and he
didn't want that :-).

Brad