[Bioperl-l] Re: bl2seq reader chokes when no results (PR#927)
Ann Loraine
loraine@loraine.net
Tue, 24 Apr 2001 21:10:11 -0700 (PDT)
The trick I think is to distinguish between two possibilities:
1. bl2seq ran but exited prematurely due to some kind of error.
(no_alignment.1)
2. bl2seq ran without error but couldn't align the two sequences because
they aren't homologous (no_alignment.2)
In case 2, I would want the parser to report that the program ran but
failed to produce an alignment. In case 1, I would be happy if the parser
threw an exception and died. Currently it does this in both cases.
I think it's possible to distinguish case 1 from case 2 by examining the
output -- in case 2, the program produces statistics such as
"effective HSP length" and so on. (see alignment.2 uploaded earlier)
The parser could search for these strings and then throw an exception if
it fails to find them in the output.
-Ann
---
Ann E. Loraine
http://www.loraine.net
On Tue, 24 Apr 2001, Brad Chapman wrote:
> Hilmar:
> > Hmm. Very true. I didn't expect them to be so crappy.
>
> :-). Yeah, when you put junk into BLAST it likes to give you junk back.
>
> > So, this appears
> > to boil down to 1) recognize by a parsing error that the program
> > producing the file failed (crashed), and 2) start to interpret that
> > failure of the program. Sounds like fun to add this doesn't it.
>
> Lots of fun...
>
> > Does this mean that your ErrorParser in fact does such an
> > interpretation?
>
> Yes, you are exactly on target with how it was implemented. BTW, the
> class is actually called BlastErrorParser and is found in
> Bio/Blast/NCBIStandalone.py, if you want to look at the python
> code.
>
> It was implemented in python using exceptions -- the parser will
> raise a SyntaxError whenever it hits something it can't parse. So, we
> save the BLAST report before parsing, parse it, and then catch any
> SyntaxErrors that occur. If we get a SyntaxError, then we try to
> analyze the bad report to see if we know what kind of error it
> is. Eventually we either end up raising a specific error if we
> recognize the problem, or raise the SyntaxError again and let the code
> calling the parser deal with it (they are free to catch the errors,
> and just go on to parse the next record if they want).
>
> Anyways, it is kind of ugly, since it is not guaranteed that Blast
> will continue to produce similar junk in different revisions, but it's
> better than nothing. It also introduces a little extra overhead since we
> save each BLAST report before parsing it, but I couldn't see a way
> around that without messing with Jeff's pretty parsing code (and he
> didn't want that :-).
>
> Brad
>
>