[Biopython-dev] Blast

Jeffrey Chang jeffrey.chang at duke.edu
Thu Sep 29 22:16:04 EDT 2005


On Sep 29, 2005, at 1:46 PM, Michiel De Hoon wrote:

> To my surprise, the parser in Blast.NCBIWWW tries to parse HTML output
> instead of text output. My guess is that the HTML output changes  
> more often
> and is more difficult to parse than text output. So isn't it  
> possible to make
> NCBIWWW.qblast return text output instead of HTML and parse that  
> instead?
> So my question is, why was the choice made to parse HTML instead of  
> text? Is
> it simply because blast-on-the-web couldn't return text output in  
> the past?

You are right.  It was done that way in the past when the only way to  
use NCBI's BLAST was to use the HTML output.  (Actually, there was a  
version that you could access through a proprietary non-HTTP  
protocol, but the databases were not updated as frequently.)  Now  
that we can get text, perhaps it is time to encourage users to use  
the text one.  I believe the HTML parser is a few versions behind  
now, and unable to parse current BLAST output anymore.

Jeff


More information about the Biopython-dev mailing list