[Biopython-dev] Blast

Michiel De Hoon mdehoon at c2b2.columbia.edu
Thu Sep 29 18:43:41 EDT 2005


> On Thu, 2005-09-29 at 13:46 -0400, Michiel De Hoon wrote:
> > Hi everybody,
> > 
> > Recently there have been some problems with the Blast parser in
Biopython, to
> > the degree that the example in 3.1.2 in the tutorial does not work as
> > advertised. The problem, of course, is that the NCBI file format as
returned
> > by a www blast run keeps changing, so we are condemned to keep fixing our
> > parser to keep up with NCBI.
> > To my surprise, the parser in Blast.NCBIWWW tries to parse HTML output
> > instead of text output. My guess is that the HTML output changes more
often
> > and is more difficult to parse than text output. So isn't it possible to
make
> > NCBIWWW.qblast return text output instead of HTML and parse that instead?
> > So my question is, why was the choice made to parse HTML instead of text?
Is
> > it simply because blast-on-the-web couldn't return text output in the
past?
> > 
>
> I'd guess many people still want to really *look* at the output in their
> browser, which is just more comfortable with html, not to mention the
> possibility of clicking on the links, etc.

Then, the easiest solution might be to add a keyword argument to qblast to
specify if HTML or text output is desired (default is text output), and use
the text parser in NCBIStandalone. Then we only need to maintain the
text-based parser (which is easier to maintain anyway), and users who want
HTML output can still get it.

--Michiel.



More information about the Biopython-dev mailing list