[Biojava-dev] Interesting BLAST 2.2.25+ XML behaviour

Peter Cock p.j.a.cock at googlemail.com
Tue May 3 09:24:08 UTC 2011


Hello all,

I've CC'd the BioPerl, BioRuby, BioJava and Biopython development mailing
lists to make sure you're aware of this, but can we continue any discussion
on the cross-project open-bio-l mailing list please?

I noticed that recent versions of BLAST are not using a single <iteration>
block for each query, which was the historical behaviour and assumed
by the Biopython BLAST XML parser. This may be a bug in BLAST.
See link below for an example.

Has anyone else noticed this, and has it been reported to the NCBI yet?

Thanks,

Peter

(Not for the first time, I wish there was a public bug tracker for BLAST,
or at least a private bug tracker so we could talk about issues with an
NCBI assigned reference number.)

---------- Forwarded message ----------
From: Peter Cock <p.j.a.cock at googlemail.com>
Date: Wed, Apr 20, 2011 at 6:08 PM
Subject: Interesting BLAST 2.2.25+ XML behaviour
To: Biopython-Dev Mailing List <biopython-dev at biopython.org>


Hi all,

Have a look at this XML file from a FASTA vs FASTA search
using blastp from  BLAST 2.2.25+ (current release), which
is a test file I created for the BLAST+ wrappers in Galaxy:

https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml

I just put it though the Biopython BLAST XML parser, and
was surprised not to get four records back (since as you
might guess from the filename, there were four queries).

It appears this version of BLAST+ is incrementing the
iteration counter for each match... or something like that.

Has anyone else noticed this? I wonder if it is accidental...

Peter




More information about the biojava-dev mailing list