[Biopython] Legacy blastn XML outfile parsing is slow. What XML parser is actually used?

Tanya Golubchik golubchi at stats.ox.ac.uk
Thu Sep 27 11:35:35 UTC 2012


Thanks, Peter, that's good to know.

Cheers,
Tanya

On 25/09/12 17:00, Peter Cock wrote:
> On Tue, Sep 25, 2012 at 3:39 PM, Tanya Golubchik
> <golubchi at stats.ox.ac.uk> wrote:
>> Hello,
>>
>> Apologies for not having followed the entire discussion, but just wanted
>> to say that we're also using NCBIXML here and are likely to be
>> incorporating it in a new piece of software soon, so it would be really
>> unfortunate if some tags disappeared, were renamed or (even worse)
>> changed meaning in future releases.
>>
>> I'm a bit late coming in here so maybe this has been answered, but is
>> there a better parser that should be used at the moment? I was under the
>> impression that NCBIXML is the only one.
>>
>> Thanks,
>> Tanya
> 
> Hi Tanya,
> 
> I hope I can reassure you there is nothing to worry about :)
> 
> Right now there is only the NCBIXML parser, and we're not going
> to change it (except possibly to make it a little faster if people
> want to work on that).
> 
> We're planning to a add new module based on Bow's GSoC
> code, under the working name SearchIO, which would cover
> BLAST, BLAT, HMMER, etc. This would have a different API
> and in the long term would probably replace all of Bio.Blast.
> http://biopython.org/wiki/SearchIO
> 
> The discussion about possible changes has been (I think)
> only about this new code (and would have been better off
> on the development mailing list but this thread went off on
> a slight tangent).
> 
> Once 'SearchIO' is released, we'd want to encourage
> people to use that instead of NCBIXML, with a view to
> deprecating and eventually removing NCBIXML. See:
> http://biopython.org/wiki/Deprecation_policy
> 
> Regards,
> 
> Peter



More information about the Biopython mailing list