[Biopython-dev] Blast parsers and records

Peter biopython at maubp.freeserve.co.uk
Tue Jun 1 13:48:30 UTC 2010


On Mon, May 31, 2010 at 7:56 PM, Blanca Postigo Jose Miguel
<jblanca at btc.upv.es> wrote:
> Mensaje citado por Michael Sandford <sandford at ufl.edu>:
>
>> I've got a few comments as well:
>> > 4) The current Blast record stores its information in attributes. If you
>> use Bio.Entrez to parse Blast XML output (Biopython 1.54 contains the
>> necessary DTDs to do so), the information is stored in dictionaries. This has
>> some advantages. For example, it allows you to use record.keys() to find out
>> what the record contains. Ideally, I think that a Blast Record class should
>> inherit from a dictionary.
>
> I've developed for my own use a dict structure that represents a blast result.
> This structure also can represent many other results, like exonerate, SSAHA or
> any other number of aligners. Having a common representations for all of them
> allows you to create common filters that work with the same interface. I don't
> know if it is very efficient, but it has proven to be very convinient for us.
> You can take a look at:
>
> http://github.com/JoseBlanca/franklin/blob/master/franklin/alignment_search_result.py
>
> Best regards,
>
> Jose Blanca

It has some similarities to what I was imagining for a BioPerl-SearchIO-like
module. I'm still not convinced that we should just be using (subclasses of)
dictionaries - I would rather have important core properties like the hit
co-ordinates held explicitly as properties or attributes (and always using
Python counting, not whatever a given file format uses, like one-based
locations in BLAST output).

Peter



More information about the Biopython-dev mailing list