[Biopython-dev] Blast parsers and records

Peter biopython at maubp.freeserve.co.uk
Mon Jun 7 13:50:06 UTC 2010


On Fri, Jun 4, 2010 at 4:55 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Michael, Peter, Sebastian, Laurent, Jose, and others,
>
> Thanks for your comments. It looks like there are lots of things to discuss,
> so let's start with the easiest ones.
>
> About converting a record to a string (point 5): I agree that using __str__ is
> probably not the best choice, so let's use __format__ instead, or add a "write"
> method. The added advantage of these is that we can print out a record in
> different formats (xml, text, table) by specifying the requested format as an argument.

The __format__ or format method sounds like a great idea (following other
bits of Biopython).

> For point 3), maybe my wording was confusing; actually what I had in mind
> is the case where a given Blast program can produce different output formats
> (xml, text, table, etc.). This was inspired by this bug report:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2176
> In my mind, the different output formats are just different intermediates, but
> in essence they are the same and should therefore be stored in the same
> class. So, if I run blastp, save the result as XML, and parse it, I'd expect the
> same class as when I run blastp and save and parse the output in table format.
> Just in the latter case, some information may be missing if it is not available in
> the output in table format. Does that sound acceptable?

I agree that records from all the different BLAST output formats should be
represented by a common base class - but not necessarily the same class.
For example, the default plain text and XML formats include the pairwise
alignments, but the tabular output does not. To me having a sub-class which
stores the pairwise alignments seems natural here.

Peter



More information about the Biopython-dev mailing list