[Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?

Wibowo Arindrarto w.arindrarto at gmail.com
Mon Aug 26 16:04:38 UTC 2013

On Mon, Aug 26, 2013 at 4:04 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto
> <w.arindrarto at gmail.com> wrote:
>> Hi Peter, everyone,
>> As for the readiness, I think the important features that we planned
>> have been implemented in SearchIO. I don't have any major feature
>> change that I would like to implement anytime soon, too. So yes, I
>> think it is ready.
> So you'd be comfortable with removing the experimental warning
> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
> thing is resolved)?

Yes. I think all public-facing modules are ok now. There are still two
issue which I consider minor, but I think should be mentioned before
we lift the warning:

1. Storing [T]FAST[X|Y] query and hit strand information (see
https://redmine.open-bio.org/issues/3419). I'm not sure yet if I
should do the commit, but Jason's patch look sensible (and I can
probably add some more so that the parser knows whether to set the
strand on hit or query sequence).

2. Collapsing / merging overlapping HSPs. I've received one (or two)
mail(s) asking whether it is possible to merge overlapping HSPs
(apparently BLAST sometimes do this). I haven't figured a way to
cleanly implement this, so this is on hold for now.

In addition, we had a discussion some months ago about the Bio._utils
module that SearchIO uses (see
and http://lists.open-bio.org/pipermail/biopython-dev/2013-February/010290.html).
We had an extensive discussion about this last time, which went as far
as considering a change on how we run our tests. Since the Bio._utils
module itself is private, however, no public-facing functions in
SearchIO is affected.

Other than these, some planned features are implementing the HMMER3.1
parser (which I think should not interfere with lifting the warning).

> And you would like to officially support plain text BLAST parsing
> (despite it not being recommend by the NCBI, and known to have
> been quite a lot of work in the past to keep the parser working)?

Looking at http://lists.open-bio.org/pipermail/biopython/2012-September/008166.html,
the most sensible approach seems to be to put the current parser under
SearchIO (hence the module reorganization I did; so we can deprecate
Bio.Blast as a whole without losing functionality), without actually
advertising that we have full support of parsing the text output
(perhaps put a disclaimer that plain text is not guaranteed to work?).
I feel like some people may still want to use previous BLAST versions
anyway, and we do have a functioning parser tested up to 2.2.26+, so
throwing it away doesn't seem to be the best thing to do here. And in
the case that someone does want to extend the parser (could be me,
could be someone else) to work with the latest BLAST version, (s)he
can then extend the existing parser.

> We should probably also give you (Bow) commit rights too, so you
> can handle basic parser updates within SearchIO directly - assuming
> you're happy with that?

This is fine with me.


P.S. I made the pull request for the reorganization here:
https://github.com/biopython/biopython/pull/223, comments are welcomed

More information about the Biopython-dev mailing list