[Biopython-dev] HMMER (+ BLAT) wrappers

Chris Mitchell chris.mit7 at gmail.com
Wed May 2 14:50:05 UTC 2012


Hey Bow,

I think it would be better to have an option to send the query to the local
server should one be running as opposed to wrapping a gfServer that would
be local for the duration of a given python process.  This would allow for
cases where someone has their BLAT queries split up in a script and not
incur the loading time for the database multiple times.  The
gfServer/gfQuery setup is also rather a pain to use from my experience
(it's all relative paths).  I also think using the -pslx output would be a
better default since -psl doesn't provide you with the sequence alignments.

Chris

On Wed, May 2, 2012 at 4:17 AM, Wibowo Arindrarto <w.arindrarto at gmail.com>wrote:

> Hi everyone,
>
> The past week I've been trying to generate some test cases for BLAST,
> HMMER, et al. I was writing some short scripts to automate the test
> case generation, when I realized that Biopython doesn't have wrappers
> for HMMER and BLAT, so I decided to write them. The code is here:
> https://github.com/bow/gsoc/blob/master/hmmer/_HMMER.py and here:
> https://github.com/bow/gsoc/blob/master/blat/_BLAT.py.
>
> If it is of general interest to Biopython, I'd love to submit a pull
> request for these wrappers. They were primarily written for test case
> generation, but I imagine they won't require that many tweaks to make
> it suitable for inclusion in Biopython. However, before I can do that,
> there are some issues that I think needs to be discussed:
>
> 1. Where should the wrappers be put? I noticed that different wrappers
> are located in different directories according to their 'theme' (e.g.
> BLAST wrappers in Bio.Blast.Applications and ClustalW wrapper in
> Bio.Align.Applications). For the HMMER wrapper, should it be put
> inside Bio.Motif.Applications? For the BLAT wrapper, should I create a
> new Bio.Blat folder just for it? Yesterday I thought maybe it would be
> easier if all application wrappers are put inside the same directory
> (e.g. all in Bio.Applications), so maybe that's a viable option for
> future releases?
>
> 2. How should shared options among slightly different programs be
> handled? We can rely on creating abstract subclasses for them, but I
> find it easier to simply create lists and then combine them in the
> different programs. The current HMMER wrapper employs both of these
> approaches, but I think it needs to stick to just one approach to make
> the code easier to understand.
>
> 3. Is there a convention for naming the command line arguments? For
> example, if the command line option trigger is '--domE', should I name
> the Python variable, for example, 'domE', 'dome', 'dom_e', or 'dom_E'?
>
> 4. For the HMMER wrapper, there are some flags that are exclusive to
> each other (i.e. the user can only choose one of the flags). If the
> user chooses both, HMMER doesn't show any error messages ~ but nothing
> is run. Should the wrapper check for such mutually exclusive flags
> when it's created as well?
>
> 5. For BLAT, the installed suite includes a program that runs a BLAT
> server to handle search requests from different clients. It doesn't
> seem to be a typical program that should be wrapped by Biopython, but
> I might be wrong. Should a wrapper for the server be included as well?
>
> cheers,
> Bow
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>



More information about the Biopython-dev mailing list