[emboss-dev] Fwd: Request For Work

Lapointe, David David.Lapointe at umassmed.edu
Wed Aug 13 14:46:31 UTC 2008

We are running Rocks (4.3) on our cluster currently and the Bio roll has
EMBOSS installed (4.1.0). One peculiarity is that
EMBOSS is installed on every node locally so that updating databases (
rebase, tfsites, etc) must be done on every node. Other than that there
could some creative work with distributed computing ( distinct from mpi
which would also be interesting ).  Having a mechanism to share the data
would be a plus.


-----Original Message-----
From: emboss-dev-bounces at lists.open-bio.org
[mailto:emboss-dev-bounces at lists.open-bio.org] On Behalf Of Peter Rice
Sent: Wednesday, August 13, 2008 10:22 AM
To: jitesh dundas
Cc: emboss-dev at lists.open-bio.org
Subject: Re: [emboss-dev] Fwd: Request For Work

Dear jitesh,

> Thank you for your reply. Please excuse me for the delay in replying 
> as I was out of town.
> I am looking at working on this issue in 2 ways:-
> 1) I wish to parallelize the phases of different softwares( if they 
> are in develpment stage).
> 2) Next, if there is a connection or dependency between two or more 
> projects( or applications), then we can try to give the output that is

> needed based on the current status of the output-supplying

Aha ... so you are looking at running several EMBOSS applications in
parallel? That is a very interesting issue for us.

> I will need to know if there is any relationship identified between 
> any of the applications defined in the  EMBOSS project. If there are 
> any relations already present between the applications, it will become

> easier to get a handle to move the execution from one point to

The inputs and outputs of all EMBOSS applications are marked up in the
.acd files with a "knowntype" that identifies common outputs that could,
for example, be combined and visuallised together - and also which
ooutput could be used as inputs by other applications. For sequences,
features, alignments and reports this includes whether the type is
nucleotide or protein.

> Also, Running applications in parallel will require a change in the 
> way we make our applications. We need to define a master relationship 
> between all the apllications, so as to relate all the applications
with each other.

We are also looking at adding definitions for the algorithm used by an
applications, and a standard way to represent the transformations of
inputs into outputs.

Any feedback on these issues would be very welcome.

We are also interested in looking at executing EMBOSS code in parallel
is anyone is looking at that.


Peter Rice
emboss-dev mailing list
emboss-dev at lists.open-bio.org

More information about the emboss-dev mailing list