James Bonfield jkb at mrc-lmb.cam.ac.uk
Wed Jan 10 15:16:48 UTC 2001

On Wed, Jan 10, 2001 at 03:39:41PM +0100, Catherine Letondal wrote:
> Maybe something like a 'pipe' attribute, at least for input/output files,
> would be useful to connect programs together.

As I understand it there is a -filter option which reads from stdin and writes
to stdout, although obviously it only works for programs that take exactly one
sequence. Output could be tricky too. You'd also need a way to silence the
rest of the output, or alternatively to output it to stderr instead.

execl() inherits open file descriptors [1] (unless the close-on-exec flag is
set) which means that a parent process could open file descriptor 3 and then
call a command which writes to descriptor 3. So this way a master program
could create a communication channel which the separate programs in the
pipeline all communicate via. That's perhaps overkill though!

We thought a bit about connecting multiple programs together to produce
pipelines. However that's far off on our plan. You'd obviously need a way of
linking arguments together so that the user cannot specify the output 
of program 1 as "xyzzy" and the input for program 2 as "plugh".

> I agree with the suggestion about having default prompts expanded by an acdpretty or 
> acdcomplete feature as well as a way to group parameters in a hierarchical way. 
> I feel very sorry for the acd2xml not being perfect!! :-) It's purpose was only
> to adapt EMBOSS descriptions to Pise, not to convert ACD into XML. I keep on suggesting
> that native XML ACDs would be great.

The problems we have with acd2xml though are mainly that it doesn't always
copy over everything in the acd files, and that it's clearly designed to be
read by perl rather than tcl (as it includes embedded perl expressions). As
you rightly point out, this is because it's designed for Pise so I don't feel
that we have any right to complain about this! :) Actually one other key
reason for reading acd directly was that the tcl xml parser we used had some
rather weird licence restrictions which we weren't entirely happy about, and
reading acd is easier than xml anyway.

If the acd reading works well enough we may even use the same format for our
own analysis functions in Spin as it will cut down on the rather repetitive
nature of our tcl/tk code!


[1] Rather amusingly I once discovered that on many systems the unix "write"
command had a problem with open file descriptors. A shell out would reset the
uid and gid to revoke the gid tty permissions, but it forgot to close the fd
to the terminal being written too. This in turn means that other programs can
then use ioctl()s (eg TIOCSTI) on the terminal to do intriguing things, such
as inserting real keypresses into the 'keyboard buffer' and hence allowed
complete hijacking of the remote session. Of course that was back in my
naughtier Unix days :)

James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/

More information about the emboss-dev mailing list