Reorganisation of EMBOSS

Mon Oct 14 14:12:15 UTC 2002

On Mon, 14 Oct 2002 11:26:52 +0100
"Gary Williams, Tel 01223 494522" <gwilliam at hgmp.mrc.ac.uk> wrote:

> 
> Sorry - I'm guilty of imprecise terminology here.
> 
> When I said "A program should have a function", I meant that a program
> should have a single biological job to do, not that a program should
> only have one call to a single function in the code libraries, (which I
> agree is very silly.)
> 
> Gary

Not so.

It not only is not silly, but a bery good idea indeed: ideally a program
should have a single functional entry point. So the general layout would
be
	main()
	{
		args = process_arguments();
		do_work(args);
	}

This has been admonished for more than fifteen years that I remember. The
rationale is that if you doing it this way is no more complicated than the
classic "main(argc,argv)" directly approach, but it adds a very useful layer:

Instead of having to process command line arguments or text strings, the
routine that is actually your program gets actual binary arguments. Thus 
building new programs that need to use the functionality of an existing
one is easy: simply call the routine with the appropriate arguments.

Doing so without a single routine entry point, i.e. a traditional program
would entangle renaming main to something, converting all actual alrguments
to strings and calling that something, or rewriting the program's main()
entirely.

The approach of a processing first the command-line and then calling a single
routine enforces considering the program as something that carries on one
conceptually functional work, and saves work if you later want to build on
top of it. It is more elegant and demonstrates that you really have a clear
idea what the program is supposed to achieve and what it needs to do it.

That is most important in today's world: if you want to make that functionality
into something remotely invokable for distributed computing, and it is a routine
you simply write the IDL and you're done, otherwise either you redesign the
program or write a wrapper to invoke it whic results in additional overhead.
Ditto for adding interfaces: if you are separating the command-line processing,
is to have that ability. If you now want to build a new interface it is easy
to use the single routine entry as a hook to call the program instead of
marshalling the arguments into text strings again...

All in all, it's old wisdom that it is much better to have your program invoke
a single function to do its work for future maintenance.

					j