To do list.

David Martin david.martin at biotek.uio.no
Tue Oct 31 12:49:20 UTC 2000



Having just seen the 'to do' list on the web can I throw my spanner in..

Could the results output from a number of programs be modified in the
following way:

A results object (in much the same way as a sequence object) is
created. The program author doesn't have to worry about any formatting
because the result object knows about it.
New output styles get added at a library level, not an application level.

This sprang out of looking at the code for INFOSEQ with the thought of
having output into an SQL database. As every statement seems to have 
if (html) 
else

it looked somewhat ugly.

Having consistent result objects would be very nice for downstream parsers
and integration into all sorts of other things. I know I am guilty of
having a hideous output format in the apps I have written but am more than
prepared to rewrite them.

It also means that we can get programs written easier, better code output
from the various formats (eg tagging each HTML tag with
class=emboss-result etc so local style sheets can be used), XML for
integration with Kathrine and Kates various interfaces and so on.

Suggested initial result types:

1) Single value lists: These are lists of single values for example the
output from infoseq, geecee, btwisted. Labelled with sequence ID and then
a list of results.

2.) Composition tables (eg cusp, compseq and so on)

3.) Site lists as feature tables? eg restrict.

Add to this every program that puts out output having a standard format
header to say which program, when, where, and giving the parameters.


In addition to this, a new database access format. Not sequences but codon
usage. The codon usage tables distributed with emboss are positively
fossilised (6 years old). An access method for reading a species from CUTG
.spsum files would be great. Extracting all species from CUTG is probably
a bit over the top so an access method for just the interesting species
should suffice.
Showdb should then show a C as the database type.

..d
 
 
---------------------------------------------------------------------
*  Dr. David Martin                  Biotechnology Centre of Oslo   *
*  Node Manager                      Gaustadalleen 21               *
*  The Norwegian EMBNet Node         P.O. box 1125 Blindern         *
*  tel +47 22 84 05 35               N-0317 Oslo                    *
*  fax +47 22 84 05 01               Norway                         * 
---------------------------------------------------------------------






More information about the EMBOSS mailing list