restrict and supermatcher

Peter Rice peter.rice at uk.lionbioscience.com
Fri Feb 15 13:02:38 UTC 2002


Heather Davidson wrote:
> 
> I would like to know how to use supermatcher. I have used matcher with
> no problems. I want to compare one long sequence against a lot of other
> smaller sequences. In GCG one made a file of filenames of the smaller
> sequences and the comparison was done all at once. I do not know how to
> do this on supermatcher and the help does not help.

The file of filenames is a VMS concept ... used by GCG and by EMBOSS.

Put the filenames into a list, and use @filename on the command line.

There are some known problems (for example if a file does not exist, or
with complicated USAs or nested files of filenames) but as a simple list it
should be fine.

> Also I have used restrict but the output is not what I want. Basically I
> want an output like mapsort in GCG and I have asked support at HGMP and
> they cannot help either. I want the enzymes listed alphabetically (
> which I know it does fine except it does a long line instead of across
> the paper) and with the site location but then with the size of
> fragments produced from that digest listed underneath

Interesting ....

Restrict now produces a feature report, with a qualifier 'enzyme'. (Try
'restrict -rf embl' to see what is really happening inside).

What you really are asking is to sort the report output by a named
column/qualifier. This is quite easy to implement ... a sort by one or more
qualifiers (internal or printed names) and then by start/end position.

Once implemented, this would apply to all EMBOSS programs that have the new
'report' output .... this will grow to include all programs that generate
sequence annotation.

For restrict, it would be a command line option:

% restrict -rsort enzyme

(as they are in the feature table, you could also sort by 5prime instead of
the start of the site match)

Report qualifiers never have spaces, so a list of names with spaces would
be fine. ACD sees the full list of report columns/qualifiers, so EMBOSS can
give an error message at the start of the program.

> I have only used emboss twice and each time I cannot get what I used to
> be able to get effortlessly from GCG

Ah, you should perhaps not expect to get what GCG gives you!!!

Remember, with EMBOSS you can effortlessly get things that GCG cannot do
...

... and that includes some requests for new features :-)

regards,

Peter Rice

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723




More information about the EMBOSS mailing list