Preferred isoschizomer ?

Fernan Aguero fernan at
Wed Apr 30 18:19:45 UTC 2003

Sorry to insist on this point. I am now also suffering from this
behaviour ...

+----[ ableasby at <ableasby at> (14.Apr.2003 15:28):
| 1) If your colleague had explicitly said  -enzymes psti
|    on the command line (or equivalent GUI) then it would
|    be found. The output would be overly verbose if all
|    isoschizomers are reported so as a compromise it reports
|    only one.

Right, if you ask for PstI, you would get PstI and not any
other isoschizomer. And I also agree with the compromise of
reporting only one isoschizomer. The problem is deciding
which one to report.

| 2) If you take the emboss files from the REBASE (NEB) distro
|    then, after renaming and putting them in data/REBASE, it will
|    probably report PstI (haven't tried it). 

I don't exactly know what you mean by 'take the emboss files
from the REBASE distro' if you mean getting the withrefm
file, as explained in the EMBOSS admin tutorial, that's what
I did, and in all cases I tried I get BspMAI instead of PstI
(but this is only one particular case I expect this to
happen for many other enzymes as well):

restrict -> BspMAI
restrict -commercial t -> BspMAI
restrict -preferred t -> BspMAI
restrict -commercial t -preferred t -> BspMAI

I've been looking at the withrefm file and according to the
description of the format provided within the file itself,
there is no provision for 'preferred' isoschizomers. At
least not explicitly declared.  The fields for each entry
are: name, isoschizomers, sequence, metylation site,
organism, source, commercial provider, references.

So, if you look for 'CTGCAG' (the sequence recognized by
PstI), you would see that it occurs several times. The file
is sorted in alphabetical order by enzyme name. However, the
list of isoschizomers does not seem to be in strict
alphabetical order and seems that the ordering is trying to
suggest a 'preferred' isoschizomer. Going through the PstI
isoschizomers in alphabetical order, of all the cases I
looked (until I got tired) PstI is always the first in the
list of isoschizomers.

So, why is restrict not using it?

And perhaps, a more difficult question to answer, as pointed before
by Guy Bottu: why is restrict preferring BspMAI over the rest of the

| I arranged with NEB
|    that they would provide only the 'common' REs in their files.
|    I believe this is what some other packages do. Using REBASEEXTRACT
|    on the withrefm file gives all the REs.

So, the answer would be that rebaseextract does nothing to
mark/tag/select a preferred isoschizomer and instead relies
on the withrefm file to contain only 'preferred'
isoschizomers? As far as I can see the withrefm file
contains all the isoschizomers for each recognition sequence.

Taken from the REBASE README:
... ... ...
#31. All Enzymes (each w/ ref & isos)           withrefm.###
... ... ...
#37. EMBOSS																			emboss_e.###, emboss_r.###
... ... ...																								
I also checked all the emboss* files (apparently, REBASE
already provides the same files that rebaseextract
produces?) and they also contain all the isoschizomers, and
not a reduced subset.

However, if this is the case, what's the use of a '-preferred t/f'
option for restrict? There would only be 'preferred' files
in the restriction enzyme database accessed by EMBOSS ...

| 3) You can equate any reported RE to another by adding an entry
|    into embossre.equ   e.g.
|    BspMAI PstI

And I have to do this myself for all enzymes when --
apparently -- it is all already in the withrefm file?


I hope this helps to find a solution.

In the meantime a hack around this would be to have at hand
a file with a list of all commonly used, commercially available
enzymes, and use it like this

restrict -enzymes @enz.list

Such a list of enzymes may be the one containing enzyme
prototypes (they are called proto.* at the REBASE site,
proto.304 is the current one). I've modified it to use it as
a list successfully.

A comparison of what happens when one uses withrefm or the
proto list does not lead to a rapid conclusion. Using
withrefm sometimes gives you a prototype enzyme, even if
there are other isoschizomers, and even if they appear first
(in alphabetical order). I wasn't able to understand what
guides restrict in choosing from the list of available
enzymes.  Looking at the source code was my next step, but
I'm still not knowledgeable enough in C ...



| Alan

F e r n a n   A g u e r o

More information about the EMBOSS mailing list