[Biojava-l] RestrictionEnzyme can't handle double sites

Jesse jesse-t at chello.nl
Tue Jun 28 04:46:09 EDT 2005


I think a solution requires the RestritionEnzyme class to be changed.

Maybe changing getRecognitionSite() to return an array of Strings
SymbolLists instead of a single String?

-Jesse


-----------------------------------
mark.schreiber at novartis.com mark.schreiber at novartis.com 
Wed Jun 22 21:01:12 EDT 2005

What would be your reccomended solution to this problem?





"Jesse" <jesse-t at chello.nl>
Sent by: biojava-l-bounces at portal.open-bio.org
06/22/2005 11:05 PM

 
        To:     <biojava-l at biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzyme can't handle double
sites


Another problem.

Some Restriction Enzymes have more than one recognition site. Usually this
can be notated by using ambiguous symbols, but some for restriction 
enzymes
this is not possible because in some cases the ambiguous symbols rely on
each other.

Usually an ambiguous symbol is something like this:
ANNC
The first "N" is independent of the second "N". For example, it can match
with:
AAAC
AACC
AAGC
AATC
....
....
ATTC
16 possibilities. The ambiguous symbols are independent of each other.

But in some restriction enzyme, the ambiguous symbols are dependent of 
each
other. So for a sequence like
ANNC
Would than only match with:
AAAC
ACCC
AGGC
ATTC
Only 4 possibilities. The ambiguous symbols are dependent of each other.


This happens with these enzymes:
TaqII
M.PhiBssHII (unknown cutlocation)
M.Phi3TI (unknown cutlocation)
M.Rho11sI (unknown cutlocation)
M.SPBetaI (unknown cutlocation)
M.SPRI (unknown cutlocation)

<1>TaqII
<2>
<3>GACCGA(11/9),CACCCA(11/9)
<4>
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>X
<8>Barker, D., Hoff, M., Oliphant, A., White, R., (1984) Nucleic Acids 
Res.,
vol. 12, pp. 5567-5581.
Myers, P.A., Roberts, R.J., Unpublished observations.
Rutkowska, S.M., Jaworowska, I., Skowron, P.M., Unpublished observations.


RestrictionEnzymeManager takes the last recognition site in this example, 
it
skips GACCGA.

Name: TaqII
RecognitionSite:caccca
ForwardRegex: cac{3}a
ReverseRegex: tg{3}tg
CutType: 0
DownStreamEndType: 0
IsPalindromic: false
DownstreamCut: 17, 15,


- Jesse



More information about the Biojava-l mailing list