[Bioperl-l] Restriction Enzyme cuts on Circular plasmids

Fri Oct 31 09:48:10 EST 2003

Yes, you are quite right. Probably in the test tube one of the sites 
will be cut randomly - behavior we could mimic but that would be very 
confusing. I agree it may be better to map the sites rather than cut 
the sequences, and this may be the best way to do it (and actually 
could be faster).

Perhaps a better way to do it is to add a new function that returns 
mapped cut sites. I need to look more at the docs, but isn't there a 
bioperl feature already to look for occurrences of a single sequence in 
a sequence? This could actually be a lot faster for non-ambiguous 
sequences by using index rather than split, and use less memory.

At the moment the module actually calculates cut sites by fragmenting 
the sequence and then calculating the length of each of the sequences 
returned by the split. This is probably not optimal.

Maybe this should be a bugzilla bug report....

Rob

On Friday, October 31, 2003, at 08:36  AM, Gray, John wrote:

> After reading some of your comments about how the site recognition is 
> functioning, I am concerned that there may be another problem.  It 
> commonly occurs that restriction enzyme recognition sites will 
> overlap, and I think this may cause your method to miss some sites.  I 
> am wondering whether it may be necessary to separate the process of 
> site mapping and cleavage.
>
> For example, BssH II cuts at G^CGCGC, and the sequence of GCGCGCGC 
> theoretically has two cut sites within it.  Of course, your algorithm 
> is similar to reality in that once the enzyme cuts the sequence once, 
> it probably won't be able to recognize the other site.  However, in 
> the test tube what you will actually get is a random distribution of 
> cutting at the two sites.  Traditionally (at least in the software I 
> have used), the site mapping algorithms have returned all possible cut 
> sites.
>
> I am thinking the only way around this would be to first map the sites 
> into an array, and then use that array to either calculate fragment 
> sizes or sequences.  With the possibility of overlapping sites in 
> mind, I still can't think of any way to circumvent the problem of the 
> origin on circular sequences without concatenating the sequence to 
> simulate circularity.
>
> John
>
> -----Original Message-----
> From: Rob Edwards [mailto:redwards at utmem.edu]
> Sent: Thursday, October 30, 2003 7:53 PM
> To: bioperl-l at portal.open-bio.org
> Subject: Re: [Bioperl-l] Restriction Enzyme cuts on Circular plasmids
>
> The following is a quick patch for Bio/Restriction/Analysis.pm so that
> it handles circular sequences correctly if there is another cut site in
> the region that has been linearized. At the moment it won't handle a
> single cut site at that point (e.g. pBR322 has a single EcoRI site at
> the point it is circularized). I am not sure how to deal with this and
> need to think about it (the fragments are right but the cut sites are
> not).
>
> Can someone submit it for me?
>
> I have submitted a Bugzilla report as #1548
>
> 120c120,121
> < for further analysis. However, this will change the start of the
> ---
>> for further analysis. This fragment will also be checked for cuts
>> by the enzyme(s). However, this will change the start of the
> 737c738,749
> <                 unshift (@re_frags, $last.$first);
> ---
>>               my $newfrag=$last.$first;
>>               my @cuts = split /($beforeseq)($afterseq)/i, $newfrag;
>>               my @newfrags;
>>               if ($#cuts) {
>>                # there is another cut
>>                for (my $i=0; $i<=$#cuts; $i+=2) {push (@newfrags,
> $cuts[$i].$cuts[$i+1])}
>>               }
>>               else {
>>                # there isn't another cut
>>                push (@newfrags, $newfrag);
>>               }
>>               push @re_frags, @newfrags;
>
>
> Thanks
>
> Rob
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>