[Bioperl-l] Bio::Tools::RestrictionEnzyme
Chris Fields
cjfields at uiuc.edu
Fri Nov 3 18:28:53 UTC 2006
Nick,
Could you file this as a bug?
Chris
On Nov 3, 2006, at 10:29 AM, Staffa, Nick (NIH/NIEHS) wrote:
> The module Bio::Tools::RestrictionEnzyme
> Uses the perl split function to generate the fragments of a digestion,
> Using the recognition pattern as the delimiter. It then glues onto
> the
> resulting strings that part of the pattern representing the
> sequence before
> and after the cut. This is fine for non-ambiguous patterns, but
> starts
> looking funny for patterns having ambiguities.
> Worse that in doing a double digest, one enzyme after another, the
> ambiguity
> code character can mask a true cut site.
> I was using BsaHI [GRCGYC] followed by HpaII [CCGG]
> Below is the example of the CCGG pattern being masked by a Y
> And the different results of the digestion.
>
>
> CGYCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA
> GAAA
> As opposed to the real thing:
> CGCCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA
> GAAA
>
> Which when cut by HpaII [CCGG] really yields
> first A_B_frag :
> CGC
>
> Instead of :
> first A_B_frag =
>
> CGYCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA
> GAAAG
> ACTTGCTAGTGCTGTTGGGTCTCC
> TTGACTCTGAGACAATGATAACAATGTTGAAGGTGGTCTAGGCATTTGGGTGCTGTGGAGTTATAAAGAG
> GAAAAG
> AAAAGATAAAACAAAAAAAAATAG
> GAAACAAATGATTAAGCCACTACTAAGGGGTCTAGTCTAATGCCAACTGGGTAAATTCATGGGAACAATG
> TGTGCC
> AGTCTTTAGAAACACTGTTTCATA
> TTGCATATATTATGGCATGGTATTACATTGATTAATTTTACTTTAGAGATGAAGAAGCTGAGATTTGGGG
> TGAATA
> GCAATTATCCCAAAGTCTCTCAGA
> TAGCTGGAGGCAGCAGGGTCTGGGGTATTCACAGTCCCTACTCCATATTGTGTGGTCAGAACCAAATGAG
> ACAGAT
> AAAGGGCAGACAAAAGAGAAAGTG
> GGGAGTATGATTTGAAAATGATGGTGTGACCCAGATTTCTGATGGAAATATCTAATGGCTGCAGACTGGA
> TAGCTG
> TGACCATTTTAGTTACTGAATTCA
> GGAGATCTTATCTCAATGGAGGCATGTTGTCAACCAAAAGCCAGGATAAGCAAGGGTCAGTGTCTAGACA
> TTGGAG
> TAAGGTTTGCCTGGATATTTCCAC
> AGGGAACCAAGTGTCATGGAGTCTTATTCATTGGGAGGTTATCTTTGTTACACACATGGACATATCATCA
> AGCCAG
> CAATTCAGCAAAACTGTCAACACA
> CAAATAGAGATGTATTGACAACGGGGAACCACAAGTCATGCTTATTCCAAGCTAAAGCCCTCATGTGGAA
> CTTGTT
> TTGTATGGCATTTGTCTCATCTAC
> ACATTGATGGGAAGGGTAAAAGGAAGTCTTTGGTGGGATTACAGAAGTCAGTAAAAAAGCAAAAGGAAAG
> ATTTAG
> AAAACAAAGAAAAAGAAAAGGGAG
> GAAAGGAAAAGAAAAAAGATTTCAGAGATCTCAACATCAATTCAGACCAAGGGTGCCTCTTATACTATGT
> CCAAGC
> CAGTAAGTGGGGTTGTTCTTGTTA
> ACTACAGCCATGTATAGAGGTGAACTTCAGGCTCCTGACTGATCCTCTGAGGTAGAAAGTAAACAGTACT
> CTTATG
> ACACACGCAGTTGTTCAGTGCTGA
> CATGAAAATGTCATTGCTTACAGCGCTAGGAGAC
>
>
> This subroutine yields, I believe, the true sequence,
> Although I don't know how efficient it is.
> I'm thinking it must be more efficient than having to turn each
> fragment
> from the first digestion into a BioPerl Sequence Object before
> applying the
> cut_seq method.
>
> sub cut_seq {
> my $number= 0;
> my @frags = ();
> my $bigline = shift @_;
> my $recognition_site = shift @_;
> my $cutsite = shift @_;
> my $pat = &expanded_string($recognition_site);
> while ($bigline){
> #my $offset = index $bigline, $pat;
> if ($bigline =~/($pat)/){
> my $first = substr $&,0,$cutsite;
> my $last = substr $&,$cutsite;
> my $frag = $`.$first;
> push @frags, $frag;
> $number++;
> #print "fragment # $number:\n$frag\n";
> my $rest_of_bigline = $last.$';
> $bigline = $rest_of_bigline;}
> else {push @frags, $bigline; #Last one
> $number++;
> #print "fragment # $number:\n$bigline\n";
> $bigline = "";}
> }
> return @frags;
> }
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569 (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Jack L. Field( field1 at niehs.nih.gov )
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list