[Bioperl-l] counting gaps in sequence data
Michael S. Robeson II
popgen23 at mac.com
Fri Oct 15 00:55:34 EDT 2004
Wow, that seems to work pretty well. However, I am unsure of what the
following line means:
push @{$gaptype{$gap}}, $-[0] + 2;
Especially the " $-[0] + 2" part of it. I understand that it is an
array but what is going on there is a little vague. Other than that I
pretty much understand the code. Also, about the part not being able to
match gaps at the end of a string will be a problem. I am currently
working off of what you've posted and seeing if I can fit (using a
character class I suppose) a "\Z", "\z", or "$" to match any gaps at
the end of a line.
-Cheers!
-Mike
On Oct 14, 2004, at 17:32, Barry Moore wrote:
> Mike-
>
> Something like this maybe?
>
> use strict;
> use warnings;
>
> my %seqs = (human => "acgtt---cgatacg---acgact-----t",
> chimp => "acgtacgatac---actgca---ac",
> mouse => "acgata---acgatcg----acgt");
>
> for my $seq (keys %seqs) { # An array of your sequences
> print "\n\nThe $seq sequence has the following gaps:\n";
> my %gaptype;
> for my $gap (1..5) { # 5 or however large you want gaps to be counted
> while ($seqs{$seq} =~ /[atgc]-{$gap}[atgc]/g) { #notice that this
> won't catch terminal gaps
> #This creates a hash of arrays. The arrays hold the locations of
> the
> #gaps, and the count of each gaptype is determined by the length
> of that array.
> push @{$gaptype{$gap}}, $-[0] + 2;
> }
> if (defined @{$gaptype{$gap}}) {
> my $positions = join ", ", @{$gaptype{$gap}};
> print "\tGap length $gap begining at positions:\t$positions\n";
> }
> }
> }
>
> Barry Moore
>
>
> Michael Robeson wrote:
More information about the Bioperl-l
mailing list