[Bioperl-l] HOWTO: take a slice of a split location
Cook, Malcolm
MEC at stowers-institute.org
Sat Dec 10 02:06:03 EST 2005
Fellow Bioperlers,
I was in need of extracting the 3'-most 1000 bp of from multiple genomic CDS regions (designing 70mer u-array probes).
I looked in vain for Bio::Location->splice($from,$to);
So I wrote one which works but suffers from actually materializing the list of interger indices into the sequence for every base.
Has anyone a better approach they'd care to share?
Malcolm Cook - mec at stowers-institute.org
Stowers Institute for Medical Research - Kansas City, MO USA
P.S. Here' what I wrote:
package Bio::LocationI; # Code in the interface so it works
# with both ::Split and ::Simple
# Bio::Locations
sub _intspans {
# Purpose: for a (presumably) monotonically increasing list of
# integers, return list of arrays each holding min and max of
# the list's internal contiguous spans.
#
# Example: 1..5,10..20,30 => ([1,5],[10,20],[30,30])
my @i = @_;
die "nothing passed to intspans" unless @i;
my @s = ([$i[0],shift(@i)]);
foreach (@i) {
if ($_ == 1 + $s[0][1]) {
$s[0][1] = $_;
} else {
unshift @s, [$_, $_]
}}
reverse @s;
}
sub slice {
# Purpose: compute a slice of the Location, using perls normal slice
# semantics, expect that it trims out of range values.
my ($self, $from, $to) = @_;
my @int = eval (join ',', map {$_->start . '..' . $_->end} $self->each_Location); # build perl expression using the range (..) and list (,) operators.
@int = @int[$from..$to];
@int = grep {$_} @int; # Removing undefs (in case $from/$to out of bounds).
my @intspans = _intspans(@int);
new Bio::Location::Split (-strand => $self->strand,
-locations => [map {new Bio::Location::Simple(-start => $_->[0],
-end => $_->[1],
-strand => $self->strand,
)
} @intspans],
);
}
More information about the Bioperl-l
mailing list