[Bioperl-l] HOWTO: take a slice of a split location
Jason Stajich
jason.stajich at duke.edu
Sat Dec 10 13:11:57 EST 2005
Hi Malcom -
Don't have a chance to look at your code, but my approach to this
problem would be to first splice the sequence out from the genome
my $feature = Bio::SeqFeature::Generic->new(-location =>
$splitlocation);
my $cdsseq = $feature->spliced_seq;
then just retrieve the last 1000 bases of this sequence.
my $threeprime = $cdsseq->subseq($cdsseq->length - 1000, $cdsseq-
>length);
(this might be off-by-one?)
There is also a module to map between coordinates -
Bio::Coordinate::GeneMapper if you need to go from transcript to
genomic coordinates.
-jason
On Dec 10, 2005, at 2:06 AM, Cook, Malcolm wrote:
> Fellow Bioperlers,
>
> I was in need of extracting the 3'-most 1000 bp of from multiple
> genomic CDS regions (designing 70mer u-array probes).
>
> I looked in vain for Bio::Location->splice($from,$to);
>
> So I wrote one which works but suffers from actually materializing
> the list of interger indices into the sequence for every base.
>
> Has anyone a better approach they'd care to share?
>
> Malcolm Cook - mec at stowers-institute.org
> Stowers Institute for Medical Research - Kansas City, MO USA
>
> P.S. Here' what I wrote:
>
> package Bio::LocationI; # Code in the interface so it works
> # with both ::Split and ::Simple
> # Bio::Locations
>
> sub _intspans {
> # Purpose: for a (presumably) monotonically increasing list of
> # integers, return list of arrays each holding min and max of
> # the list's internal contiguous spans.
> #
> # Example: 1..5,10..20,30 => ([1,5],[10,20],[30,30])
> my @i = @_;
> die "nothing passed to intspans" unless @i;
> my @s = ([$i[0],shift(@i)]);
> foreach (@i) {
> if ($_ == 1 + $s[0][1]) {
> $s[0][1] = $_;
> } else {
> unshift @s, [$_, $_]
> }}
> reverse @s;
> }
>
> sub slice {
> # Purpose: compute a slice of the Location, using perls normal slice
> # semantics, expect that it trims out of range values.
> my ($self, $from, $to) = @_;
> my @int = eval (join ',', map {$_->start . '..' . $_->end} $self-
> >each_Location); # build perl expression using the range (..) and
> list (,) operators.
> @int = @int[$from..$to];
> @int = grep {$_} @int; # Removing undefs (in case $from/$to out
> of bounds).
> my @intspans = _intspans(@int);
> new Bio::Location::Split (-strand => $self->strand,
> -locations => [map {new Bio::Location::Simple(-start => $_->
> [0],
> -end => $_->[1],
> -strand => $self->strand,
> )
> } @intspans],
> );
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12
More information about the Bioperl-l
mailing list