[Bioperl-l] Alignment->slice() issue?
Kevin Brown
Kevin.M.Brown at asu.edu
Thu Jan 18 18:08:18 UTC 2007
NM, looks like I found the issue. Since the alignment object needs the
sequences to be padded to match them up (even though a start and stop
value are in the alignment) I was trying to speed up the pad method and
it wasn't fully filling out. So, I created my own splice function so I
don't have the perl interpreter having to pad some sequences with as
many as 3,000,000 .'s.
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Wednesday, January 17, 2007 9:17 AM
> To: bioperl-l list
> Subject: [Bioperl-l] Alignment->slice() issue?
>
> Bioperl: 1.5.2_100
> Perl: perl -v
> This is perl, v5.8.5 built for i386-linux-thread-multi
>
> I'm hoping this is just me, but I've created a huge alignment of a set
> of primers on a chromosome and then I'm trying to slice up that one
> large alignment into smaller alignments based around the CDS
> features of
> the chromosome (taken from a Genbank file that the script read in
> previously that gives me both the features and the chromosome
> sequence).
> The error occurs when I request the slice. I get the following:
>
> ------------- EXCEPTION -------------
> MSG: Bad start,end parameters. Start [1088] has to be less than end
> [850]
> STACK Bio::PrimarySeq::subseq
> /usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:354
> STACK Bio::SimpleAlign::slice
> /usr/lib/perl5/site_perl/5.8.5/Bio/SimpleAlign.pm:929
> STACK toplevel ./PrimerAnalysis.pl:376
>
> --------------------------------------
>
> But, based on output I've put into my script that isn't the range I
> requested from the alignment. What I've requested is
> $align = $alignments{$key}->slice($start, $stop);
> $start is 1088 and $stop is 2377 (from the printout below)
> "Forward strand with start(1088) and stop(2377) at ./PrimerAnalysis.pl
> line 358, <$primer> line 657."
>
> The feature I'm initally after is BMAA0001 start:1139 stop:2326 with
> some upstream and downstream sequence.
>
> I noticed that slice does "foreach my $seq (
> $self->each_seq() )", so I
> copied that to printout all the sequences held by the alignment and
> their start and stop locations and get the following:
> NC_006349 1 2325379
> BurkM_0005_a-f..BurkM_0005_a-r 80686 81516
> BurkM_0005_a-f..BurkM_0005_a-r 268747 269577
> BurkM_0005_a-f..BurkM_0005_a-r 329852 330682
> BurkM_0005_a-f..BurkM_0005_a-r 560818 561648
> BurkM_0005_a-f..BurkM_0005_a-r 592443 593273
> BurkM_0005_a-f..BurkM_0005_a-r 908245 909075
> BurkM_0005_a-f..BurkM_0005_a-r 935390 936220
> BurkM_0005_a-f..BurkM_0005_a-r 1014714 1015544
> BurkM_0005_a-f..BurkM_0005_a-r 1034315 1035145
> BurkM_0005_a-f..BurkM_0005_a-r 1225934 1226764
> BurkM_0005_a-f..BurkM_0005_a-r 1324779 1325609
> BurkM_0005_a-f..BurkM_0005_a-r 1413075 1413905
> BurkM_0005_a-f..BurkM_0005_a-r 1480717 1481547
> BurkM_0005_a-f..BurkM_0005_a-r 1517965 1518795
> BurkM_0005_a-f..BurkM_0005_a-r 1900786 1901616
> BurkM_0005_a-f..BurkM_0005_a-r 1921906 1922736
> BurkM_0005_a-f..BurkM_0005_a-r 1957979 1958809
> BurkM_0005_a-f..BurkM_0005_a-r 2136301 2137131
> BurkM_0005_a-r..BurkM_0005_a-f 103238 104068
> BurkM_0005_a-r..BurkM_0005_a-f 170641 171471
> BurkM_0005_a-r..BurkM_0005_a-f 408755 409585
> BurkM_0005_a-r..BurkM_0005_a-f 432906 433736
> BurkM_0005_a-r..BurkM_0005_a-f 509458 510288
> BurkM_0005_a-r..BurkM_0005_a-f 565194 566024
> BurkM_0005_a-r..BurkM_0005_a-f 656754 657584
> BurkM_0005_a-r..BurkM_0005_a-f 733927 734757
> BurkM_0005_a-r..BurkM_0005_a-f 838705 839535
> BurkM_0005_a-r..BurkM_0005_a-f 869777 870607
> BurkM_0005_a-r..BurkM_0005_a-f 892021 892851
> BurkM_0005_a-r..BurkM_0005_a-f 909903 910733
> BurkM_0005_a-r..BurkM_0005_a-f 1061801 1062631
> BurkM_0005_a-r..BurkM_0005_a-f 1096777 1097607
> BurkM_0005_a-r..BurkM_0005_a-f 1636356 1637186
> BurkM_0005_a-r..BurkM_0005_a-f 1636356 1643935
> BurkM_0005_a-r..BurkM_0005_a-f 1643105 1643935
> BurkM_0005_a-r..BurkM_0005_a-f 1790703 1791533
> BurkM_0005_a-r..BurkM_0005_a-f 2267109 2267939
> BurkM_0005_a-f..BurkM_0005_a-f 560818 566024
> BurkM_0005_a-f..BurkM_0005_a-f 908245 910733
> BMA_0006_a-r..BMA_0006_a-r 561646 565196
> BMA_0006_a-r..BMA_0006_a-r 909073 909905
> BMA_0046_a-f..BMA_0046_a-r 437921 438661
> BurkM_0092_a-f..BurkM_0092_a-f 561670 565172
> BurkM_0092_a-f..BurkM_0092_a-f 909097 909881
> BMA_0113_a-f..BMA_0113_a-r 1310782 1311536
> BMA_0113_a-r..BMA_0113_a-f 172284 173038
> BMA_0113_a-r..BMA_0113_a-f 2266197 2266951
> BMA_0146_a-f..BMA_0146_a-r 1172194 1173065
> BMA_0146_a-f..BMA_0146_a-r 2267012 2269123
> BMA_0146_a-r..BMA_0146_a-f 167410 168281
> BMA_0146_a-r..BMA_0146_a-f 320180 321051
> BMA_0146_a-r..BMA_0146_a-f 894226 895097
> BMA_0146_a-r..BMA_0146_a-f 894226 900207
> BMA_0146_a-r..BMA_0146_a-f 899335 900207
> BMA_0146_a-r..BMA_0146_a-f 1638747 1639622
> BMA_0146_a-r..BMA_0146_a-f 1972415 1973286
> BMA_0146_a-r..BMA_0146_a-f 2157899 2158770
> BMA_0146_a-r..BMA_0146_a-f 2321169 2322040
>
> So, I can see that all the sequences held in the alignment
> have Start <
> Stop as expected. What I can't figure is where the end value
> is coming
> from that is messing this up.
>
> Any help is greatly appreciated.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list