[Bioperl-l] Alignment->slice() issue?

Kevin Brown Kevin.M.Brown at asu.edu
Thu Jan 18 18:08:18 UTC 2007


NM, looks like I found the issue.  Since the alignment object needs the
sequences to be padded to match them up (even though a start and stop
value are in the alignment) I was trying to speed up the pad method and
it wasn't fully filling out.  So, I created my own splice function so I
don't have the perl interpreter having to pad some sequences with as
many as 3,000,000 .'s. 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Wednesday, January 17, 2007 9:17 AM
> To: bioperl-l list
> Subject: [Bioperl-l] Alignment->slice() issue?
> 
> Bioperl: 1.5.2_100
> Perl: perl -v
> This is perl, v5.8.5 built for i386-linux-thread-multi
> 
> I'm hoping this is just me, but I've created a huge alignment of a set
> of primers on a chromosome and then I'm trying to slice up that one
> large alignment into smaller alignments based around the CDS 
> features of
> the chromosome (taken from a Genbank file that the script read in
> previously that gives me both the features and the chromosome 
> sequence).
> The error occurs when I request the slice.  I get the following:
> 
> ------------- EXCEPTION  -------------
> MSG: Bad start,end parameters. Start [1088] has to be less than end
> [850]
> STACK Bio::PrimarySeq::subseq
> /usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:354
> STACK Bio::SimpleAlign::slice
> /usr/lib/perl5/site_perl/5.8.5/Bio/SimpleAlign.pm:929
> STACK toplevel ./PrimerAnalysis.pl:376
> 
> --------------------------------------
> 
> But, based on output I've put into my script that isn't the range I
> requested from the alignment.  What I've requested is
> $align = $alignments{$key}->slice($start, $stop);
> $start is 1088 and $stop is 2377 (from the printout below)
> "Forward strand with start(1088) and stop(2377) at ./PrimerAnalysis.pl
> line 358, <$primer> line 657."
> 
> The feature I'm initally after is BMAA0001 start:1139 stop:2326 with
> some upstream and downstream sequence.
> 
> I noticed that slice does "foreach my $seq ( 
> $self->each_seq() )", so I
> copied that to printout all the sequences held by the alignment and
> their start and stop locations and get the following:
> NC_006349       1       2325379
> BurkM_0005_a-f..BurkM_0005_a-r  80686   81516
> BurkM_0005_a-f..BurkM_0005_a-r  268747  269577
> BurkM_0005_a-f..BurkM_0005_a-r  329852  330682
> BurkM_0005_a-f..BurkM_0005_a-r  560818  561648
> BurkM_0005_a-f..BurkM_0005_a-r  592443  593273
> BurkM_0005_a-f..BurkM_0005_a-r  908245  909075
> BurkM_0005_a-f..BurkM_0005_a-r  935390  936220
> BurkM_0005_a-f..BurkM_0005_a-r  1014714 1015544
> BurkM_0005_a-f..BurkM_0005_a-r  1034315 1035145
> BurkM_0005_a-f..BurkM_0005_a-r  1225934 1226764
> BurkM_0005_a-f..BurkM_0005_a-r  1324779 1325609
> BurkM_0005_a-f..BurkM_0005_a-r  1413075 1413905
> BurkM_0005_a-f..BurkM_0005_a-r  1480717 1481547
> BurkM_0005_a-f..BurkM_0005_a-r  1517965 1518795
> BurkM_0005_a-f..BurkM_0005_a-r  1900786 1901616
> BurkM_0005_a-f..BurkM_0005_a-r  1921906 1922736
> BurkM_0005_a-f..BurkM_0005_a-r  1957979 1958809
> BurkM_0005_a-f..BurkM_0005_a-r  2136301 2137131
> BurkM_0005_a-r..BurkM_0005_a-f  103238  104068
> BurkM_0005_a-r..BurkM_0005_a-f  170641  171471
> BurkM_0005_a-r..BurkM_0005_a-f  408755  409585
> BurkM_0005_a-r..BurkM_0005_a-f  432906  433736
> BurkM_0005_a-r..BurkM_0005_a-f  509458  510288
> BurkM_0005_a-r..BurkM_0005_a-f  565194  566024
> BurkM_0005_a-r..BurkM_0005_a-f  656754  657584
> BurkM_0005_a-r..BurkM_0005_a-f  733927  734757
> BurkM_0005_a-r..BurkM_0005_a-f  838705  839535
> BurkM_0005_a-r..BurkM_0005_a-f  869777  870607
> BurkM_0005_a-r..BurkM_0005_a-f  892021  892851
> BurkM_0005_a-r..BurkM_0005_a-f  909903  910733
> BurkM_0005_a-r..BurkM_0005_a-f  1061801 1062631
> BurkM_0005_a-r..BurkM_0005_a-f  1096777 1097607
> BurkM_0005_a-r..BurkM_0005_a-f  1636356 1637186
> BurkM_0005_a-r..BurkM_0005_a-f  1636356 1643935
> BurkM_0005_a-r..BurkM_0005_a-f  1643105 1643935
> BurkM_0005_a-r..BurkM_0005_a-f  1790703 1791533
> BurkM_0005_a-r..BurkM_0005_a-f  2267109 2267939
> BurkM_0005_a-f..BurkM_0005_a-f  560818  566024
> BurkM_0005_a-f..BurkM_0005_a-f  908245  910733
> BMA_0006_a-r..BMA_0006_a-r      561646  565196
> BMA_0006_a-r..BMA_0006_a-r      909073  909905
> BMA_0046_a-f..BMA_0046_a-r      437921  438661
> BurkM_0092_a-f..BurkM_0092_a-f  561670  565172
> BurkM_0092_a-f..BurkM_0092_a-f  909097  909881
> BMA_0113_a-f..BMA_0113_a-r      1310782 1311536
> BMA_0113_a-r..BMA_0113_a-f      172284  173038
> BMA_0113_a-r..BMA_0113_a-f      2266197 2266951
> BMA_0146_a-f..BMA_0146_a-r      1172194 1173065
> BMA_0146_a-f..BMA_0146_a-r      2267012 2269123
> BMA_0146_a-r..BMA_0146_a-f      167410  168281
> BMA_0146_a-r..BMA_0146_a-f      320180  321051
> BMA_0146_a-r..BMA_0146_a-f      894226  895097
> BMA_0146_a-r..BMA_0146_a-f      894226  900207
> BMA_0146_a-r..BMA_0146_a-f      899335  900207
> BMA_0146_a-r..BMA_0146_a-f      1638747 1639622
> BMA_0146_a-r..BMA_0146_a-f      1972415 1973286
> BMA_0146_a-r..BMA_0146_a-f      2157899 2158770
> BMA_0146_a-r..BMA_0146_a-f      2321169 2322040
> 
> So, I can see that all the sequences held in the alignment 
> have Start <
> Stop as expected.  What I can't figure is where the end value 
> is coming
> from that is messing this up.
> 
> Any help is greatly appreciated.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list