[Bioperl-l] Bio::Coordinate::GeneMapper cds to peptide bug
Heikki Lehvaslaiho
heikki.lehvaslaiho at gmail.com
Wed May 12 10:23:03 UTC 2010
Outch. I'll definitely have a look.
Strange that none of the tests have picked this up...
-Heikki
Heikki Lehvaslaiho - skype:heikki_lehvaslaiho
cell: +966 545 595 849 office: +966 2 808 2429
Computational Bioscience Research Centre (CBRC), Building #2, Office #4216
4700 King Abdullah University of Science and Technology (KAUST)
Thuwal 23955-6900, Kingdom of Saudi Arabia
On 12 May 2010 01:40, Aaron Mackey <amackey at virginia.edu> wrote:
> Hi Chris,
>
> I was hoping Heikki might take up the cause and investigate further --
> let's
> give him a chance to respond. But it's a frightening bug if it's really
> been that way for all this time ...
>
> -Aaron
>
> On Tue, May 11, 2010 at 6:31 PM, Chris Fields <cjfields at illinois.edu>
> wrote:
>
> > Aaron,
> >
> > Do we want to write this up as a set of tests to add to the bioperl test
> > suite? We can probably add it after the github migration tomorrow.
> >
> > chris
> >
> > On May 11, 2010, at 4:26 PM, Aaron Mackey wrote:
> >
> > > Hi Zerui (and others),
> > >
> > > I've confirmed there seems to be a bug in Bio::Coordinate::GeneMapper,
> > > specifically in this code:
> > >
> > > lines:
> > > 1170: (-start => int ($loc->start / 3 ) +1,
> > > 1171: -end => int ($loc->end / 3 ) +1,
> > >
> > > both of those lines should look like: int (($loc->start - 1) / 3) + 1
> > >
> > > otherwise for CDS coordinates 1, 2, 3, 4, 5, 6, you get incorrect
> peptide
> > > positions 1, 1, 2, 2, 2, 3 (when you instead want 1, 1, 1, 2, 2, 2)
> > >
> > > There is also a problem when mapping exon coordinates that are
> > outside/after
> > > the CDS region (instead of getting undefined locations, you continue to
> > get
> > > peptide coordinates, but they are invalid, larger than the protein
> > length).
> > >
> > > Dennis and fringy -- this may affect the SNPtab.pl script I wrote for
> > you,
> > > as it uses this module to calculate codons for SNPs.
> > >
> > > -Aaron
> > >
> > > P.S. a script the demonstrates the problem:
> > >
> > > use Bio::Coordinate::GeneMapper;
> > >
> > > my $mapper =
> > > Bio::Coordinate::GeneMapper
> > > ->new( -in => "chr",
> > > -out => "propeptide",
> > > -exons => [ Bio::Location::Simple
> > > ->new( -start => 101,
> > > -end => 109 ),
> > > Bio::Location::Simple
> > > ->new( -start => 201,
> > > -end => 221 ),
> > > ],
> > > -cds => Bio::Location::Simple
> > > ->new(-start => 101, -end => 209),
> > > );
> > >
> > >
> > > print join("\t", "chr", "aa"), "\n";
> > > for my $pos (99..111,199..211) {
> > > my $res = $mapper->map(
> > > Bio::Location::Simple->new(-start => $pos, -end => $pos, -seq_id =>
> > 1));
> > > my $start = $res->start; $start = "NA" unless defined $start;
> > > my $end = $res->end; $end = "NA" unless defined $end;
> > > print join("\t", $pos, $start), "\n";
> > > }
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list