[Bioperl-l] Bio::Coordinate::GeneMapper cds to peptide bug
Aaron Mackey
amackey at virginia.edu
Tue May 11 21:26:50 UTC 2010
Hi Zerui (and others),
I've confirmed there seems to be a bug in Bio::Coordinate::GeneMapper,
specifically in this code:
lines:
1170: (-start => int ($loc->start / 3 ) +1,
1171: -end => int ($loc->end / 3 ) +1,
both of those lines should look like: int (($loc->start - 1) / 3) + 1
otherwise for CDS coordinates 1, 2, 3, 4, 5, 6, you get incorrect peptide
positions 1, 1, 2, 2, 2, 3 (when you instead want 1, 1, 1, 2, 2, 2)
There is also a problem when mapping exon coordinates that are outside/after
the CDS region (instead of getting undefined locations, you continue to get
peptide coordinates, but they are invalid, larger than the protein length).
Dennis and fringy -- this may affect the SNPtab.pl script I wrote for you,
as it uses this module to calculate codons for SNPs.
-Aaron
P.S. a script the demonstrates the problem:
use Bio::Coordinate::GeneMapper;
my $mapper =
Bio::Coordinate::GeneMapper
->new( -in => "chr",
-out => "propeptide",
-exons => [ Bio::Location::Simple
->new( -start => 101,
-end => 109 ),
Bio::Location::Simple
->new( -start => 201,
-end => 221 ),
],
-cds => Bio::Location::Simple
->new(-start => 101, -end => 209),
);
print join("\t", "chr", "aa"), "\n";
for my $pos (99..111,199..211) {
my $res = $mapper->map(
Bio::Location::Simple->new(-start => $pos, -end => $pos, -seq_id => 1));
my $start = $res->start; $start = "NA" unless defined $start;
my $end = $res->end; $end = "NA" unless defined $end;
print join("\t", $pos, $start), "\n";
}
More information about the Bioperl-l
mailing list