[Bioperl-l] Question about Bio::Coordinate::Pair
Dan Bolser
dan.bolser at gmail.com
Fri Oct 8 15:04:44 UTC 2010
Well... I tried to toggle the strand of the gap sublocation to match
that of the match sublocation, and overall I ended up failing one test
within t/Coordinate/CoordinateMapper.t, test number 55 (line 119)...
This test actually tests for the strandedness of the gap sublocation
that I'm specifically changing because it leads to the 'unexpected'
(or at least, 'inconsistent') behaviour that I'm calling a bug (test
55 is at the end, with enough preceding context to reproduce it):
# propepide
my $match1 = Bio::Location::Simple->new
(-seq_id => 'propeptide', -start => 21, -end => 40, -strand=>1 );
# peptide
my $match2 = Bio::Location::Simple->new
(-seq_id => 'peptide', -start => 1, -end => 20, -strand=>1 );
ok my $pair = Bio::Coordinate::Pair->new(-in => $match1,
-out => $match2,
-negative => 0, # false, default
);
...
#
# partial match = gap & match
#
$pos2 = Bio::Location::Simple->new
(-start => 20, -end => 22, -strand=> -1 );
ok $res = $pair->map($pos2);
...
is $res->gap->strand, -1; # TEST 55. Fails when I 'fix' Bio::Coordinate::Pair
In the absence of any other information, can I take this to mean that
the strand of the gap sublocations are not used for anything
significant?
Cheers,
Dan.
On 5 October 2010 11:36, Dan Bolser <dan.bolser at gmail.com> wrote:
> Hi,
>
> Can someone describe in a bit more detail the purpose of the Gap
> sublocations that are sometimes returned by Bio::Coordinate::Pair [1]?
>
> I found that, according to Bio::Location::Split, if the Match and Gap
> sublocations have a different strand, the strand method (called via
> Bio::Coordinate::Result) returns undef. This is inconsistent with the
> way Bio::Coordinate::Result tends to behave. See the test script and
> results below, also pasted here [2].
>
> The question is, can I just toggle the strand of the Gap sublocation
> to match that of the Match sublocation? Or does the strand of the Gap
> sublocation encode some important but as yet undocumented information?
> If the strand of the Gap and Match sublocations are made to match
> (within Bio::Coordinate::Pair) this will simplify code that uses
> Bio::Coordinate::Pair, making it more consistent, and perhaps help
> with some other bugs [3].
>
>
> Cheers,
> Dan.
>
> [1] http://www.bioperl.org/wiki/Module:Bio::Coordinate::Pair
> [2] http://www.bioperl.org/wiki/Module_Discussion:Bio::Coordinate::Pair
> [3] http://tinyurl.com/36na2cp
>
>
> #!/usr/bin/perl -w
>
> ## Stress test Bio::Coordinate::Pair
>
> use strict;
> use Data::Dumper;
>
> use Bio::Location::Simple;
> use Bio::Coordinate::Pair;
>
> ## A contig
> my $ctg = Bio::Location::Simple->
> new( -seq_id => 'ctg',
> -start => 1,
> -end => 1001,
> -strand => +1,
> );
>
> ## The contigs position on a chromosome (forward)
> my $ctg_on_chr_f = Bio::Location::Simple->
> new( -seq_id => 'ctg on chr r',
> -start => 5001,
> -end => 6001,
> -strand => +1,
> );
>
> ## The contigs position on a chromosome (reverse)
> my $ctg_on_chr_r = Bio::Location::Simple->
> new( -seq_id => 'ctg on chr r',
> -start => 5001,
> -end => 6001,
> -strand => -1,
> );
>
> ## Coordinate mapping (forward)
> my $agp_f = Bio::Coordinate::Pair->
> new( -in => $ctg,
> -out => $ctg_on_chr_f
> );
>
> ## Coordinate mapping (reverse)
> my $agp_r = Bio::Coordinate::Pair->
> new( -in => $ctg,
> -out => $ctg_on_chr_r
> );
>
>
>
> ## A match, in contig coordinates...
> my $match_on_ctg_4 = Bio::Location::Simple->
> new( -seq_id => 'hit 4',
> -start => 925,
> -end => 1125,
> -strand => -1,
> );
>
> ## Map it into chromosome coordinates (forward)
> my $match_on_chr_4_f =
> $agp_f->map( $match_on_ctg_4 );
>
> print Dumper $match_on_chr_4_f, "\n";
>
> ## Map it into chromosome coordinates (reverse)
> my $match_on_chr_4_r =
> $agp_r->map( $match_on_ctg_4 );
>
> print Dumper $match_on_chr_4_r, "\n";
>
> __END__
>
> $VAR1 = bless( {
> '_sublocations' => [
> bless( {
> '_strand' => -1,
> '_seqid' => 'ctg on chr r',
> '_start' => 5925,
> '_location_type' => 'EXACT',
> '_end' => 6001
> },
> 'Bio::Coordinate::Result::Match' ),
> bless( {
> '_strand' => -1,
> '_seqid' => 'ctg',
> '_location_type' => 'EXACT',
> '_start' => 1002,
> '_end' => 1125
> }, 'Bio::Coordinate::Result::Gap' )
> ],
> '_gap' => $VAR1->{'_sublocations'}[1],
> 'strand' => -1,
> '_match' => $VAR1->{'_sublocations'}[0],
> '_splittype' => 'JOIN'
> }, 'Bio::Coordinate::Result' );
> $VAR2 = '
> ';
> $VAR1 = bless( {
> '_sublocations' => [
> bless( {
> '_strand' => 1,
> '_seqid' => 'ctg on chr r',
> '_start' => 5001,
> '_location_type' => 'EXACT',
> '_end' => 5077
> },
> 'Bio::Coordinate::Result::Match' ),
> bless( {
> '_strand' => -1,
> '_seqid' => 'ctg',
> '_location_type' => 'EXACT',
> '_start' => 1002,
> '_end' => 1125
> }, 'Bio::Coordinate::Result::Gap' )
> ],
> '_gap' => $VAR1->{'_sublocations'}[1],
> 'strand' => 1,
> '_match' => $VAR1->{'_sublocations'}[0],
> '_splittype' => 'JOIN'
> }, 'Bio::Coordinate::Result' );
> $VAR2 = '
> ';
>
More information about the Bioperl-l
mailing list