[Bioperl-l] Question about Bio::Coordinate::Pair
Dan Bolser
dan.bolser at gmail.com
Tue Oct 5 10:36:49 UTC 2010
Hi,
Can someone describe in a bit more detail the purpose of the Gap
sublocations that are sometimes returned by Bio::Coordinate::Pair [1]?
I found that, according to Bio::Location::Split, if the Match and Gap
sublocations have a different strand, the strand method (called via
Bio::Coordinate::Result) returns undef. This is inconsistent with the
way Bio::Coordinate::Result tends to behave. See the test script and
results below, also pasted here [2].
The question is, can I just toggle the strand of the Gap sublocation
to match that of the Match sublocation? Or does the strand of the Gap
sublocation encode some important but as yet undocumented information?
If the strand of the Gap and Match sublocations are made to match
(within Bio::Coordinate::Pair) this will simplify code that uses
Bio::Coordinate::Pair, making it more consistent, and perhaps help
with some other bugs [3].
Cheers,
Dan.
[1] http://www.bioperl.org/wiki/Module:Bio::Coordinate::Pair
[2] http://www.bioperl.org/wiki/Module_Discussion:Bio::Coordinate::Pair
[3] http://tinyurl.com/36na2cp
#!/usr/bin/perl -w
## Stress test Bio::Coordinate::Pair
use strict;
use Data::Dumper;
use Bio::Location::Simple;
use Bio::Coordinate::Pair;
## A contig
my $ctg = Bio::Location::Simple->
new( -seq_id => 'ctg',
-start => 1,
-end => 1001,
-strand => +1,
);
## The contigs position on a chromosome (forward)
my $ctg_on_chr_f = Bio::Location::Simple->
new( -seq_id => 'ctg on chr r',
-start => 5001,
-end => 6001,
-strand => +1,
);
## The contigs position on a chromosome (reverse)
my $ctg_on_chr_r = Bio::Location::Simple->
new( -seq_id => 'ctg on chr r',
-start => 5001,
-end => 6001,
-strand => -1,
);
## Coordinate mapping (forward)
my $agp_f = Bio::Coordinate::Pair->
new( -in => $ctg,
-out => $ctg_on_chr_f
);
## Coordinate mapping (reverse)
my $agp_r = Bio::Coordinate::Pair->
new( -in => $ctg,
-out => $ctg_on_chr_r
);
## A match, in contig coordinates...
my $match_on_ctg_4 = Bio::Location::Simple->
new( -seq_id => 'hit 4',
-start => 925,
-end => 1125,
-strand => -1,
);
## Map it into chromosome coordinates (forward)
my $match_on_chr_4_f =
$agp_f->map( $match_on_ctg_4 );
print Dumper $match_on_chr_4_f, "\n";
## Map it into chromosome coordinates (reverse)
my $match_on_chr_4_r =
$agp_r->map( $match_on_ctg_4 );
print Dumper $match_on_chr_4_r, "\n";
__END__
$VAR1 = bless( {
'_sublocations' => [
bless( {
'_strand' => -1,
'_seqid' => 'ctg on chr r',
'_start' => 5925,
'_location_type' => 'EXACT',
'_end' => 6001
},
'Bio::Coordinate::Result::Match' ),
bless( {
'_strand' => -1,
'_seqid' => 'ctg',
'_location_type' => 'EXACT',
'_start' => 1002,
'_end' => 1125
}, 'Bio::Coordinate::Result::Gap' )
],
'_gap' => $VAR1->{'_sublocations'}[1],
'strand' => -1,
'_match' => $VAR1->{'_sublocations'}[0],
'_splittype' => 'JOIN'
}, 'Bio::Coordinate::Result' );
$VAR2 = '
';
$VAR1 = bless( {
'_sublocations' => [
bless( {
'_strand' => 1,
'_seqid' => 'ctg on chr r',
'_start' => 5001,
'_location_type' => 'EXACT',
'_end' => 5077
},
'Bio::Coordinate::Result::Match' ),
bless( {
'_strand' => -1,
'_seqid' => 'ctg',
'_location_type' => 'EXACT',
'_start' => 1002,
'_end' => 1125
}, 'Bio::Coordinate::Result::Gap' )
],
'_gap' => $VAR1->{'_sublocations'}[1],
'strand' => 1,
'_match' => $VAR1->{'_sublocations'}[0],
'_splittype' => 'JOIN'
}, 'Bio::Coordinate::Result' );
$VAR2 = '
';
More information about the Bioperl-l
mailing list