[Bioperl-l] Bio::RangeI intersection and Bio::DB::GFF
Marco Blanchette
mblanche at berkeley.edu
Tue May 2 20:32:58 UTC 2006
Brian--
Even when both elements of intersection() are from the negative strand, the
return object is from the positive strand and $overlap is actually the
revervese complement of the intersection between the 2 exons. Here is part
of the output from the script below:
===
ex1 Strand: -1
CTTTTTTCCACATACGTCGTCAACGTGATTCGACCTTTTCCGGTTTATTAGTTGAACATGGCAGTCGGCAAAAATA
AAGGTCTTTCCAAGGGTGGTAAGAAGGGCGG
TAAGAAGAAGGTGGTGGACCCGTTTTCTCGCAAGGACTG
ex2 Strand: -1
CTTTTTTCCACATACGTCGTCAACGTGATTCGACCTTTTCCGGTTTATTAGTTGAACATGGCAGTCGGCAAAAATA
AAGGTCTTTCCAAGGGTGGTAAGAAGGGCGG
TAAGAAGAAGGTGGTGGACCCGTTTTCTCGCAAGGACTGGTACGATGTCAAAGCTCCGAATATGTTTCAAACCCGT
CAAATCG
overlap Strand: 1
CAGTCCTTGCGAGAAAACGGGTCCACCACCTTCTTCTTACCGCCCTTCTTACCACCCTTGGAAAGACCTTTATTTT
TGCCGACTGCCATGTTCAACTAATAAACCGG
AAAAGGTCGAATCACGTTGACGACGTATGTGGAAAAAAG
...
If both are from the positive strand, the return object is positive as in:
===
ex1 Strand: 1
CAACGCAGACGTGGTACGGCGTTTTAAATCTGATAACATTTTGAACCGGGAATTATTTTAGAGTACCATTCTTTGT
TTTGTGCCTGTTTCAGTATAAATTAATTATG
CGCCTGATTTAAAGTACAAAATGTGTAAATATATCACCTTACCGTCGCGGGTGCACCCAATTGTGCTTTGATGAAT
AAATATACATATATGCAACATATATAACTTC
CTGTGTTAGTATAAGTGTATGTCAGCCAAAAACAAATATATATATGAGTGTTTATCGGCATTCGTGTGCTGGCAGA
GCAGCGATCAAAGCTGCGTTCGGTACTCGTT
GACTGGCCCAAGAATGAATTCTCGTGCAAGTGTGTTGATAAAAAGTATACGTATGTAT
ex2 Strand: 1
ATCGACAGTTGCCATCGTCGTTATTCCAGCACTAATTTAAAAAAAATTCGATCAACGCAGACGTG
overlap Strand: 1
CAACGCAGACGTG
Is there something I am missing? Here is the script generating the output
Many thanks all...
Marco
use strict;
use warnings;
use Bio::DB::GFF;
MAIN:{
my $db = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
-dsn =>
'dbi:mysql:database=dmel_43_LS;host=riolab.net',
-user => 'guest');
my $test_db = $db->segment('4');
# Load up the exons into $exons_p
for my $gene ($test_db->features(-types => 'gene')){
my $exons_p = extractExons($gene);
cluster($exons_p) unless ($#{$exons_p} == -1);
}
}
sub extractExons {
my $gene = shift;
my %ex_list;
my @tcs = $gene->features( -type =>'processed_transcript',
-attributes =>{Gene => $gene->group});
for my $tc (@tcs){
my @exons = $tc->features (-type => 'exon',
-attributes => {Parent => $tc->group}
);
for (@exons){
my $ex_id = $_->id;
$ex_list{$ex_id} = $_ unless (exists $ex_list{$ex_id});
}
}
my @values = values %ex_list;
return(\@values);
}
sub cluster {
my $exons_p = shift;
for (my $s = 0; $s <= $#{$exons_p}; $s++){
for (my $t = $s+1; $t <= $#{$exons_p}; $t++){
my $exon1 = $exons_p->[$s];
my $exon2 = $exons_p->[$t];
if (!($exon1->equals($exon2)) && $exon1->overlaps($exon2)){
my $overlap = $exon1->intersection($exon2);
print "===\n";;
print "ex1\tStrand: ", $exon1->strand, "\n",
$exon1->seq, "\n";
print "ex2\tStrand: ", $exon2->strand, "\n",
$exon2->seq, "\n";
print "overlap\tStrand: ", $overlap->strand, "\n",
$overlap->seq, "\n";
}
}
}
}
On 5/2/06 13:17, "Brian Osborne" <osborne1 at optonline.net> wrote:
> Marco,
>
> Yes, this is how intersection() is supposed to work. If both of the Range
> objects have the same strand then the strand information is returned as part
> of the result but if they aren't on the same strand then no strand
> information is returned.
>
> Brian O.
>
>
> On 5/2/06 3:30 PM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
>
>> Dear all--
>>
>> I have been trying to use the intersection function to extract overlapping
>> region from alternatively spliced exons as in the following script. The
>> returned object from the 'my $overlap = $exon1->intersection($exon2);' is
>> actually loosing the strand of $exon1 if $exon1 is from the negative strand.
>> Is this behavior expected? Should I check the strand of $exon1 before
>> working on the object return by any Bio::RangeI function?
>>
>> Many thanks
>>
>> #!/usr/bin/perl
>> use strict;
>> use warnings;
>> use Bio::DB::GFF;
>>
>> MAIN:{
>>
>> my $db = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
>> -dsn =>
>> 'dbi:mysql:database=dmel_43_LS;host=riolab.net',
>> -user => 'guest');
>> my $test_db = $db->segment('4');
>>
>> # Load up the exons into $exons_p
>> for my $gene ($test_db->features(-types => 'gene')){
>>
>> my $exons_p = extractExons($gene);
>>
>> cluster($exons_p) unless ($#{$exons_p} == -1);
>>
>> }
>> }
>>
>> sub extractExons {
>> my $gene = shift;
>> my %ex_list;
>> my @tcs = $gene->features( -type =>'processed_transcript',
>> -attributes =>{Gene => $gene->group});
>>
>> for my $tc (@tcs){
>> my @exons = $tc->features (-type => 'exon',
>> -attributes => {Parent => $tc->group}
>> );
>>
>> for (@exons){
>> my $ex_id = $_->id;
>> $ex_list{$ex_id} = $_ unless (exists $ex_list{$ex_id});
>>
>> }
>>
>> }
>> my @values = values %ex_list;
>> return(\@values);
>> }
>>
>> sub cluster {
>> my $exons_p = shift;
>>
>> for (my $s = 0; $s <= $#{$exons_p}; $s++){
>> for (my $t = $s+1; $t <= $#{$exons_p}; $t++){
>> my $exon1 = $exons_p->[$s];
>> my $exon2 = $exons_p->[$t];
>>
>> if (!($exon1->equals($exon2)) && $exon1->overlaps($exon2)){
>>
>> my $overlap = $exon1->intersection($exon2);
>>
>> print "===\n";;
>> print "ex1\n", $exon1->seq, "\n";
>> print "ex2\n", $exon2->seq, "\n";
>> print "overlap\n", $overlap->seq, "\n";
>> }
>> }
>> }
>> }
>> ______________________________
>> Marco Blanchette, Ph.D.
>>
>> mblanche at uclink.berkeley.edu
>>
>> Donald C. Rio's lab
>> Department of Molecular and Cell Biology
>> 16 Barker Hall
>> University of California
>> Berkeley, CA 94720-3204
>>
>> Tel: (510) 642-1084
>> Cell: (510) 847-0996
>> Fax: (510) 642-6062
>
>
______________________________
Marco Blanchette, Ph.D.
mblanche at uclink.berkeley.edu
Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204
Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062
--
More information about the Bioperl-l
mailing list