[Bioperl-l] Odd behaviour in Bio::SeqFeature::Gene::Transcript
Jason Stajich
jason.stajich at duke.edu
Thu May 12 13:50:08 EDT 2005
On May 12, 2005, at 7:30 AM, Oliver Burren wrote:
> Dear Developers,
>
> I'm working with Bio::SeqFeature::Gene::Transcript and I am getting
> some
> odd behaviour. In a nutshell I'm getting an error depending on which
> order i add exons to a transcript when I try and dump introns.
>
> Best illustrated with a script which I attach (test_intron.pl).
>
> Here is the output that I get :
> 229 machine /home/xxx % perl test_intron.pl
> Building transcript on -ve strand
> Exon Order is utr3prime,exon3,exon2,exon1,utr5prime
> Strand is set to -1
> SEQ intron 31 30 . - .
> SEQ intron 21 25 . - .
> SEQ intron 11 15 . - .
> SEQ intron 6 5 . - .
> Exon Order is utr5prime,exon1,exon2,exon3,utr3prime
> Strand is set to -1
>
> ------------- EXCEPTION -------------
> MSG: Intron gap begins after '10' and ends before '1'
> STACK Bio::SeqFeature::Gene::Intron::location /home/xxxxx/bioperl-
> live/Bio/SeqFeature/Gene/Intron.pm:288
> STACK Bio::SeqFeature::Generic::strand /home/xxxxx/bioperl-
> live/Bio/SeqFeature/Generic.pm:356
> STACK Bio::Tools::GFF::_gff2_string /home/xxxxx/bioperl-
> live/Bio/Tools/GFF.pm:777
> STACK Bio::Tools::GFF::gff_string /home/xxxxx/bioperl-
> live/Bio/Tools/GFF.pm:680STACK
> Bio::SeqFeature::Generic::gff_string /home/xxxxx/bioperl-
> live/Bio/SeqFeature/Generic.pm:762
> STACK toplevel test_intron.pl:56
>
> --------------------------------------
>
>
>
> Is this the expected behaviour or a feature or more likely have I
> made a
> mistake somwhere ?
>
>
> I can fix it by the following. However this is probably the wrong
> thing
> to do. It can also be fixed by sorting exons before addition to
> transcript.
>
Sorting needs to take into account strandedness - it should be
handled by the transcript object I would think. I am assuming we
won't ever have exons in the same transcript which are actually
represented on different contigs. This would then make the sorting
not work - we run into the same problem in Bio::SeqFeatureI-
>spliced_seq function too.
So the sort needs to have a strand as part of the equation. I think
this takes care of it. Multiply start by the strand in the
schwartzian transformation.
$ cvs diff Bio/SeqFeature/Gene/Transcript.pm
Index: Bio/SeqFeature/Gene/Transcript.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Gene/
Transcript.pm,v
retrieving revision 1.33
diff -r1.33 Transcript.pm
294c294,295
< @exons = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map
{ [ $_, $_->start()] } @exons;
---
> @exons = map { $_->[0] } sort { $a->[1] <=> $b->[1] }
> map { [ $_, $_->start * ($_->strand || 1)] } @exons;
I can check this in if it works - seems to for your example.
--jason
>
> cvs diff Transcript.pm
> Index: Transcript.pm
> ===================================================================
> RCS file: /home/repository/bioperl/bioperl-
> live/Bio/SeqFeature/Gene/Transcript.pm,v
> retrieving revision 1.33
> diff -r1.33 Transcript.pm
> 287c287
> < $rev_order = ($exons[0]->end() < $exons[1]->start() ? 0 : 1);
> ---
>
>> #$rev_order = ($exons[0]->end() < $exons[1]->start() ? 0 : 1);
>>
> 291c291,292
> < if((! defined($strand)) || ($strand != -1) || (! $rev_order)) {
> ---
>
>> #if((! defined($strand)) || ($strand != -1) || (! $rev_order)) {
>> if((! defined($strand)) || ($strand != -1)) {
>>
>
>
> Thanks
>
> Olly Burren
> JDRF/WT DIL
>
> <test_intron.pl>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list