[Bioperl-l] Bio::Location::Split

Chris Fields cjfields at uiuc.edu
Mon Oct 16 17:57:35 UTC 2006


> I recently came across bug 2101, where Bio::Location::Split::to_FTstring
> gives the incorrect order for multi-sublocation locations on the minus
> strand. That is, I found it by getting incorrect results, and then found
> it in Bugzilla and in the September archives.
>
> I'm converting CDS files from one format to another. E.g., I read an
> EMBL file with a chromosome and CDS features, and want to output the
> location in a FASTA header. If I do something like:
> 
> foreach (<$in>) {
>     foreach my $feat ($seq->getSeqFeatures) {
>         print $feat->location->to_FTstring()
>     }
> }
> 
> I get the wrong results for multi-exon CDSs on the -1 strand, as
> described in the bug report.
> 
> Is there a relatively easy way around this? I assume I can't get at the
> original string of the location, which in this case is all I need. Can I
> just flip the order of the exons in certain cases? Chris F, can you tell
> me the preliminary solution you mentioned?
> 
> I must say I'm sort of surprised this wasn't found before. It seems like
> a not-that-rare occurrence. Oh well.
> 
> Thanks,
> 
> - Amir Karger
> Research Computing
> Life Sciences Division
> Harvard University

Could you let me know specifically which EMBL file contains the odd
locations?  The bug report uses theoretical locations, not actual ones, so
it would be nice to have a real-world example to test against.  

As for the lack of catching this, the particular types of locations that
cause the issue are quite rare.  Note that there are two bugs for that bug
report.  The first (and more serious) is still unresolved.  The second
(where remote locations are treated differently in Location::Split, which
caused more problems than it was worth) had a fix committed about a month
ago.  

Any fixes I have made for the first bug invariably break several other
methods, which use the current Location::Split object logic for retrieving
sequences, building feature strings, etc.  Since a new RC is imminent and
the bug only affects a small number of locations, I have held off until
after a final release is made (the last thing I want to do is fix something
that breaks ~6-8 other methods), but I'll try looking at it again this week.


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign




More information about the Bioperl-l mailing list