[Bioperl-l] split location problems
Chris Fields
cjfields at uiuc.edu
Tue Oct 17 01:07:55 UTC 2006
On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
> The whole point of split locations is to represent genes with
> introns so that is not the "rare" case.
>
> I'm confused where the problem is. The locations that I get out
> with to_FTstring on the location object are exactly the same as
> those input.
The problem is with the a subset of split locations described in the
bug report. The following works:
complement(join(2691..4571,4918..5163))
whereas this:
join(complement(4918..5163),complement(2691..4571))
gives this:
complement(join(4918..5163,2691..4571))
which is not syntactically the same. It should be:
complement(join(2691..4571,4918..5163))
since 'join' implies that the order of the segments to be joined is
important ('order' and 'bond' do not, I guess).
> I have processed the genbank fungal genomes into GFF3 and have had
> no problems so I'm confused where you are breaking down. If I
> write them out as embl I also get the correct thing. This is using
> the CVS version of bioperl from the HEAD.
>
> I've added code to test this to bug 2101 including a C.glabrata
> chromsome downloaded from genbank. Perhaps the problem is on the
> EMBL parsing side, I didn't test that.
>
> On the technical side, I still am not sure I fully know where the
> strand information should be stored - the top level container or
> the sub-features. I'll try and stay up on the discussion if
> anything has been decided that I should know about.
>
> -jason
Split::strand() sets the sublocations as well, which seems to confuse
the situation more but it is consistent with LocationI, as Hilmar
points out. I'm looking into a few solutions now, including a fix in
Split::to_FTstring().
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list