[Biocorba-l] SeqFeatureLocation
Jason Stajich
jason@chg.mc.duke.edu
Thu, 8 Feb 2001 12:11:39 -0500 (EST)
I do follow you example below, I guess the only other case is how would
just the location 5.10 be represented? Start and end are known, but the
whole location is fuzzy not the endpoints. I have made this work in
bioperl by adding a location fuzzy code as well which can be EXACT,
WITHIN, BETWEEN.
On Thu, 8 Feb 2001, Alan Robinson wrote:
>
> On Wed, 7 Feb 2001, Jason Stajich wrote:
>
> > Okay, so I've been told that <5..100 and 5<..100 mean the same thing.
> > I feel better about that.
>
> I'm relieved too since the EMBL FeatureTable specification
> [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html]
> makes no mention of the latter case and I've been digging around.
>
>
> > But I'm still not clear how the location model will handle
> > (5.10) or (1^3) as locations. Are they really valid locations. We can
> > fudge it by making the start position be the value (since it can be
> > represented that way) and make it so there is no ending position. Sort of
> > circumvents the model though.
>
> These cases of fuzziness are looked after using a combination of 'start',
> 'extension' and 'fuzzy' variables for the start and end position. The IDL
> is modelled after (i.e. stolen from) the EMBL IDL which handles these
> type of occurences.
>
>
> For your (truely horrible) example location: (1^3)..(5.10) then the
> following in Perl would return all the infomation about location to you:
>
>
> # Return the single SeqFeatureLocation for this SeqFeature object
> my @location = @{$mySeqFeature->locations()};
>
>
> # First - do the starting position: 1^3
>
> # Get the start position as a SeqFeaturePostion object:
> my $startSeqFeaturePosition = $location[0]->start;
>
> # Get the first base - the value should be 1
> my $start = $startSeqFeaturePosition->start;
>
> # Get the extension of this position - the value should be 2:
> my $extension = $startSeqFeaturePosition->extension;
>
> # Get the type code for the fuzziness - should be 3 (i.e. BETWEEN
> # or '^' if you look this up in the FuzzyTypeCode interface):
> my $fuzzy = $startSeqFeaturePosition->fuzzy;
>
>
> # Now do the end position: 5.10
>
> # Get the end position as a SeqFeaturePostion object:
> my $endSeqFeaturePosition = $location[0]->end;
>
> # Get the first base - the value should be 5:
> $start = $endSeqFeaturePosition->start;
>
> # Get the extension of this position - the value should be 5:
> $extension = $endSeqFeaturePosition->extension;
>
> # Get the type code for the fuzziness - should be 2 (i.e. WITHIN
> # or '.' if you look this up in the FuzzyTypeCode interface):
> $fuzzy = $endSeqFeaturePosition->fuzzy;
>
>
> >From the above:
>
> For the starting position:
>
> start = 1
> extension = 2
> fuzzy = '^' (BETWEEN)
> -----------
> (1^3)
>
> For the end position:
>
> start = 5
> extension = 5
> fuzzy = '.' (WITHIN)
> -----------
> (5.10)
>
>
> Thus the final Location is (1^3)..(5.10).
>
>
> Do you follow this? Or is there a problem I've missed?
>
>
>
> > So we don't have a part for a fuzzy location only fuzzy endpoints. What
> > if the whole location is fuzzy, ie it is within 5.10 but we're not sure
> > where it starts or ends. To make this work we'd need to add a fuzzy field
> > to the SeqFeatureLocatoin struct.
>
>
> _______________________________________________
> Biocorba-l mailing list
> Biocorba-l@biocorba.org
> http://www.biocorba.org/mailman/listinfo/biocorba-l
>
Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center
http://www.chg.duke.edu/