[Biocorba-l] SeqFeatureLocation

Alan Robinson alan@ebi.ac.uk
Thu, 8 Feb 2001 16:56:57 +0000 (GMT Standard Time)


On Wed, 7 Feb 2001, Jason Stajich wrote:

> Okay, so I've been told that <5..100 and 5<..100 mean the same thing.  
> I feel better about that.

I'm relieved too since the EMBL FeatureTable specification
[http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html]
makes no mention of the latter case and I've been digging around.


> But I'm still not clear how the location model will handle
> (5.10) or (1^3) as locations.   Are they really valid locations.  We can
> fudge it by making the start position be the value (since it can be
> represented that way) and make it so there is no ending position.  Sort of
> circumvents the model though.   

These cases of fuzziness are looked after using a combination of 'start',
'extension' and 'fuzzy' variables for the start and end position. The IDL
is modelled after (i.e. stolen from) the EMBL IDL which handles these
type of occurences.


For your (truely horrible) example location: (1^3)..(5.10) then the
following in Perl would return all the infomation about location to you:


# Return the single SeqFeatureLocation for this SeqFeature object
my @location = @{$mySeqFeature->locations()};


# First - do the starting position: 1^3

# Get the start position as a SeqFeaturePostion object:
my $startSeqFeaturePosition = $location[0]->start;

# Get the first base - the value should be 1
my $start = $startSeqFeaturePosition->start;

# Get the extension of this position - the value should be 2:
my $extension = $startSeqFeaturePosition->extension;

# Get the type code for the fuzziness - should be 3 (i.e. BETWEEN
# or '^' if you look this up in the FuzzyTypeCode interface):
my $fuzzy = $startSeqFeaturePosition->fuzzy;


# Now do the end position: 5.10

# Get the end position as a SeqFeaturePostion object:
my $endSeqFeaturePosition = $location[0]->end;

# Get the first base - the value should be 5:
$start = $endSeqFeaturePosition->start;

# Get the extension of this position - the value should be 5:
$extension = $endSeqFeaturePosition->extension;

# Get the type code for the fuzziness - should be 2 (i.e. WITHIN
# or '.' if you look this up in the FuzzyTypeCode interface):
$fuzzy = $endSeqFeaturePosition->fuzzy;


>From the above:

For the starting position:

  start = 1
  extension = 2
  fuzzy = '^' (BETWEEN)
  -----------
  (1^3)

For the end position:

  start = 5
  extension = 5
  fuzzy = '.' (WITHIN)
  -----------
  (5.10)


Thus the final Location is (1^3)..(5.10).


Do you follow this? Or is there a problem I've missed?



> So we don't have a part for a fuzzy location only fuzzy endpoints.  What
> if the whole location is fuzzy, ie it is within 5.10 but we're not sure
> where it starts or ends.  To make this work we'd need to add a fuzzy field
> to the SeqFeatureLocatoin struct.