[Biojava-l] Between Locations

Cox, Greg gcox@netgenics.com
Thu, 4 Oct 2001 14:13:01 -0400


Some months ago, I brought this up incorrectly.  I've spoken to the
scientists here, and I'd like to propose a set of semantics for
BetweenLocations.  This is driven by the two genbank/embl cases 5^6 and
5^10.  LocationTools will require the boolean operations areEqual, contains,
and overlaps.  It also requires operations that return a location, union and
intersection.  

The general proposal is to be generous with returning values where the
correct behavior is ill-defined.  For example, we recommend 5^10 intersects
7^12 results in 7^10.  It's easiest to define the semantics for each case.

Assumptions
* Union, areEqual, Overlaps, and Intersection are symmetric operators
* Contains is a directional operator
* In general, 7^12 is analogous to 7.12 in behavior
* A location 1..10 represents nucleotides 1 through 10 inclusive
* A location 5^6 represents the space between nucleotide 5 and nucleotide 6
* A location 5^10 represents a space between two nucleotides.  The
nucleotides are located between 5 and 10 inclusive
* Requesting the features on a subsequence must return the features with
between locations as well

(tab-delimited table)
Location					Intersection	Overlaps
1..10 intersects 5^6	result	5^6			TRUE
1..10 intersects 5^7	result	5^7			TRUE
1..10 intersects 10^11	result	EMPTY			FALSE
1..10 intersects 9^11	result	9^10			TRUE
5^6 intersects 5^6	result	5^6			TRUE
5^10 intersects 5^6	result	5^6			TRUE
5^10 intersects 7^12	result	7^10			TRUE

areEqual
	5..6 equals 5^6	result FALSE
	5^6 equals 5^6	result TRUE
	5^6 equals 5^10	result FALSE

Contains
	1..10 contains 5^6	result TRUE
	5^6	contains 1..10	result FALSE
	1..10 contains 5^7	result TRUE
	1..10 contains 10^11	result FALSE
	1..10 contains 9^11	result FALSE
	5^6 contains 5^6		result TRUE
	5^10 contains 5^6		result TRUE
	5^6 contains 5^10		result FALSE
	5^10 contains 7^12	result FALSE
	7^12 contains 5^10	result FALSE

Overlaps
	1..10 overlaps 5^6	result TRUE
	1..10 overlaps 5^7	result TRUE
	1..10 overlaps 10^11	result FALSE
	1..10 overlaps 9^11	result TRUE
	9^11 overlaps 1..10	result TRUE
	5^6 overlaps 5^6		result TRUE
	5^6 overlaps 5^10		result TRUE
	5^10 overlaps 7^12	result TRUE 

Union
	1..10 union 5^6	result compoundLocation(1..10,5^6)
	1..10 union 5^7	result compoundLocation(1..10,5^7)
	1..10 union 10^11	result compoundLocation(1..10,10^11)
	1..10 union 9^11	result compoundLocation(1..10,9^11)
	5^6 union 5^6	result 5^6
	5^10 union 5^6	result 5^10
	5^10 union 7^12	result 5^12

If there aren't any objections, I'll code this up and commit it.

Greg