[Bioperl-l] Refactoring Locations...
Heikki Lehvaslaiho
heikki@ebi.ac.uk
Thu, 27 Jun 2002 17:34:23 +0100
I ran into a small problem with Bio::Locations and would like to slightly
refactor them.
From my point of view there are three types of exact sequence locations
which in feature table notation are: 23, 34..55 and 46^47. The first two are
handled by Bio::Location::Simple and have location_type('EXACT'). The last
one is lumped into location_type('BETWEEN') together with locations like
46^78 and handled by Bio::Location::Fuzzy. The source for the confusion is
that the feature table definition allows for locations like 46^78 which I do
not think are used anywhere. To stress, notation 46^47 is essential when you
have clean insertions between residues.
Currently we have Bio::LocationI which defines the interface,
Bio::Location::Simple and two subclasses of Simple: Bio::Location::Fuzzy and
Bio::Location::Split.
What I'd like to have is to rename the current Simple into Atomic to be a
common superclass and recreate Bio::Location::Simple so that it can have two
values for the method location_type(): 'EXACT' and 'IN-BETWEEN' ('TWEEN',
'TWIXT' ?). The object will throw an error if location_type() is 'TWEEN' and
start() and end() are both defined and not adjacent. The length of 'TWIXT'
location is always zero. The default value of location_type() will be 'EXACT'.
In practice the code changes seem to be easy to make and there might even be
slight speed increase: Current Simple does some thing slightly convoluted
way because methods are inherited by Fuzzy and Split.
Using Bio::Location::Simple in scripts and other modules is made more
complicated only if you are conserned about insertions (your should be!).
You can then test either location_type() or lenght().
The only other place in bioperl core outside Bio::Location that I have found
to be affected is FTHelper.pm where one more condition needs to be added.
I have almost all the code changes ready for committing.
Any comments?
-Heikki
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________