[Bioperl-l] Refactoring Locations...
Heikki Lehvaslaiho
heikki@ebi.ac.uk
Tue, 02 Jul 2002 09:14:08 +0100
It's done.
If you have any errors being generated from location in the CVS HEAD, I'd be
happy to have a look at them.
Bio::Location::Fuzzy now complains if location like 23^24 is assigned to it.
You should use Bio::Location::Simple with location_type('IN-BETWEEN').
Location.t tests failed overnight failed because I forgot to add and commit
Bio::Location::Atomic. Fixed.
There are really quite a lot of errors and warnings when running tests in
the HEAD. It is difficult to see which are important and which are caused
from missing binaries.
-heikki
Heikki Lehvaslaiho wrote:
>
> I ran into a small problem with Bio::Locations and would like to
> slightly refactor them.
>
> From my point of view there are three types of exact sequence locations
> which in feature table notation are: 23, 34..55 and 46^47. The first two
> are handled by Bio::Location::Simple and have location_type('EXACT').
> The last one is lumped into location_type('BETWEEN') together with
> locations like 46^78 and handled by Bio::Location::Fuzzy. The source for
> the confusion is that the feature table definition allows for locations
> like 46^78 which I do not think are used anywhere. To stress, notation
> 46^47 is essential when you have clean insertions between residues.
>
>
> Currently we have Bio::LocationI which defines the interface,
> Bio::Location::Simple and two subclasses of Simple: Bio::Location::Fuzzy
> and Bio::Location::Split.
>
> What I'd like to have is to rename the current Simple into Atomic to be
> a common superclass and recreate Bio::Location::Simple so that it can
> have two values for the method location_type(): 'EXACT' and
> 'IN-BETWEEN' ('TWEEN', 'TWIXT' ?). The object will throw an error if
> location_type() is 'TWEEN' and
> start() and end() are both defined and not adjacent. The length of
> 'TWIXT' location is always zero. The default value of location_type()
> will be 'EXACT'.
>
>
> In practice the code changes seem to be easy to make and there might
> even be slight speed increase: Current Simple does some thing slightly
> convoluted way because methods are inherited by Fuzzy and Split.
> Using Bio::Location::Simple in scripts and other modules is made more
> complicated only if you are conserned about insertions (your should
> be!). You can then test either location_type() or lenght().
>
>
> The only other place in bioperl core outside Bio::Location that I have
> found to be affected is FTHelper.pm where one more condition needs to be
> added.
>
>
> I have almost all the code changes ready for committing.
>
> Any comments?
>
> -Heikki
>
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________