[Bioperl-l] Opening up deleting features from Seq objects again
David Block
dblock@gene.pbi.nrc.ca
Thu, 25 Oct 2001 13:34:49 -0600 (CST)
On Thu, 25 Oct 2001, Ewan Birney wrote:
> On Wed, 24 Oct 2001, David Block wrote:
>
> > As part of our work in Genquire (soon to be released BSD! Finally!) we
> > have come to a point where it makes sense to include delete_feature
> > functionality in our Bio::SeqI implementation.
>
> Great!
>
> >
> > This should be extended to Bio::Seq generally because:
> >
>
> No... see below...
>
> > With complex GeneStructure objects, rebuilding a hierarchy of annotations
> > is not trivial. The old technique, flush/add, flattens the hierarchy and
> > can result in multiple copies of exons being added to the sequence.
> >
> > delete_feature can understand the feature's context and remove only those
> > parts of the parent gene that make sense.
> >
> > Our implementation looks like this:
> > my $orphanlist=$seq->delete_feature($feature,$transcript,$gene);
> >
> > This allows the current exon/transcript/gene hierarchy to be passed to the
> > sequence. It returns a list of features which are no longer part of a
> > coherent gene structure, i.e. if you want to delete one of two
> > transcripts, but want the hypothetical exons that make up the transcript
> > to stick around, the exons will be attached as top-level features and
> > returned to you.
> >
> > This allows our gui to function as expected.
> >
> > I volunteer to bolt this functionality on to Bio::Seq or one of its
> > descendants, if that's better. We want it in SeqI, at least as a stub,
> > so SeqCanvas doesn't barf if it's given any SeqI and asked to delete
> > something.
>
> I'm really against adding this functionality in the Bio::Seq
> implementation
>
>
> You are really forcing the Genquire update model (deleting individual
> features) into the default Bioperl sequence object. I think this is a bad
> idea and should be discouraged.
>
This is a natural outgrowth of SeqFeature::Gene::GeneStructure et al.
Once you have structured data in memory, flush/add is not appropriate all
of the time.
And what is the problem with having a working implementation of
delete_feature? You don't have to use it...
>
> There are other "update policy" systems for database access (Bioperl-db
> follows a more cvs publish/update type model - ish)
>
Of course, that's fine if that's what you want, but how do you dynamically
decide what goes into the next update? Bio::Seq should be able to remove
one of its features from memory - this may or may not affect the
underlying persistent storage.
>
> Furthermore, I have a sneaky suspicision that the feature delete
> requirements becomes a bit of a can of worms wrt to things like multiple
> users.
>
We have a lock system so that users must register locks on portions of the
database. Bio::Seq _has_no_persistence_mechanism_, so no two users are
ever looking at the same memory space. What's your problem?
>
> What I think you should go for is this sort of model
>
>
> # interface that GenQuire needs for a sequence objects to be
> # editable
>
I already have all of this - I want to make this portable for the benefit
of SeqCanvas, not for Genquire.
> Bio::GenQuire::UpdateableSeqI
>
> # implementation of this in pure Perl, can inhereit from
> # Bio::Seq if so wished to reduce coding
>
> Bio::GenQuire::Seq
>
> # implementation of this with DB backend
>
> Bio::GenQuire::DB::DavidsNameSpace::Whatever
>
>
> This mirrors what we have done in Ensembl, separating out the "Ensembl
> specific" interfaces into Bio::EnsEMBL::* space, and therefore not
> inflicting Ensembl's update model on everyone else (not that we have one).
>
But that doesn't allow people to use Ensembl on any old Bio::Seq, which is
what we want from Bio::Tk::SeqCanvas.
>
>
> Does this make sense? Do other people have views?
>
>
> The important thing is to keep "updatability" coupled with the update
> policy/functionality of the editor/system, as this is very variable.
>
updatability can be crippled by default, and available on certain
conditions (such as the presence of GeneStructures).
> > SeqCanvas does nothing if nothing is returned, again as you would expect.
> >
So a stub in Bio::SeqI that returns nothing and does nothing accomplishes
most of what we want.
--
David Block
soon to be moving
dblock@gene.pbi.nrc.ca
http://bioinfo.pbi.nrc.ca/wiki
NRC Plant Biotechnology Institute
Saskatoon, SK, Canada