[Bioperl-l] additional methods for Bio::SeqUtils for in-silico cloning

Frank Schwach fs5 at sanger.ac.uk
Tue Jan 10 16:47:41 UTC 2012


Hi Roy,

Sorry, I hadn't explained that very well: it's not the outer boundaries
of the feature that become fuzzy but the "inner" ones of the split
locations:

 --------------------           a feature's location
==========xxxx================= sequence


 ---------                     sublocation 1
          --------             sublocation 2
===============================
         
x= sequence to delete
The feature's location has changed from Simple to Split.

Sublocation 1:
start is still EXACT and has not changed
end is now AFTER because this is not a true end of the feature

Sublocation 2:
start is BEFORE 
end is EXACT (but shifted)

I hope this makes more sense(?)

Cheers,

Frank



On Tue, 2012-01-10 at 15:25 +0000, Roy Chaudhuri wrote:
> Hi Frank,
> 
> Looks good to me. One thing I'm not sure about - why do features 
> overlapping a deletion become fuzzy? That behaviour is in 
> trunc_with_features because it's intended to represent a taking a 
> subregion of a larger sequence, but if you're representing an internal 
> deletion then the boundaries of the overlapping feature aren't unknown, 
> they have been specifically altered. Maybe you could give absolute 
> coordinates, but add a note indicating that the 5' or 3' end has been 
> truncated by however many bases.
> 
> Cheers,
> Roy.
> 
> On 10/01/2012 13:10, Frank Schwach wrote:
> > Hi Chris,
> >
> > I have made the changes in a Git fork and made the pull request now.
> > If this is accepted into BioPerl I can also write a little SeqUtils
> > HOWTO for the BioPerl wiki.
> >
> > Frank
> >
> >
> > On Mon, 2012-01-09 at 18:29 +0000, Fields, Christopher J wrote:
> >> Sounds very promising!  The easiest way to contribute is via a fork of the code on Github with a pull request (as you already know, being a contributor to the Primer3 modules).
> >>
> >> chris
> >>
> >> On Jan 9, 2012, at 11:10 AM, Frank Schwach wrote:
> >>
> >>> Hi all,
> >>>
> >>> I needed to manipulate Bio::Seq objects with annotations and sequence
> >>> features to simulate molecular cloning techniques, e.g. to cut a vector
> >>> and insert a fragment into it while preserving all the annotations and
> >>> moving the features accordingly.
> >>> My main aim was to split features that span deletion/insertion sites in
> >>> a meaningful way, which can not be done with the currently availble
> >>> methods.
> >>> I have modified Bio::SeqUtils so that I have the following new methods:
> >>>
> >>> delete
> >>> ======
> >>> removes a segment from a sequence object and adjusts positions and types
> >>> of locations of sequence features:
> >>> - locations of features that span the deletion sites are turned into
> >>> Splits.
> >>> - locations that extend into the deleted region are turned to Fuzzy to
> >>> indicate that their true start/end was lost.
> >>> - locations contained inside the deleted regions are lost.
> >>> - other features are shifted according to the length of the deletion.
> >>>
> >>> insert
> >>> ======
> >>> adds a Bio::Seq object into another one between specified insertion
> >>> sites. This also affects the features on the recipient sequence:
> >>> - locations of features that span the insertion site are split but
> >>> position types are not turned to Fuzzy because no part of the original
> >>> feature is lost.
> >>> - other features are shifted according to the length of the insertion.
> >>>
> >>> ligate
> >>> ======
> >>> just for convenience. Supply a recipient, a fragment and one or two
> >>> sites to cut the recipient. Can also flip the fragment if required.
> >>> Simply calls delete [, reverse_complement_with_features] and insert in
> >>> turn.
> >>>
> >>>
> >>> One situation I haven't handled yet is a deletion that spans the origin
> >>> of a circular molecule but that should be a rare thing to do anyway. The
> >>> code currently throws an error if this is attempted.
> >>>
> >>> I'm happy to contribute the code on Github if there is interest?
> >>> Comments on the handling of feature locations highly welcome!
> >>>
> >>> Frank
> >>
> >>
> >>
> >>
> >
> >
> >
> 



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Bioperl-l mailing list