[Bioperl-l] additional methods for Bio::SeqUtils for in-silico cloning
Frank Schwach
fs5 at sanger.ac.uk
Tue Jan 10 16:47:41 UTC 2012
Hi Roy,
Sorry, I hadn't explained that very well: it's not the outer boundaries
of the feature that become fuzzy but the "inner" ones of the split
locations:
-------------------- a feature's location
==========xxxx================= sequence
--------- sublocation 1
-------- sublocation 2
===============================
x= sequence to delete
The feature's location has changed from Simple to Split.
Sublocation 1:
start is still EXACT and has not changed
end is now AFTER because this is not a true end of the feature
Sublocation 2:
start is BEFORE
end is EXACT (but shifted)
I hope this makes more sense(?)
Cheers,
Frank
On Tue, 2012-01-10 at 15:25 +0000, Roy Chaudhuri wrote:
> Hi Frank,
>
> Looks good to me. One thing I'm not sure about - why do features
> overlapping a deletion become fuzzy? That behaviour is in
> trunc_with_features because it's intended to represent a taking a
> subregion of a larger sequence, but if you're representing an internal
> deletion then the boundaries of the overlapping feature aren't unknown,
> they have been specifically altered. Maybe you could give absolute
> coordinates, but add a note indicating that the 5' or 3' end has been
> truncated by however many bases.
>
> Cheers,
> Roy.
>
> On 10/01/2012 13:10, Frank Schwach wrote:
> > Hi Chris,
> >
> > I have made the changes in a Git fork and made the pull request now.
> > If this is accepted into BioPerl I can also write a little SeqUtils
> > HOWTO for the BioPerl wiki.
> >
> > Frank
> >
> >
> > On Mon, 2012-01-09 at 18:29 +0000, Fields, Christopher J wrote:
> >> Sounds very promising! The easiest way to contribute is via a fork of the code on Github with a pull request (as you already know, being a contributor to the Primer3 modules).
> >>
> >> chris
> >>
> >> On Jan 9, 2012, at 11:10 AM, Frank Schwach wrote:
> >>
> >>> Hi all,
> >>>
> >>> I needed to manipulate Bio::Seq objects with annotations and sequence
> >>> features to simulate molecular cloning techniques, e.g. to cut a vector
> >>> and insert a fragment into it while preserving all the annotations and
> >>> moving the features accordingly.
> >>> My main aim was to split features that span deletion/insertion sites in
> >>> a meaningful way, which can not be done with the currently availble
> >>> methods.
> >>> I have modified Bio::SeqUtils so that I have the following new methods:
> >>>
> >>> delete
> >>> ======
> >>> removes a segment from a sequence object and adjusts positions and types
> >>> of locations of sequence features:
> >>> - locations of features that span the deletion sites are turned into
> >>> Splits.
> >>> - locations that extend into the deleted region are turned to Fuzzy to
> >>> indicate that their true start/end was lost.
> >>> - locations contained inside the deleted regions are lost.
> >>> - other features are shifted according to the length of the deletion.
> >>>
> >>> insert
> >>> ======
> >>> adds a Bio::Seq object into another one between specified insertion
> >>> sites. This also affects the features on the recipient sequence:
> >>> - locations of features that span the insertion site are split but
> >>> position types are not turned to Fuzzy because no part of the original
> >>> feature is lost.
> >>> - other features are shifted according to the length of the insertion.
> >>>
> >>> ligate
> >>> ======
> >>> just for convenience. Supply a recipient, a fragment and one or two
> >>> sites to cut the recipient. Can also flip the fragment if required.
> >>> Simply calls delete [, reverse_complement_with_features] and insert in
> >>> turn.
> >>>
> >>>
> >>> One situation I haven't handled yet is a deletion that spans the origin
> >>> of a circular molecule but that should be a rare thing to do anyway. The
> >>> code currently throws an error if this is attempted.
> >>>
> >>> I'm happy to contribute the code on Github if there is interest?
> >>> Comments on the handling of feature locations highly welcome!
> >>>
> >>> Frank
> >>
> >>
> >>
> >>
> >
> >
> >
>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Bioperl-l
mailing list