[Bioperl-l] additional methods for Bio::SeqUtils for in-silico cloning

Frank Schwach fs5 at sanger.ac.uk
Mon Jan 9 17:10:37 UTC 2012


Hi all,

I needed to manipulate Bio::Seq objects with annotations and sequence
features to simulate molecular cloning techniques, e.g. to cut a vector
and insert a fragment into it while preserving all the annotations and
moving the features accordingly. 
My main aim was to split features that span deletion/insertion sites in
a meaningful way, which can not be done with the currently availble
methods.
I have modified Bio::SeqUtils so that I have the following new methods:

delete
======
removes a segment from a sequence object and adjusts positions and types
of locations of sequence features:
- locations of features that span the deletion sites are turned into
Splits.
- locations that extend into the deleted region are turned to Fuzzy to
indicate that their true start/end was lost.
- locations contained inside the deleted regions are lost.
- other features are shifted according to the length of the deletion.

insert
======
adds a Bio::Seq object into another one between specified insertion
sites. This also affects the features on the recipient sequence:
- locations of features that span the insertion site are split but
position types are not turned to Fuzzy because no part of the original
feature is lost.
- other features are shifted according to the length of the insertion.

ligate
======
just for convenience. Supply a recipient, a fragment and one or two
sites to cut the recipient. Can also flip the fragment if required.
Simply calls delete [, reverse_complement_with_features] and insert in
turn.


One situation I haven't handled yet is a deletion that spans the origin
of a circular molecule but that should be a rare thing to do anyway. The
code currently throws an error if this is attempted.

I'm happy to contribute the code on Github if there is interest?
Comments on the handling of feature locations highly welcome!

Frank





-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Bioperl-l mailing list