[Bioperl-l] (no subject)
Ewan Birney
birney at ebi.ac.uk
Wed Sep 10 10:35:39 EDT 2003
I have been scripting primer design for a while where I find I have
better control over
the heuristics and (importantly) can include BLAST/exonerate matching
of a region
to its own genome to find unique-in-genome areas.
I know Primer3 is out there, but in some cases, making sure you design
a primer in
a non-duplicated region is more important than getting the right G/C
content etc.
I'd like to propose the following modules:
Bio::Primer::Feature.pm
a single primer, SeqFeatureI compliant, start/end on a sequence,
reuses the seq(), has gc content
methods and has_inversion($size) which gives back the first inverted
string over size or undef if none.
Bio::Primer::Pair.pm
a pair of primers, having left and right Bio::Primer::Feature.pm's
with "joint" methods such as
diff_gc(), the difference in GC content between the two pairs
Bio::Primer::AssessmentI,pm
interfaces which defines the method
$score = $assor->assess($pair);
Bio::Primer::Design.pm
takes a sequence, an optional left hand region (defaults to
50bp), an optional right hand region (defaults to 50bp), an optional
primer size (default of 20), an optional prune score and a list of
Bio::Primer::AssessmentI.pm compliant modules.
design works the following way:
generates every left hand and right hand primer of size
foreach left,right pair, applies each Assessment module in
turn. If the score falls below
prune at any point, discards this pair immedaitely
(therefore by setting prune to - say - -100 and having an
assessment module of inversion_greater_than_5 give -200 then primer
pairs with this are never considered, to keep the
list manageable if needs be).
stores final score for this pair
provides final "best pair" or complete list
Assessment modules first up would be:
Bio::Primer::Assessment::inversion_length.pm
Bio::Primer::Assessment::GC_content.pm
Bio::Primer::Assessment::GC_matching.pm (primers should have the
same melting temperature)
Bio::Primer::Assessment::product_length.pm (ideal product length of
around 1KB)
these would all take some "weight" constructor to allow them to be
weighted differently
I'd also build in Bio::SearchIO or SeqFeature based modules which
"banned" certain regions of the
sequence from being used.
I thought about putting the Bio::Primer::Feature.pm in
Bio::SeqFeature::Primer.pm but I thought that
keeping all the modules together made more sense.
This could also go off
Bio::Tools::Primer::*
if people prefered.
any views?
More information about the Bioperl-l
mailing list