[Bioperl-l] New modules

Rob Edwards redwards at utmem.edu
Sat Oct 18 22:52:47 EDT 2003


I have written a couple of modules that are bioperl-ish. They are by no 
means finished, polished objects, and there is not t/ scripts at the 
moment. (But there are some docs!).

Bio::Tools::RepeatFinder for finding direct and indirect, perfect, and 
imperfect repeats in DNA sequences.

It is not necessarily speed-optimized, but it works well for sequences 
upto several hundred kb. The imperfect repeats part of it (a) just 
joins other repeats that it finds, so there is a limitation on the 
minimum length of imperfect repeats (2n+1 where n is the user-defined 
length of one half of a repeat); (b) in the current implementation 
requires Tie::RefHash; (c) the distance between repeats can be 
specified; and (d) this really slows things down because all the 
repeats are compared to all other repeats. You don't have to calculate 
these though, if you are only looking for perfect repeats.

Bio::SearchIO::Blink is a parser for BLINK from NCBI

Blink is NCBI's Blast Link - precalculated BLAST searches for every 
protein in the NR database (I believe). Unfortunately these reports 
don't contain useful things like starts and stops of matches so at the 
moment this doesn't return a very bioperl-ish result, but I am hoping 
that as NCBI develop BLINK this module can evolve into something. Its 
useful as is, though, depending on what you need.

The modules are available from http://www.salmonella.org/bioperl/

Take a look and let me know what you think or if you find bugs.

Rob



More information about the Bioperl-l mailing list