[Bioperl-l] Query re naming / distribution of bio-related module [bio-newbie - be gentle!]
Simon Whitaker
simon@netcetera.org
Mon, 15 Apr 2002 13:01:14 +0100
On Mon, 15 Apr 2002 08:36:52 +0800 (SGT) Elia Stupka wrote:
> don't worry we are not known to bite ;)
Phew. :-)
> Your module sounds interesting and it would probably be very helpful if
> you could post some more specs about the input data, all the various
> methods it has, and the output types, for us to understand better where it
> fits in the package, and whether it overlaps with some of the other
> modules you have.
The data we're working on is arranged in two types of file. A repeat
length data file contains data on the number of repeats found at a
number of loci for a number of strains of the organism being examined.
Data is in comma- or tab-separated format. For example:
lgtC,lex2A,lic1,lic2,lic3,strain
15,20,33,30,29,1231
15,20,34,30,29,1232
43,26,39,22,36,1209
41,25,39,22,36,1207
A mutation rate data file maps repeat lengths to their estimated
mutation rates. For example:
10,0.00001
11,0.0001
12,0.001
13,0.01
14,0.1
Here's a breakdown of the public methods (as currently envisaged):
Constructor:
new() - no arguments. Creates and initialises a new Bio::MutPhenotyper
object. (of course :-)
Member functions:
getErrorStr() - returns information on last error
loadRLFile() - takes a filename as an argument. Loads repeat length data
for this object from that file. Returns number of rows read on success,
sets an error message (getable with getErrorStr) and returns 0 on
failure. (If the object already has repeat length data loaded it is
overwritten.)
e.g.:
my $typer = new Bio::MutPhenotyper;
my $rows = $typer->loadRLData("somedata.rl") or die $myTyper->getErrorStr;
readRLData() - as above, but takes a single line of data as an argument.
Returns number of fields read on success, sets an error message (getable
with getErrorStr) and returns 0 on failure.
loadMRFile() - takes filename as argument. Loads mutation rate data for
this object from that file. Returns number of rows read on success, sets
error message and returns 0 on failure. (Basically as above.)
getRLData() - returns reference to 2D array containing repeat length
data for the current object
getMRData() - returns reference to a hash containing mutation rate data.
calcMutPhenotypes() - no arguments. Calculates mutation phenotypes for
all strains in the repeat-length data. Returns reference to
2-dimensional array of results on success, sets error message and
returns 0 on failure.
calcMeanRepeats() - no arguments. Calculates mean and s.d. on repeat
lengths in the repeat-length data. Returns reference to 2-dimensional
array of results on success, sets error message and returns 0 on failure.
calcTotalAlleles() - no arguments. Calculates total number of alleles
for each locus. Returns reference to 2-dimensional array of results on
success, sets error message and returns 0 on failure.
translateToUniqueAlleles() - no arguments. Translates repeat length data
into unique alleles. Returns reference to 2-dimensional array of results
on success, sets error message and returns 0 on failure.
translateToMutRates() - no arguments. Translates repeat length data into
mutation rates. Returns reference to 2-dimensional array of results on
success, sets error message and returns 0 on failure.
All the best,
Simon
--
Simon Whitaker
http://netcetera.org/