[Bioperl-l] Query re naming / distribution of bio-related module [bio-newbie - be gentle!]

Simon Whitaker simon@netcetera.org
Mon, 15 Apr 2002 13:01:14 +0100


On Mon, 15 Apr 2002 08:36:52 +0800 (SGT) Elia Stupka wrote:

> don't worry we are not known to bite ;)

Phew. :-)

> Your module sounds interesting and it would probably be very helpful if
> you could post some more specs about the input data, all the various
> methods it has, and the output types, for us to understand better where it
> fits in the package, and whether it overlaps with some of the other
> modules you have.

The data we're working on is arranged in two types of file. A repeat
length data file contains data on the number of repeats found at a
number of loci for a number of strains of the organism being examined.
Data is in comma- or tab-separated format. For example:

    lgtC,lex2A,lic1,lic2,lic3,strain
    15,20,33,30,29,1231
    15,20,34,30,29,1232
    43,26,39,22,36,1209
    41,25,39,22,36,1207

A mutation rate data file maps repeat lengths to their estimated
mutation rates. For example:

    10,0.00001
    11,0.0001
    12,0.001
    13,0.01
    14,0.1

Here's a breakdown of the public methods (as currently envisaged):

Constructor:

new() - no arguments. Creates and initialises a new Bio::MutPhenotyper
object. (of course :-)

Member functions:

getErrorStr() - returns information on last error

loadRLFile() - takes a filename as an argument. Loads repeat length data
for this object from that file. Returns number of rows read on success,
sets an error message (getable with getErrorStr) and returns 0 on
failure. (If the object already has repeat length data loaded it is
overwritten.)

e.g.:

    my $typer = new Bio::MutPhenotyper;
    my $rows = $typer->loadRLData("somedata.rl") or die $myTyper->getErrorStr;

readRLData() - as above, but takes a single line of data as an argument.
Returns number of fields read on success, sets an error message (getable
with getErrorStr) and returns 0 on failure.

loadMRFile() - takes filename as argument. Loads mutation rate data for
this object from that file. Returns number of rows read on success, sets
error message and returns 0 on failure. (Basically as above.)

getRLData() - returns reference to 2D array containing repeat length
data for the current object

getMRData() - returns reference to a hash containing mutation rate data.

calcMutPhenotypes() - no arguments. Calculates mutation phenotypes for
all strains in the repeat-length data. Returns reference to
2-dimensional array of results on success, sets error message and
returns 0 on failure.

calcMeanRepeats() - no arguments. Calculates mean and s.d. on repeat
lengths in the repeat-length data. Returns reference to 2-dimensional
array of results on success, sets error message and returns 0 on failure.

calcTotalAlleles() - no arguments. Calculates total number of alleles
for each locus. Returns reference to 2-dimensional array of results on
success, sets error message and returns 0 on failure.

translateToUniqueAlleles() - no arguments. Translates repeat length data
into unique alleles. Returns reference to 2-dimensional array of results
on success, sets error message and returns 0 on failure.

translateToMutRates() - no arguments. Translates repeat length data into
mutation rates. Returns reference to 2-dimensional array of results on
success, sets error message and returns 0 on failure.


All the best,

Simon

-- 
Simon Whitaker
http://netcetera.org/