[Bioperl-l] Re: MicroarrayIO proposal

Chervitz, Steve Steve_Chervitz@affymetrix.com
Mon, 14 Oct 2002 13:13:35 -0700


Adam,

Good point about the word "probe" and raising the MGED connection. I think
it would be a good idea to stick with the MGED terminology of "feature" (the
spot on an array) and "reporter" (the physical sequence). If anyone needs
some background here, see
http://www.mged.org/Workgroups/MAGE/designelement.html.

I don't think it's a good idea to try to incorporate MAGEstk itself into
Bioperl, but it would be a good idea to keep module names and data accessor
names aligned as best we can. 

The MAGEstk perl code is autogenerated from the MAGE object model and serves
the narrow purpose of serializing/de-serializing expression data to/from
MAGE-ML. So it's important for MAGE-perl to be tightly coupled from the MAGE
effort, which is still evolving.

It *would* be cool to create MAGE-perl adaptor or wrapper classes that could
live within Bioperl. This would enable bioperlers to make use of the
MAGE-perl code for handling MAGE-ML data. I haven't thought much about
specifics here, but would be interested if others have ideas along these
lines.

Allen's modules serve mostly to help parse various non-MAGE-ML file formats
containing expression data, a worthy cause which is outside the scope of
MAGEstk. All we'd really need to tie into the MAGE world would be some
adaptors that could interconvert his objects with MAGE-perl objects.

More thoughts on Allen's modules. His ProbeI module is actually a feature,
so perhaps it could be renamed MicroarrayFeatureI.pm. His Probeset.pm module
is probably close to the reporter concept, but I haven't looked at it in
detail yet. Would be useful to compare Allen's code to the corresponding
MAGE-perl classes.

For the record, here's how Affy array layout data is mapped to the
feature/reporter MAGE system:

   Single spot (cell) on an array = Feature
   Perfect match cell + its mismatch cell = Reporter
   All probe pairs (PM+MM) in a probe set = CompositeSequence

Steve

> -----Original Message-----
> From: Adam Witney [mailto:awitney@sghms.ac.uk] 
> Sent: Monday, October 14, 2002 2:03 AM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] Re: MicroarrayIO proposal
> 
> 
> 
> Hi Allen,
> 
> I would recommend staying away from the use of the word 
> 'Probe' as there is always confusion as to what you are 
> referring to. Many people use 'probe' to refer to the DNA 
> molecule spotted onto the array and others use it to refer to 
> the labelled sample hybridised to the array.
> 
> The Microarray and Gene Expression Database group 
> (www.mged.org) has avoided use of this word completely and 
> uses 'reporter' to describe the spotted DNA and 
> 'labeledExtract' to refer to the hybridising material.
> 
> The MGED group has developed a model to represent microarray 
> data and is fast being incorporated into the major microarray 
> databases in use. Also, one of the working groups at MGED is 
> developing a perl/java software toolkit to create and 
> manipulate microarray data objects. Take a look here for more details 
> 
> http://www.mged.org/Workgroups/MAGE/magestk.html
> 
> Maybe it would be worth incorporating the magestk work into 
> your work here?
> 
> cheers
> 
> adam
> 
> 
> > Message: 10
> > Date: Fri, 11 Oct 2002 17:05:05 -0700 (PDT)
> > From: Allen Day <allenday@ucla.edu>
> > To: Bioperl <bioperl-l@bioperl.org>
> > Subject: [Bioperl-l] MicroarrayIO proposal
> > 
> > Hi all,
> > 
> > I'm getting ready to commit some MicroarrayIO classes to 
> bioperl-live, 
> > but first I'd like to get some feedback on how I've set 
> them up.  See 
> > below:
> > 
> > Hierarchy is like this:
> > 
> > Bio/
> > Expression/
> > Microarray/
> > ProbeI.pm
> > Probeset.pm
> > Affymetrix/
> > Array.pm
> > Data.pm
> > Probe.pm
> > MicroarrayI.pm
> > MicroarrayIO.pm
> > MicroarrayIO/
> > affymetrix.pm
> > ProbeI.pm
> > 
> > 
> > 
> > 
> > And usage is like this:
> > 
> > use Bio::Expression::MicroarrayIO;
> > 
> > # create an IO object.  an array object is created
> > # based on -template
> > my $mio = Bio::Expression::MicroarrayIO->new(
> >            -file     => 'path/to/datafile',
> >            -format   => 'affymetrix',
> >            -template => 'path/to/template',
> >         );
> > 
> > # fill the array object created by the constructor
> > # with data from the next array.  returns a
> > # Bio::Expression::MicroarrayI compliant object
> > my $array = $mio->next_array;
> > 
> > # this will write affy-format files, given
> > # a MicroarrayI compliant object
> > my $out = Bio::Expression::MicroarrayIO->new(
> >            -file     => '>path/to/outputfile',
> >            -format   => 'affymetrix',
> >         );
> > 
> > #write $array to file
> > $out->write_array($array);
> > 
> > #print a list of outliers and masked probes
> > foreach my $probeset ($array->each_probeset){
> > foreach my $probe ($probeset->each_probe){
> > next unless $probe->is_outlier or $probe->is_masked;
> > print join "\t", (
> > $probeset->id,
> > $probe->x,
> > $probe->y,
> > $probe->value,
> > $probe->is_outlier,
> > $probe->is_masked,
> > "\n";
> > )
> > }
> > }
> > 
> > 
> > Comments appreciated.  Enjoy the weekend.
> > 
> > -Allen
> > 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org http://bioperl.org/mailman/listinfo/bioperl-l
>