[Bioperl-microarray] Extraction of gene expression levels from
NCBI GEO...
Allen Day
allenday at ucla.edu
Fri Dec 19 14:29:41 EST 2003
Daniel,
> I am new to this list, so bear with me if my question gets asked a lot.
Welcome!
> Being newer to bioinformatics, there are still many things I'm only now
> becoming familiar with. What I am trying to accomplish is to establish a
> method of retrieving Microarray data from NCBI's GEO database, specifically
> in connection with a particular gene. Once that data has been pulled in, it
> needs to be formatted to create a chart/report of basic expression levels in
> various tissue types. Projects like this have been done by Novartis at
> http://expression.gnf.org, retrieving the data points associated with a
> particular gene and then assembling the values into a java created graph.
> While easy to read, the tissue expression values are what I desire most over
> the graph.
> The problem with the aforementioned database is that all data is based
> on results from the Affymetrix U95A chip (human) and U74A chip (mouse)
> chips, which do not have probes corresponding to the most recent genes and
> ESTs. NCBI has data from the more recent U133A and B chips, as well as other
> array formats, and therefore is more likely to have the data I'm looking at,
> albeit the data being often derived from non-"normal" tissue. Initially, I
> expect to have to download the individual files from NCBI manually, and from
> that point parse the files with perl script and retrieve the expression
> values for the gene of interest. From that point, I can assemble a report
> combining the overlapping data points to create a proposed average
> expression level of Gene X in Tissue Y. The end of the matter is this: what
> modules are most suited to my purposes in Bioperl (if any) ? I have been
There isn't anything. I'm doing work very similar to what you describe.
I take the approach of using bioperl-microarray to parse the files (from
NCBI or elsewhere) so that I can get at the expression data as objects.
Then I load the data into a local chado database (http://www.gmod.org)
using the RAD gene expression module and annotate the arrays using various
ontologies.
For the latter half it might make your life easier to try to obtain the
data in a MAGE-ML format, but I don't think it's always available from
public databases. Most of the time I end up annotating w/ ontology terms
by hand.
> browsing the docs and have not seen any that seem to apply to what I am
> doing, but I can easily miss something as there are so many modules in
> the latest release. Any ideas or suggestions would be most appreciated.
> Thanks!!!!!
-Allen
More information about the bioperl-microarray
mailing list