From risetograce at hotmail.com Fri Dec 19 13:37:04 2003 From: risetograce at hotmail.com (Daniel Newkirk) Date: Fri Dec 19 13:42:52 2003 Subject: [Bioperl-microarray] Extraction of gene expression levels from NCBI GEO... Message-ID: Hello all, I am new to this list, so bear with me if my question gets asked a lot. Being newer to bioinformatics, there are still many things I'm only now becoming familiar with. What I am trying to accomplish is to establish a method of retrieving Microarray data from NCBI's GEO database, specifically in connection with a particular gene. Once that data has been pulled in, it needs to be formatted to create a chart/report of basic expression levels in various tissue types. Projects like this have been done by Novartis at http://expression.gnf.org, retrieving the data points associated with a particular gene and then assembling the values into a java created graph. While easy to read, the tissue expression values are what I desire most over the graph. The problem with the aforementioned database is that all data is based on results from the Affymetrix U95A chip (human) and U74A chip (mouse) chips, which do not have probes corresponding to the most recent genes and ESTs. NCBI has data from the more recent U133A and B chips, as well as other array formats, and therefore is more likely to have the data I'm looking at, albeit the data being often derived from non-"normal" tissue. Initially, I expect to have to download the individual files from NCBI manually, and from that point parse the files with perl script and retrieve the expression values for the gene of interest. From that point, I can assemble a report combining the overlapping data points to create a proposed average expression level of Gene X in Tissue Y. The end of the matter is this: what modules are most suited to my purposes in Bioperl (if any) ? I have been browsing the docs and have not seen any that seem to apply to what I am doing, but I can easily miss something as there are so many modules in the latest release. Any ideas or suggestions would be most appreciated. Thanks!!!!! Daniel Newkirk _________________________________________________________________ Grab our best dial-up Internet access offer: 6 months @$9.95/month. http://join.msn.com/?page=dept/dialup From allenday at ucla.edu Fri Dec 19 14:29:41 2003 From: allenday at ucla.edu (Allen Day) Date: Fri Dec 19 14:35:26 2003 Subject: [Bioperl-microarray] Extraction of gene expression levels from NCBI GEO... In-Reply-To: Message-ID: Daniel, > I am new to this list, so bear with me if my question gets asked a lot. Welcome! > Being newer to bioinformatics, there are still many things I'm only now > becoming familiar with. What I am trying to accomplish is to establish a > method of retrieving Microarray data from NCBI's GEO database, specifically > in connection with a particular gene. Once that data has been pulled in, it > needs to be formatted to create a chart/report of basic expression levels in > various tissue types. Projects like this have been done by Novartis at > http://expression.gnf.org, retrieving the data points associated with a > particular gene and then assembling the values into a java created graph. > While easy to read, the tissue expression values are what I desire most over > the graph. > The problem with the aforementioned database is that all data is based > on results from the Affymetrix U95A chip (human) and U74A chip (mouse) > chips, which do not have probes corresponding to the most recent genes and > ESTs. NCBI has data from the more recent U133A and B chips, as well as other > array formats, and therefore is more likely to have the data I'm looking at, > albeit the data being often derived from non-"normal" tissue. Initially, I > expect to have to download the individual files from NCBI manually, and from > that point parse the files with perl script and retrieve the expression > values for the gene of interest. From that point, I can assemble a report > combining the overlapping data points to create a proposed average > expression level of Gene X in Tissue Y. The end of the matter is this: what > modules are most suited to my purposes in Bioperl (if any) ? I have been There isn't anything. I'm doing work very similar to what you describe. I take the approach of using bioperl-microarray to parse the files (from NCBI or elsewhere) so that I can get at the expression data as objects. Then I load the data into a local chado database (http://www.gmod.org) using the RAD gene expression module and annotate the arrays using various ontologies. For the latter half it might make your life easier to try to obtain the data in a MAGE-ML format, but I don't think it's always available from public databases. Most of the time I end up annotating w/ ontology terms by hand. > browsing the docs and have not seen any that seem to apply to what I am > doing, but I can easily miss something as there are so many modules in > the latest release. Any ideas or suggestions would be most appreciated. > Thanks!!!!! -Allen