[DAS2] Notes from the weekly DAS/2 teleconference, 19 Jun 2006

Wed Jun 21 08:27:10 UTC 2006

>
> 1. delivering mappings of probe sets onto other ids (e.g., AGI gene
> ids) using different authorities: TAIR, us, Affymetrix, University of
> Michigan, and so on.

We're doing this with the NetAffx schema that has been loaded to
Postgres/Chado and full-text indexed.  I think we have Affy probeset -> TAIR
ID mappings, but not the others.

2. filtering out probe sets using various critiera, e.g., promiscuous
> probe sets that match multiple genes, probe sets that "behave badly"
> in all known experiments, and so on. Each filtering procedure can be
> given a name.

Yes, that is something I am looking at right now.  Actually, as you get more
and more arrays the probeset behavior becomes very clear, with many
transcripts showing discrete on/off states, e.g. a bunch of genes highly
expressed in human tongue:

taste receptor, type 2, member 1
http://celsius-cgi.genomics.ctrl.ucla.edu/cgi/plot_element.Rsh?221324_at

gastrin-releasing peptide receptor
http://celsius-cgi.genomics.ctrl.ucla.edu/cgi/plot_element.Rsh?207929_at

olfactory receptor, family 10
http://celsius-cgi.genomics.ctrl.ucla.edu/cgi/plot_element.Rsh?221346_at

natural cytotoxicity triggering receptor
http://celsius-cgi.genomics.ctrl.ucla.edu/cgi/plot_element.Rsh?217088_s_at

There are even clear trimodals, like thyroid receptor alpha:

http://celsius-cgi.genomics.ctrl.ucla.edu/cgi/plot_element.Rsh?1316_at

3. providing expression values generated from 'cel' files using either
> RMA or MAS5, w/ PMA calls on both

Yes, you can do this in R with XML, but it's a pain.  Better for expression
data to use TSV as you are doing.  We have an R lib in development for doing
large batch retrieval of hundreds of arrays.  Getting annotation into R
turns out to be easier with XML as it just easier to represent in the more
flexible format.

-Allen