[MOBY-l] Genomic position-based GO search...
Simon Twigger
simont at mcw.edu
Thu Nov 21 17:49:12 UTC 2002
Hi there,
(Apologies for the GO/BioMOBY cross post but I think its relevant to
both)
As part of our ongoing work to implement an ontology schema within RGD
we were doing some use case analyses and one of the big things I think
our users (Rat geneticists/genomic people, positional cloners, etc)
want to do is to find out what genes are in their region of interest
(defined by a QTL, syntenic region, or similar) and from that, the GO
terms associated with those genes. The next step would be to build in
some sort of filter that allowed them to ask "What genes in this region
are part of XX process/component/function?" etc. Im sure this is
something that isnt restricted to Rat genomics.
I know that this isnt too hard to build for each individual db as they
have the gene information, the mapping information and the GO
information locally and its all integrated. However, when you are doing
comparative analyses (at least in our case) you know the syntenic
region you are interested in but you dont have all the
genes/positions/terms for the other organism(s) in your own database so
you cant easily offer that functionality. You might not want to bump
the user off to the other organism db to use their interface (if it
exists) and this also wouldnt work if you want this functionality
inside a tool rather than as a user-operated search function.
Im wondering if this is functionality that could be provided either by
GO, or by a db for their own organism, that others could use thereby
saving others the hassle of maintaining lots of info about the other
organisms genes and locations?
Two potential solutions that I thought of:
Option 1 - add the mapping information into GO: add chromosome and
genomic location (and presumably build/reference map info). If you then
know the region you are interested in from the other organism you can
get all genes with GO terms of interest between START and STOP on
chromosome N. Downsides to this are adding yet more info to GO files
and schema, the hassle of keeping things up to date at GO, etc. and it
might not be worth the pain.
Option 2 - (I like this one) Have a standard API offered by a db
(webservices/BioMoby would seem to be a good fit here) that others can
call to extract this information: You pass in the chromosome, the map
and the region and optionally some GO terms that you want to use to
refine the returned results and the webservice returns a list of genes
in that region that match those criteria. On the MOBY front - Im not
sure if this is violating the atomic input > transform > output concept
by doing too much in the transform step and it could certainly be
broken down into component parts and joined back together.
What do others think about this? Ultimately I'd love to see this
genomic position based search expanded so I could pop a genome browser
on top to display not every gene/feature/SNP etc. in a region but only
those that match certain criteria - a genome-based search engine for
the db.
Simon.
------------------------------------------------------------------------
--------------------------
Simon Twigger, Ph.D.
Assistant Professor, Bioinformatics Research Center
Medical College of Wisconsin
8701 Watertown Plank Road,
Milwaukee, WI, 53226
tel. 414-456-8802, fax 414-456-6595
More information about the moby-l
mailing list