[MOBY-l] Genomic position-based GO search...

Simon Twigger simont at mcw.edu
Thu Nov 21 17:49:12 UTC 2002


Hi there,

(Apologies for the GO/BioMOBY cross post but I think its relevant to  
both)

As part of our ongoing work to implement an ontology schema within RGD  
we were doing some use case analyses and one of the big things I think  
our users (Rat geneticists/genomic people, positional cloners, etc)  
want to do is to find out what genes are in their region of interest  
(defined by a QTL, syntenic region, or similar) and from that, the GO  
terms associated with those genes. The next step would be to build in  
some sort of filter that allowed them to ask "What genes in this region  
are part of XX process/component/function?" etc. Im sure this is  
something that isnt restricted to Rat genomics.

I know that this isnt too hard to build for each individual db as they  
have the gene information, the mapping information and the GO  
information locally and its all integrated. However, when you are doing  
comparative analyses (at least in our case) you know the syntenic  
region you are interested in but you dont have all the  
genes/positions/terms for the other organism(s) in your own database so  
you cant easily offer that functionality. You might not want to bump  
the user off to the other organism db to use their interface (if it  
exists) and this also wouldnt work if you want this functionality  
inside a tool rather than as a user-operated search function.

Im wondering if this is functionality that could be provided either by  
GO, or by a db for their own organism, that others could use thereby  
saving others the hassle of maintaining lots of info about the other  
organisms genes and locations?

Two potential solutions that I thought of:
Option 1 - add the mapping information into GO: add chromosome and  
genomic location (and presumably build/reference map info). If you then  
know the region you are interested in from the other organism you can  
get all genes with GO terms of interest between START and STOP on  
chromosome N. Downsides to this are adding yet more info to GO files  
and schema, the hassle of keeping things up to date at GO, etc. and it  
might not be worth the pain.

Option 2 - (I like this one) Have a standard API offered by a db  
(webservices/BioMoby would seem to be a good fit here) that others can  
call to extract this information: You pass in the chromosome, the map  
and the region and optionally some GO terms that you want to use to  
refine the returned results and the webservice returns a list of genes  
in that region that match those criteria. On the MOBY front - Im not  
sure if this is violating the atomic input > transform > output concept  
by doing too much in the transform step and it could certainly be  
broken down into component parts and joined back together.

What do others think about this? Ultimately I'd love to see this  
genomic position based search expanded so I could pop a genome browser  
on top to display not every gene/feature/SNP etc. in a region but only  
those that match certain criteria - a genome-based search engine for  
the db.


Simon.


------------------------------------------------------------------------ 
--------------------------
Simon Twigger, Ph.D.
Assistant Professor, Bioinformatics Research Center

Medical College of Wisconsin
8701 Watertown Plank Road,
Milwaukee, WI, 53226
tel. 414-456-8802, fax 414-456-6595




More information about the moby-l mailing list