[MOBY] [MOBY-l] Services on MOBYClient

Simon Twigger simont at mcw.edu
Mon Dec 8 21:47:33 UTC 2003


On Monday, Dec 8, 2003, at 13:51 America/Chicago, Mark Wilkinson wrote:

> On Mon, 2003-12-08 at 13:13, Simon Twigger wrote:
>
>> sub register_service {
>>    my $C = shift;
>>    my $reg = $C->registerService(
>>      serviceName  => 'keywordToGene',
>>      authURI      => $authURI,
>>      contactEmail => $email,
>>      description  => "Test service: matches gene symbol to RGD gene
>> record, if available",
>>      URL          => $url,
>>      input        => [ ['', [Object => []]], ],
>
> Should be:     input        => [ ['', [Object => ['Global_Keyword']]],
> ],

I'll give that a go.

>
> In fact, registration should probably fail in that case since (contrary
> to what I said in my last message) Global_Keyword is a namespace, not  
> an
> object type, so it shouldn't even successfully register.

I wondered about this too but when you look at the Object list  
(http://mobycentral.cbr.nrc.ca/cgi-bin/types/Objects) Global_Keyword is  
also listed there as an Object which was why I was trying to do it that  
way.

> about the service itself:
>
> I'm still confused about what the service does... it consumes a  
> "keyword" gene
> identifier, and returns a string containing the same gene identifier  
> plus a bunch
> of cross-references... is that the intention?

Not really, it should take a broader range of inputs and not just  
regurgitate what you put in with a little bit more info. Though, if one  
went with the idea of extending the objects like we have in sequence -  
VirtualGene, GenericGene, FullyAnnotatedGene, etc. (going from bare  
minimum info - symbol, name and ID to every annotation under the sun)  
this might have a use.


> If so, they why is it consuming
> "Global_Keyword"?  Gene identifiers are not really keywords (per se),  
> or at least
> not the way you are implying that they be used.

Mentally I tend to distinguish identifiers from symbols which perhaps  
colors my thinking - For me, Identifiers are stable accession numbers,  
id numbers, etc. (eg RGD:12345), symbols are much less stable and are  
closer to a keyword than anything else.

> For example, I executed your
> service using "kinase" and got nothing, but I executed it using A2m  
> and got
> the locus. If it were really a keyword lookup, I think I would have  
> expected
> a return from "kinase".  As it is, you are calling RGD ID's "keywords"  
> and thus
> taking all of the semantic meaning out of them

This is a test and it just does an absolute match on the query word  
which isnt what you'd really want. I was trying to build anything that  
worked and this was just where I started. In retrospect this service  
might serve the purpose of answering 'do you have any gene records in  
RGD that match this string?'.  You'd input a String object and look for  
matches in Symbols, Aliases, Gene Names, etc and return a list of  
matching up to date RGD gene symbols. The symbol(s) returned would be  
the current correct nomenclature along with associated cross references  
for you to move on with. This is like doing a search on the Gene table  
on RGD where you can put in 'kinase' and get back anything with kinase  
in the name. Though this then opens up a can of worms - should I check  
to see if Kinase is a GO term and then search on the GO annotations as  
well, just to be complete - not every kinase has 'kinase' in the name.  
There is a danger of biting off too much in one go and I would opt for  
breaking this up into multiple services that could wrapped up and  
called as an uber-service (that hopefully ran the various searches in  
parallel before merging at the end).

I think my original thought process was that a keyword is just one word  
as opposed to a String which is potentially a phrase and I was looking  
to limit what went in. This could be used as a nomenclature server for  
example - you enter a gene symbol (always one word) and get back the  
current approved symbol. There are obviously other ways to do this and  
getting the correct meaning behind each input is important for  
consistency.


Simon.

------------------------------------------------------------------------ 
--------------------------
Simon Twigger, Ph.D.
Assistant Professor, Bioinformatics Research Center

Medical College of Wisconsin
8701 Watertown Plank Road,
Milwaukee, WI, 53226
tel. 414-456-8802, fax 414-456-6595




More information about the moby-l mailing list