[MOBY-l] Core information about biological objects...
Simon Twigger
simont at mcw.edu
Thu Sep 18 12:13:10 EDT 2003
Hi there,
I've been thinking about literature related services using MOBY, in
particular how one might be able to query a db to find out what it knew
about a particular reference. The idea would be to query a db service
asking 'do you have this reference in your system, and if so, what
information have you got linked to it?' The service would then return
something describing what, if anything, it knew about a reference. In
our context at the Rat Genome Database this would be such things as -
the genes/rat strains/sequences, etc. that we have curated from the
paper.
ideally you would want the list of 'things' that get returned to be
MOBY objects but lightweight as there might be a lot of info for some
references and you dont necessarily know what you are interested in
(though I suppose this could be an additional parameter in the initial
query ("do you have any gene records linked to this reference?")).
This leads me to wonder what such a lightweight object would look like
and what it would contain. You'd like to know it was a 'gene' but you
dont want a load of extra info, perhaps just the gene symbol, the db's
ID for the gene and perhaps a URL to link in to the report in the
database. This should be sufficient to generate a useful report on the
reference and give info on how to follow up if you identify specific
objects you want to know more about.
How would one create such an lightweight object in the MOBY environment
- use a full blown gene object but only return a limited core attribute
set, have a separate lightweight gene object with just these
attributes? ideally everyone implementing the service would use the
same object and by using a lightweight object with a limited set of
standard core attributes it would be useable by everyone and still
allow db's to implement a 'heavyweight' object to include their
specific attributes, accessible via a different, specific service.
I notice we have a lightweight Amino Acid object, I'd like to have such
things as gene, strain, quantitative trait locus, SSLP (microsatellite
marker). What do other people think about such standard objects and
what might the core attributes be for them. The trick is to return the
smallest amount of useful information that allows a subsequent service
or user to know which object they are interested in without having to
get the heavyweight objects to find the appropriate attribute. Also, to
limit the table joins in the database so the query is fast (can always
run off views or specific flat files though).
Here are my thoughts, a number of these objects have mapping
information though the question then becomes which map coordinates do
you want, do you want them all, default to genome position, etc. - the
complexity goes up when you add this in and things cease to become
lightweight.
Gene:
Symbol: Abc1
ID: 12345
namespace: RGD
URL: http://rgd.mcw.edu/query/query.cgi?id=RGD:12345
[Might also want brief mapping information: chromosome, position?]
Strain:
Symbol: BN/SsMCW
ID: 345346
namespace: RGD
URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 345346
[Might also want strain type: inbred, outbred, consomic, congenic,
recombinant inbred, etc., Im sure there are other terms for the other
organisms]
QTL:
Symbol: Bp123
ID: 23453
namespace: RGD
URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 23453
[Might also want long name (Blood pressure QTL 123), chromosome]
SSLP:
Symbol: D1Rat1
ID: 65432
namespace: RGD
URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 65432
[Might also want chromosome, position]
What do other people think??
Simon
------------------------------------------------------------------------
--------------------------
Simon Twigger, Ph.D.
Assistant Professor, Bioinformatics Research Center
Medical College of Wisconsin
8701 Watertown Plank Road,
Milwaukee, WI, 53226
tel. 414-456-8802, fax 414-456-6595
More information about the moby-l
mailing list