[MOBY-l] Core information about biological objects...

Lincoln Stein lstein at cshl.edu
Wed Oct 1 10:42:47 EDT 2003


Hi,

I'm going to speak very generally here from the standpoint of someone who has 
been struggling with the tradeoffs between "light" vs "heavy" objects for 
some time.

I have found it to be extremely useful to export objects that are stripped of 
their content and consist only of attribute names and pointers.  As the 
application needs to access the contents of the object, the object is filled 
in bit by bit in a transparent way.  I call this process "filling" of the the 
object.  However, this works best when the application only needs a few 
attributes and when network latency is low.  If the application needs most of 
the attributes or when network latency is high, this strategy is a 
performance killer.

My approach has been to let the application tune this behavior, by specifying 
in the request three different types of retrieval:

	1) fetch the object completely unfilled, and fill dynamically as needed
		(this is the default)

	2) fetch the object completely filled

	3) fetch the object partly-filled, by listing the attributes to
		be filled in the request

For generality, one should allow requests that will return multiple objects, 
and one should be able to control the filling behaviorfor each object class.

I think MOBY does have something similar to the concept of filled and 
unfilled, but once an object is filled it can't be emptied.  Is that right?

Lincoln

On Thursday 18 September 2003 12:13 pm, Simon Twigger wrote:
> Hi there,
>
> I've been thinking about literature related services using MOBY, in
> particular how one might be able to query a db to find out what it knew
> about a particular reference. The idea would be to query a db service
> asking 'do you have this reference in your system, and if so, what
> information have you got linked to it?' The service would then return
> something describing what, if anything, it knew about a reference. In
> our context at the Rat Genome Database this would be such things as -
> the genes/rat strains/sequences, etc. that we have curated from the
> paper.
>
> ideally you would want the list of 'things' that get returned to be
> MOBY objects but lightweight as there might be a lot of info for some
> references and you dont necessarily know what you are interested in
> (though I suppose this could be an additional parameter in the initial
> query ("do you have any gene records linked to this reference?")).
>
> This leads me to wonder what such a lightweight object would look like
> and what it would contain. You'd like to know it was a 'gene' but you
> dont want a load of extra info, perhaps just the gene symbol, the db's
> ID for the gene and perhaps a URL to link in to the report in the
> database. This should be sufficient to generate a useful report on the
> reference and give info on how to follow up if you identify specific
> objects you want to know more about.
>
> How would one create such an lightweight object in the MOBY environment
> - use a full blown gene object but only return a limited core attribute
> set, have a separate lightweight gene object with just these
> attributes? ideally everyone implementing the service would use the
> same object and by using a lightweight object with a limited set of
> standard core attributes it would be useable by everyone and still
> allow db's to implement a 'heavyweight' object to include their
> specific attributes, accessible via a different, specific service.
>
> I notice we have a lightweight Amino Acid object, I'd like to have such
> things as gene, strain, quantitative trait locus, SSLP (microsatellite
> marker). What do other people think about such standard objects and
> what might the core attributes be for them. The trick is to return the
> smallest amount of useful information that allows a subsequent service
> or user to know which object they are interested in without having to
> get the heavyweight objects to find the appropriate attribute. Also, to
> limit the table joins in the database so the query is fast (can always
> run off views or specific flat files though).
>
>   Here are my thoughts, a number of these objects have mapping
> information though the question then becomes which map coordinates do
> you want, do you want them all, default to genome position, etc. - the
> complexity goes up when you add this in and things cease to become
> lightweight.
>
> Gene:
> Symbol: Abc1
> ID: 12345
> namespace: RGD
> URL: http://rgd.mcw.edu/query/query.cgi?id=RGD:12345
>
> [Might also want brief mapping information: chromosome, position?]
>
> Strain:
> Symbol: BN/SsMCW
> ID: 345346
> namespace: RGD
> URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 345346
>
> [Might also want strain type: inbred, outbred, consomic, congenic,
> recombinant inbred, etc., Im sure there are other terms for the other
> organisms]
>
> QTL:
> Symbol: Bp123
> ID: 23453
> namespace: RGD
> URL:  http://rgd.mcw.edu/query/query.cgi?id=RGD: 23453
>
> [Might also want long name (Blood pressure QTL 123), chromosome]
>
> SSLP:
> Symbol: D1Rat1
> ID: 65432
> namespace: RGD
> URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 65432
>
> [Might also want chromosome, position]
>
>
>
> What do other people think??
>
> 	Simon
>
> ------------------------------------------------------------------------
> --------------------------
> Simon Twigger, Ph.D.
> Assistant Professor, Bioinformatics Research Center
>
> Medical College of Wisconsin
> 8701 Watertown Plank Road,
> Milwaukee, WI, 53226
> tel. 414-456-8802, fax 414-456-6595
>
> _______________________________________________
> moby-l mailing list
> moby-l at biomoby.org
> http://biomoby.org/mailman/listinfo/moby-l

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================




More information about the moby-l mailing list