[MOBY] Re: [MOBY-l] Core information about biological objects...

Mark Wilkinson markw at illuminae.com
Wed Oct 1 14:33:15 EDT 2003


On Wed, 2003-10-01 at 07:42, Lincoln Stein wrote:


> I think MOBY does have something similar to the concept of filled and 
> unfilled, but once an object is filled it can't be emptied.  Is that right?

Yes and no... 

We deal with this in several ways, but none of them are as direct as the
methodology you describe (requesting that portions of the object are
filled as needed).  One way we deal with the problem is through the
inheritance hierarchy - you only request the complexity that you really
need at that moment.  This isn't quite as flexible as your approach, but
it makes the architecture more simplistic (the server doesn't have to
deal with arbitrary requests for sub-portions of an object); on the
other hand, it does assume that service providers will set up a variety
of services serving different complexities of objects.  We also deal
with this by discouraging unnecessarily complex objects - our objects
are inherently lightweight (so far) and only carry the data that we
would assume the end-user will require.  

"Filling in" an object is quite simple, again because of the ISA
hierarchy.  For example, I have two services that serve genbank
sequences - one returns a "virtual sequence" (service A), the other
returns "*Sequence" (service B).  The B could be considered a filling-in
of the data returned from A.  If I approach A with a set of NCBI_gi's
and execute the service I get a bunch of "unfilled" virtual sequence
objects.  Since these all inherit directly from Object, and service B
consumes Object, I can (en masse, or individually) send the output of A
to B and get them filled in.

w.r.t. "emptying" the object - there is no pre-defined way to do this,
but it doens't seem to be a particularly useful thing to do... if I
service is registered that requires "DNASequence" objects, then it must
*need* DNASequence objects (versus Object objects or VirtualSequence
objects).  Since the service provider registers the minimal "full"
object that they require to execute their service, passing them an empty
object would only shift the burden onto them to fill it in (which they
might not be able to do!).  An intelligent client would be able to
decompose objects and throw away unnecessary information that was not
required for service execution, but the system does not *necessitate*
that this client-side work is done.

M

> Lincoln
> 
> On Thursday 18 September 2003 12:13 pm, Simon Twigger wrote:
> > Hi there,
> >
> > I've been thinking about literature related services using MOBY, in
> > particular how one might be able to query a db to find out what it knew
> > about a particular reference. The idea would be to query a db service
> > asking 'do you have this reference in your system, and if so, what
> > information have you got linked to it?' The service would then return
> > something describing what, if anything, it knew about a reference. In
> > our context at the Rat Genome Database this would be such things as -
> > the genes/rat strains/sequences, etc. that we have curated from the
> > paper.
> >
> > ideally you would want the list of 'things' that get returned to be
> > MOBY objects but lightweight as there might be a lot of info for some
> > references and you dont necessarily know what you are interested in
> > (though I suppose this could be an additional parameter in the initial
> > query ("do you have any gene records linked to this reference?")).
> >
> > This leads me to wonder what such a lightweight object would look like
> > and what it would contain. You'd like to know it was a 'gene' but you
> > dont want a load of extra info, perhaps just the gene symbol, the db's
> > ID for the gene and perhaps a URL to link in to the report in the
> > database. This should be sufficient to generate a useful report on the
> > reference and give info on how to follow up if you identify specific
> > objects you want to know more about.
> >
> > How would one create such an lightweight object in the MOBY environment
> > - use a full blown gene object but only return a limited core attribute
> > set, have a separate lightweight gene object with just these
> > attributes? ideally everyone implementing the service would use the
> > same object and by using a lightweight object with a limited set of
> > standard core attributes it would be useable by everyone and still
> > allow db's to implement a 'heavyweight' object to include their
> > specific attributes, accessible via a different, specific service.
> >
> > I notice we have a lightweight Amino Acid object, I'd like to have such
> > things as gene, strain, quantitative trait locus, SSLP (microsatellite
> > marker). What do other people think about such standard objects and
> > what might the core attributes be for them. The trick is to return the
> > smallest amount of useful information that allows a subsequent service
> > or user to know which object they are interested in without having to
> > get the heavyweight objects to find the appropriate attribute. Also, to
> > limit the table joins in the database so the query is fast (can always
> > run off views or specific flat files though).
> >
> >   Here are my thoughts, a number of these objects have mapping
> > information though the question then becomes which map coordinates do
> > you want, do you want them all, default to genome position, etc. - the
> > complexity goes up when you add this in and things cease to become
> > lightweight.
> >
> > Gene:
> > Symbol: Abc1
> > ID: 12345
> > namespace: RGD
> > URL: http://rgd.mcw.edu/query/query.cgi?id=RGD:12345
> >
> > [Might also want brief mapping information: chromosome, position?]
> >
> > Strain:
> > Symbol: BN/SsMCW
> > ID: 345346
> > namespace: RGD
> > URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 345346
> >
> > [Might also want strain type: inbred, outbred, consomic, congenic,
> > recombinant inbred, etc., Im sure there are other terms for the other
> > organisms]
> >
> > QTL:
> > Symbol: Bp123
> > ID: 23453
> > namespace: RGD
> > URL:  http://rgd.mcw.edu/query/query.cgi?id=RGD: 23453
> >
> > [Might also want long name (Blood pressure QTL 123), chromosome]
> >
> > SSLP:
> > Symbol: D1Rat1
> > ID: 65432
> > namespace: RGD
> > URL: http://rgd.mcw.edu/query/query.cgi?id=RGD: 65432
> >
> > [Might also want chromosome, position]
> >
> >
> >
> > What do other people think??
> >
> > 	Simon
> >
> > ------------------------------------------------------------------------
> > --------------------------
> > Simon Twigger, Ph.D.
> > Assistant Professor, Bioinformatics Research Center
> >
> > Medical College of Wisconsin
> > 8701 Watertown Plank Road,
> > Milwaukee, WI, 53226
> > tel. 414-456-8802, fax 414-456-6595
> >
> > _______________________________________________
> > moby-l mailing list
> > moby-l at biomoby.org
> > http://biomoby.org/mailman/listinfo/moby-l
-- 
Mark Wilkinson <markw at illuminae.com>
Illuminae



More information about the moby-l mailing list