[MOBY-l] Re: BIoinformatics Protocols Ontological KnowledgeBase?

Lincoln Stein lstein at cshl.org
Mon Apr 28 13:26:21 UTC 2003


Hi Elia,

What you're talking about is an ontology of bioinformatics services.  This is 
similar to what we've been talking about for MOBY, and also similar to the 
(much more ambitious) MyGrid project.

It's a hard problem, but it has taken on a certain feeling of inevitability, 
and I would very much welcome your group working on this with us. We are 
currently pursuing two different forks in the road: one is to use WSDL to 
define bioinformatics service signatures and to facilitate automatic 
invocation.  This relies on a central registry.  The other is to use RDF to 
create a distributed ontology of service descriptions; no registry required. 

As far as the connection to Current Protocols goes, I'm on the editorial board 
at Current Protocols in Bioinformatics, and can introduce the idea of a 
relationship at the next board meeting on June 13.  Do you think you might 
have a concrete proposal or discussion paper ready by then?

Lincoln

On Monday 28 April 2003 02:24 am, Elia Stupka wrote:
> Hello everybody,
>
> as our BioPipe paper is being accepted Shawn, Kiran and I are thinking of
> the next possiblestep in the field, and we came up with a possible
> interesting project, an ontology drivendatabase of bioinformatics
> protocols. I've included a few people in the list because it'ssomething
> that conceptually or concretely overlaps with these resources: SO, GKB,
> BioPipe,Current Protocols Books....
>
> I don't want to drag this particular e-mail too long but here is a very
> short summary of theidea. A database that will contain bioinformatics
> protocols that have been published or arebeing used by current large scale
> projects and/or are de facto accepted protocols inbioinformatics. These can
> be as simple as "reciprocal blast for ortholog finding" or ascomplex as the
> full genome annotation pipeline at Ensembl. The idea is to make
> theprotocols a)reproducible (BioPipe), ontology driven (SO), curated
> internally (a la GKB),publishable (Current Protocols), annotated by
> external users (a la Amazon).
>
> We would like to see SO extended somewhat from being only an ontology for
> sequencefeatures to encompassing what we consider input data, which at time
> may be sequencefeatures, but often will be raw data. The ontology for raw
> data could be as simple as dna orprotein, if dna is it cDNA, genomic DNA,
> etc.to allow us to describe from which layer of theontology "onwards" in
> the tree a protocol will work. We would further like to
> "ontologize"analysis programs, so that one could simply say this step
> requires a "similarity search"without having to state what program, etc.
>
> So primarily a bioinformatics protocol would be a graph of SO input, SO
> analysis, SO foroutput. This in turn would allow us to automatically
> generate BioPipe XML by pairing the term"whole genome dna" with all current
> methods of fetching genomes in BioPipe, pairing"similarity search" to all
> blast-like programs in BioPipe, pairing "protein-based geneprediction" to
> genewise, genomescan,etc. and pairing "genes" output to iohandlers for
> storinggenes. This is the BioPipe aspect of it, which would enable us to
> make a protocol not only adead "record" but an executable alive protocol.
>
> The effort would need som sort of curation team, serious setup, like GKB
> to validateprotocols, annotate prtocols, research literature,etc. It would
> also need serious softwaredevelopment as it progresses because we *are
> aware* it will get hairy as we will really startpopulating this beast, but
> it will be as exciting as challenging, I think.
>
> Finally, a collaboration with somebody like Current Protocols could really
> push this projectout. Be it to simply provide a summary in the book of what
> is in the website, or be it a propertight integration where "protocol
> identifiers" would be the same between the two, and thebook would almost
> become an extension to the project rather than the other way around.
> This is at the first, brainstorming sort of stage, Shawn, Kiran and I have
> just started toyingwith the idea and would like to know what you all think.
> If it is of interest to get decentcritical mass we can then think of the
> next stage, perhaps a Banbury (or Banbury-like)meeting? By the end of
> summer Shawn will be close to SO folks (at Stanford), Kiran will be
> atLincoln's, and I will be back in Europe, closer to Hinxton, so  we would
> be well located to takethis forward together.
>
> Ciao!
>
> Elia

-- 
Lincoln Stein
lstein at cshl.org
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)



More information about the moby-l mailing list