[MOBY-l] Spewing forth from the Hackathon - RFC please!

Mark Wilkinson markw at illuminae.com
Tue Feb 18 07:59:22 UTC 2003


Attached is a document few fairly simple examples of the structure of
MOBY-compliant objects, queries, and responses as one might ~expect them
to look in the upcoming post-hackathon release of the MOBY Spec.

A few significant changes are apparent, the primary ones being that:

1) we now explicitly use a moby namespace (though we haven't got an XSD
for it yet), 

2) the query/response structures are a bit more complex to allow space
for future provision information blocks, or whatever else might come
along, 

3) MOBY has its own primitives!  We have moby:INT, moby:STRING,
moby:FLOAT, etc.  This ensures that *everything* in a MOBY Object is
itself a MOBY Object (honestly!  We had to overload these because we
need to sometimes add namespace and id into an 'int'... as absurd as
that might sound...), and thus can be used to trigger service requests
from MOBY Central - e.g. I have 200 sequences, if I pass you 200 INT's
representing the Length of each sequence can you give me back a FLOAT
indicating the average length of these sequences?

4)  Some rules for the structure and use of CrossReference blocks are
now clarified, and some rules surrounding the switching of namespaces
between the query and the response have also become apparent.


There are probably other goodies in there that I can't remember, but
those are the biggies.

Lincoln has suggested that we register all biological web-services (CGI,
MOBY-SOAP, non-MOBY-SOAP, or whatever) in MOBY Central, and call it
'GOOBY'... I dunno... sure, why not :-)

Anyway, I'm sending this out as a request for comment.  It is a snapshot
(only!) of what the current state of MOBY is of 4:00 this afternoon.  We
are still working on it continuously (passing parameters in queries is
the next hurdle!) so it will be in flux for quite a while, but comments
from the MOBY community would be extremely valuable at this stage of the
game - if you see something you don't like, say so now before it becomes
more concrete!

Greetings from hot and sticky Singapore,

Mark


-- 
=======================================================================
                                    |--==\
Mark Wilkinson, Ph.D.                \==-|       
Bioinformatics Consultant             \=/        0010010010100101110010
Illuminae Media                       /-\        
727 6th Ave. N.                      /-==|       0010100100111101010010
Saskatoon, SK, Canada               |==-/        
S7K 2S8                              \=/         0100100100010010010101
+1 (306) 373 3841                     /\         
markw at illuminae.com                  /=-\        1101001010100101010101
                                    |--==\
=======================================================================
-------------- next part --------------
Sample Objects, Queries, & Responses
BioHackathon, 2003 - Singapore.
Version 0.01

These are samples only, to exemplify correctly structured MOBY objects,
queries, and responses in the upcoming (post-hackathon) MOBY release.
They do not necessarily represent objects whose
structures are currently registered in MOBY Central, nor do they
necessarily represent what those structures might be when we do register
them.  They also do not represent legitimate data nor responses.  They
are meant to give you an idea of what things should look like, and what
things to consider when constructing objects, queries, and responses.
Comments indicating points of interest in the examples are included.

========================================================================

OBJECTS:  objects don't need to have the moby namespace explicitly
defined since they will invariably be inside of a Query or Response
wrapper that does.

(Base) Object: (the "MOBY Triple")
<moby:Object  moby:namespace="GenBank/GI" moby:id="163483">


Sequence: note that Sequence inherits from (IS-A) Object
<moby:Sequence  moby:namespace="GenBank/GI" moby:id="163483">
    <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">375</moby:INT>
    <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
        ATTGCGCATGCGAGCTAGTAGCATGCGATGAGGTCGATGCATCT
    </moby:STRING>
</moby:Sequence>


GO_Term: (IS-A Object)
<moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005575">
    <moby:STRING moby:namespace="primitive" moby:tagName="go:name">cellular component</moby:STRING>
    <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
    That fraction of cells, prepared by disruptive biochemical methods, that includes the plasma and other membranes. 
    </moby:STRING>
</moby:GO_Term>


GO_Annotation: (IS-A Object, HAS-A GO_Term)
<moby:GO_Annotation moby:namespace="" moby:id="">
    <moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005575">
        <moby:STRING moby:namespace="primitive" moby:tagName="go:name">cellular component</moby:STRING>
        <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
            That fraction of cells, prepared by disruptive biochemical methods, that includes the plasma and other membranes. 
        </moby:STRING>
    </moby:GO_Term>
    <moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005622">
        <moby:STRING moby:namespace="primitive" moby:tagName="go:name">intracellular</moby:STRING>
        <moby:STRING moby:namespace="primitive" moby:tagName="go:synonym">protoplasm</moby:STRING>
        <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
            A structure composed of a very long molecule of DNA and associated proteins (e.g. histones) that carries hereditary information.
        </moby:STRING>
    </moby:GO_Term>
</moby:GO_Annotation>


AnnotatedSequence: (IS-A Sequence, HAS-A GO_Annotation) 
<moby:AnnotatedSequence  moby:namespace="GenBank/GI" moby:id="163483">
    <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">375</moby:INT>
    <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
        ATTGCGCATGCGAGCTAGTAGCATGCGATGAGGTCGATGCATCT
    </moby:STRING>
    <moby:GO_Annotation moby:namespace="" moby:id="">
        <moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005575">
            <moby:STRING moby:namespace="primitive" moby:tagName="go:name">cellular component</moby:STRING>
            <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
                That fraction of cells, prepared by disruptive
                biochemical methods, that includes the plasma and other
                membranes. 
            </moby:STRING>
        </moby:GO_Term>
        <moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005622">
            <moby:STRING moby:namespace="primitive" moby:tagName="go:name">intracellular</moby:STRING>
            <moby:STRING moby:namespace="primitive" moby:tagName="go:synonym">protoplasm</moby:STRING>
            <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
                A structure composed of a very long molecule of DNA and
                associated proteins (e.g. histones) that carries
                hereditary information.
            </moby:STRING>
        </moby:GO_Term>
    </moby:GO_Annotation>    
</moby:Sequence>

________________________________________________________________________






QUERIES and RESPONSES:

========================================================================    
Simple single-object query
(e.g. "Service takes sequences as input and calculates
 Tm for that sequence")
 ) and its response


QUERY: 
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby">
    <moby:queryInput>
        <moby:Sequence  moby:namespace="GenBank/GI" moby:id="163483">
            <moby:INT moby:namespace="primitive" moby:id="length1" moby:tagName="Length">375</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="sequence1" moby:tagName="SequenceString">
                ATTGCGCATGCGAGCTAGTAGCATGCGATGAGGTCGATGCATCT
            </moby:STRING>
        </moby:Sequence>
    </moby:queryInput>
</moby:MOBY>


RESPONSE:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby"
            moby:authURI="http://www.tempcalculator.org">
    <moby:queryResponse>
        <moby:FLOAT moby:namespace="GenBank/GI" moby:id="163483">69.8
        </moby:FLOAT>
    </moby:queryResponse>
</moby:MOBY>


COMMENT:  Note that, although it is implicit that the service used the
string with id "sequence1" the namespace/id of the response is that of
the *root* object of the query request; i.e. the Sequence object
========================================================================

========================================================================

Simple multiple-query, each with single-object
(e.g. "please BLAST these two sequences")

QUERY:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby">
    <moby:queryInput>
        <moby:Sequence  moby:namespace="GenBank/GI" moby:id="557634">
            <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">10</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                TGCGCATGCG
            </moby:STRING>
        </moby:Sequence>
    </moby:queryInput>
    <moby:queryInput>
        <moby:Sequence  moby:namespace="GenBank/GI" moby:id="163483">
            <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">20</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                ATTGCGCATGCGAGCTAGTA
            </moby:STRING>
        </moby:Sequence>
    </moby:queryInput>
</moby:MOBY>


RESPONSE:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY  xmlns:moby="http://www.biomoby.org/moby"
            moby:authURI="http://www.ncbi.nlm.nih.gov">
    <moby:queryResponse>
        <moby:BLAST_TEXT moby:namespace="GenBank/GI" moby:id="163483">
        <![CDATA[
            BLASTN 2.2.2 [Jan-08-2002]


            Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
            Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
            "Gapped BLAST and PSI-BLAST: a new generation of protein database search
            programs",  Nucleic Acids Res. 25:3389-3402.

            Query= 163483
            ....
            ]]>
        </moby:BLAST_TEXT>
    </moby:queryResponse>
    <moby:queryResponse>
        <moby:BLAST_TEXT moby:namespace="GenBank/GI" moby:id="557634">
        <![CDATA[
            BLASTN 2.2.2 [Jan-08-2002]


            Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
            Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
            "Gapped BLAST and PSI-BLAST: a new generation of protein database search
            programs",  Nucleic Acids Res. 25:3389-3402.

            Query= 557634
            ....
            ]]>
        </moby:BLAST_TEXT>
    </moby:queryResponse>
</moby:MOBY>

COMMENT: again, note that the namespace of the "root" input object is
maintained, and that this is sufficient to determine which BLAST report
 belongs with which sequence, even though they were returned in a
 different order.
========================================================================
 
========================================================================
Single query with two parameters
(e.g. "please return sequences annotated with this GO term that contain
this sequence motif"
 )
)

QUERY:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby">
    <moby:queryInput>
        <moby:Sequence  moby:namespace="GenBank/GI" moby:id="557634">
            <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">10</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                TGCGCATGCG
            </moby:STRING>
        </moby:Sequence>
        <moby:GO_Term moby:namespace="GO/Acc" moby:id="GO:0005575">
            <moby:STRING moby:namespace="primitive" moby:tagName="go:name">cellular component</moby:STRING>
            <moby:STRING moby:namespace="primitive" moby:tagName="go:definition">
            That fraction of cells, prepared by disruptive biochemical
            methods, that includes the plasma and other membranes. 
            </moby:STRING>
        </moby:GO_Term>
    </moby:queryInput>
</moby:MOBY>


RESPONSE:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby"
            moby:authURI="http://www.geneontology.org">
    <moby:queryResponse>
        <moby:Sequence  moby:namespace="EMBL/Acc" moby:id="A998645">
            <moby:CrossReference>
                <moby:queryXref moby:namespace="GenBank/GI" moby:id="557634"/>
                <moby:queryXref moby:namespace="GO/Acc" moby:id="GO:0005575"/>
                <moby:Object moby:namespace="PubMed/ID" moby:id="72364722"/>
            </moby:CrossReference>
            <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">1265</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                ATCGAGCATCAGGCAGTGCGAGCTGAGCTGAGCTGATGCGGGATGTAGTAGCGA
                GGAGCGATGCTAGTGCGAGTCGATGCGTAGTG....
            </moby:STRING>
        </moby:Sequence>
    </moby:queryResponse>
</moby:MOBY>


COMMENT:
    
Note that the namespace of the returned object has changed!  As a result
the service provider *MUSAT* return a CrossReference block containing
queryXref objects containing the namespace/id of the query object(s) so
that the client can map a response back to the query that it relates to.
As it happens, the Service also knew about a PubMed entry that relates
to this sequence as well, and included it as a cross-referencing base
Object.

========================================================================

========================================================================
Single query with two parameters, one of which is a collection
(e.g.
"please create a Blast database from the collection of sequences, then
Blast the given sequence against that database")
 )

QUERY:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby">
    <moby:queryInput>
        <moby:collection namespace="collection" id="collection1">
            <moby:Sequence  moby:namespace="GenBank/GI" moby:id="557634">
                <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">2003</moby:INT>
                <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                    TGCGCATGCGATCGATCAGCAGCGTAGCTAGGCATCGATTCG...
                </moby:STRING>
            </moby:Sequence>
            <moby:Sequence  moby:namespace="GenBank/GI" moby:id="883472">
                <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">1664</moby:INT>
                <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                    TGCGCATGCGTCGAGCTAGCTAGCGATGCGATGCTGTAGCGTGAC....
                </moby:STRING>
            </moby:Sequence>
        </moby:collecction>
        <moby:Sequence  moby:namespace="GenBank/GI" moby:id="1177346">
            <moby:INT moby:namespace="primitive" moby:id="" moby:tagName="Length">164</moby:INT>
            <moby:STRING moby:namespace="primitive" moby:id="" moby:tagName="SequenceString">
                ACCCCTGCGCATGCGTCGAGCTAGCTAGCGATGCGATGCTGTAGCGTGAC....
            </moby:STRING>
        </moby:Sequence>
    </moby:queryInput>
</moby:MOBY>



RESPONSE:
<?xml version="1.0" encoding="UTF-8"?>
<moby:MOBY xmlns:moby="http://www.biomoby.org/moby"
            moby:authURI="http://www.ncbi.nlm.nih.gov">
    <moby:queryResponse>
        <moby:BLAST_TEXT moby:namespace="GenBank/GI" moby:id="1177346">
            <moby:CrossReference>
                <moby:queryXref moby:namespace="collection" moby:id="collection1"/>
                <moby:queryXref moby:namespace="GenBank/GI" moby:id="1177346"/>
            </moby:CrossReference>
            <![CDATA[
            BLASTN 2.2.2 [Jan-08-2002]


            Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
            Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
            "Gapped BLAST and PSI-BLAST: a new generation of protein database search
            programs",  Nucleic Acids Res. 25:3389-3402.

            Query= 1177346
            ....
            ]]>
        </moby:BLAST_TEXT>
    </moby:queryResponse>
</moby:MOBY>

COMMENT:
Since there were two input objects, and one output object, it is again
necessary to include xreferences back to both of them in the
CrossReference block to avoid ambiguity (despite the fact that the
namespace is duplicated in the BLAST_TEXT object triple; if the server
returns one queryXref, it *MUST* return them all)


More information about the moby-l mailing list