[Bioperl-pipeline] DataMonger

Wed, 11 Sep 2002 17:29:08 -0700

hi all,
Following is the proposed changes to the xml structure for the new schema.
The structure doesnt correspond to the schema exactly as the schema is
overloaded but the xml needs to needs to be more user-friendly..

The rule now could have an action RUN_DATAMONGER in which case it needs to
have the analysis_id and datamonger_id and no next analysis_id needs to be
there.
This is because, though the datamonger is implemented as a runnable for
other reasons, this structure of encaspulating it as a preprocesser seemed
better.

For the converter, i have added it to the analysis itself, so an analysis
can have many converters and each will be run one after the other.

please go through the following changes let me know if you see some other
abstraction or reorganisation for the new structure..
after we finalise this, i will update the dtd and
genomic_sequence_annotation.xml. The comments by shawn in the
genomic_sequence_annotation.xml are very good. should serve as model for
writing future comments when we add more tags to xml.

kiran

    <data_monger id="1">
        <filter>
           <module>Bio::Pipeline::Filter::xxx</module>
           <rank>1</rank>
           <argument>
                <tag>feature_type</tag>
                <value>xxx</value>
            </argument>
           <argument>
                <tag>xxxxx</tag>
                <value>xxx</value>
            </argument>
         </filter>
        <input_create>
           <module>Bio::Pipeline::InputCreate::xxx</module>
           <rank>1</rank>
           <iohandler>
                <tag>protein_iohandler</tag>
                <iohandler_id>2</iohandler_id>
            </iohandler>
           <iohandler>
                <tag>contig_iohandler</tag>
                <iohandler_id>3</iohandler_id>
            </iohandler>
         </input_create>
     </data_monger>

    <analysis id="1">
      <logic_name>genewise</logic_name>
      <runnable>Bio::Pipeline::Runnable::Genewise</runnable>
      <nodegroup_id>1</nodegroup_id>

      <!-- iohandler to use to store genewise features -->
      <output_iohandler_id>1</output_iohandler_id>

      <converter>
             <module>Bio::Ensembl::Converter::BioToEns </module>
             <method>convert_bio_to_ens </method>
             <rank>1</rank>
       </converter>
      <converter>
               <module>Bio::Ensembl::Converter::EnsToGFD</module>
               <method>convert_ens_to_gfd</method>
              <rank>2</rank>
      </converter>
    </analysis>

    <rule>
      <action>RUN_DATAMONGER</action>
      <analysis_id>1</analysis_id>
      <datamonger_id>1</datamonger_id>
    </rule>

    <rule>
      <action>COPY_ID</action>
      <current_analysis_id>1</current_analysis_id>
      <next_analysis_id>2</next_analysis_id>
    </rule>