[Bioperl-pipeline] xml dir housekeeping

Thu Jan 30 13:07:41 EST 2003

> Sure, I agreed with the idea of dev directory for xml. However, I am
> thinking about the pipeline test with ensembl. Mainly, as you know,
> the converter stuff is designed to convert objects between bioperl and
> ensembl so far, so that pipeline can make full use of huge data set,
> rather than flatfile. 

Actually there is quite a nice way in which you could test the conversion
without having to have a whole database around, which is to freeze and
thaw perl objects into files. You could create Ensembl objects, freeze
them, and keep them as your tests. It would test the conversion well,
without the need for a database around.

I don't like the idea of dumping out from a current database. If you want
to test a database, then you can do it by keeping a ready dump in the t
directory, and creating an Test database object with that, as is done in
biosql/bioperl-db, as well as some older Ensembl tests. This is the right
way to do it, and as long as you elegantly skip the test if the user
doesn't have ensembl, then I think you can do it. You don't need to do it
by dumping and creating and adding analyses. You can create a ready made
dump and create a test Ensembl database. But then you need to make sure it
is up to date every time Ensembl changes schema, just like the converters.

> Hence, developing a module for preparing the dataset is a reasonable
> requirement. I think more varieties of individual requirement will
> come to pipeline when more people are using them.

I think you should:

1)Make a simple converter test, by using frozen objects

2)Make a complex db-based test, by using a ready made database dump (very
small but functional) and the EnsTestDB.pm module

3)Move towards working with Bala and Shawn on the real Ciona annotation
pipeline

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6779 1117        *
********************************