Bioperl: XML/BioPerl

Ewan Birney birney@sanger.ac.uk
Thu, 31 Dec 1998 12:02:00 +0000 (GMT)


David,

	A very interesting read (as was gunther's comments on it).

	I think XML has a big part to play in serving up biological
data in a more compatible form than hard-to-digest ASCII formats. XML
fits very well with perl's 'just-do-it' philosohpy, and I suspect that
any XML transport mechanisms will be first implemented in perl... so
bioperl is a good place to start airing these sorts of ideas.

	As you mentioned there are a number of already existing biological
schemas - from the two XML things already out there, through the NCBI's
ASN.1 definition. In addition there is the work by the CORBA people at the
EBI: in paticular the EBI's idl for the sequence database is quite well
thought out (you can read more about it off http://corba.ebi.ac.uk/).
Finally in the CORBA field you may know at the moment there is discussion
in the Life Sciences Research working group about biological data: a
number of people in bioperl have responded in some way to it.

	I think all these approaches are valid ways of getting at the
problem (I don't find it very useful to argue about whether it is better
to do it using one technology or another - the important thing is just to
do it and not argue about things forever...especially technology). You
seem to have made good headway into describing things as XML objects. What
I might suggest is the following things:

	a) that you start a web site pulling together all these links to
other biological schemas and your own take on the problem

	b) encourage other people to comment on it (like Gunther already
has). I think this mailing list is as good a place to start as any. This
also may mean merging your efforts with one of these other existing ones
or not. 

	c) Come up with some conrete proposals

For both a) and c) I can strongly encourage you to do this inside the 
'bioperl' web site and code base. I know I would be very supportive of it.
You can get an account on bio.perl.org and add pages to the web site via
CVS (the web site is under cvs control). It will probably take a while
to gather all the information and come up with proposals that enough
people agree with to make them sensible, and once you have done that you
can start trying to layout some serious pieces of code inside the bioperl
code base. that's looking ahead 3-6 months probably at a minimum. 


I know you are a busy man ;) so you might feel that it is better done
by someone else. However, I think you seem interested and have already
thought about the problem, and so I would encourage you to keep at it
and try to make sense of what is already out there, followed by coming
up with serious proposals.

BTW - if you are interested,I wrote a paper about software enigneering
in bioinformatics and it has a chpater about distributing objects, where
I discuss XML. It is at
ftp://ftp.sanger.ac.uk/pub/birney/libs/components.ps

I know I should get it into html... but I haven't installed latex2html
yet. <sigh>


To get an account on bio.perl.org, email dag@genetics.com. If you
do want to use the bioperl web site to host something, you probably
want to get on the guts list (available from the same place as this one).


Anyway - great to see you take interest in this. I look forward to more
discussion on it....


Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================