[Bioperl-l] miscellanea
Jason Stajich
jason@cgt.mc.duke.edu
Sat, 9 Feb 2002 18:13:17 -0500 (EST)
A number of off-line things have been going on, just wanted to be sure and
keep the community informed. Some day soon we will have an easy to update
news system on the bioperl site so you can have your Knewsticker's
flashing up bioperl newsupdates ;) - for now I am just sending this in a
list email.
* XML parsers. We're looking to converge on using 1 perl XML parser in
the future. We want to go with SAX2 compliant parsers, but I think
that we are still having some debates about what is speediest. My
guess is that we'll leave things as they are for 1.00 but expect to
retool the internals of any modules that use an XML parser and
converge on a single solution where possible.
* Bibliographic objects are getting added in Bio::Biblio. These are
based on Martin Senger's design and we will be adding a Medline
parser (and possibly Entrez NCBI XML if someone wants to help write
it). We will have access to CORBA and HTTP fetches for remote data as
well as hooks into an SQL db layer through the biosql project.
* Markers and Maps will be coming up to speed as we test the objects out
on real projects.
* The Hackathon produced some great ideas - one of which is to
formalize some of the sequence database access. We've decided to call
the existing system of HTTP requests with an accession or gi and
returning sequence data in a standard format (fasta,genbank,embl,
bsml,agave) "BioFetch". This probably means that the DB.t tests
should get moved to BioFetch.t and we leave only non-Biofetch
DB tests in there (hmm and those are...?)
* I'm considering proposing an event based parsing model for SeqIO (post
1.0 of course) in the same way the event based parsing was written for
SearchIO. This would also be the time and place we could insert some
smarter feature location parsing with a grammar
(using Parse::RecDescent or equivalent ) rather than the pieced
together regexps.
* That said - we need an AGAVE SeqIO parser at some point - hopefully
once the new framework is in place this will be extremely easy to
write. Maybe the DT guys want to write one?
* SteveC and I have been musing how we want to deal with the scripts and
examples directories. Maybe it makes sense to have a single directory
called scripts and have examples be located in there. The notion is
that examples should demonstrate bioperl functionality but may not be
general purpose (cmdline args, etc) while a script is something that
people should be able to use out of the box for real work. There
will also be some scripts which don't use bioperl - I have started a
dir called scripts/contributed which is where these types of scripts
can live. I would like to consider breaking this off into a new CVS
module so we can grant write accounts to non-bioperl devs without
worrying about erroneous commits to the main tree. With CVS alias
magic I can actually make these appear the scripts/contributed directory
anyways.
* The current list of things to do for 1.00 are:
- Finish code reviews for those who have agreed to do them.
This should include answering the questions:
Does the documentation make sense?
Is the SYNOPSIS runnable?
Does a reasonable test exist for all the pertinent objects?
Are there any outstanding issues that need to be re-examined?
- Verify Peter's bptutorial changes wrt to new modules, update the
README, biodesign.pod, (Brian O has been way on top of this -
Thanks!).
- Check the bug list to see if there are any other gotchas that
we should fix before the release (there is at least one SeqIO
feature location parsing bugs that are showing up)
(Other things I'm forgetting?)
Anyone is welcome and encouraged to help with the above. Especially
newcomers - if you can read some of the documentation and tell us what is
unclear we can be sure and fix these before the release. If you do take
something and get it completed, send a note to list so we can put a tick
on the board and move on.
-jason
--
Jason Stajich
Duke University
jason@cgt.mc.duke.edu