[Bioperl-l] miscellanea

Jason Stajich jason@cgt.mc.duke.edu
Sat, 9 Feb 2002 18:13:17 -0500 (EST)


A number of off-line things have been going on, just wanted to be sure and
keep the community informed.  Some day soon we will have an easy to update
news system on the bioperl site so you can have your Knewsticker's
flashing up bioperl newsupdates ;) - for now I am just sending this in a
list email.


  * XML parsers.  We're looking to converge on using 1 perl XML parser in
    the future.  We want to go with SAX2 compliant parsers, but I think
    that we are still having some debates about what is speediest.  My
    guess is that we'll leave things as they are for 1.00 but expect to
    retool the internals of any modules that use an XML parser and
    converge on a single solution where possible.

  * Bibliographic objects are getting added in Bio::Biblio.  These are
    based on Martin Senger's design and we will be adding a Medline
    parser (and possibly Entrez NCBI XML if someone wants to help write
    it).  We will have access to CORBA and HTTP fetches for remote data as
    well as hooks into an SQL db layer through the biosql project.

  * Markers and Maps will be coming up to speed as we test the objects out
    on real projects.

  * The Hackathon produced some great ideas - one of which is to
    formalize some of the sequence database access.  We've decided to call
    the existing system of HTTP requests with an accession or gi and
    returning sequence data in a standard format (fasta,genbank,embl,
    bsml,agave) "BioFetch".  This probably means that the DB.t tests
    should get moved to BioFetch.t and we leave only non-Biofetch
    DB tests in there (hmm and those are...?)

  * I'm considering proposing an event based parsing model for SeqIO (post
    1.0 of course) in the same way the event based parsing was written for
    SearchIO.  This would also be the time and place we could insert some
    smarter feature location parsing with a grammar
    (using Parse::RecDescent or equivalent ) rather than the pieced
    together regexps.

  * That said - we need an AGAVE SeqIO parser at some point - hopefully
    once the new framework is in place this will be extremely easy to
    write.  Maybe the DT guys want to write one?

  * SteveC and I have been musing how we want to deal with the scripts and
    examples directories.  Maybe it makes sense to have a single directory
    called scripts and have examples be located in there.  The notion is
    that examples should demonstrate bioperl functionality but may not be
    general purpose (cmdline args, etc) while a script is something that
    people should be able to use out of the box for real work.  There
    will also be some  scripts which don't use bioperl - I have started a
    dir called scripts/contributed which is where these types of scripts
    can live.  I would like to consider breaking this off into a new CVS
    module so we can grant write accounts to non-bioperl devs without
    worrying about erroneous commits to the main tree.  With CVS alias
    magic I can actually make these appear the scripts/contributed directory
    anyways.

  * The current list of things to do for 1.00 are:

	- Finish code reviews for those who have agreed to do them.
          This should include answering the questions:
           Does the documentation make sense?
 	   Is the SYNOPSIS runnable?
           Does a reasonable test exist for all the pertinent objects?
           Are there any outstanding issues that need to be re-examined?

        - Verify Peter's bptutorial changes wrt to new modules, update the
          README, biodesign.pod, (Brian O has been way on top of this -
          Thanks!).

        - Check the bug list to see if there are any other gotchas that
          we should fix before the release (there is at least one SeqIO
          feature location parsing bugs that are showing up)

       (Other things I'm forgetting?)

Anyone is welcome and encouraged to help with the above.  Especially
newcomers - if you can read some of the documentation and tell us what is
unclear we can be sure and fix these before the release.  If you do take
something and get it completed, send a note to list so we can put a tick
on the board and move on.

-jason
-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu