[Biojava-dev] The future of BioJava
Mark Fortner
phidias51 at gmail.com
Sat Sep 22 18:42:50 UTC 2007
Richard & Andy,
1. I like the idea of making readers more pluggable, and Dozer
definitely looks interesting. Is this going to be supported via the Service
Provider Interface approach (used by Taverna and other projects)?
2. Andy brought up the point of people who create non-standard
variations of EMBL-formatted files. I was wondering if these files were
created in programming languages other than Java? If so, would those users
be willing to use a Jython, JRuby, or a Perl-like scripting language like
Sleep,? This would allow them to use biojava as a library, and still use a
scripting language whose syntax they were familiar with. They would also be
producing files in a more standardized format. This might cut down on the
number of parsing mistakes caused by "unsupported" file variations. You can
go to http://scripting.dev.java.net for more information on the
scripting languages that the Java VM supports.
3. Was there any reason why non-standard files were being created?
Perhaps some use-case not being covered?
4. If BioJava is split up into a variety of smaller JARs, how would
you insure that the users had all of the JARs that they needed? Would an
installer be provided to allow users to select groups of JARs? There are a
number of open source installers that would make this process easier. Using
Maven is suitable if you're a developer, if you're a scripter it's a little
more difficult to deal with.
5. Are there any thoughts about using a templating system like
Velocity, FreeMarker or JST? This would make it easier to insure that files
were produced in a standard fashion. It would also make it easier to
maintain support for writing files in different file formats.
6. When it comes to unit testing and continuous building, is the
bio*.org server going to handle that automated build & burn, or is someone
in the group going to have to do it? I think the inability to have the
build setup on the server had us stymied before.
7. Now that Java also includes the Derby database, and the Java
Persistence API (JPA), has anyone considered migrating the BioSQL support
from Hibernate to JPA, and using Derby as the default database? This would
make it a little easier to maintain and would minimize the setup work that a
new user would have to do.
8. Richard, you mention in the "Reasoning" section that "users have
moved on". What types of use-cases beyond basic sequence analysis, should
BioJava support? Would support for more of lab-related processes expand the
user base and number of committers? Would support for parsing different
types of instrument files be a useful addition? I could imagine use cases
where users would like to be able to parse an Affy file and fetch probe
information, gene information, and perhaps pathway data.
9. Are there any thoughts about using annotations (perhaps in
combination with ontologies) to handle semantic validation of arguments?
For example, you might have an annotation like
@id {ontologyURI="http://www.mygrid.org.uk/ontology#LocusLink_record_id"}
indicating that the attribute or method argument is a LocusLink id.
Thanks for kick-starting this discussion?
Regards,
Mark Fortner
More information about the biojava-dev
mailing list