[Biojava-dev] bjv2 alpha 2
Matthew Pocock
matthew.pocock at ncl.ac.uk
Fri May 14 11:20:02 EDT 2004
Hi,
I've just rolled out bjv2 alpha 2 - shelob. This is both a feature and
performance enhancement release. Get it from subversion at:
http://www.derkholm.net/svn/repos/bjv2/branches/shelob
As always, the development version is at:
http://www.derkholm.net/svn/repos/bjv2/trunk
features:
* gff support
* guts for allowing both gff and to be viewed as a stream of features
or sequences
* schema support on queryables
performance:
* elide away unnecisary object creation - 10x speed improvement
* adaptive indexing - data sources work out what questions you ask &
build indexes
* starting to do query optimization through the query and integration
layers - 320 sec query down to 6 sec!! I think this has replaced an n*n
scaling with a log(n) scaling. Pure objects would be constant-time
though - still some way to go
miscelanei:
* now requires the /newest/ javac - the javac bundled with java 1.5
beta1 was buggy
* more documentation - design and user docs
An example script:
import org.bjv2.seq.Sequence;
import org.bjv2.seq.Sequences;
import org.bjv2.seq.io.IO;
import org.bjv2.symbol.SymbolList;
import org.bjv2.gql.Queryable;
import java.io.File;
/**
* Demonstration of integrating multiple files.
* <p/>
* Use: <pre>IntegrateSequences seqFile1, seqFile2, ...</pre>
* <p/>
* The files can be any biological sequence/feature format files -
currently gff & embl are supported.
* The output will be a list of all sequences, and the number of
features on the sequence, regardless of
* whether the feature was annotated in the same file that the sequence
was defined in.
*
* @author Matthew Pocock
*/
public class IntegrateSequences
{
public static void main(String[] args)
throws Throwable
{
// load all the data in
for(String arg: args) {
File seqFile = new File(arg);
System.out.println("Loading: " + seqFile);
IO.loadSequence(seqFile);
}
System.out.println("All sequences: ");
// get the queryable with all the sequence data in
// this will be made prettier for alpha3
Queryable<Sequence> allSeqs = (Queryable<Sequence>)
Sequences.defaultContext().getMapping().image(
Sequences.getIdentifiers().get(Sequences.Domains.SEQUENCE));
System.out.println("\t" + allSeqs);
// loop over all sequences, printing out the sequence length & the
number of features
for(Sequence seq: allSeqs) {
System.out.println("\t" + seq.getIdentifier());
SymbolList symL = seq.getSymbolList();
if(symL != null) {
System.out.println("\t\tlength: " + symL.length());
}
System.out.println("\t\tfeatures: " + seq.getFeatures().size());
}
}
}
More information about the biojava-dev
mailing list