>>> Part of how we try to handle big data files in Biopython is using
>>> Python iterators, whereby the file is loaded record by record (how
>>> depends on the file format - for BLAST we do this query by query),
>>> not all into memory in one go. I think BioPerl does something very
>>> similar in their parsers, I'm not so familiar with BioJava.
> BioJava uses a visitor pattern. In effect an iterator.
> With all current implementations IO runs, then code, the IO, etc.
> While we are IO constrained, we are actually doing worse.
> What I want is an IO thread going at maximum throughput. Every item
> should get parcelled out for further parsing and processing, in
> parallel to the IO thread.
> We should do better, and make it a generalization. I think we can do
> it by using Scala and the standard BioJava iterators. With Scala it
> can be turned in a parallelized iterator. That is a fun project.
>> From my point of view Python guys are doing a very good job on all fields.
>> Unfortunately I'm in love with ruby :-)
> All you need is love :)
At some point the choice of a language will not matter as much, as long as it is implemented in a VM (something Perl 5 cannot claim at the moment, but Perl 6 does with the Parrot VM).  


