[Biojava-l] RichSequence.IOTools performance

Andy Yates ayates at ebi.ac.uk
Thu Mar 31 07:57:33 UTC 2011


Makes a lot of sense. There's no way of knowing if a stream is buffered unless the top level object given was an instance of BufferedOutputStream. Does this mean that by some fluke we could buffer a buffered stream?

TBH I'm more glad that we've got the speed back :).

Andy

On 30 Mar 2011, at 20:38, Scooter Willis wrote:

> Khalil
> 
> For BioJava3 FastaWriter was simply using an OutputStream where its
> use was wrapped by FastaWriterHelper which was not using a
> BufferedOutputStream. I made changes to FastaWriter to check if the
> OutputStream is an instance of BufferedOutputStream and if not create
> one locally and the close when returning. The writing of 10,000
> sequences or 4.5MB of data went from 15 seconds to .6 seconds. I
> checked in the code change if you wanted to test using your code.
> 
> Thanks
> 
> Scooter
> 
> On Tue, Mar 29, 2011 at 5:47 PM, Khalil El Mazouari
> <khalil.elmazouari at gmail.com> wrote:
>> Hi
>> I am using netbeans profiler.
>> The total exec time was ± 20s (macbook pro i7, 4GB, SSD)  for ± 10.000 seq.
>> By writing the RichSequence object to ByteArrayOutputStream -> FileChannel,
>> where appropriate, the total exec time dropped to 7s. Huge improvement, for
>> the app I am developing. The app will be used to analyze ± 100,000 sequence
>> per run.
>> Regards,
>> khalil
>> 
>> On 29 Mar 2011, at 22:13, Scooter Willis wrote:
>> 
>> Instead of percentage metrics can you get the time before and after the
>> write execution for comparison without profiling. What profiler are you
>> using?
>> 
>> On Mar 28, 2011 5:39 PM, "Andy Yates" <ayates at ebi.ac.uk> wrote:
>> 
>> Dang Rich :).
>> 
>> At the moment we've not done anything WRT Genbank outputting but would
>> accept anything to help us out with this.
>> 
>> As for the performance difference between BJ3 & BJ what happens if you use
>> the writer objects directly with a BufferedOutputStream writer? Have you got
>> any profiling results? It would be very interesting to see where we've lost
>> the performance ...
>> 
>> Andy
>> 
>> On 28 Mar 2011, at 18:23, Richard Holland wrote:
>> 
>>> In which case you've got little option but to r...
>> 
>> --
>> Andrew Yates                   Ensembl Genomes Engineer
>> EMBL-EBI                       Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open...
>> 
>> 

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/








More information about the Biojava-l mailing list