[Biojava-l] BioJava Hackathon - Day 1

Richard Holland holland at eaglegenomics.com
Mon Jan 18 21:12:14 UTC 2010

I'll be there tomorrow to explain in person to those who are there, but for now here's a brief summary:

Files are made of strings, as are streams. Therefore to grab the sequence as a string out of a file or stream is trivial and cheap. As long as the code doesn't need to do anything that can't be done at the string level, there's no need for it to be converted into a SymbolList equivalent. Therefore any code that doesn't need to convert it should be able to accept and work with a plain string, and be much faster/efficient/cheaper.

Code that does need to have more than just a string should be able to convert that string into a SymboLlist equivalent on demand, which can be backed by strings, files, nio, whatever. 

By making the new SymbolList implement the CharSequence interface, and making methods that only need strings require CharSequence parameters and not Strings, you can pass either type to those string-only methods (because String implements CharSequence), therefore if you've parsed the thing already into a SymbolList, you can keep that representation, drop the original, and still be able to use the string-only methods.

Also by making the new SymbolList implement the standard List interface from Collections, it can be used in nice Java shortcuts such as the new foreach loops, and standard iterators can be used (instead of the current SymbolListIterator method). Collections API can also be used then to do things like subsets or reverses.

...hope it all makes sense!

On 18 Jan 2010, at 19:17, George Waldon wrote:

> Hi Andreas,
> Thanks for the link. It is quite nice to get news from the hackathon this way.
> Quick question on my side. I understand that there is a need for having a bridge between sequence objects and strings, especially for beginners, but how does this becomes a critical issue to the point of saying "Sequences should be Strings as far as possible"? . 
> - George
> On Mon, Jan 18, 2010 at 8:37 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
>> Hi,
>> Today is the first day of the BioJava Hackathon. We are 8 BioJava
>> developers meeting  here at the Genome Campus in Hinxton to hack on
>> the latest code-base. If you want to stay in touch which what is going
>> on, I am going to blog every day at:
>> http://openbioinformatics.blogspot.com . Andy Yates is tweeting at
>> http://twitter.com/search?q=%23biojava
>> Other ways of staying in touch with us are via Skype or Google Chat.
>> Send me your skype username or google account if you want to talk to
>> us directly.
>> Andreas
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l

Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com

More information about the Biojava-l mailing list