[Biojava-l] GenBank parsing

Tue Jun 2 09:40:39 UTC 2015

Hi

I'm coming back to BioJava (BJ) after a couple of years away and am somewhat confused by the current collection of cookbooks, tutorials and APIs. There appear to be a few examples for handling protein structure data, but relatively little for more mainstream stuff such as parsing Genbank files, which I first need to get the information I want to investigate protein structure. But when I look at the relevant code samples to do this, they refer back to BJ3, BJ1, or even BJX. Even the Wiki page still refers to BJ3 despite the release of BJ4 back in Feb 2015.

I have everything working for parsing GenBank data, but I'm still trying to get the Annotation information out of the top of a GenBank file, and can't find any way of doing this using BJ4 - the BJ4 API appears to refer to the RichAnnotation type in BJX release. Can anyone clarify what you are supposed to do here? Start mixing in some BJX? (and is BJX still active?) or should I still be using BJ3 until BJ4 stabilizes. I realise this is an open source project, but some clarification on the current status of things would be handy if the project is going to appeal to a larger community :)

Thanks!