[Biojava-dev] Some more light parser changes

Bubba Puryear bubba.puryear at gmail.com
Tue Jul 4 16:48:10 UTC 2006


Greetings again,

I've made some slight adjustments to GenbankFormat.java (biojavax)
that allow me to get biojavax to parse all 292M of genbank records
that I have access to. There are three things here:

1. Made the regex for locus lines slightly more tolerant. (made
modifed date field optional - some of the older records I have don't
include the date)

2. The previous checkin for no accessions was slighly incomplete - the
accession has to be set on the RichListener - not just assigned to the
local accession variable (which I believe is only used for logging)

3. I needed a larger readAheadLimit on the BufferedReader for parsing sections.

All the tests run (locally anyway) with these changes and pass. Thanks
for your consideration.

Bubba
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GenbankFormat.java.patch
Type: text/x-patch
Size: 3446 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biojava-dev/attachments/20060704/15baf366/attachment-0002.bin>


More information about the biojava-dev mailing list