[Biojava-l] RefSeq bioJava parser problem

Brad Chapman chapmanb@arches.uga.edu
Thu, 9 May 2002 09:34:08 -0400


[Can the Biojava GenBank parser deal with RefSeq?]

Matthew:
> Could someone with genbank parsing experteese say what the differences 
> between the two formats are, and how easy it would be to get the genbank 
> parser to accept refseq documents?

I don't know anything about the Biojava parsers, but I've spent many
hours of my life getting the Biopython GenBank parser to deal with
RefSeq documents. It is definitely possible, but is a complete pain in
the arse -- the RefSeq files are only marginally "GenBank" and often
make up their own field names, qualifier names and value keys on a whim.
Even more awfully, they like to add on their own personal functions to
locations, like 'bond(12,63)'

Anyways, so it's possible and I've done it when people complained about
a particular file or set of files not working, but I wouldn't describe
it as one of the most enjoyable experiences of my life.

My-most-enjoyable-experience-with-coding-up-the-parser-ly yr's,
Brad
-- 
PGP public key available from http://pgp.mit.edu/