[Biojava-l] Re: Biojava-l digest, Vol 1 #334 - 2 msgs

Thomas Down td2@sanger.ac.uk
Mon, 11 Jun 2001 19:53:40 +0100


On Mon, Jun 11, 2001 at 08:17:55PM +0200, Sarath wrote:
> hi thomas
>   It was so nice to hear from you the early response and it did work the
> way you said but i just had to include a set of dummy characters to
> mislead the program but is this  the only way i could manage with
> such files as the files i have suggested as a reference were the newly
> sequenced ones i.e the sequencing of these genomes was completed on 1st
> june  so what have u to say for this ? I dont exactly know the purpose of
> GI in the genbank format but do u think this level of rigidity is
> neccessary for genbankformat reading 
> from sarath 

Disclaimer: I'm not a regular Genbank user (mainly 'cos I
have EMBL easily available on-site...).

My best guess is that GI is equivalent to the `ID' of EMBL
entries, e.g. it is an identifier which is per-version rather
than per-accession.  May be wrong, though...

I agree that tolerating these files is almost certainly the
right thing to do -- but I wanted to post to the list first
to check that other users (some of whom use the Genbank
parser daily) don't have any issues.  If there aren't ant
replies by tommorow morning, I'll fix it.

Is this okay for you?

   Thomas.



PS. In the mean time, if you want to write your own patch, look
around like 418 of src/org/biojava/bio/seq/io/GenbankFormat.java