[Biojava-l] Antw: Re: Exception thrown when parsing GenBank file

Dietmar Birzer Dietmar.Birzer at biologie.uni-regensburg.de
Wed Nov 23 15:18:29 UTC 2011


 Hi all,

as the GenbankLocationParser from biojava-1.8.1 is not working properly anymore, I was wondering if there is an equivalent way to do this ( GenpeptRichSequenceDB().getRichSequence("14719485") ) using BioJava 3.
Unfortunately I could not find any GenBank/GenPept parser so far. Is that because it does not exist (yet), or just because I have not looked properly?

Best wishes
 Dietmar
 
>>> Peter Cock <p.j.a.cock at googlemail.com> 11/14/2011 6:29 PM >>> 
On Mon, Nov 14, 2011 at 4:53 PM, Dietmar Birzer wrote:
>
> Hi all,
>
> I am currently trying to debug a little software application which
> uses BioJava's core-1.8.1.jar library because it has started to
> throw exceptions a while ago.
>
> I guess the problem is, that the GenbankLocationParser is not
> able to handle "Het" entries in the features section of the
> GenBank/GenPept format, e.g.
>
>  Het             join(bond(9),bond(125))
>                     /heterogen="( NA,   5 )"
>
> for database id 14719485 (http://www.ncbi.nlm.nih.gov/protein/14719485) .

Interesting - note the bond locations are not in the official
DDBJ/EMBL/GenBank feature table specification (v9, Oct 2011):
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

However, as noted on http://www.bioperl.org/wiki/BioPerl_Locations
that seems to be intended only for nucleotides and not proteins as here.
It might be worth contacting the NCBI to find out if there is an official
specification covering these location strings?

Peter






More information about the Biojava-l mailing list