[Biojava-l] Antw: Re: Exception thrown when parsing GenBank file

George Waldon gwaldon at geneinfinity.org
Wed Nov 23 16:06:57 UTC 2011


Hi Dietmar,

The GenbankLocationParser from biojava-1.8.1 is working perfectly as  
long as you feed him with DNA locations that use the syntax described  
in the DDBJ/EMBL/GenBank feature table definition. What you've got  
(join(bond(9),bond(125))...) is a pseudo GenBank format that  
apparently describes protein structure and uses a slightly different  
syntax to indicate heteroatom locations. That is why you get all these  
exceptions thrown.

George



Quoting Dietmar Birzer <Dietmar.Birzer at biologie.uni-regensburg.de>:

>  Hi all,
>
> as the GenbankLocationParser from biojava-1.8.1 is not working  
> properly anymore, I was wondering if there is an equivalent way to  
> do this ( GenpeptRichSequenceDB().getRichSequence("14719485") )  
> using BioJava 3.
> Unfortunately I could not find any GenBank/GenPept parser so far. Is  
> that because it does not exist (yet), or just because I have not  
> looked properly?
>
> Best wishes
>  Dietmar
>
>>>> Peter Cock <p.j.a.cock at googlemail.com> 11/14/2011 6:29 PM >>>
> On Mon, Nov 14, 2011 at 4:53 PM, Dietmar Birzer wrote:
>>
>> Hi all,
>>
>> I am currently trying to debug a little software application which
>> uses BioJava's core-1.8.1.jar library because it has started to
>> throw exceptions a while ago.
>>
>> I guess the problem is, that the GenbankLocationParser is not
>> able to handle "Het" entries in the features section of the
>> GenBank/GenPept format, e.g.
>>
>>  Het             join(bond(9),bond(125))
>>                     /heterogen="( NA,   5 )"
>>
>> for database id 14719485 (http://www.ncbi.nlm.nih.gov/protein/14719485) .
>
> Interesting - note the bond locations are not in the official
> DDBJ/EMBL/GenBank feature table specification (v9, Oct 2011):
> http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
>
> However, as noted on http://www.bioperl.org/wiki/BioPerl_Locations
> that seems to be intended only for nucleotides and not proteins as here.
> It might be worth contacting the NCBI to find out if there is an official
> specification covering these location strings?
>
> Peter
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>






More information about the Biojava-l mailing list